本文介绍了语言代码和语言区域代码的良好定义是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  • 何时使用en_GBen-GB吗?
  • 有什么区别?
  • ISO 639-1(语言)和ISO 3166(国家/地区)组合是否有ISO名称?

  • When to use en_GB and en-GB ?
  • What is the difference ?
  • Is there an ISO name for this ISO 639-1 (language) and ISO 3166 (country) combination ?

    推荐答案

    有多种语言环境标识符系统.乍一看,其中许多是相似的,但是当您深入时,它们却并非如此:

    There are several systems for locale identifiers. Many of them are similar at the first glance, but not when you go deeper:

    一些示例(带有拉丁文字的塞尔维亚-塞尔维亚,带有基本排序的日语-日本):

    Some examples (Serbian-Serbia with Latin Script, Japanese-Japan with radical sorting):

    • UTS-35,ICU,Mac OS X,Flash:sr-Latn-RS,ja-JP @ collat​​ion = radical
    • 较新的UTS-35,BCP 47扩展U:sr-Latn-RS,ja-JP-u-co-unihan
    • Win 2000,XP:0x81a,0x10411
    • Vista,Win 7:sr-Latn-CS,ja-JP_radical
    • Java:sr_CS,ja_JP
    • Java 7:sr_RS,ja_JP
    • Linux:sr_RS @ latin,ja_JP.utf8

    想到它就像谈论颜色(RGB,CMYB,HSV,Pantone等)的不同方式

    Think of it like different ways to talk about colors (RGB, CMYB, HSV, Pantone, etc.)

    因此,除非您指定所使用的环境,否则-_毫无意义.使用-,Java将无法理解,使用_,Windows将无法理解.ICU(及其上构建的系统)同时接受-_,但会产生_样式.

    So - vs. _ does not make sense unless you specify what the is the environment you are using. Use - and Java will not understand it, use _ and Windows will not understand it.ICU (and systems build on top of it) accept both - and _, but produce the _ style.

    没有涵盖语言国家组合的ISO.但是有些ISO涵盖了各个部分(语言,国家/地区,脚本).ISO的确切版本还取决于语言环境标识符所使用的系统.

    There is no ISO that covers the combination of language-country. But there are ISOs that cover the various parts (language, country, script).The exact version of the ISO also depends on the system used for locale identifiers.

    通常,您应该同时接受_-,并且只生成一个(在接受的内容上是自由的,而在发出的内容上则是严格的")(如ICU).

    In general you should accept both _ and -, and generate only one ("be liberal in what you accept and strict in what you emit") (like ICU).

    如果使用其他类型的语言环境标识符与系统通信,则必须与系统进行映射.这将迫使您使用_-.某些映射将是有损的(在Windows,Linux中无法指定备用日历;在Java早于7时无法指定备用排序或脚本,等等),并且可能无法进行往返(与RGB-转换类似) CMYK).

    If you communicate with systems using another type of locale identifier, you will have to map to/from your system. That will force you to use _ or -.Some of the mappings will be lossy (there is no way to specify alternate calendars in Windows, Linux; or alternate sorting or scripts in Java older than 7, etc.) and round-tripping might not be possible (somewhat similar to conversions RGB-CMYK).

    添加:事情不仅在系统之间是不同的,而且可以随时间变化.例如,Java 7增加了对sr_RS和脚本的支持,Windows继续增加了对更多语言环境的支持,创建了新国家(苏丹分裂,俄罗斯,塞尔维亚)或消失了(东德,苏联,南斯拉夫)等等.

    Addition: things are different not only between systems, but they can change in time. For instance Java 7 added support for sr_RS and for scripts, Windows keeps adding support for more locales, new countries get created (Sudan split, Russia, Serbia) or disappear (East Germany, U.S.S.R, Yugoslavia) and so on.

    对于内部表示,您可能希望选择功能最强大的一种,它可以表示所有内容,即UTS-35/BCP 47(也由CLDR和ICU使用).

    For internal representation you might want to choose the most powerful one, that can represent everything, and that is UTS-35 / BCP 47 (also used by CLDR and ICU).

    这篇关于语言代码和语言区域代码的良好定义是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

  • 08-29 10:47