问题描述
en_GB
和en-GB
吗?ISO 639-1
(语言)和ISO 3166
(国家/地区)组合是否有ISO名称?en_GB
and en-GB
?ISO 639-1
(language) and ISO 3166
(country) combination ?推荐答案
有多种语言环境标识符系统.乍一看,其中许多是相似的,但是当您深入时,它们却并非如此:
There are several systems for locale identifiers. Many of them are similar at the first glance, but not when you go deeper:
一些示例(带有拉丁文字的塞尔维亚-塞尔维亚,带有基本排序的日语-日本):
Some examples (Serbian-Serbia with Latin Script, Japanese-Japan with radical sorting):
- UTS-35,ICU,Mac OS X,Flash:sr-Latn-RS,ja-JP @ collation = radical
- 较新的UTS-35,BCP 47扩展U:sr-Latn-RS,ja-JP-u-co-unihan
- Win 2000,XP:0x81a,0x10411
- Vista,Win 7:sr-Latn-CS,ja-JP_radical
- Java:sr_CS,ja_JP
- Java 7:sr_RS,ja_JP
- Linux:sr_RS @ latin,ja_JP.utf8
想到它就像谈论颜色(RGB,CMYB,HSV,Pantone等)的不同方式
Think of it like different ways to talk about colors (RGB, CMYB, HSV, Pantone, etc.)
因此,除非您指定所使用的环境,否则-
与_
毫无意义.使用-
,Java将无法理解,使用_
,Windows将无法理解.ICU(及其上构建的系统)同时接受-
和_
,但会产生_
样式.
So -
vs. _
does not make sense unless you specify what the is the environment you are using. Use -
and Java will not understand it, use _
and Windows will not understand it.ICU (and systems build on top of it) accept both -
and _
, but produce the _
style.
没有涵盖语言国家组合的ISO.但是有些ISO涵盖了各个部分(语言,国家/地区,脚本).ISO的确切版本还取决于语言环境标识符所使用的系统.
There is no ISO that covers the combination of language-country. But there are ISOs that cover the various parts (language, country, script).The exact version of the ISO also depends on the system used for locale identifiers.
通常,您应该同时接受_
和-
,并且只生成一个(在接受的内容上是自由的,而在发出的内容上则是严格的")(如ICU).
In general you should accept both _
and -
, and generate only one ("be liberal in what you accept and strict in what you emit") (like ICU).
如果使用其他类型的语言环境标识符与系统通信,则必须与系统进行映射.这将迫使您使用_
或-
.某些映射将是有损的(在Windows,Linux中无法指定备用日历;在Java早于7时无法指定备用排序或脚本,等等),并且可能无法进行往返(与RGB-转换类似) CMYK).
If you communicate with systems using another type of locale identifier, you will have to map to/from your system. That will force you to use _
or -
.Some of the mappings will be lossy (there is no way to specify alternate calendars in Windows, Linux; or alternate sorting or scripts in Java older than 7, etc.) and round-tripping might not be possible (somewhat similar to conversions RGB-CMYK).
添加:事情不仅在系统之间是不同的,而且可以随时间变化.例如,Java 7增加了对sr_RS和脚本的支持,Windows继续增加了对更多语言环境的支持,创建了新国家(苏丹分裂,俄罗斯,塞尔维亚)或消失了(东德,苏联,南斯拉夫)等等.
Addition: things are different not only between systems, but they can change in time. For instance Java 7 added support for sr_RS and for scripts, Windows keeps adding support for more locales, new countries get created (Sudan split, Russia, Serbia) or disappear (East Germany, U.S.S.R, Yugoslavia) and so on.
对于内部表示,您可能希望选择功能最强大的一种,它可以表示所有内容,即UTS-35/BCP 47(也由CLDR和ICU使用).
For internal representation you might want to choose the most powerful one, that can represent everything, and that is UTS-35 / BCP 47 (also used by CLDR and ICU).
这篇关于语言代码和语言区域代码的良好定义是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!