Script (Unicode)

‎	‎	ᓃ‎	‎	⠿‎
‎	‎	አ‎	文‎	あ‎
ꦏ‎	‎	‎	‎	ழ்‎
‎	ع‎‎	ש‎‎	Д‎	A‎

In Unicode, a script is a collection of letters and other written signs used to represent textual information in one or more writing systems.^[1] Some scripts support one and only one writing system and language, for example, Armenian. Other scripts support many different writing systems; for example, the Latin script supports English, French, German, Italian, Vietnamese, Latin itself, and several other languages. Some languages make use of multiple alternate writing systems and thus also use several scripts; for example, in Turkish, the Arabic script was used before the 20th century but transitioned to Latin in the early part of the 20th century. More or less complementary to scripts are symbols and Unicode control characters.

The unified diacritical characters and unified punctuation characters frequently have the "common" or "inherited" script property. However, the individual scripts often have their own punctuation and diacritics, so that many scripts include not only letters but also diacritic and other marks, punctuation, numerals and even their own idiosyncratic symbols and space characters.

Unicode 15.1 defines 161 separate scripts, including 94 modern scripts and 67 ancient or historic scripts.^[2]^[3] More scripts are in the process for encoding or have been tentatively allocated for encoding in roadmaps.^[4]

Definition and classification edit

When multiple languages make use of the same script, there are frequently some differences, particularly in diacritics and other marks. For example, Swedish and English both use the Latin script. However, Swedish includes the character å (sometimes called a Swedish O), while English has no such character. Nor does English make use of the diacritic combining ring above for any character. In general, the languages sharing the same scripts share many of the same characters. Despite these peripheral differences in the Swedish and English writing systems, they are said to use the same Latin script. Thus, the Unicode abstraction of scripts is a basic organizing technique. The differences among different alphabets or writing systems remain and are supported through Unicode’s flexible scripts, combining marks and collation algorithms.

Script versus writing system edit

Writing system is sometimes treated as a synonym for "script". However, it also can be used as the specific concrete writing system supported by a script. For example, the Vietnamese writing system is supported by the Latin script. A writing system may also cover more than one script; for example, the Japanese writing system makes use of the Han, Hiragana and Katakana scripts.

Most writing systems can be broadly divided into several categories: logographic, syllabic, alphabetic (or segmental), abugida, abjad and featural; however, all features of any of these may be found in any given writing system in varying proportions, often making it difficult to purely categorize a system. The term complex system is sometimes used to describe those where the admixture makes classification problematic.

Unicode supports all of these types of writing systems through its numerous scripts. Unicode also adds further properties to characters to help differentiate the various characters and the ways they behave within Unicode text-processing algorithms.

Special script property values edit

In addition to explicit or specific script properties, Unicode uses three special values:^[5]

Common: Unicode can assign a character in the UCS to a single script only. However, many characters—those that are not part of a formal natural-language writing system or are unified across many writing systems—may be used in more than one script (for example, currency signs, symbols, numerals and punctuation marks). In these cases Unicode defines them as belonging to the "common" script (ISO 15924 code "Zyyy").
Inherited: Many diacritics and non-spacing combining characters may be applied to characters from more than one script. In these cases Unicode assigns them to the "inherited" script (ISO 15924 code Zinh), which means that they have the same script class as the base character with which they combine, and so in different contexts they may be treated as belonging to different scripts. For example, U+0308 ̈ COMBINING DIAERESIS may combine either with U+0065 e LATIN SMALL LETTER E to create a Latin ë or with U+0435 е CYRILLIC SMALL LETTER IE for the Cyrillic ё. In the former case, it inherits the Latin script of the base character, whereas in the latter case, it inherits the Cyrillic script of the base character.
Unknown: The value of "unknown" script (ISO 15924 code Zzzz) is given to unassigned, private-use, noncharacter, and surrogate code points.

Character categories within scripts edit

Unicode provides a general category property for each character. So in addition to belonging to a script every character also has a general category. Typically scripts include letter characters including: uppercase letters, lowercase letter and modifier letters. Some characters are considered titlecase letters for a few precomposed ligatures such as ǲ (U+01F2). Such titlecase ligatures are all in the Latin and Greek scripts and are all compatibility characters, and therefore Unicode discourages their use by authors. It is unlikely that new titlecase letters will be added in the future.

Most writing systems do not differentiate between uppercase and lowercase letters. For those scripts all letters are categorized as "other letter" or "modifier letter". Ideographs such as Unihan ideographs are also categorized as "other letters". A few scripts do differentiate between uppercase and lowercase however: Latin, Cyrillic, Greek, Armenian, Georgian, and Deseret. Even for these scripts there are some letters that are neither uppercase nor lowercase.

Scripts can also contain any other general category character such as marks (diacritic and otherwise), numbers (numerals), punctuation, separators (word separators such as spaces), symbols and non-graphical format characters. These are included in a particular script when they are unique to that script. Other such characters are generally unified and included in the punctuation or diacritic blocks. However, the bulk of characters in any script (other than the common and inherited scripts) are letters.

List of scripts in Unicode edit

Unicode defines over a hundred script names (called "Alias" or "Property value alias"), based on the ISO 15924 list. Unicode uses the "Common" script name for ISO 15924's Zyyy (code for undetermined script), "Inherited" for ISO 15924's Zinh (code for inherited script), and "Unknown" for ISO 15924's Zzzz (code for uncoded script). Not used are, among others, the ISO 15924 script codes: Zsym (Symbols) and Zmth (Mathematical notation). These are considered not to be scripts in Unicode sense.

v t e Scripts in ISO 15924^[a]^[b] and in Unicode^[c]^[d]
ISO 15924				Script in Unicode^[e]
Code	ISO number	ISO formal name	Directionality	Unicode Alias^[f]	Version	Characters	Notes	Description
Adlm	166	Adlam	right-to-left script	Adlam	9.0	88		Ch 19.9
Afak	439	Afaka	varies	ZZ— Not in Unicode, proposal is explored^[i]
Aghb	239	Caucasian Albanian	left-to-right	Caucasian Albanian	7.0	53	Ancient/historic	Ch 8.11
Ahom	338	Ahom, Tai Ahom	left-to-right	Ahom	8.0	65	Ancient/historic	Ch 15.16
Arab	160	Arabic	right-to-left script	Arabic	1.0	1,368		Ch 9.2
Aran	161	Arabic (Nastaliq variant)	mixed	ZZ— Typographic variant of Arabic (see § Arab)
Armi	124	Imperial Aramaic	right-to-left script	Imperial Aramaic	5.2	31	Ancient/historic	Ch 10.4
Armn	230	Armenian	left-to-right	Armenian	1.0	96		Ch 7.6
Avst	134	Avestan	right-to-left script	Avestan	5.2	61	Ancient/historic	Ch 10.7
Bali	360	Balinese	left-to-right	Balinese	5.0	124		Ch 17.3
Bamu	435	Bamum	left-to-right	Bamum	5.2	657		Ch 19.6
Bass	259	Bassa Vah	left-to-right	Bassa Vah	7.0	36	Ancient/historic	Ch 19.7
Batk	365	Batak	left-to-right	Batak	6.0	56		Ch 17.6
Beng	325	Bengali (Bangla)	left-to-right	Bengali	1.0	96		Ch 12.2
Bhks	334	Bhaiksuki	left-to-right	Bhaiksuki	9.0	97	Ancient/historic	Ch 14.3
Blis	550	Blissymbols	varies	ZZ— Not in Unicode, proposal is explored^[i]
Bopo	285	Bopomofo	left-to-right, right-to-left script	Bopomofo	1.0	77		Ch 18.3
Brah	300	Brahmi	left-to-right	Brahmi	6.0	115	Ancient/historic	Ch 14.1
Brai	570	Braille	left-to-right	Braille	3.0	256		Ch 21.1
Bugi	367	Buginese	left-to-right	Buginese	4.1	30		Ch 17.2
Buhd	372	Buhid	left-to-right	Buhid	3.2	20		Ch 17.1
Cakm	349	Chakma	left-to-right	Chakma	6.1	71		Ch 13.11
Cans	440	Unified Canadian Aboriginal Syllabics	left-to-right	Canadian Aboriginal	3.0	726		Ch 20.2
Cari	201	Carian	left-to-right, right-to-left script	Carian	5.1	49	Ancient/historic	Ch 8.5
Cham	358	Cham	left-to-right	Cham	5.1	83		Ch 16.10
Cher	445	Cherokee	left-to-right	Cherokee	3.0	172		Ch 20.1
Chis	298	Chisoi	left-to-right	ZZ— Not in Unicode, proposal is mature^[ii]
Chrs	109	Chorasmian	right-to-left script, top-to-bottom	Chorasmian	13.0	28	Ancient/historic	Ch 10.8
Cirt	291	Cirth	varies	ZZ— Not in Unicode
Copt	204	Coptic	left-to-right	Coptic	1.0	137	Ancient/historic, disunified from Greek in 4.1	Ch 7.3
Cpmn	402	Cypro-Minoan	left-to-right	Cypro Minoan	14.0	99	Ancient/historic	Ch 8.4
Cprt	403	Cypriot syllabary	right-to-left script	Cypriot	4.0	55	Ancient/historic	Ch 8.3
Cyrl	220	Cyrillic	left-to-right	Cyrillic	1.0	506	Includes typographic variant Old Church Slavonic (see § Cyrs)	Ch 7.4
Cyrs	221	Cyrillic (Old Church Slavonic variant)	varies	ZZ— Typographic variant of Cyrillic (see § Cyrl); Ancient/historic
Deva	315	Devanagari (Nagari)	left-to-right	Devanagari	1.0	164		Ch 12.1
Diak	342	Dives Akuru	left-to-right	Dives Akuru	13.0	72	Ancient/historic	Ch 15.15
Dogr	328	Dogra	left-to-right	Dogra	11.0	60	Ancient/historic	Ch 15.18
Dsrt	250	Deseret (Mormon)	left-to-right	Deseret	3.1	80		Ch 20.4
Dupl	755	Duployan shorthand, Duployan stenography	left-to-right	Duployan	7.0	143		Ch 21.6
Egyd	070	Egyptian demotic	mixed	ZZ— Not in Unicode
Egyh	060	Egyptian hieratic	mixed	ZZ— Not in Unicode
Egyp	050	Egyptian hieroglyphs	right-to-left script, left-to-right	Egyptian Hieroglyphs	5.2	1,110	Ancient/historic	Ch 11.4
Elba	226	Elbasan	left-to-right	Elbasan	7.0	40	Ancient/historic	Ch 8.10
Elym	128	Elymaic	right-to-left script	Elymaic	12.0	23	Ancient/historic	Ch 10.9
Ethi	430	Ethiopic (Geʻez)	left-to-right	Ethiopic	3.0	523		Ch 19.1
Gara	164	Garay	right-to-left	ZZ— Not in Unicode, approved for version 16.0^[iii]
Geok	241	Khutsuri (Asomtavruli and Nuskhuri)	left-to-right	Georgian			Unicode groups Khutsori, Asomtavruli and Nuskhuri into 'Georgian' (see § Geok). Similarly, Mkhedruli and Mtavruli are 'Georgian' (see § Geor)	Ch 7.7
Geor	240	Georgian (Mkhedruli and Mtavruli)	left-to-right	Georgian	1.0	173	In Unicode this also includes Nuskhuri (Geok)	Ch 7.7
Glag	225	Glagolitic	left-to-right	Glagolitic	4.1	134	Ancient/historic	Ch 7.5
Gong	312	Gunjala Gondi	left-to-right	Gunjala Gondi	11.0	63		Ch 13.15
Gonm	313	Masaram Gondi	left-to-right	Masaram Gondi	10.0	75		Ch 13.14
Goth	206	Gothic	left-to-right	Gothic	3.1	27	Ancient/historic	Ch 8.9
Gran	343	Grantha	left-to-right	Grantha	7.0	85	Ancient/historic	Ch 15.14
Grek	200	Greek	left-to-right	Greek	1.0	518	Directionality sometimes as boustrophedon	Ch 7.2
Gujr	320	Gujarati	left-to-right	Gujarati	1.0	91		Ch 12.4
Gukh	397	Gurung Khema	left-to-right	ZZ— Not in Unicode, approved for version 16.0^[iii]
Guru	310	Gurmukhi	left-to-right	Gurmukhi	1.0	80		Ch 12.3
Hanb	503	Han with Bopomofo (alias for Han + Bopomofo)	mixed	ZZ— See § Hani, § Bopo
Hang	286	Hangul (Hangŭl, Hangeul)	left-to-right, vertical right-to-left	Hangul	1.0	11,739	Hangul syllables relocated in 2.0	Ch 18.6
Hani	500	Han (Hanzi, Kanji, Hanja)	top-to-bottom, columns right-to-left (historically)	Han	1.0	99,030		Ch 18.1
Hano	371	Hanunoo (Hanunóo)	left-to-right, bottom-to-top	Hanunoo	3.2	21		Ch 17.1
Hans	501	Han (Simplified variant)	varies	ZZ— Subset of Han (Hanzi, Kanji, Hanja) (see § Hani)
Hant	502	Han (Traditional variant)	varies	ZZ— Subset of § Hani
Hatr	127	Hatran	right-to-left script	Hatran	8.0	26	Ancient/historic	Ch 10.12
Hebr	125	Hebrew	right-to-left script	Hebrew	1.0	134		Ch 9.1
Hira	410	Hiragana	vertical right-to-left, left-to-right	Hiragana	1.0	381		Ch 18.4
Hluw	080	Anatolian Hieroglyphs (Luwian Hieroglyphs, Hittite Hieroglyphs)	left-to-right	Anatolian Hieroglyphs	8.0	583	Ancient/historic	Ch 11.6
Hmng	450	Pahawh Hmong	left-to-right	Pahawh Hmong	7.0	127		Ch 16.11
Hmnp	451	Nyiakeng Puachue Hmong	left-to-right	Nyiakeng Puachue Hmong	12.0	71		Ch 16.12
Hrkt	412	Japanese syllabaries (alias for Hiragana + Katakana)	vertical right-to-left, left-to-right	Katakana or Hiragana			See § Hira, § Kana	Ch 18.4
Hung	176	Old Hungarian (Hungarian Runic)	right-to-left script	Old Hungarian	8.0	108	Ancient/historic	Ch 8.8
Inds	610	Indus (Harappan)	mixed	ZZ— Not in Unicode, proposal is explored^[i]
Ital	210	Old Italic (Etruscan, Oscan, etc.)	right-to-left script, left-to-right	Old Italic	3.1	39	Ancient/historic	Ch 8.6
Jamo	284	Jamo (alias for Jamo subset of Hangul)	varies	ZZ— Subset of § Hang
Java	361	Javanese	left-to-right	Javanese	5.2	90		Ch 17.4
Jpan	413	Japanese (alias for Han + Hiragana + Katakana)	varies	ZZ— See § Hani, § Hira and § Kana
Jurc	510	Jurchen	left-to-right	ZZ— Not in Unicode
Kali	357	Kayah Li	left-to-right	Kayah Li	5.1	47		Ch 16.9
Kana	411	Katakana	vertical right-to-left, left-to-right	Katakana	1.0	321		Ch 18.4
Kawi	368	Kawi	left-to-right	Kawi	15.0	86	Ancient/historic	Ch 17.9
Khar	305	Kharoshthi	right-to-left script	Kharoshthi	4.1	68	Ancient/historic	Ch 14.2
Khmr	355	Khmer	left-to-right	Khmer	3.0	146		Ch 16.4
Khoj	322	Khojki	left-to-right	Khojki	7.0	65	Ancient/historic	Ch 15.7
Kitl	505	Khitan large script	left-to-right	ZZ— Not in Unicode
Kits	288	Khitan small script	vertical right-to-left	Khitan Small Script	13.0	471	Ancient/historic	Ch 18.12
Knda	345	Kannada	left-to-right	Kannada	1.0	91		Ch 12.8
Kore	287	Korean (alias for Hangul + Han)	left-to-right	ZZ— See § Hani, § Hang
Kpel	436	Kpelle	left-to-right	ZZ— Not in Unicode, proposal is explored^[i]
Krai	396	Kirat Rai	left-to-right	ZZ— Not in Unicode, approved for version 16.0^[iii]
Kthi	317	Kaithi	left-to-right	Kaithi	5.2	68	Ancient/historic	Ch 15.2
Lana	351	Tai Tham (Lanna)	left-to-right	Tai Tham	5.2	127		Ch 16.7
Laoo	356	Lao	left-to-right	Lao	1.0	83		Ch 16.2
Latf	217	Latin (Fraktur variant)	varies	ZZ— Typographic variant of Latin (see § Latn)
Latg	216	Latin (Gaelic variant)	left-to-right	ZZ— Typographic variant of Latin (see § Latn)
Latn	215	Latin	left-to-right	Latin	1.0	1,481	See also: Latin script in Unicode	Ch 7.1
Leke	364	Leke	left-to-right	ZZ— Not in Unicode
Lepc	335	Lepcha (Róng)	left-to-right	Lepcha	5.1	74		Ch 13.12
Limb	336	Limbu	left-to-right	Limbu	4.0	68		Ch 13.6
Lina	400	Linear A	left-to-right	Linear A	7.0	341	Ancient/historic	Ch 8.1
Linb	401	Linear B	left-to-right	Linear B	4.0	211	Ancient/historic	Ch 8.2
Lisu	399	Lisu (Fraser)	left-to-right	Lisu	5.2	49		Ch 18.9
Loma	437	Loma	left-to-right	ZZ— Not in Unicode, proposal is explored^[i]
Lyci	202	Lycian	left-to-right	Lycian	5.1	29	Ancient/historic	Ch 8.5
Lydi	116	Lydian	right-to-left script	Lydian	5.1	27	Ancient/historic	Ch 8.5
Mahj	314	Mahajani	left-to-right	Mahajani	7.0	39	Ancient/historic	Ch 15.6
Maka	366	Makasar	left-to-right	Makasar	11.0	25	Ancient/historic	Ch 17.8
Mand	140	Mandaic, Mandaean	right-to-left script	Mandaic	6.0	29		Ch 9.5
Mani	139	Manichaean	right-to-left script	Manichaean	7.0	51	Ancient/historic	Ch 10.5
Marc	332	Marchen	left-to-right	Marchen	9.0	68	Ancient/historic	Ch 14.5
Maya	090	Mayan hieroglyphs	mixed	ZZ— Not in Unicode
Medf	265	Medefaidrin (Oberi Okaime, Oberi Ɔkaimɛ)	left-to-right	Medefaidrin	11.0	91		Ch 19.10
Mend	438	Mende Kikakui	right-to-left script	Mende Kikakui	7.0	213		Ch 19.8
Merc	101	Meroitic Cursive	right-to-left script	Meroitic Cursive	6.1	90	Ancient/historic	Ch 11.5
Mero	100	Meroitic Hieroglyphs	right-to-left script	Meroitic Hieroglyphs	6.1	32	Ancient/historic	Ch 11.5
Mlym	347	Malayalam	left-to-right	Malayalam	1.0	118		Ch 12.9
Modi	324	Modi, Moḍī	left-to-right	Modi	7.0	79	Ancient/historic	Ch 15.12
Mong	145	Mongolian	vertical left-to-right, left-to-right	Mongolian	3.0	168	Mong includes Clear and Manchu scripts	Ch 13.5
Moon	218	Moon (Moon code, Moon script, Moon type)	mixed	ZZ— Not in Unicode, proposal is explored^[i]
Mroo	264	Mro, Mru	left-to-right	Mro	7.0	43		Ch 13.8
Mtei	337	Meitei Mayek (Meithei, Meetei)	left-to-right	Meetei Mayek	5.2	79		Ch 13.7
Mult	323	Multani	left-to-right	Multani	8.0	38	Ancient/historic	Ch 15.10
Mymr	350	Myanmar (Burmese)	left-to-right	Myanmar	3.0	223		Ch 16.3
Nagm	295	Nag Mundari	left-to-right	Nag Mundari	15.0	42
Nand	311	Nandinagari	left-to-right	Nandinagari	12.0	65	Ancient/historic	Ch 15.13
Narb	106	Old North Arabian (Ancient North Arabian)	right-to-left script	Old North Arabian	7.0	32	Ancient/historic	Ch 10.1
Nbat	159	Nabataean	right-to-left script	Nabataean	7.0	40	Ancient/historic	Ch 10.10
Newa	333	Newa, Newar, Newari, Nepāla lipi	left-to-right	Newa	9.0	97		Ch 13.3
Nkdb	085	Naxi Dongba (na²¹ɕi³³ to³³ba²¹, Nakhi Tomba)	left-to-right	ZZ— Not in Unicode
Nkgb	420	Naxi Geba (na²¹ɕi³³ gʌ²¹ba²¹, 'Na-'Khi ²Ggŏ-¹baw, Nakhi Geba)	left-to-right	ZZ— Not in Unicode, proposal is explored^[i]
Nkoo	165	N’Ko	right-to-left script	NKo	5.0	62		Ch 19.4
Nshu	499	Nüshu	vertical right-to-left	Nushu	10.0	397		Ch 18.8
Ogam	212	Ogham	bottom-to-top, left-to-right	Ogham	3.0	29	Ancient/historic	Ch 8.14
Olck	261	Ol Chiki (Ol Cemet’, Ol, Santali)	left-to-right	Ol Chiki	5.1	48		Ch 13.10
Onao	296	Ol Onal	left-to-right	ZZ— Not in Unicode, approved for version 16.0^[iii]
Orkh	175	Old Turkic, Orkhon Runic	right-to-left script	Old Turkic	5.2	73	Ancient/historic	Ch 14.8
Orya	327	Oriya (Odia)	left-to-right	Oriya	1.0	91		Ch 12.5
Osge	219	Osage	left-to-right	Osage	9.0	72		Ch 20.3
Osma	260	Osmanya	left-to-right	Osmanya	4.0	40		Ch 19.2
Ougr	143	Old Uyghur	mixed	Old Uyghur	14.0	26	Ancient/historic	Ch 14.11
Palm	126	Palmyrene	right-to-left script	Palmyrene	7.0	32	Ancient/historic	Ch 10.11
Pauc	263	Pau Cin Hau	left-to-right	Pau Cin Hau	7.0	57		Ch 16.13
Pcun	015	Proto-Cuneiform	left-to-right	ZZ— Not in Unicode
Pelm	016	Proto-Elamite	left-to-right	ZZ— Not in Unicode
Perm	227	Old Permic	left-to-right	Old Permic	7.0	43	Ancient/historic	Ch 8.13
Phag	331	Phags-pa	vertical left-to-right	Phags-pa	5.0	56	Ancient/historic	Ch 14.4
Phli	131	Inscriptional Pahlavi	right-to-left script	Inscriptional Pahlavi	5.2	27	Ancient/historic	Ch 10.6
Phlp	132	Psalter Pahlavi	right-to-left script	Psalter Pahlavi	7.0	29	Ancient/historic	Ch 10.6
Phlv	133	Book Pahlavi	mixed	ZZ— Not in Unicode
Phnx	115	Phoenician	right-to-left script	Phoenician	5.0	29	Ancient/historic^[g]	Ch 10.3
Piqd	293	Klingon (KLI pIqaD)	left-to-right	ZZ— Rejected for inclusion in Unicode^[iv]^[v]
Plrd	282	Miao (Pollard)	left-to-right	Miao	6.1	149		Ch 18.10
Prti	130	Inscriptional Parthian	right-to-left script	Inscriptional Parthian	5.2	30	Ancient/historic	Ch 10.6
Psin	103	Proto-Sinaitic	mixed	ZZ— Not in Unicode
Qaaa-Qabx	900-949	Reserved for private use (range)		ZZ— Not in Unicode
Ranj	303	Ranjana	left-to-right	ZZ— Not in Unicode
Rjng	363	Rejang (Redjang, Kaganga)	left-to-right	Rejang	5.1	37		Ch 17.5
Rohg	167	Hanifi Rohingya	right-to-left script	Hanifi Rohingya	11.0	50		Ch 16.14
Roro	620	Rongorongo	mixed	ZZ— Not in Unicode, proposal is explored^[i]
Runr	211	Runic	left-to-right, boustrophedon	Runic	3.0	86	Ancient/historic	Ch 8.7
Samr	123	Samaritan	right-to-left script, top-to-bottom	Samaritan	5.2	61		Ch 9.4
Sara	292	Sarati	mixed	ZZ— Not in Unicode
Sarb	105	Old South Arabian	right-to-left script	Old South Arabian	5.2	32	Ancient/historic	Ch 10.2
Saur	344	Saurashtra	left-to-right	Saurashtra	5.1	82		Ch 13.13
Sgnw	095	SignWriting	vertical left-to-right	SignWriting	8.0	672		Ch 21.7
Shaw	281	Shavian (Shaw)	left-to-right	Shavian	4.0	48		Ch 8.15
Shrd	319	Sharada, Śāradā	left-to-right	Sharada	6.1	96		Ch 15.3
Shui	530	Shuishu	left-to-right	ZZ— Not in Unicode
Sidd	302	Siddham, Siddhaṃ, Siddhamātṛkā	left-to-right	Siddham	7.0	92	Ancient/historic	Ch 15.5
Sidt	180	Sidetic	right-to-left	ZZ— Not in Unicode, proposal is mature^[ii]
Sind	318	Khudawadi, Sindhi	left-to-right	Khudawadi	7.0	69		Ch 15.9
Sinh	348	Sinhala	left-to-right	Sinhala	3.0	111		Ch 13.2
Sogd	141	Sogdian	horizontal and vertical writing in East Asian scripts, top-to-bottom	Sogdian	11.0	42	Ancient/historic	Ch 14.10
Sogo	142	Old Sogdian	right-to-left script	Old Sogdian	11.0	40	Ancient/historic	Ch 14.9
Sora	398	Sora Sompeng	left-to-right	Sora Sompeng	6.1	35		Ch 15.17
Soyo	329	Soyombo	left-to-right	Soyombo	10.0	83	Ancient/historic	Ch 14.7
Sund	362	Sundanese	left-to-right	Sundanese	5.1	72		Ch 17.7
Sunu	274	Sunuwar	left-to-right	ZZ— Not in Unicode, approved for version 16.0^[iii]
Sylo	316	Syloti Nagri	left-to-right	Syloti Nagri	4.1	45	Ancient/historic	Ch 15.1
Syrc	135	Syriac	right-to-left script	Syriac	3.0	88	Includes typographic variants Estrangelo (see § Syre), Western (§ Syrj), and Eastern (§ Syrn)	Ch 9.3
Syre	138	Syriac (Estrangelo variant)	mixed	ZZ— Typographic variant of Syriac (see § Syrc)
Syrj	137	Syriac (Western variant)	mixed	ZZ— Typographic variant of Syriac (see § Syrc)
Syrn	136	Syriac (Eastern variant)	mixed	ZZ— Typographic variant of Syriac (see § Syrc)
Tagb	373	Tagbanwa	left-to-right	Tagbanwa	3.2	18		Ch 17.1
Takr	321	Takri, Ṭākrī, Ṭāṅkrī	left-to-right	Takri	6.1	68		Ch 15.4
Tale	353	Tai Le	left-to-right	Tai Le	4.0	35		Ch 16.5
Talu	354	New Tai Lue	left-to-right	New Tai Lue	4.1	83		Ch 16.6
Taml	346	Tamil	left-to-right	Tamil	1.0	123		Ch 12.6
Tang	520	Tangut	vertical right-to-left, left-to-right	Tangut	9.0	6,914	Ancient/historic	Ch 18.11
Tavt	359	Tai Viet	left-to-right	Tai Viet	5.2	72		Ch 16.8
Tayo	380	Tai Yo	top-to-bottom, columns right-to-left	ZZ— Not in Unicode, proposal is mature^[ii]
Telu	340	Telugu	left-to-right	Telugu	1.0	100		Ch 12.7
Teng	290	Tengwar	left-to-right	ZZ— Not in Unicode
Tfng	120	Tifinagh (Berber)	left-to-right, right-to-left script, top-to-bottom, bottom-to-top	Tifinagh	4.1	59		Ch 19.3
Tglg	370	Tagalog (Baybayin, Alibata)	left-to-right	Tagalog	3.2	23		Ch 17.1
Thaa	170	Thaana	right-to-left script	Thaana	3.0	50		Ch 13.1
Thai	352	Thai	left-to-right	Thai	1.0	86		Ch 16.1
Tibt	330	Tibetan	left-to-right	Tibetan	2.0	207	Added in 1.0, removed in 1.1 and reintroduced in 2.0	Ch 13.4
Tirh	326	Tirhuta	left-to-right	Tirhuta	7.0	82		Ch 15.11
Tnsa	275	Tangsa	left-to-right	Tangsa	14.0	89		Ch 13.18
Todr	229	Todhri	right-to-left	ZZ— Not in Unicode, approved for version 16.0^[iii]
Tols	299	Tolong Siki	left-to-right	ZZ— Not in Unicode, proposal is mature^[ii]
Toto	294	Toto	left-to-right	Toto	14.0	31		Ch 13.17
Tutg	341	Tulu-Tigalari	left-to-right	ZZ— Not in Unicode, approved for version 16.0^[iii]
Ugar	040	Ugaritic	left-to-right	Ugaritic	4.0	31	Ancient/historic	Ch 11.2
Vaii	470	Vai	left-to-right	Vai	5.1	300		Ch 19.5
Visp	280	Visible Speech	left-to-right	ZZ— Not in Unicode
Vith	228	Vithkuqi	left-to-right	Vithkuqi	14.0	70	Ancient/historic	Ch 8.12
Wara	262	Warang Citi (Varang Kshiti)	left-to-right	Warang Citi	7.0	84		Ch 13.9
Wcho	283	Wancho	left-to-right	Wancho	12.0	59		Ch 13.16
Wole	480	Woleai	mixed	ZZ— Not in Unicode, proposal is explored^[i]
Xpeo	030	Old Persian	left-to-right	Old Persian	4.1	50	Ancient/historic	Ch 11.3
Xsux	020	Cuneiform, Sumero-Akkadian	left-to-right	Cuneiform	5.0	1,234	Ancient/historic	Ch 11.1
Yezi	192	Yezidi	right-to-left script	Yezidi	13.0	47	Ancient/historic	Ch 9.6
Yiii	460	Yi	left-to-right	Yi	3.0	1,220		Ch 18.7
Zanb	339	Zanabazar Square (Zanabazarin Dörböljin Useg, Xewtee Dörböljin Bicig, Horizontal Square Script)	left-to-right	Zanabazar Square	10.0	72	Ancient/historic	Ch 14.6
Zinh	994	Code for inherited script		Inherited		657
Zmth	995	Mathematical notation		ZZ— Not a 'script' in Unicode
Zsym	996	Symbols		ZZ— Not a 'script' in Unicode
Zsye	993	Symbols (emoji variant)		ZZ— Not a 'script' in Unicode
Zxxx	997	Code for unwritten documents		ZZ— Not a 'script' in Unicode
Zyyy	998	Code for undetermined script		Common		8,306
Zzzz	999	Code for uncoded script		Unknown		964,234	In Unicode: All other code points
Notes ^ ISO 15924 publications As of 12 September 2023^[update] ^ ISO 15924 Normative text file As of 12 September 2023^[update] ^ ISO 15924 Changes (including Aliases for Unicode; as of 12 September 2023^[update]) ^ Unicode version 15.1 ^ Unicode charts ^ Unicode uses the "Property Value Alias" (Alias) as the script-name. These Alias names are part of Unicode and are published informatively next to ISO 15924. An alias script name may be used in a character name: `Palm`, Palmyrene → U+10860 𐡠 PALMYRENE LETTER ALEPH. ^ In Unicode, the Phoenician script is intended for the representation of text in Paleo-Hebrew, Archaic Phoenician, Phoenician, Early Aramaic, Late Phoenician cursive, Phoenician papyri, Siloam Hebrew, Hebrew seals, Ammonite, Moabite, and Punic.^[vi]
References ^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ "SEI List of Scripts Not Yet Encoded". Unicode Consortium. March 2023. Retrieved 2023-09-25. ^ ^a ^b ^c ^d "Unicode Pipeline § Code Points Provisionally Assigned for Mature Proposals". Unicode Consortium. 2023-09-12. Retrieved 2023-09-25. ^ ^a ^b ^c ^d ^e ^f ^g "Unicode Pipeline § Approved for Publication in Version 16.0". Unicode Consortium. 2023-09-12. Retrieved 2023-09-25. ^ Michael Everson (1997-09-18). "Proposal to encode Klingon in Plane 1 of ISO/IEC 10646-2". ^ The Unicode Consortium (2001-08-14). "Approved Minutes of the UTC 87 / L2 184 Joint Meeting". ^ "Middle East-II, Ancient Scripts" (PDF). 15.0.0. The Unicode Consortium. Retrieved 2023-09-25.

Missing scripts in Unicode edit

With each new version of Unicode, new writing systems are added to the international character code. According to a statement by linguist Dr Deborah Anderson of UC Berkeley, there are over 100 writing systems that have not yet been included in Unicode.

According to a list of the project Missing Scripts by the University of Applied Sciences Mainz, Germany, the ANRT Nancy, France and UC Berkeley, USA, there are 294 known writing systems of mankind according to the current state of research (January 2022). 131 of them have not yet been encoded in Unicode, i.e. cannot yet be used on a computer or mobile phone.

References edit

^ "Glossary". unicode.org.
^ "Unicode Character Database: Scripts". unicode.org.
^ "Chapter 14: Additional Ancient and Historic Scripts". The Unicode Standard, Version 15.0 (PDF). Mountain View, CA: Unicode, Inc. September 2022. ISBN 978-1-936213-32-0.
^ https://www.unicode.org/roadmaps/ Roadmaps to Unicode
^ "UAX #24: Unicode Script Property". www.unicode.org.

External links edit

Script Encoding Initiative, A project at UC Berkeley, USA, working to get more scripts included in the Unicode standard.
The World’s Writing Systems, An overview of all 294 known writing systems, each with a typographic reference glyph and their Unicode status.

[cnote_a_grp_ISO_Unicode] 
ISO 15924 publications As of 12 September 2023^[update]

[cnote_b_grp_ISO_list] 
ISO 15924 Normative text file As of 12 September 2023^[update]

[cnote_c_grp_ISO_changes] 
ISO 15924 Changes (including Aliases for Unicode; as of 12 September 2023^[update])

[cnote_d_grp_Asof_Unicode_version] 
Unicode version 15.1

[cnote_e_grp_Unicode_charts] 
Unicode charts

[cnote_f_grp_Aliases_for_Unicode] 
Unicode uses the "Property Value Alias" (Alias) as the script-name. These Alias names are part of Unicode and are published informatively next to ISO 15924. An alias script name may be used in a character name: Palm, Palmyrene → U+10860 𐡠 PALMYRENE LETTER ALEPH.

[cnote_g_grp_Scripts] 
In Unicode, the Phoenician script is intended for the representation of text in Paleo-Hebrew, Archaic Phoenician, Phoenician, Early Aramaic, Late Phoenician cursive, Phoenician papyri, Siloam Hebrew, Hebrew seals, Ammonite, Moabite, and Punic.^[vi]

[uniproposed-6] ^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ "SEI List of Scripts Not Yet Encoded". Unicode Consortium. March 2023. Retrieved 2023-09-25.

[pipeline_mature-7] "Unicode Pipeline § Code Points Provisionally Assigned for Mature Proposals". Unicode Consortium. 2023-09-12. Retrieved 2023-09-25.

[pipeline_v16-8] ^ ^a ^b ^c ^d ^e ^f ^g "Unicode Pipeline § Approved for Publication in Version 16.0". Unicode Consortium. 2023-09-12. Retrieved 2023-09-25.

[9] Michael Everson (1997-09-18). "Proposal to encode Klingon in Plane 1 of ISO/IEC 10646-2".

[10] The Unicode Consortium (2001-08-14). "Approved Minutes of the UTC 87 / L2 184 Joint Meeting".

[11] "Middle East-II, Ancient Scripts" (PDF). 15.0.0. The Unicode Consortium. Retrieved 2023-09-25.

[1] "Glossary". unicode.org.

[2] "Unicode Character Database: Scripts". unicode.org.

[3] "Chapter 14: Additional Ancient and Historic Scripts". The Unicode Standard, Version 15.0 (PDF). Mountain View, CA: Unicode, Inc. September 2022. ISBN 978-1-936213-32-0.

[4] ttps://www.unicode.org/roadmaps/ Roadmaps to Unicode

[Unicode_script_property-5] "UAX #24: Unicode Script Property". www.unicode.org.

[1]

[2]

[3]

[4]

[5]

[a]

[b]

[c]

[d]

[e]

[f]

[i]

[ii]

[iii]

[g]

[iv]

[v]

[vi]