* Synced the perlfaq
[p5sagit/p5-mst-13.2.git] / lib / unicore / ArabicShaping.txt
CommitLineData
283b82dc 1# ArabicShaping-5.2.0.txt
2# Date: 2009-08-17, 11:11:00 PDT [KW]
8836d2a5 3#
4# This file is a normative contributory data file in the
5# Unicode Character Database.
6#
283b82dc 7# Copyright (c) 1991-2009 Unicode, Inc.
a2bd7410 8# For terms of use, see http://www.unicode.org/terms_of_use.html
9#
283b82dc 10# This file defines the shaping classes for Arabic, Syriac, and N'Ko
8836d2a5 11# positional shaping, repeating in machine readable form the
283b82dc 12# information exemplified in Tables 8-3, 8-7, 8-8, 8-11, 8-12,
13# 8-13, and 13-5 of The Unicode Standard, Version 5.2.
8836d2a5 14#
283b82dc 15# See sections 8.2, 8.3, and 13.5 of The Unicode Standard, Version 5.2
8836d2a5 16# for more information.
17#
18# Each line contains four fields, separated by a semicolon.
19#
1911be83 20# Field 0: the code point, in 4-digit hexadecimal
283b82dc 21# form, of an Arabic, Syriac, or N'Ko character.
22#
1911be83 23# Field 1: gives a short schematic name for that character,
8836d2a5 24# abbreviated from the normative Unicode character name.
283b82dc 25#
a2bd7410 26# Field 2: defines the joining type (property name: Joining_Type)
27# R Right_Joining
28# L Left_Joining
29# D Dual_Joining
30# C Join_Causing
31# U Non_Joining
32# T Transparent
283b82dc 33# See Section 8.2, Arabic for more information on these types.
34#
a2bd7410 35# Field 3: defines the joining group (property name: Joining_Group)
36#
37# The values of the joining group are based schematically on character
38# names. Where a schematic character name consists of two or more parts separated
39# by spaces, the formal Joining_Group property value, as specified in
40# PropertyValueAliases.txt, consists of the same name parts joined by
41# underscores. Hence, the entry:
42#
43# 0629; TEH MARBUTA; R; TEH MARBUTA
44#
45# corresponds to [Joining_Group = Teh_Marbuta].
8836d2a5 46#
a2bd7410 47# Note: For historical reasons, the property value [Joining_Group = Hamza_On_Heh_Goal]
48# is anachronistically named. It used to apply to both of the following characters
49# in earlier versions of the standard:
50#
51# U+06C2 ARABIC LETTER HEH GOAL WITH HAMZA ABOVE
52# U+06C3 ARABIC LETTER TEH MARBUTA GOAL
53#
54# However, it currently applies only to U+06C3, and *not* to U+06C2.
55# To avoid destabilizing existing Joining_Group property aliases, the
56# value Hamza_On_Heh_Goal has not been changed, despite the fact that it
57# no longer applies to Hamza On Heh Goal, but only to Teh Marbuta Goal.
afc46004 58#
283b82dc 59# When other cursive scripts are added to the Unicode Standard in
60# the future, the joining group value of all its letters will default
61# to jg=No_Joining_Group in this data file. Other, more specific
62# joining group values will be defined only if an explicit proposal
63# to define those values exactly has been approved by the UTC. This
64# is the convention exemplified by the N'Ko script. Only the Arabic
65# and Syriac scripts currently have explicit joining group values defined.
66#
1911be83 67# Note: Code points that are not explicitly listed in this file are
a2bd7410 68# either of joining type T or U:
afc46004 69#
a2bd7410 70# - Those that not explicitly listed that are of General Category Mn, Me, or Cf
1911be83 71# have joining type T.
283b82dc 72# - All others not explicitly listed have joining type U.
afc46004 73#
74# For an explicit listing of characters of joining type T, see
75# the derived property file DerivedJoiningType.txt.
76#
283b82dc 77# There are currently no characters of joining type L defined in Unicode.
afc46004 78#
8836d2a5 79# #############################################################
80
81# Unicode; Schematic Name; Joining Type; Joining Group
82
d357d9fe 83# Arabic characters
8836d2a5 84
a2bd7410 850600; ARABIC NUMBER SIGN; U; No_Joining_Group
860601; ARABIC SIGN SANAH; U; No_Joining_Group
870602; ARABIC FOOTNOTE MARKER; U; No_Joining_Group
880603; ARABIC SIGN SAFHA; U; No_Joining_Group
20e8a3a3 890608; ARABIC RAY; U; No_Joining_Group
a2bd7410 90060B; AFGHANI SIGN; U; No_Joining_Group
910621; HAMZA; U; No_Joining_Group
d357d9fe 920622; MADDA ON ALEF; R; ALEF
930623; HAMZA ON ALEF; R; ALEF
940624; HAMZA ON WAW; R; WAW
950625; HAMZA UNDER ALEF; R; ALEF
960626; HAMZA ON YEH; D; YEH
970627; ALEF; R; ALEF
980628; BEH; D; BEH
990629; TEH MARBUTA; R; TEH MARBUTA
100062A; TEH; D; BEH
101062B; THEH; D; BEH
102062C; JEEM; D; HAH
103062D; HAH; D; HAH
104062E; KHAH; D; HAH
105062F; DAL; R; DAL
1060630; THAL; R; DAL
1070631; REH; R; REH
1080632; ZAIN; R; REH
1090633; SEEN; D; SEEN
1100634; SHEEN; D; SEEN
1110635; SAD; D; SAD
1120636; DAD; D; SAD
1130637; TAH; D; TAH
1140638; ZAH; D; TAH
1150639; AIN; D; AIN
116063A; GHAIN; D; AIN
20e8a3a3 117063B; KEHEH WITH 2 DOTS ABOVE; D; GAF
118063C; KEHEH WITH 3 DOTS BELOW; D; GAF
283b82dc 119063D; FARSI YEH WITH INVERTED V; D; FARSI YEH
120063E; FARSI YEH WITH 2 DOTS ABOVE; D; FARSI YEH
121063F; FARSI YEH WITH 3 DOTS ABOVE; D; FARSI YEH
a2bd7410 1220640; TATWEEL; C; No_Joining_Group
d357d9fe 1230641; FEH; D; FEH
1240642; QAF; D; QAF
1250643; KAF; D; KAF
1260644; LAM; D; LAM
1270645; MEEM; D; MEEM
1280646; NOON; D; NOON
1290647; HEH; D; HEH
1300648; WAW; R; WAW
8836d2a5 1310649; ALEF MAKSURA; D; YEH
d357d9fe 132064A; YEH; D; YEH
822ebcc8 133066E; DOTLESS BEH; D; BEH
134066F; DOTLESS QAF; D; QAF
8836d2a5 1350671; HAMZAT WASL ON ALEF; R; ALEF
d357d9fe 1360672; WAVY HAMZA ON ALEF; R; ALEF
1370673; WAVY HAMZA UNDER ALEF; R; ALEF
a2bd7410 1380674; HIGH HAMZA; U; No_Joining_Group
d357d9fe 1390675; HIGH HAMZA ALEF; R; ALEF
1400676; HIGH HAMZA WAW; R; WAW
1410677; HIGH HAMZA WAW WITH DAMMA; R; WAW
1420678; HIGH HAMZA YEH; D; YEH
1430679; TEH WITH SMALL TAH; D; BEH
144067A; TEH WITH 2 DOTS VERTICAL ABOVE; D; BEH
145067B; BEH WITH 2 DOTS VERTICAL BELOW; D; BEH
146067C; TEH WITH RING; D; BEH
147067D; TEH WITH 3 DOTS ABOVE DOWNWARD; D; BEH
148067E; TEH WITH 3 DOTS BELOW; D; BEH
149067F; TEH WITH 4 DOTS ABOVE; D; BEH
1500680; BEH WITH 4 DOTS BELOW; D; BEH
1510681; HAMZA ON HAH; D; HAH
1520682; HAH WITH 2 DOTS VERTICAL ABOVE; D; HAH
1530683; HAH WITH MIDDLE 2 DOTS; D; HAH
1540684; HAH WITH MIDDLE 2 DOTS VERTICAL; D; HAH
1550685; HAH WITH 3 DOTS ABOVE; D; HAH
1560686; HAH WITH MIDDLE 3 DOTS DOWNWARD; D; HAH
1570687; HAH WITH MIDDLE 4 DOTS; D; HAH
1580688; DAL WITH SMALL TAH; R; DAL
1590689; DAL WITH RING; R; DAL
160068A; DAL WITH DOT BELOW; R; DAL
161068B; DAL WITH DOT BELOW AND SMALL TAH; R; DAL
162068C; DAL WITH 2 DOTS ABOVE; R; DAL
163068D; DAL WITH 2 DOTS BELOW; R; DAL
164068E; DAL WITH 3 DOTS ABOVE; R; DAL
165068F; DAL WITH 3 DOTS ABOVE DOWNWARD; R; DAL
1660690; DAL WITH 4 DOTS ABOVE; R; DAL
1670691; REH WITH SMALL TAH; R; REH
1680692; REH WITH SMALL V; R; REH
1690693; REH WITH RING; R; REH
1700694; REH WITH DOT BELOW; R; REH
1710695; REH WITH SMALL V BELOW; R; REH
1720696; REH WITH DOT BELOW AND DOT ABOVE; R; REH
1730697; REH WITH 2 DOTS ABOVE; R; REH
1740698; REH WITH 3 DOTS ABOVE; R; REH
1750699; REH WITH 4 DOTS ABOVE; R; REH
176069A; SEEN WITH DOT BELOW AND DOT ABOVE; D; SEEN
177069B; SEEN WITH 3 DOTS BELOW; D; SEEN
178069C; SEEN WITH 3 DOTS BELOW AND 3 DOTS ABOVE; D; SEEN
179069D; SAD WITH 2 DOTS BELOW; D; SAD
180069E; SAD WITH 3 DOTS ABOVE; D; SAD
181069F; TAH WITH 3 DOTS ABOVE; D; TAH
18206A0; AIN WITH 3 DOTS ABOVE; D; AIN
18306A1; DOTLESS FEH; D; FEH
18406A2; FEH WITH DOT MOVED BELOW; D; FEH
18506A3; FEH WITH DOT BELOW; D; FEH
18606A4; FEH WITH 3 DOTS ABOVE; D; FEH
18706A5; FEH WITH 3 DOTS BELOW; D; FEH
18806A6; FEH WITH 4 DOTS ABOVE; D; FEH
18906A7; QAF WITH DOT ABOVE; D; QAF
19006A8; QAF WITH 3 DOTS ABOVE; D; QAF
a2bd7410 19106A9; KEHEH; D; GAF
d357d9fe 19206AA; SWASH KAF; D; SWASH KAF
19306AB; KAF WITH RING; D; GAF
19406AC; KAF WITH DOT ABOVE; D; KAF
19506AD; KAF WITH 3 DOTS ABOVE; D; KAF
19606AE; KAF WITH 3 DOTS BELOW; D; KAF
19706AF; GAF; D; GAF
19806B0; GAF WITH RING; D; GAF
19906B1; GAF WITH 2 DOTS ABOVE; D; GAF
20006B2; GAF WITH 2 DOTS BELOW; D; GAF
20106B3; GAF WITH 2 DOTS VERTICAL BELOW; D; GAF
20206B4; GAF WITH 3 DOTS ABOVE; D; GAF
20306B5; LAM WITH SMALL V; D; LAM
20406B6; LAM WITH DOT ABOVE; D; LAM
20506B7; LAM WITH 3 DOTS ABOVE; D; LAM
20606B8; LAM WITH 3 DOTS BELOW; D; LAM
20706B9; NOON WITH DOT BELOW; D; NOON
20806BA; DOTLESS NOON; D; NOON
20906BB; DOTLESS NOON WITH SMALL TAH; D; NOON
21006BC; NOON WITH RING; D; NOON
283b82dc 21106BD; NYA; D; NYA
d357d9fe 21206BE; KNOTTED HEH; D; KNOTTED HEH
21306BF; HAH WITH MIDDLE 3 DOTS DOWNWARD AND DOT ABOVE; D; HAH
21406C0; HAMZA ON HEH; R; TEH MARBUTA
21506C1; HEH GOAL; D; HEH GOAL
a2bd7410 21606C2; HAMZA ON HEH GOAL; D; HEH GOAL
d357d9fe 21706C3; TEH MARBUTA GOAL; R; HAMZA ON HEH GOAL
21806C4; WAW WITH RING; R; WAW
21906C5; WAW WITH BAR; R; WAW
22006C6; WAW WITH SMALL V; R; WAW
22106C7; WAW WITH DAMMA; R; WAW
22206C8; WAW WITH ALEF ABOVE; R; WAW
22306C9; WAW WITH INVERTED SMALL V; R; WAW
22406CA; WAW WITH 2 DOTS ABOVE; R; WAW
22506CB; WAW WITH 3 DOTS ABOVE; R; WAW
283b82dc 22606CC; FARSI YEH; D; FARSI YEH
d357d9fe 22706CD; YEH WITH TAIL; R; YEH WITH TAIL
283b82dc 22806CE; FARSI YEH WITH SMALL V; D; FARSI YEH
d357d9fe 22906CF; WAW WITH DOT ABOVE; R; WAW
23006D0; YEH WITH 2 DOTS VERTICAL BELOW; D; YEH
23106D1; YEH WITH 3 DOTS BELOW; D; YEH
23206D2; YEH BARREE; R; YEH BARREE
23306D3; HAMZA ON YEH BARREE; R; YEH BARREE
afc46004 23406D5; AE; R; TEH MARBUTA
a2bd7410 23506DD; ARABIC END OF AYAH; U; No_Joining_Group
1911be83 23606EE; DAL WITH INVERTED V; R; DAL
23706EF; REH WITH INVERTED V; R; REH
d357d9fe 23806FA; SEEN WITH DOT BELOW AND 3 DOTS ABOVE; D; SEEN
23906FB; DAD WITH DOT BELOW; D; SAD
24006FC; GHAIN WITH DOT BELOW; D; AIN
7be0dac3 24106FF; HEH WITH INVERTED V; D; KNOTTED HEH
8836d2a5 242
d357d9fe 243# Syriac characters
8836d2a5 244
d357d9fe 2450710; ALAPH; R; ALAPH
2460712; BETH; D; BETH
2470713; GAMAL; D; GAMAL
2480714; GAMAL GARSHUNI; D; GAMAL
2490715; DALATH; R; DALATH RISH
2500716; DOTLESS DALATH RISH; R; DALATH RISH
2510717; HE; R; HE
822ebcc8 2520718; WAW; R; SYRIAC WAW
d357d9fe 2530719; ZAIN; R; ZAIN
254071A; HETH; D; HETH
255071B; TETH; D; TETH
256071C; TETH GARSHUNI; D; TETH
257071D; YUDH; D; YUDH
258071E; YUDH HE; R; YUDH HE
259071F; KAPH; D; KAPH
2600720; LAMADH; D; LAMADH
2610721; MIM; D; MIM
2620722; NUN; D; NUN
2630723; SEMKATH; D; SEMKATH
2640724; FINAL SEMKATH; D; FINAL SEMKATH
2650725; E; D; E
2660726; PE; D; PE
2670727; REVERSED PE; D; REVERSED PE
2680728; SADHE; R; SADHE
2690729; QAPH; D; QAPH
270072A; RISH; R; DALATH RISH
271072B; SHIN; D; SHIN
272072C; TAW; R; TAW
1911be83 273072D; PERSIAN BHETH; D; BETH
274072E; PERSIAN GHAMAL; D; GAMAL
275072F; PERSIAN DHALATH; R; DALATH RISH
276074D; SOGDIAN ZHAIN; R; ZHAIN
277074E; SOGDIAN KHAPH; D; KHAPH
278074F; SOGDIAN FE; D; FE
afc46004 279
a2bd7410 280# Arabic supplement characters
281
2820750; BEH WITH 3 DOTS HORIZONTALLY BELOW; D; BEH
2830751; BEH WITH DOT BELOW AND 3 DOTS ABOVE; D; BEH
2840752; BEH WITH 3 DOTS POINTING UPWARDS BELOW; D; BEH
2850753; BEH WITH 3 DOTS POINTING UPWARDS BELOW AND 2 DOTS ABOVE; D; BEH
2860754; BEH WITH 2 DOTS BELOW AND DOT ABOVE; D; BEH
2870755; BEH WITH INVERTED SMALL V BELOW; D; BEH
2880756; BEH WITH SMALL V; D; BEH
2890757; HAH WITH 2 DOTS ABOVE; D; HAH
2900758; HAH WITH 3 DOTS POINTING UPWARDS BELOW; D; HAH
2910759; DAL WITH 2 DOTS VERTICALLY BELOW AND SMALL TAH; R; DAL
292075A; DAL WITH INVERTED SMALL V BELOW; R; DAL
293075B; REH WITH STROKE; R; REH
294075C; SEEN WITH 4 DOTS ABOVE; D; SEEN
295075D; AIN WITH 2 DOTS ABOVE; D; AIN
296075E; AIN WITH 3 DOTS POINTING DOWNWARDS ABOVE; D; AIN
297075F; AIN WITH 2 DOTS VERTICALLY ABOVE; D; AIN
2980760; FEH WITH 2 DOTS BELOW; D; FEH
2990761; FEH WITH 3 DOTS POINTING UPWARDS BELOW; D; FEH
3000762; KEHEH WITH DOT ABOVE; D; GAF
3010763; KEHEH WITH 3 DOTS ABOVE; D; GAF
3020764; KEHEH WITH 3 DOTS POINTING UPWARDS BELOW; D; GAF
3030765; MEEM WITH DOT ABOVE; D; MEEM
3040766; MEEM WITH DOT BELOW; D; MEEM
3050767; NOON WITH 2 DOTS BELOW; D; NOON
3060768; NOON WITH SMALL TAH; D; NOON
3070769; NOON WITH SMALL V; D; NOON
308076A; LAM WITH BAR; D; LAM
309076B; REH WITH 2 DOTS VERTICALLY ABOVE; R; REH
310076C; REH WITH HAMZA ABOVE; R; REH
311076D; SEEN WITH 2 DOTS VERTICALLY ABOVE; D; SEEN
20e8a3a3 312076E; HAH WITH SMALL TAH BELOW; D; HAH
313076F; HAH WITH SMALL TAH AND 2 DOTS; D; HAH
3140770; SEEN WITH SMALL TAH AND 2 DOTS; D; SEEN
3150771; REH WITH SMALL TAH AND 2 DOTS; R; REH
3160772; HAH WITH SMALL TAH ABOVE; D; HAH
3170773; ALEF WITH DIGIT TWO ABOVE; R; ALEF
3180774; ALEF WITH DIGIT THREE ABOVE; R; ALEF
283b82dc 3190775; FARSI YEH WITH DIGIT TWO ABOVE; D; FARSI YEH
3200776; FARSI YEH WITH DIGIT THREE ABOVE; D; FARSI YEH
3210777; YEH WITH DIGIT FOUR BELOW; D; YEH
20e8a3a3 3220778; WAW WITH DIGIT TWO ABOVE; R; WAW
3230779; WAW WITH DIGIT THREE ABOVE; R; WAW
324077A; YEH BARREE WITH DIGIT TWO ABOVE; D; BURUSHASKI YEH BARREE
325077B; YEH BARREE WITH DIGIT THREE ABOVE; D; BURUSHASKI YEH BARREE
326077C; HAH WITH DIGIT FOUR BELOW; D; HAH
327077D; SEEN WITH DIGIT FOUR ABOVE; D; SEEN
328077E; SEEN WITH INVERTED V; D; SEEN
329077F; KAF WITH 2 DOTS ABOVE; D; KAF
a2bd7410 330
98fbe989 331# N'Ko Characters
332
33307CA; NKO A; D; No_Joining_Group
33407CB; NKO EE; D; No_Joining_Group
33507CC; NKO I; D; No_Joining_Group
33607CD; NKO E; D; No_Joining_Group
33707CE; NKO U; D; No_Joining_Group
33807CF; NKO OO; D; No_Joining_Group
33907D0; NKO O; D; No_Joining_Group
34007D1; NKO DAGBASINNA; D; No_Joining_Group
34107D2; NKO N; D; No_Joining_Group
34207D3; NKO BA; D; No_Joining_Group
34307D4; NKO PA; D; No_Joining_Group
34407D5; NKO TA; D; No_Joining_Group
34507D6; NKO JA; D; No_Joining_Group
34607D7; NKO CHA; D; No_Joining_Group
34707D8; NKO DA; D; No_Joining_Group
34807D9; NKO RA; D; No_Joining_Group
34907DA; NKO RRA; D; No_Joining_Group
35007DB; NKO SA; D; No_Joining_Group
35107DC; NKO GBA; D; No_Joining_Group
35207DD; NKO FA; D; No_Joining_Group
35307DE; NKO KA; D; No_Joining_Group
35407DF; NKO LA; D; No_Joining_Group
35507E0; NKO NA WOLOSO; D; No_Joining_Group
35607E1; NKO MA; D; No_Joining_Group
35707E2; NKO NYA; D; No_Joining_Group
35807E3; NKO NA; D; No_Joining_Group
35907E4; NKO HA; D; No_Joining_Group
36007E5; NKO WA; D; No_Joining_Group
36107E6; NKO YA; D; No_Joining_Group
36207E7; NKO NYA WOLOSO; D; No_Joining_Group
36307E8; NKO JONA JA; D; No_Joining_Group
36407E9; NKO JONA CHA; D; No_Joining_Group
36507EA; NKO JONA RA; D; No_Joining_Group
36607FA; NKO LAJANYALAN; C; No_Joining_Group
367
afc46004 368# Other
369
a2bd7410 370200C; ZERO WIDTH NON-JOINER; U; No_Joining_Group
283b82dc 371200D; ZERO WIDTH JOINER; C; No_Joining_Group
98fbe989 372
373# EOF