Commit | Line | Data |
1911be83 |
1 | The *.txt files were copied from |
8836d2a5 |
2 | |
99870f4d |
3 | ftp://www.unicode.org/Public/UNIDATA |
b6922eda |
4 | |
99870f4d |
5 | with subdirectories 'extracted' and 'auxiliary' |
61131c94 |
6 | |
99870f4d |
7 | The Unihan files were not included due to space considerations. Also NOT |
8 | included were any *.html files and *Test.txt files. It is possible to add the |
9 | Unihan files, and edit mktables (see instructions near its beginning) to look |
10 | at them. |
11 | |
12 | The file 'version' should exist and be a single line with the Unicode version, |
13 | like: |
14 | 5.2.0 |
61131c94 |
15 | |
16 | To be 8.3 filesystem friendly, the names of some of the input files have been |
17 | changed from the values that are in the Unicode DB: |
18 | |
19 | mv PropertyValueAliases.txt PropValueAliases.txt |
20 | mv NamedSequencesProv.txt NamedSqProv.txt |
21 | mv DerivedAge.txt DAge.txt |
22 | mv DerivedCoreProperties.txt DCoreProperties.txt |
23 | mv DerivedNormalizationProps.txt DNormalizationProps.txt |
24 | mv extracted/DerivedBidiClass.txt extracted/DBidiClass.txt |
25 | mv extracted/DerivedBinaryProperties.txt extracted/DBinaryProperties.txt |
26 | mv extracted/DerivedCombiningClass.txt extracted/DCombiningClass.txt |
27 | mv extracted/DerivedDecompositionType.txt extracted/DDecompositionType.txt |
28 | mv extracted/DerivedEastAsianWidth.txt extracted/DEastAsianWidth.txt |
29 | mv extracted/DerivedGeneralCategory.txt extracted/DGeneralCategory.txt |
30 | mv extracted/DerivedJoiningGroup.txt extracted/DJoinGroup.txt |
31 | mv extracted/DerivedJoiningType.txt extracted/DJoinType.txt |
32 | mv extracted/DerivedLineBreak.txt extracted/DLineBreak.txt |
33 | mv extracted/DerivedNumericType.txt extracted/DNumType.txt |
34 | mv extracted/DerivedNumericValues.txt extracted/DNumValues.txt |
35 | |
99870f4d |
36 | If you have the Unihan database (5.2 and above), you should also do the |
37 | following: |
61131c94 |
38 | |
99870f4d |
39 | mv Unihan_DictionaryIndices.txt UnihanIndicesDictionary.txt |
40 | mv Unihan_DictionaryLikeData.txt UnihanDataDictionaryLike.txt |
41 | mv Unihan_IRGSources.txt UnihanIRGSources.txt |
42 | mv Unihan_NumericValues.txt UnihanNumericValues.txt |
43 | mv Unihan_OtherMappings.txt UnihanOtherMappings.txt |
44 | mv Unihan_RadicalStrokeCounts.txt UnihanRadicalStrokeCounts.txt |
45 | mv Unihan_Readings.txt UnihanReadings.txt |
46 | mv Unihan_Variants.txt UnihanVariants.txt |
47 | |
48 | If you download everything, the names of files, such as test files, that are |
49 | not used by mktables are not changed by the above, and will not work correctly |
50 | as-is on 8.3 filesystems. |
51 | |
52 | mktables is used to generate the tables used by the rest of Perl. It will warn |
53 | you about any *.txt files in the directory substructure that it doesn't know |
54 | about. You should remove any so-identified, or edit mktables to add them to |
55 | its lists to process. You can run |
56 | |
57 | mktables -globlist |
58 | |
59 | to have it try to process these tables generically. |
60 | |
61 | If any files are added, deleted, or their names change, you must run |
dbe75581 |
62 | |
97050450 |
63 | mktables -makelist |
99870f4d |
64 | |
65 | to generate a new list of all the files. |
97050450 |
66 | |
0fa75b59 |
67 | FOR PUMPKINS |
68 | |
99870f4d |
69 | The files are inter-related. If you take the latest UnicodeData.txt, for |
70 | example, but leave the older versions of other files, there can be subtle |
71 | problems. |
72 | |
73 | When moving to a new version of Unicode, you need to update 'version' by hand |
74 | |
75 | p4 edit version |
76 | ... |
77 | |
78 | You should look in the Unicode release notes (which are probably towards the |
79 | bottom of http://www.unicode.org/reports/tr44/) to see if any properties have |
80 | newly been moved to be Obsolete, Deprecated, or Stabilized. The full names for |
81 | these should be added to the respective lists near the beginning of mktables, |
82 | using an 'if' to add them for just this Unicode version going forward, so that |
83 | mktables can continue to be used for earlier Unicode versions. |
84 | |
85 | When putting out a new Perl release, think about if any of the Deprecated |
86 | properties should be moved to Suppressed. |
b6922eda |
87 | |
a2bd7410 |
88 | The *.pl files are generated from the *.txt files by the mktables script, |
89 | more recently done during the Perl build process, but if you want to try |
90 | the old manual way: |
0fa75b59 |
91 | |
92 | cd lib/unicore |
99870f4d |
93 | p4 edit *.pl */*.pl */*/*.pl |
94 | perl ./mktables -P ../../pod -T ../../t/re/uniprops.t -makelist |
0fa75b59 |
95 | p4 revert -a |
96 | cd ../.. |
97 | perl Porting/manicheck |
e1aef32f |
98 | |
0fa75b59 |
99 | If any new (or deleted, unlikely but not impossible) *.pl files are indicated: |
100 | |
101 | cd lib/unicore |
102 | p4 add ... |
103 | p4 delete ... |
104 | cd ../... |
105 | p4 edit MANIFEST |
106 | ... |
107 | |
108 | And finally: |
109 | |
110 | p4 submit |
8836d2a5 |
111 | |
112 | -- |
99870f4d |
113 | jhi@iki.fi; updated by nick@ccl4.org, public@khwilliamson.com |