1 The *.txt files were copied from
3 ftp://www.unicode.org/Public/UNIDATA
5 with subdirectories 'extracted' and 'auxiliary'
7 The Unihan files were not included due to space considerations. Also NOT
8 included were any *.html files and *Test.txt files. It is possible to add the
9 Unihan files, and edit mktables (see instructions near its beginning) to look
12 The file 'version' should exist and be a single line with the Unicode version,
16 To be 8.3 filesystem friendly, the names of some of the input files have been
17 changed from the values that are in the Unicode DB:
19 mv PropertyValueAliases.txt PropValueAliases.txt
20 mv NamedSequencesProv.txt NamedSqProv.txt
21 mv DerivedAge.txt DAge.txt
22 mv DerivedCoreProperties.txt DCoreProperties.txt
23 mv DerivedNormalizationProps.txt DNormalizationProps.txt
24 mv extracted/DerivedBidiClass.txt extracted/DBidiClass.txt
25 mv extracted/DerivedBinaryProperties.txt extracted/DBinaryProperties.txt
26 mv extracted/DerivedCombiningClass.txt extracted/DCombiningClass.txt
27 mv extracted/DerivedDecompositionType.txt extracted/DDecompositionType.txt
28 mv extracted/DerivedEastAsianWidth.txt extracted/DEastAsianWidth.txt
29 mv extracted/DerivedGeneralCategory.txt extracted/DGeneralCategory.txt
30 mv extracted/DerivedJoiningGroup.txt extracted/DJoinGroup.txt
31 mv extracted/DerivedJoiningType.txt extracted/DJoinType.txt
32 mv extracted/DerivedLineBreak.txt extracted/DLineBreak.txt
33 mv extracted/DerivedNumericType.txt extracted/DNumType.txt
34 mv extracted/DerivedNumericValues.txt extracted/DNumValues.txt
36 If you have the Unihan database (5.2 and above), you should also do the
39 mv Unihan_DictionaryIndices.txt UnihanIndicesDictionary.txt
40 mv Unihan_DictionaryLikeData.txt UnihanDataDictionaryLike.txt
41 mv Unihan_IRGSources.txt UnihanIRGSources.txt
42 mv Unihan_NumericValues.txt UnihanNumericValues.txt
43 mv Unihan_OtherMappings.txt UnihanOtherMappings.txt
44 mv Unihan_RadicalStrokeCounts.txt UnihanRadicalStrokeCounts.txt
45 mv Unihan_Readings.txt UnihanReadings.txt
46 mv Unihan_Variants.txt UnihanVariants.txt
48 If you download everything, the names of files, such as test files, that are
49 not used by mktables are not changed by the above, and will not work correctly
50 as-is on 8.3 filesystems.
52 mktables is used to generate the tables used by the rest of Perl. It will warn
53 you about any *.txt files in the directory substructure that it doesn't know
54 about. You should remove any so-identified, or edit mktables to add them to
55 its lists to process. You can run
59 to have it try to process these tables generically.
61 If any files are added, deleted, or their names change, you must run
65 to generate a new list of all the files.
69 The files are inter-related. If you take the latest UnicodeData.txt, for
70 example, but leave the older versions of other files, there can be subtle
73 When moving to a new version of Unicode, you need to update 'version' by hand
78 You should look in the Unicode release notes (which are probably towards the
79 bottom of http://www.unicode.org/reports/tr44/) to see if any properties have
80 newly been moved to be Obsolete, Deprecated, or Stabilized. The full names for
81 these should be added to the respective lists near the beginning of mktables,
82 using an 'if' to add them for just this Unicode version going forward, so that
83 mktables can continue to be used for earlier Unicode versions.
85 When putting out a new Perl release, think about if any of the Deprecated
86 properties should be moved to Suppressed.
88 The *.pl files are generated from the *.txt files by the mktables script,
89 more recently done during the Perl build process, but if you want to try
93 p4 edit *.pl */*.pl */*/*.pl
94 perl ./mktables -P ../../pod -T ../../t/re/uniprops.t -makelist
97 perl Porting/manicheck
99 If any new (or deleted, unlikely but not impossible) *.pl files are indicated:
113 jhi@iki.fi; updated by nick@ccl4.org, public@khwilliamson.com