Commit | Line | Data |
d8416318 |
1 | If you read this file _as_is_, just ignore the funny characters you |
2 | see. It is written in the POD format (see perlpod manpage) which is |
3 | specially designed to be readable as is. |
4 | |
5 | The following documentation is written in EUC-CN encoding. |
6 | |
7 | ?ç¹?ÄãÓÃÒ»°ãµÄÎÄ×Ö±à¼Æ÷ÔÄÀÀÕâ·ÝÎĵµ, ÇëºöÂÔÎÄÖÐÆæÌصÄ×¢¼Ç×Ö·û. Õâ·ÝÎÄ |
8 | ¼þÊÇÒÔ POD (¼òÃ÷Îĵµ¸ñʽ) д³É; ÕâÖÖ¸ñʽÊÇΪÁËÄÜ?ÃÈËÖ±½Ó¶ÁÈ¡¶øÌرðÉè¼? |
9 | µÄ. ¹ØÓڴ˸ñʽµÄ½øÒ»²½×ÊѶ, Çë²Î¿¼ perlpod ÏßÉÏÎĵµ. |
10 | |
11 | =head1 NAME |
12 | |
13 | perlcn - ¼òÌåÖÐÎÄ Perl Ö¸ÄÏ |
14 | |
15 | =head1 DESCRIPTION |
16 | |
17 | »¶ÓÀ´µ½ Perl µÄÌìµØ! |
18 | |
19 | ´Ó 5.8.0 °æ¿ªÊ¼, Perl ¾ß±¸ÁËÏ꾡µÄ Unicode (ͳһÂë) Ö§Ô®, Ò²Á¬´øÖ§Ô®ÁË |
20 | Ðí¶àÀ¶¡ÓïϵÒÔÍâµÄ±àÂ뷽ʽ; CJK (ÖÐ?Õº?) ±ãÊÇÆäÖеÄÒ»²¿·Ý. Unicode ÊÇ |
21 | ¹ú¼ÊÐԵıê×¼, ÊÔͼº¸ÇÊÀ½çÉÏËùÓеÄ×Ö·û: Î÷·½ÊÀ½ç, ¶«·½ÊÀ½ç, ÒÔ¼°Á½Õß¼ä |
22 | µÄÒ»ÇÐ (Ï£À°ÎÄ, ÐðÀûÑÇÎÄ, °¢À²®ÎÄ, Ï£²®À´ÎÄ, Ó¡¶ÈÎÄ, Ó¡µØ°²ÎÄ, µÈµÈ). |
23 | ËüÒ²?ÝÄÉÁ˶àÖÖ×÷ҵϵͳÓëƽÌ? (?? PC ¼°Âó½ðËþ). |
24 | |
25 | Perl ±¾ÉíÒÔ Unicode ½øÐвÙ×÷. Õâ±íʾ Perl ÄÚ²¿µÄ×Ö´®×ÊÁÏ¿ÉÓà Unicode |
26 | ±íʾ, Perl µÄº¯Ê½ÓëËã·û (Àý?çÕ?¹æ±íʾʽ±È¶Ô) Ò²ÄÜ¶Ô Unicode ½øÐвÙ×÷. |
27 | ÔÚÊä?ë¼°Êä³öÊ?, ΪÁË´¦ÀíÒÔ Unicode ֮ǰµÄ±àÂ뷽ʽ´¢´æµÄ×ÊÁÏ, Perl Ìṩ |
28 | ÁË¡¸Encode¡¹Õâ¸öÄ£¿é, ¿ÉÒÔ?ÃÄãÇáÒ׵ضÁÈ¡¼°Ð´Èë¾ÉÓеıàÂë×ÊÁ?. |
29 | |
30 | Encode ÑÓÉìÄ£¿éÖ§Ô®ÏÂÁмòÌåÖÐÎĵıàÂ뷽ʽ: |
31 | |
32 | euc-cn Unix ÑÓÉì×Ö·û¼¯, Ò²¾ÍÊÇË׳ƵĹú±êÂë |
33 | gb2312 δ¾´¦ÀíµÄ (µÍ±ÈÌØ) GB2312 ×Ö·û±í |
34 | gb12345 δ¾´¦ÀíµÄÖйúÓ÷±ÌåÖÐÎıàÂë |
35 | iso-ir-165 GB2312 + GB6345 + GB8565 + ÐÂÔö×Ö·û |
36 | cp936 ×ÖÂëÒ³ 936, Ò²³ÆΪ GBK (À©³ä¹ú±êÂë) |
37 | hz 7 ±ÈÌØÒݳöʽ GB2312 ±àÂë |
38 | |
39 | ¾ÙÀýÀ´Ëµ, ½« euc-cn ±àÂëµÄµµ°¸×ª³É Unicode, ìóÐè¼ü?ëÏÂÁÐÖ¸Á?: |
40 | |
41 | perl -Mencoding=euc-cn,STDOUT,utf8 -pe1 < file.euc-cn > file.utf8 |
42 | |
43 | Perl Ò²ÄÚ¸½ÁË¡¸piconv¡¹, Ò»Ö§Íê?«Ò? Perl д³ÉµÄ×Ö·ûת»»¹¤¾ß³ÌÐò, Ó÷¨ |
44 | ?çÏ?: |
45 | |
46 | piconv -f euc-cn -t utf8 < file.euc-cn > file.utf8 |
47 | piconv -f utf8 -t euc-cn < file.utf8 > file.euc-cn |
48 | |
49 | ÁíÍâ, ÀûÓà encoding Ä£¿é, Äã¿ÉÒÔÇáÒ×д³öÒÔ×Ö·ûΪµ¥Î»µÄ³ÌÐòÂë, ?çÏÂËùÊ?: |
50 | |
51 | #!/usr/bin/env perl |
52 | # Æô¶¯ euc-cn ×Ö´®½âÎö; ±ê×¼Êä³ö?ë¼°±ê×¼´íÎó¶¼ÉèÎ? euc-cn ±àÂë |
53 | use encoding 'euc-cn', STDIN => 'euc-cn', |
54 | STDOUT => 'euc-cn', STDERR => 'euc-cn'; |
55 | |
56 | print length("ÂæÍÕ"); # 2 (Ë«ÒýºÅ±íʾ×Ö·û) |
57 | print length('ÂæÍÕ'); # 4 (µ¥ÒýºÅ±íʾ×Ö½Ú) |
58 | print index("×»×»½Ì»å", "»×»½"); # -1 (²»°üº¬´Ë×Ó×Ö´®) |
59 | print index('×»×»½Ì»å', '»×»½'); # 1 (´ÓµÚ¶þ¸ö×Ö½Ú¿ªÊ¼) |
60 | |
61 | =head2 ¶îÍâµÄÖÐÎıàÂë |
62 | |
63 | ?ç¹?ÐèÒª¸ü¶àµÄÖÐÎıàÂë, ¿ÉÒÔ´Ó CPAN (L<http://www.cpan.org/>) ÏÂÔØ |
64 | Encode::HanExtra Ä£¿é. ËüÄ¿Ç°ÌṩÏÂÁбàÂ뷽ʽ: |
65 | |
66 | gb18030 À©³ä¹ýµÄ¹ú±êÂë, °üº¬·±ÌåÖÐÎÄ |
67 | |
68 | ÁíÍâ, Encode::HanConvert Ä£¿éÔòÌṩÁ˼ò·±×ª»»ÓõÄÁ½ÖÖ±àÂë: |
69 | |
70 | gbk-trad GBK ¼òÌåÖÐÎÄÓë Unicode ·±ÌåÖÐÎÄ»¥×ª |
71 | big5-simp Big5 ·±ÌåÖÐÎÄÓë Unicode ¼òÌåÖÐÎÄ»¥×ª |
72 | |
73 | ?ôÏëÔ? GBK Óë Big5 Ö®¼ä»¥×ª, Çë²Î¿¼¸ÃÄ£¿éÄÚ¸½µÄ b2g.pl Óë g2b.pl Á½ |
74 | Ö§³ÌÐò. |
75 | |
76 | =head2 ½øÒ»²½µÄ×ÊѶ |
77 | |
78 | Çë²Î¿¼ Perl ÄÚ¸½µÄ´óÁ¿ËµÃ÷Îĵµ (²»ÐÒ?«ÊÇÓÃÓ?ÎÄдµÄ) , À´Ñ§Ï°¸ü¶à¹ØÓÚ |
79 | Perl µÄ֪ʶ, ÒÔ¼° Unicode µÄʹÓ÷½Ê½. ²»¹ý, ÍⲿµÄ×ÊÔ´Ï൱·á¸»: |
80 | |
81 | =head2 Ìṩ Perl ×ÊÔ´µÄÍøÖ· |
82 | |
83 | =over 4 |
84 | |
85 | =item L<http://www.perl.com/> |
86 | |
87 | Perl µÄÊ×Ò³ (ÓÉÅ·À³Àñ¹«Ë¾Î¬»¤) |
88 | |
89 | =item L<http://www.cpan.org/> |
90 | |
91 | Perl ×ۺϵä²ØÍø (Comprehensive Perl Archive Network) |
92 | |
93 | =item L<http://lists.perl.org/> |
94 | |
95 | Perl ÓʵÝÂÛ̳һÀÀ |
96 | |
97 | =back |
98 | |
99 | =head2 ѧϰ Perl µÄÍøÖ· |
100 | |
101 | =over 4 |
102 | |
103 | =item L<http://www.oreilly.com.cn/html/perl.html> |
104 | |
105 | ¼òÌåÖÐÎÄ°æµÄÅ·À³Àñ Perl Êé½å |
106 | |
107 | =back |
108 | |
109 | =head2 Perl ʹÓÃÕß¼¯»á |
110 | |
111 | =over 4 |
112 | |
113 | =item L<http://www.pm.org/groups/asia.shtml#China> |
114 | |
115 | Öйú Perl Íƹã×éÒ»ÀÀ |
116 | |
117 | =back |
118 | |
119 | =head2 Unicode Ïà¹ØÍøÖ· |
120 | |
121 | =over 4 |
122 | |
123 | =item L<http://www.unicode.org/> |
124 | |
125 | Unicode ѧÊõѧ»á (Unicode ±ê×¼µÄÖƶ¨Õß) |
126 | |
127 | =item L<http://www.cl.cam.ac.uk/%7Emgk25/unicode.html> |
128 | |
129 | Unix/Linux É쵀 UTF-8 ¼° Unicode ´ð¿ÍÎÊ |
130 | |
131 | =back |
132 | |
133 | =head1 AUTHORS |
134 | |
135 | Jarkko Hietaniemi E<lt>jhi@iki.fiE<gt> |
136 | |
137 | ÌÆ×Úºº E<lt>autrijus@autrijus.orgE<gt> |
138 | |
139 | =cut |