From: Jarkko Hietaniemi Date: Tue, 16 Apr 2002 04:31:49 +0000 (+0000) Subject: Add CJK READMEs from Autrijus Tang, Dan Kogai, and X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=commitdiff_plain;h=d8416318460c9d88f9f5c6724f0703dd06d3d6a9;p=p5sagit%2Fp5-mst-13.2.git Add CJK READMEs from Autrijus Tang, Dan Kogai, and Jungshik Shin. Regen toc. p4raw-id: //depot/perl@15944 --- diff --git a/MANIFEST b/MANIFEST index 6bd5646..6a4044e 100644 --- a/MANIFEST +++ b/MANIFEST @@ -2192,6 +2192,10 @@ README.apollo Notes about Apollo DomainOS port README.beos Notes about BeOS port README.bs2000 Notes about BS2000 POSIX port README.ce Notes about WinCE port +README.cn About using Perl and Simplified Chinese +README.jp About using Perl and Japanese +README.ko About using Perl and Korean +README.tw About using Perl and Traditional Chinese README.cygwin Notes about Cygwin port README.dgux Notes about DG/UX port README.dos Notes about DOS/DJGPP port diff --git a/README.cn b/README.cn new file mode 100644 index 0000000..eb7f7a9 --- /dev/null +++ b/README.cn @@ -0,0 +1,139 @@ +If you read this file _as_is_, just ignore the funny characters you +see. It is written in the POD format (see perlpod manpage) which is +specially designed to be readable as is. + +The following documentation is written in EUC-CN encoding. + +?ç¹?ÄãÓÃÒ»°ãµÄÎÄ×ֱ༭Æ÷ÔÄÀÀÕâ·ÝÎĵµ, ÇëºöÂÔÎÄÖÐÆæÌصÄ×¢¼Ç×Ö·û. Õâ·ÝÎÄ +¼þÊÇÒÔ POD (¼òÃ÷Îĵµ¸ñʽ) д³É; ÕâÖÖ¸ñʽÊÇΪÁËÄÜ?ÃÈËÖ±½Ó¶ÁÈ¡¶øÌرðÉè¼? +µÄ. ¹ØÓڴ˸ñʽµÄ½øÒ»²½×ÊѶ, Çë²Î¿¼ perlpod ÏßÉÏÎĵµ. + +=head1 NAME + +perlcn - ¼òÌåÖÐÎÄ Perl Ö¸ÄÏ + +=head1 DESCRIPTION + +»¶Ó­À´µ½ Perl µÄÌìµØ! + +´Ó 5.8.0 °æ¿ªÊ¼, Perl ¾ß±¸ÁËÏ꾡µÄ Unicode (ͳһÂë) Ö§Ô®, Ò²Á¬´øÖ§Ô®ÁË +Ðí¶àÀ­¶¡ÓïϵÒÔÍâµÄ±àÂ뷽ʽ; CJK (ÖÐ?Õº?) ±ãÊÇÆäÖеÄÒ»²¿·Ý. Unicode ÊÇ +¹ú¼ÊÐԵıê×¼, ÊÔͼº­¸ÇÊÀ½çÉÏËùÓеÄ×Ö·û: Î÷·½ÊÀ½ç, ¶«·½ÊÀ½ç, ÒÔ¼°Á½Õß¼ä +µÄÒ»ÇÐ (Ï£À°ÎÄ, ÐðÀûÑÇÎÄ, °¢À­²®ÎÄ, Ï£²®À´ÎÄ, Ó¡¶ÈÎÄ, Ó¡µØ°²ÎÄ, µÈµÈ). +ËüÒ²?ÝÄÉÁ˶àÖÖ×÷ҵϵͳÓëƽÌ? (?? PC ¼°Âó½ðËþ). + +Perl ±¾ÉíÒÔ Unicode ½øÐвÙ×÷. Õâ±íʾ Perl ÄÚ²¿µÄ×Ö´®×ÊÁÏ¿ÉÓà Unicode +±íʾ, Perl µÄº¯Ê½ÓëËã·û (Àý?çÕ?¹æ±íʾʽ±È¶Ô) Ò²ÄÜ¶Ô Unicode ½øÐвÙ×÷. +ÔÚÊä?ë¼°Êä³öÊ?, ΪÁË´¦ÀíÒÔ Unicode ֮ǰµÄ±àÂ뷽ʽ´¢´æµÄ×ÊÁÏ, Perl Ìṩ +ÁË¡¸Encode¡¹Õâ¸öÄ£¿é, ¿ÉÒÔ?ÃÄãÇáÒ׵ضÁÈ¡¼°Ð´Èë¾ÉÓеıàÂë×ÊÁ?. + +Encode ÑÓÉìÄ£¿éÖ§Ô®ÏÂÁмòÌåÖÐÎĵıàÂ뷽ʽ: + + euc-cn Unix ÑÓÉì×Ö·û¼¯, Ò²¾ÍÊÇË׳ƵĹú±êÂë + gb2312 δ¾­´¦ÀíµÄ (µÍ±ÈÌØ) GB2312 ×Ö·û±í + gb12345 δ¾­´¦ÀíµÄÖйúÓ÷±ÌåÖÐÎıàÂë + iso-ir-165 GB2312 + GB6345 + GB8565 + ÐÂÔö×Ö·û + cp936 ×ÖÂëÒ³ 936, Ò²³ÆΪ GBK (À©³ä¹ú±êÂë) + hz 7 ±ÈÌØÒݳöʽ GB2312 ±àÂë + +¾ÙÀýÀ´Ëµ, ½« euc-cn ±àÂëµÄµµ°¸×ª³É Unicode, ìóÐè¼ü?ëÏÂÁÐÖ¸Á?: + + perl -Mencoding=euc-cn,STDOUT,utf8 -pe1 < file.euc-cn > file.utf8 + +Perl Ò²ÄÚ¸½ÁË¡¸piconv¡¹, Ò»Ö§Íê?«Ò? Perl д³ÉµÄ×Ö·ûת»»¹¤¾ß³ÌÐò, Ó÷¨ +?çÏ?: + + piconv -f euc-cn -t utf8 < file.euc-cn > file.utf8 + piconv -f utf8 -t euc-cn < file.utf8 > file.euc-cn + +ÁíÍâ, ÀûÓà encoding Ä£¿é, Äã¿ÉÒÔÇáÒ×д³öÒÔ×Ö·ûΪµ¥Î»µÄ³ÌÐòÂë, ?çÏÂËùÊ?: + + #!/usr/bin/env perl + # Æô¶¯ euc-cn ×Ö´®½âÎö; ±ê×¼Êä³ö?ë¼°±ê×¼´íÎó¶¼ÉèÎ? euc-cn ±àÂë + use encoding 'euc-cn', STDIN => 'euc-cn', + STDOUT => 'euc-cn', STDERR => 'euc-cn'; + + print length("ÂæÍÕ"); # 2 (Ë«ÒýºÅ±íʾ×Ö·û) + print length('ÂæÍÕ'); # 4 (µ¥ÒýºÅ±íʾ×Ö½Ú) + print index("×»×»½Ì»å", "»×»½"); # -1 (²»°üº¬´Ë×Ó×Ö´®) + print index('×»×»½Ì»å', '»×»½'); # 1 (´ÓµÚ¶þ¸ö×Ö½Ú¿ªÊ¼) + +=head2 ¶îÍâµÄÖÐÎıàÂë + +?ç¹?ÐèÒª¸ü¶àµÄÖÐÎıàÂë, ¿ÉÒÔ´Ó CPAN (L) ÏÂÔØ +Encode::HanExtra Ä£¿é. ËüÄ¿Ç°ÌṩÏÂÁбàÂ뷽ʽ: + + gb18030 À©³ä¹ýµÄ¹ú±êÂë, °üº¬·±ÌåÖÐÎÄ + +ÁíÍâ, Encode::HanConvert Ä£¿éÔòÌṩÁ˼ò·±×ª»»ÓõÄÁ½ÖÖ±àÂë: + + gbk-trad GBK ¼òÌåÖÐÎÄÓë Unicode ·±ÌåÖÐÎÄ»¥×ª + big5-simp Big5 ·±ÌåÖÐÎÄÓë Unicode ¼òÌåÖÐÎÄ»¥×ª + +?ôÏëÔ? GBK Óë Big5 Ö®¼ä»¥×ª, Çë²Î¿¼¸ÃÄ£¿éÄÚ¸½µÄ b2g.pl Óë g2b.pl Á½ +Ö§³ÌÐò. + +=head2 ½øÒ»²½µÄ×ÊѶ + +Çë²Î¿¼ Perl ÄÚ¸½µÄ´óÁ¿ËµÃ÷Îĵµ (²»ÐÒ?«ÊÇÓÃÓ?ÎÄдµÄ) , À´Ñ§Ï°¸ü¶à¹ØÓÚ +Perl µÄ֪ʶ, ÒÔ¼° Unicode µÄʹÓ÷½Ê½. ²»¹ý, ÍⲿµÄ×ÊÔ´Ï൱·á¸»: + +=head2 Ìṩ Perl ×ÊÔ´µÄÍøÖ· + +=over 4 + +=item L + +Perl µÄÊ×Ò³ (ÓÉÅ·À³Àñ¹«Ë¾Î¬»¤) + +=item L + +Perl ×ۺϵä²ØÍø (Comprehensive Perl Archive Network) + +=item L + +Perl ÓʵÝÂÛ̳һÀÀ + +=back + +=head2 ѧϰ Perl µÄÍøÖ· + +=over 4 + +=item L + +¼òÌåÖÐÎÄ°æµÄÅ·À³Àñ Perl Êé½å + +=back + +=head2 Perl ʹÓÃÕß¼¯»á + +=over 4 + +=item L + +Öйú Perl Íƹã×éÒ»ÀÀ + +=back + +=head2 Unicode Ïà¹ØÍøÖ· + +=over 4 + +=item L + +Unicode ѧÊõѧ»á (Unicode ±ê×¼µÄÖƶ¨Õß) + +=item L + +Unix/Linux É쵀 UTF-8 ¼° Unicode ´ð¿ÍÎÊ + +=back + +=head1 AUTHORS + +Jarkko Hietaniemi Ejhi@iki.fiE + +ÌÆ×Úºº Eautrijus@autrijus.orgE + +=cut diff --git a/README.jp b/README.jp new file mode 100644 index 0000000..1a58765 --- /dev/null +++ b/README.jp @@ -0,0 +1,200 @@ +If you read this file _as_is_, just ignore the funny characters you +see. It is written in the POD format (see perlpod manpage) which is +specially designed to be readable as is. + +The following documentation is written in FOO encoding. + +=head1 NAME + +perljp - ÆüËܸì Perl ¥¬¥¤¥É + +=head1 ÀâÌÀ + +¤è¤¦¤³¤½ Perl ¤Ø! + +Perl 5.8.0 ¤è¤ê¡¢Unicode¥µ¥Ý¡¼¥È¤¬ÂçÉý¤Ë¶¯²½¤µ¤ì¡¢¤½¤Î·ë²Ì¥é¥Æ¥óʸ»ú°Ê³°¤Îʸ»ú¥³¡¼¥É¤Î¥µ¥Ý¡¼¥È¤¬ CJK (Ãæ¹ñ¸ì¡¢ÆüËܸ졢¥Ï¥ó¥°¥ë)¤ò´Þ¤á¤Æ²Ã¤ï¤ê¤Þ¤·¤¿¡£Unicode¤ÏÀ¤³¦Ãæ¤Îʸ»ú¤ò°ì¤Ä¤Îʸ»ú¥³¡¼¥É¤Ç°·¤¦¤³¤È¤òÌܻؤ·¤¿É¸½àµ¬³Ê¤Ç¤¢¤ê¡¢Å줫¤éÀ¾¡¢¤Ï¤¿¤Þ¤¿¤½¤Î´Ö¤Îʸ»ú¡Ê¥®¥ê¥·¥ãʸ»ú¡¢¥­¥ê¡¼¥ëʸ»ú¡¢¥¢¥é¥Ó¥¢Ê¸»ú¡¢¥Ø¥Ö¥é¥¤Ê¸»ú¡¢¥Ç¥£¡¼¥ô¥¡¥Ê¥¬¡¼¥êʸ»ú¡¢¤Ê¤É¤Ê¤É¡Ë¤ä¡¢¤³¤ì¤Þ¤Ç¤ÏOS¥Ù¥ó¥À¡¼¤¬Æȼ«¤ËÄê¤á¤Æ¤¤¤¿Ê¸»ú(PC¤ª¤è¤ÓMacintosh)¤¬¤¹¤Ç¤Ë´Þ¤Þ¤ì¤Æ¤¤¤Þ¤¹¡£ + +Perl ¼«¿È¤Ï Unicode ¤ÇÆ°ºî¤·¤Þ¤¹¡£Perl ¥¹¥¯¥ê¥×¥ÈÆâ¤Îʸ»úÎó¥ê¥Æ¥é¥ë¤äÀµµ¬É½¸½¤Ï Unicode ¤òÁ°Äó¤È¤·¤Æ¤¤¤Þ¤¹¡£¤½¤·¤ÆÆþ½ÐÎϤΤ¿¤á¤Ë¤Ï¡¢¤³¤ì¤Þ¤Ç»È¤ï¤ì¤Æ¤­¤¿¤µ¤Þ¤¶¤Þ¤Êʸ»ú¥³¡¼¥É¤ËÂбþ¤¹¤ë¥â¥¸¥å¡¼¥ë¡¢¡Ö Encode ¡×¤¬É¸½àÁõÈ÷¤µ¤ì¤Æ¤ª¤ê¡¢Unicode ¤È¤³¤ì¤é¤Îʸ»ú¥³¡¼¥É¤ÎÁê¸ßÊÑ´¹¤â´Êñ¤Ë¹Ô¤¨¤ë¤è¤¦¤Ë¤Ê¤Ã¤Æ¤¤¤Þ¤¹¡£ + +¸½»þÅÀ¤Ç Encode ¤¬¥µ¥Ý¡¼¥È¤¹¤ëʸ»ú¥³¡¼¥É¤Ï°Ê²¼¤Î¤È¤ª¤ê¤Ç¤¹¡£ + + 7bit-jis AdobeStandardEncoding AdobeSymbol AdobeZdingbat + ascii big5 big5-hkscs cp1006 + cp1026 cp1047 cp1250 cp1251 + cp1252 cp1253 cp1254 cp1255 + cp1256 cp1257 cp1258 cp37 + cp424 cp437 cp500 cp737 + cp775 cp850 cp852 cp855 + cp856 cp857 cp860 cp861 + cp862 cp863 cp864 cp865 + cp866 cp869 cp874 cp875 + cp932 cp936 cp949 cp950 + dingbats euc-cn euc-jp euc-kr + gb12345-raw gb2312-raw gsm0338 hp-roman8 + hz iso-2022-jp iso-2022-jp-1 iso-8859-1 + iso-8859-10 iso-8859-11 iso-8859-13 iso-8859-14 + iso-8859-15 iso-8859-16 iso-8859-2 iso-8859-3 + iso-8859-4 iso-8859-5 iso-8859-6 iso-8859-7 + iso-8859-8 iso-8859-9 iso-ir-165 jis0201-raw + jis0208-raw jis0212-raw johab koi8-f + koi8-r koi8-u ksc5601-raw MacArabic + MacCentralEurRoman MacChineseSimp MacChineseTrad MacCroatian + MacCyrillic MacDingbats MacFarsi MacGreek + MacHebrew MacIcelandic MacJapanese MacKorean + MacRoman MacRomanian MacRumanian MacSami + MacSymbol MacThai MacTurkish MacUkrainian + nextstep posix-bc shiftjis symbol + UCS-2BE UCS-2LE UTF-16 UTF-16BE + UTF-16LE UTF-32 UTF-32BE UTF-32LE + utf8 viscii + +(Á´114¼ïÎà) + +Î㤨¤Ð¡¢Ê¸»ú¥³¡¼¥ÉFOO¤Î¥Õ¥¡¥¤¥ë¤òUTF-8¤ËÊÑ´¹¤¹¤ë¤Ë¤Ï¡¢°Ê²¼¤Î¤è¤¦¤Ë¤·¤Þ¤¹¡£ + + perl -Mencoding=FOO,STDOUT,utf8 -pe1 < file.FOO > file.utf8 + +¤Þ¤¿¡¢Perl¤Ë¤Ï¡¢Á´Éô¤¬Perl¤Ç½ñ¤«¤ì¤¿Ê¸»ú¥³¡¼¥ÉÊÑ´¹¥æ¡¼¥Æ¥£¥ê¥Æ¥£¡¢piconv¤âÉÕ°¤·¤Æ¤¤¤ë¤Î¤Ç¡¢°Ê²¼¤Î¤è¤¦¤Ë¤¹¤ë¤³¤È¤â¤Ç¤­¤Þ¤¹¡£ + + piconv -f FOO -t utf8 < file.FOO > file.utf8 + piconv -f utf8 -t FOO < file.utf8 > file.FOO + +=head2 About (jcode.pl|Jcode.pm|JPerl) + +5.8°ÊÁ°¤Î¡¢¥¹¥¯¥ê¥×¥È¤¬EUC-JP¤Ç¤¢¤ì¤Ð¥ê¥Æ¥é¥ë¤À¤±¤Ï°·¤¦¤³¤È¤¬¤Ç¤­¤Þ¤·¤¿¡£¤Þ¤¿¡¢Æþ½ÐÎϤò°·¤¦¥â¥¸¥å¡¼¥ë¤È¤·¤Æ¤ÏJcode.pm¤¬( http://openlab.jp/Jcode/ )¡¢perl4ÍѤΥ桼¥Æ¥£¥ê¥Æ¥£¤È¤·¤Æ¤Ïjcode.pl( http://srekcah.org/jcode/ )¤¬¤½¤ì¤¾¤ì¸ºß¤·¡¢ÆüËܸì¤Î°·¤¨¤ëCGI¤Ç¤è¤¯ÍøÍѤµ¤ì¤Æ¤¤¤ë¤³¤È¤ò¸æ¸¤¸¤ÎÊý¤â¾¯¤Ê¤¯¤Ê¤¤¤«¤È»×¤ï¤ì¤Þ¤¹¡£¤¿¤À¤·¡¢ÆüËܸì¤Ë¤è¤ëÀµµ¬É½¸½¤ò¤¦¤Þ¤¯°·¤¦¤³¤È¤ÏÉÔ²Äǽ¤Ç¤·¤¿¡£ + +5.005°ÊÁ°¤ÎPerl¤Ë¤Ï¡¢ÆüËܸì¤ËÆò½¤·¤¿¥í¡¼¥«¥é¥¤¥ºÈÇ¡¢Jperl¤¬Â¸ºß¤·¤Þ¤·¤¿( http://homepage2.nifty.com/kipp/perl/jperl/index.html )¡£¤Þ¤¿¡¢MacOS 9.x/ClassicÍѤÎPerl¡¢MacPerl¤ÎÆüËܸìÈǤâMacJPerl¤È¤·¤Æ¸ºß¤·¤Æ¤Þ¤·¤¿¡£( http://world.std.com/~habilis/macjperl/ ).¤³¤ì¤é¤Ç¤Ïʸ»ú¥³¡¼¥É¤È¤·¤ÆEUC-JP¤Ë²Ã¤¨Shift_JIS¤â¤½¤Î¤Þ¤Þ°·¤¦¤³¤È¤¬¤Ç¤­¡¢¤Þ¤¿ÆüËܸì¤Ë¤è¤ëÀµµ¬É½¸½¤ò°·¤¦¤³¤È¤â²Äǽ¤Ç¤·¤¿¡£ + +Perl5.8¤Ç¤Ï¡¢¤³¤ì¤é¤Îµ¡Ç½¤¬¤¹¤Ù¤ÆPerlËÜÂΤÀ¤±¤Ç¼Â¸½¤Ç¤­¤ë¾å¤Ë¡¢ÆüËܸì¤Î¤ß¤Ê¤é¤º¾åµ­114¤Îʸ»ú¥³¡¼¥É¤ò¤¹¤Ù¤Æ¡¢¤·¤«¤âƱ»þ¤Ë°·¤¦¤³¤È¤¬¤Ç¤­¤Þ¤¹¡£¤µ¤é¤Ë¡¢CPAN¤Ê¤É¤«¤é¿·¤·¤¤Ê¸»ú¥³¡¼¥ÉÍѤΥ⥸¥å¡¼¥ë¤òÆþ¼ê¤¹¤ë¤³¤È¤â´Êñ¤Ë¤Ç¤­¤ë¤è¤¦¤Ë¤Ê¤Ã¤Æ¤¤¤Þ¤¹¡£ + +=over 4 + +=item¡¡Æþ½ÐÎÏ + +°Ê²¼¤ÎÎã¤Ï¤¤¤Å¤ì¤âShift_JIS¤ÎÆþÎϤòEUC-JP¤ËÊÑ´¹¤·¤Æ½ÐÎϤ·¤Þ¤¹¡£ + + # jcode.pl + require "jcode.pl"; + while(<>){ + jcode::convert(*_, 'euc', 'sjis); + print; + } + # Jcode.pm + use Jcode; + while(<>){ + print Jcode->new($_, 'sjis')->euc; + } + # Perl 5.8 + use Encode; + while(<>){ + from_to($_, 'shiftjis', 'euc-jp'); + print; + } + # Perl 5.8 - encoding ¤òÍøÍѤ·¤Æ + use encoding 'euc-jp', STDIN=>'shiftjis' + while(<>){ + print; + } + +=item Jperl ¸ß´¹¥¹¥¯¥ê¥×¥È + + ¤¤¤ï¤æ¤ë"shebang"¤òÊѹ¹¤¹¤ë¤À¤±¤Ç¡¢JperlÍѤÎscript¤Î¤Û¤È¤ó¤É¤ÏÊѹ¹¤Ê¤·¤ËÍøÍѲÄǽ¤À¤È»×¤ï¤ì¤Þ¤¹¡£ + + #!/path/to/jperl + ¢­ + #!/path/to/perl -Mencoding=euc-jp + + ¾Ü¤·¤¯¤Ï perldoc encoding ¤ò»²¾È¤·¤Æ¤¯¤À¤µ¤¤¡£ + +=back + +=head2 ¤µ¤é¤Ë¾Ü¤·¤¯ + +Perl¤Ë¤ÏËÄÂç¤Ê»ñÎÁ¤¬ÉÕ°¤·¤Æ¤ª¤ê¡¢Perl¤Î¿·µ¡Ç½¤äUnicode¥µ¥Ý¡¼¥È¡¢¤½¤·¤ÆEncode¥â¥¸¥å¡¼¥ë¤Î»ÈÍÑË¡¤Ê¤É¤¬ºÙ¤«¤¯ÌÖÍ夵¤ì¤Æ¤¤¤Þ¤¹¡Ê»ÄÇ°¤Ê¤¬¤é¡¢¤Û¤È¤ó¤É±Ñ¸ì¤Ç¤Ï¤¢¤ê¤Þ¤¹¤¬¡Ë¡£°Ê²¼¤Î¥³¥Þ¥ó¥É¤Ç¤½¤ì¤é¤Î°ìÉô¤ò±ÜÍ÷¤¹¤ë¤³¤È¤¬²Äǽ¤Ç¤¹¡£ + + perldoc perlunicode # Perl¤ÎUnicode¥µ¥Ý¡¼¥ÈÁ´ÈÌ + perldoc Encode # Encode¥â¥¸¥å¡¼¥ë¤Ë´Ø¤·¤Æ + perldoc Encode::JP # ¤¦¤ÁÆüËܸìʸ»ú¥³¡¼¥É¤Ë´Ø¤·¤Æ + +=head2 PerlÁ´È̤˴ؤ¹¤ë URL + +=over 4 + +=item L + +Perl ¥Û¡¼¥à¥Ú¡¼¥¸ (O'Reilly and Associates) + +=item L + +CPAN (Comprehensive Perl Archive Network) + +=item L + +Perl ¥á¡¼¥ê¥ó¥°¥ê¥¹¥È½¸ + +=back + +=head2 Perl¤Î½¤ÆÀ¤ËÌòΩ¤Ä URL + +=over 4 + +=item L + +O'Reilly ¼Ò¤ÎPerl´ØÏ¢½ñÀÒ(ÈËÂλúÃæ¹ñ¸ì) + +=item L + +O'Reilly ¼Ò¤ÎPerl´ØÏ¢½ñÀÒ(´ÊÂλúÃæ¹ñ¸ì) + +=item L + +¥ª¥é¥¤¥ê¡¼¼Ò¤ÎPerl´ØÏ¢½ñÀÒ(ÆüËܸì) + +=back + +=head2 Perl ¥æ¡¼¥¶¡¼¥°¥ë¡¼¥× + +=over 4 + +=item L + +Ãæ¹ñ¡ÊÃæ²Ú¿Í̱¶¦Ï¹ñ¡Ë + +=item L + +ÆüËÜ + +=item L + +´Ú¹ñ¡ÊÂç´Ú̱¹ñ¡Ë + +=item L + +ÂæÏÑ¡ÊÃæ²Ú̱¹ñ¡Ë + +=back + +=head2 Unicode´ØÏ¢¤ÎURL + +=over 4 + +=item L + +Unicode ¥³¥ó¥½¡¼¥·¥¢¥à (Unicodeµ¬³Ê¤ÎÁªÄêÃÄÂÎ) + +=item L + +UTF-8 and Unicode FAQ for Unix/Linux + +=item L + +UTF-8 and Unicode FAQ for Unix/Linux (¥Ï¥ó¥°¥ëÌõ) + +=back + +=head1 AUTHORS + +Jarkko Hietaniemi Ejhi@iki.fiE +Dan Kogai (¾®»ô¡¡ÃÆ) Edankogai@dan.co.jpE + +=cut diff --git a/README.ko b/README.ko new file mode 100644 index 0000000..e83bfc2 --- /dev/null +++ b/README.ko @@ -0,0 +1,179 @@ +If you read this file _as_is_, just ignore the funny characters you +see. It is written in the POD format (see perlpod manpage) which is +specially designed to be readable as is. + +This file is in Korean encoded in EUC-KR. + +ÀÌ ¹®¼­¸¦ perldocÀ» ½á¼­ º¸Áö ¾Ê°í Á÷Á¢ º¸´Â °æ¿ì¿¡´Â °¢ ºÎºÐÀÇ +¿ªÇÒÀ» Ç¥½ÃÇϱâ À§ÇØ ¾²ÀÎ =head, =item, 'L' µîÀº ¹«½ÃÇϽʽÿÀ. +ÀÌ ¹®¼­´Â µû·Î perldocÀ» ¾²Áö ¾Ê°í º¸´õ¶óµµ Àдµ¥ º° ÁöÀåÀÌ +¾ø´Â POD Çü½ÄÀ¸·Î Â¥¿© ÀÖ½À´Ï´Ù. ´õ ÀÚ¼¼ÇÑ °ÍÀº perlpod +¸Å´º¾óÀ» Âü°íÇϽʽÿÀ. + + +=head1 NAME + +perlko - Perl°ú Çѱ¹¾î ÀÎÄÚµù + +=head1 DESCRIPTION + +PerlÀÇ ¼¼°è¿¡ ¿À½Å °ÍÀ» ȯ¿µÇÕ´Ï´Ù ! + +Starting for Perl release 5.8.0 Perl has extensive support for Unicode +and as a part of that, extensive support for non-Latin characters +encodings, including the CJK (Chinese-Japanese Korean). Unicode is an +international standard that aims to include all of world's +characters: Western, Eastern, and everything in between (Greek, +Cyrillic, Arabic, Hebrew, Indic, Amerindian, and so on), and +encodings of various operating system platforms (PC and MacIntosh). + +PerlÀº 5.8.0 ¹öÀü¿¡¼­ºÎÅÍ À¯´ÏÄÚµå/ISO 10646¿¡ ´ëÇÑ ±¤¹üÀ§ÇÑ Áö¿ø ±â´ÉÀ» +°¡Áö°í ÀÖ½À´Ï´Ù. À¯´ÏÄÚµå Áö¿øÀÇ ÀÏȯÀ¸·Î ÇÑÁßÀÏÀ» ºñ·ÔÇÑ ¼¼°è °¢±¹¿¡¼­ +À¯´ÏÄÚµå ÀÌÀü¿¡ ¾²°í ÀÖ¾ú°í Áö±Ýµµ ³Î¸® ¾²ÀÌ°í ÀÖ´Â ¼ö¸¹Àº ÀÎÄÚµùÀ» +Áö¿øÇÕ´Ï´Ù. À¯´ÏÄÚµå´Â Àü ¼¼°è¿¡¼­ ¾²ÀÌ´Â ¸ðµç ¾ð¾î¸¦ À§ÇÑ Ç¥±â ü°è - +À¯·´ÀÇ ¶óƾ ¾ËÆĺª, Å°¸± ¾ËÆĺª, ±×¸®½º ¾ËÆĺª, Àεµ¿Í µ¿³² ¾Æ½Ã¾ÆÀÇ +ºê¶ó¹Ì ½ºÅ©¸³Æ® °è¿­, ¾Æ¶ø, È÷ºê¸®, ÇÑÁßÀÏÀÇ ÇÑÀÚ, Çѱ¹¾îÀÇ ÇѱÛ, +ÀϺ»¾îÀÇ °¡³ª, ºÏ¹Ì Àεð¾ÈÀÇ Ç¥±â ü°è µî- ¼ö¿ëÇÏ´Â °ÍÀ» ¸ñÇ¥·Î ÇÏ°í +Àֱ⠶§¹®¿¡ ±âÁ¸¿¡ ¾²ÀÌ´ø °¢ ¾ð¾î ¹× ±¹°¡ ±×¸®°í ¿î¿µ ü°è¿¡ °íÀ¯ÇÑ +¹®ÀÚ ÁýÇÕ°ú ÀÎÄÚµùÀ» ¸ðµÎ Æ÷°ýÇÏ°í ÀÖ½À´Ï´Ù. + + +PerlÀº ³»ºÎÀûÀ¸·Î À¯´ÏÄڵ带 ¹®ÀÚ Ç¥ÇöÀ» À§ÇØ »ç¿ëÇÕ´Ï´Ù. º¸´Ù ±¸Ã¼ÀûÀ¸·Î +¸»Çϸé Perl ½ºÅ©¸³Æ® ¾È¿¡¼­ UTF-8 ¹®ÀÚ¿­À» ¾µ ¼ö ÀÖ°í, +°¢Á¾ ÇÔ¼ö¿Í ¿¬»êÀÚ (¿¹¸¦ µé¾î, Á¤±Ô½Ä, index, substr)°¡ ¹ÙÀÌÆ® ´ÜÀ§ +´ë½Å À¯´ÏÄÚµå ±ÛÀÚ ´ÜÀ§·Î µ¿ÀÛÇÕ´Ï´Ù. (´õ ÀÚ¼¼ÇÑ °ÍÀº +perlunicode ¸Å´º¾óÀ» Âü°íÇϽʽÿÀ.) À¯´ÏÄڵ尡 ³Î¸® º¸±ÞµÇ±â Àü¿¡ +³Î¸® ¾²ÀÌ°í ÀÖ¾ú°í, ¿©ÀüÈ÷ ³Î¸® ¾²ÀÌ°í ÀÖ´Â °¢±¹/°¢ ¾ð¾îº° ÀÎÄÚµùÀ¸·Î +ÀÔÃâ·ÂÀ» ÇÏ°í À̵é ÀÎÄÚµùÀ¸·Î µÈ µ¥ÀÌÅÍ¿Í ¹®¼­¸¦ ´Ù·ç´Â °ÍÀ» µ½±â À§ÇØ +'Encode'°¡ ¾²¿´½À´Ï´Ù. ¹«¾ùº¸´Ù 'Encode'¸¦ ½á¼­ ¼ö¸¹Àº ÀÎÄÚµù »çÀÌÀÇ +º¯È¯À» ½±°Ô ÇÒ ¼ö ÀÖ½À´Ï´Ù. + +'Encode'´Â ´ÙÀ½°ú °°Àº Çѱ¹¾î ÀÎÄÚµùÀ» Áö¿øÇÕ´Ï´Ù. + + euc-kr : US-ASCII¿Í KS X 1001À» °°ÀÌ ¾²´Â ¸ÖƼ¹ÙÀÌÆ® ÀÎÄÚµù + (ÈçÈ÷ ¿Ï¼ºÇüÀ̶ó°í ºÒ¸².) KS X 2901°ú RFC 1557 Âü°í. + cp949 : MS-Windows 9x/ME¿¡¼­ ¾²ÀÌ´Â È®Àå ¿Ï¼ºÇü. + euc-kr¿¡ 8,822ÀÚÀÇ ÇÑ±Û À½ÀýÀ» ´õÇÑ °ÍÀÓ. + alias´Â uhc, windows-949, x-windows-949, + ks_c_5601-1987. ¸Ç ¸¶Áö¸· À̸§Àº ÀûÀýÇÏÁö ¾ÊÀº + À̸§ÀÌÁö¸¸, Microsoft Á¦Ç°¿¡¼­ CP949ÀÇ Àǹ̷Π+ ¾²ÀÌ°í ÀÖÀ½. + johab : KS X 1001:1998 ºÎ·Ï 3¿¡¼­ ±ÔÁ¤ÇÑ Á¶ÇÕÇü. + ¹®ÀÚ ·¹ÆÛÅ丮´Â cp949¿Í ¸¶Âù°¡Áö·Î US-ASCII, + KS X 1001¿¡ 8,822ÀÚÀÇ ÇÑ±Û À½ÀýÀ» ´õÇÑ °ÍÀÓ. + + iso-2022-kr : RFC 1557¿¡¼­ ±ÔÁ¤ÇÑ Çѱ¹¾î ÀÎÅÍ³Ý ¸ÞÀÏ ±³È¯¿ë ÀÎÄÚµùÀ¸·Î + US-ASCII¿Í KS X 1001À» ·¹ÆÛÅ丮·Î ÇÏ´Â Á¡¿¡¼­ + euc-kr°ú °°Áö¸¸ ÀÎÄÚµù ¹æ½ÄÀÌ ´Ù¸§. + 1997-8³â °æ±îÁö ¾²¿´À¸³ª ´õ ÀÌ»ó ¸ÞÀÏ ±³È¯¿¡ ¾²ÀÌÁö + ¾ÊÀ½. + ksc5601-raw : KS X 1001(KS C 5601)À» GL(Áï, MSB¸¦ 0À¸·Î ÇÑ °æ¿ì) + ¿¡ ³õ¾ÒÀ» ¶§ÀÇ ÀÎÄÚµù. US-ASCII¿Í °áÇÕÇÏÁö ¾Ê°í + ´Üµ¶À¸·Î ¾²ÀÌ´Â ÀÏÀº X11 µî¿¡¼­ ±Û²Ã ÀÎÄÚµù + (ksc5601.1987-0. '0'Àº GLÀ» ÀǹÌÇÔ.)À¸·Î ¾²ÀÌ´Â °ÍÀ» + Á¦¿ÜÇÏ°í´Â °ÅÀÇ ¾øÀ½. + +¿¹¸¦ µé¾î, euc-kr ÀÎÄÚµùÀ¸·Î µÈ ÆÄÀÏÀ» UTF-8·Î º¯È¯ÇÏ·Á¸é ´ÙÀ½°ú +°°ÀÌ ÇÏ¸é µË´Ï´Ù. + + + perl -Mencoding=euc-kr,STDOUT,utf8 -pe1 < file.euckr > file.utf8 + +¿ªº¯È¯Àº ´ÙÀ½°ú °°ÀÌ ÇÒ ¼ö ÀÖ½À´Ï´Ù. + + perl -Mencoding=utf8,STDOUT,euc-kr -pe1 < file.utf8 > file.euckr + +ÀÌ·± º¯È¯À» Á»´õ Æí¸®ÇÏ°Ô ÇÒ ¼ö ÀÖµµ·Ï Encode ¸ðµâÀ» ½á¼­ +¼ø¼öÇÏ°Ô Perl·Î¸¸ ¾²ÀÎ piconv°¡ Perl¿¡ µé¾î ÀÖ½À´Ï´Ù. +±× À̸§¿¡¼­ ¾Ë ¼ö ÀÖµíÀÌ piconv´Â Unix¿¡ ÀÖ´Â iconv¸¦ +¸ðµ¨·Î ÇÑ °ÍÀÔ´Ï´Ù. ±× »ç¿ë¹ýÀº ¾Æ·¡¿Í °°½À´Ï´Ù. + + piconv -f FOO -t utf8 < file.euckr > file.utf8 + piconv -f utf8 -t FOO < file.utf8 > file.euckr + +¶Ç, 'encoding' ¸ðµâÀ» ½á¼­ Çѱ¹¾î ÀÎÄÚµùÀ» ¾²¸é¼­ ±ÛÀÚ ´ÜÀ§ +(¹ÙÀÌÆ® ´ÜÀ§°¡ ¾Æ´Ï¶ó) 󸮸¦ ½±°Ô ÇÒ ¼ö ÀÖ½À´Ï´Ù. + + #!/usr/local/bin/perl + + use encoding 'euc-kr', STDIN => 'euc-kr', + STDOUT-> 'euc-kr', STDERR=>'euc-kr'; + + print length("°¡³ª"); # 2 (Å« µû¿ÈÇ¥´Â ±ÛÀÚ ´ÜÀ§ 󸮸¦ Áö½Ã) + print length('°¡³ª'); # 4 (ÀÛÀº µû¿ÈÇ¥´Â ¹ÙÀÌÆ® ´ÜÀ§ 󸮸¦ Áö½Ã) + print index("ÇÑ°­, ´ëµ¿°­", "¿°"); # -1 ('¿°'ÀÌ ¾øÀ½) + print index('ÇÑ°­, ´ëµ¿°­', '¿°'); # 7 (8¹ø°¿Í 9¹ø° ¹ÙÀÌÆ®°¡ '¿°'ÀÇ + Äڵ尪°ú ÀÏÄ¡ÇÔ.) + + +=head2 ´õ ÀÚ¼¼È÷ ¾Ë°í ½ÍÀ¸¸é... + +PerlÀ» ¼³Ä¡ÇÏ¸é ´ë´ÜÈ÷ ÀÚ¼¼ÇÑ ¹®¼­°¡ °°ÀÌ µû¶ó ¿À¸ç, ÀÌ ¹®¼­¸¦ ÅëÇØ +Perl Àü¹Ý »Ó ¾Æ´Ï¶ó À¯´ÏÄÚµå Áö¿ø, EncodeÀÇ »ç¿ë¹ý µî¿¡ ¸¹Àº °ÍÀ» +¹è¿ï ¼ö ÀÖ½À´Ï´Ù. ºÒÇàÈ÷µµ ÀÌ ¹®¼­´Â ÇöÀç ¸ðµÎ ¿µ¾î·Î ¾²¿© ÀÖ½À´Ï´Ù. +ÀÌ ¹®¼­ ¿Ü¿¡µµ ´ÙÀ½°ú °°Àº ÀÚ·á°¡ ÀÖ½À´Ï´Ù. ÀÌ ¸ñ·ÏÀº °áÄÚ +¿ÏÀüÇÑ °ÍÀÌ ¾Æ´Ï°í ÀϺΠ´ëÇ¥ÀûÀÎ °Í¸¸ ¸ðÀº °ÍÀÔ´Ï´Ù. + + +=head2 Perl °ü·Ã ÀÚ·á + +=over 4 + +=item L + + O'ReillyÀÇ Perl À¥ ÆäÀÌÁö + +=item L + + Comprehensive Perl Archive Network + +=item L + + Perl ¸ÞÀϸµ ¸®½ºÆ®. ¸¹Àº ¸®½ºÆ® °¡¿îµ¥ + perl-unicode¿¡¼­ 'Encode'¿¡ ´ëÇØ ³íÀÇÇÔ. + +=back + +=head2 PerlÀ» ´õ ±í°Ô °øºÎÇϴµ¥ µµ¿òÀ» ÁÙ ¼ö ÀÖ´Â Çѱ¹¾î °ü·Ã »çÀÌÆ® + +=over 4 + +=item L + + O'Reilly¿¡¼­ ³ª¿Â Çѱ¹¾î Perl ¼­Àû ¸ñ·Ï + +=item L + + Perl¿¡ °ü·ÃµÈ CGI, DB, ¿¬µ¿ µî¿¡ ´ëÇÑ Á¤º¸ ¹× ´º½º Á¦°ø + +=back + +=head2 À¯´ÏÄÚµå °ü·Ã ÀÚ·á + +=over 4 + +=item L + + À¯´ÏÄÚµå ÄÁ¼Ò½Ã¾ö. + +=item L + +±âº»ÀûÀ¸·Î Unicode¿Í °°Àº ISO Ç¥ÁØÀÎ ISO/IEC 10646 UCS(Universal +Character Set)À» ¸¸µå´Â ISO/IEC JTC1/SC2/WG2ÀÇ À¥ ÆäÀÌÁö. + +=item L + + À¯´Ð½º/¸®´ª½º¿¡¼­ À¯´ÏÄÚµå¿Í UTF-8 »ç¿ë¿¡ ´ëÇÑ ¹®´äÁý(FAQ) + +=item L + + À¯´Ð½º/¸®´ª½º¿¡¼­ À¯´ÏÄÚµå¿Í UTF-8 »ç¿ë¿¡ ´ëÇÑ ¹®´äÁý(FAQ)ÀÇ Çѱ¹¾î ¹ø¿ª + +=back + +=head1 AUTHORS + +Jarkko Hietaniemi Ejhi@iki.fiE +½ÅÁ¤½Ä Ejshin@mailaps.org + +=cut diff --git a/README.tw b/README.tw new file mode 100644 index 0000000..02c0d4e --- /dev/null +++ b/README.tw @@ -0,0 +1,145 @@ +If you read this file _as_is_, just ignore the funny characters you +see. It is written in the POD format (see perlpod manpage) which is +specially designed to be readable as is. + +The following documentation is written in Big5 encoding. + +¦pªG§A¥Î¤@¯ëªº¤å¦r½s¿è¾¹¾\Äý³o¥÷¤å¥ó, ½Ð©¿²¤¤å¤¤©_¯Sªºµù°O¦r²Å. ³o¥÷¤å +¥ó¬O¥H POD (²©ú¤å¥ó®æ¦¡) ¼g¦¨; ³oºØ®æ¦¡¬O¬°¤F¯àÅý¤Hª½±µÅª¨ú¦Ó¯S§O³]­p +ªº. Ãö©ó¦¹®æ¦¡ªº¶i¤@¨B¸ê°T, ½Ð°Ñ¦Ò perlpod ½u¤W¤å¥ó. + +=head1 NAME + +perltw - ¥¿Å餤¤å Perl «ü«n + +=head1 DESCRIPTION + +Åwªï¨Ó¨ì Perl ªº¤Ñ¦a! + +±q 5.8.0 ª©¶}©l, Perl ¨ã³Æ¤F¸ÔºÉªº Unicode (¸U°ê½X) ¤ä´©, ¤]³s±a¤ä´©¤F +³\¦h©Ô¤B»y¨t¥H¥~ªº½s½X¤è¦¡; CJK (¤¤¤éÁú) «K¬O¨ä¤¤ªº¤@³¡¥÷. Unicode ¬O +°ê»Ú©Êªº¼Ð·Ç, ¸Õ¹Ï²[»\¥@¬É¤W©Ò¦³ªº¦r²Å: ¦è¤è¥@¬É, ªF¤è¥@¬É, ¥H¤Î¨âªÌ¶¡ +ªº¤@¤Á (§Æþ¤å, ±Ô§Q¨È¤å, ªü©Ô§B¤å, §Æ§B¨Ó¤å, ¦L«×¤å, ¦L¦a¦w¤å, µ¥µ¥). +¥¦¤]®e¯Ç¤F¦hºØ§@·~¨t²Î»P¥­»O (¦p PC ¤Î³Áª÷¶ð). + +Perl ¥»¨­¥H Unicode ¶i¦æ¾Þ§@. ³oªí¥Ü Perl ¤º³¡ªº¦r¦ê¸ê®Æ¥i¥Î Unicode +ªí¥Ü, Perl ªº¨ç¦¡»Pºâ²Å (¨Ò¦p¥¿³Wªí¥Ü¦¡¤ñ¹ï) ¤]¯à¹ï Unicode ¶i¦æ¾Þ§@. +¦b¿é¤J¤Î¿é¥X®É, ¬°¤F³B²z¥H Unicode ¤§«eªº½s½X¤è¦¡Àx¦sªº¸ê®Æ, Perl ´£¨Ñ +¤F¡uEncode¡v³o­Ó¼Ò²Õ, ¥i¥HÅý§A»´©ö¦aŪ¨ú¤Î¼g¤J¦³ªº½s½X¸ê®Æ. + +Encode ©µ¦ù¼Ò²Õ¤ä´©¤U¦C¥¿Å餤¤åªº½s½X¤è¦¡: + + big5 ³Ìªìªº Big5 ½s½X + big5-hkscs Big5 + ­»´ä¥~¦r¶° + cp950 ¦r½X­¶ 950 (Big5 + ·L³n²K¥[ªº¦r²Å) + +Á|¨Ò¨Ó»¡, ±N Big5 ½s½XªºÀÉ®×Âন Unicode, ¯­»ÝÁä¤J¤U¦C«ü¥O: + + perl -Mencoding=big5,STDOUT,utf8 -pe1 < file.big5 > file.utf8 + +Perl ¤]¤ºªþ¤F¡upiconv¡v, ¤@¤ä§¹¥þ¥H Perl ¼g¦¨ªº¦r²ÅÂà´«¤u¨ãµ{¦¡, ¥Îªk +¦p¤U: + + piconv -f big5 -t utf8 < file.big5 > file.utf8 + piconv -f utf8 -t big5 < file.utf8 > file.big5 + +¥t¥~, §Q¥Î encoding ¼Ò²Õ, §A¥i¥H»´©ö¼g¥X¥H¦r²Å¬°³æ¦ìªºµ{¦¡½X, ¦p¤U©Ò¥Ü: + + #!/usr/bin/env perl + # ±Ò°Ê big5 ¦r¦ê¸ÑªR; ¼Ð·Ç¿é¥X¤J¤Î¼Ð·Ç¿ù»~³£³]¬° big5 ½s½X + use encoding 'big5', STDIN => 'big5', + STDOUT => 'big5', STDERR => 'big5'; + + print length("Àd¾m"); # 2 (Âù¤Þ¸¹ªí¥Ü¦r²Å) + print length('Àd¾m'); # 4 (³æ¤Þ¸¹ªí¥Ü¦ì¤¸²Õ) + print index("½Î½Î±Ð»£", "να"); # -1 (¤£¥]§t¦¹¤l¦r¦ê) + print index('½Î½Î±Ð»£', 'να'); # 1 (±q²Ä¤G­Ó¦r¸`¶}©l) + +=head2 ÃB¥~ªº¤¤¤å½s½X + +¦pªG»Ý­n§ó¦hªº¤¤¤å½s½X, ¥i¥H±q CPAN (L) ¤U¸ü +Encode::HanExtra ¼Ò²Õ. ¥¦¥Ø«e´£¨Ñ¤U¦C½s½X¤è¦¡: + + euc-tw Unix ©µ¦ù¦r²Å¶°, ¥]§t CNS11643 ¥­­± 1-7 + big5plus ¤¤¤å¼Æ¦ì¤Æ§Þ³N±À¼s°òª÷·|ªº Big5+ + +¥t¥~, Encode::HanConvert ¼Ò²Õ«h´£¨Ñ¤F²ÁcÂà´«¥Îªº¨âºØ½s½X: + + big5-simp Big5 ¥¿Å餤¤å»P Unicode ²Å餤¤å¤¬Âà + gbk-trad GBK ²Å餤¤å»P Unicode ÁcÅ餤¤å¤¬Âà + +­Y·Q¦b GBK »P Big5 ¤§¶¡¤¬Âà, ½Ð°Ñ¦Ò¸Ó¼Ò²Õ¤ºªþªº b2g.pl »P g2b.pl ¨â +¤äµ{¦¡. + +=head2 ¶i¤@¨Bªº¸ê°T + +½Ð°Ñ¦Ò Perl ¤ºªþªº¤j¶q»¡©ú¤å¥ó (¤£©¯¥þ¬O¥Î­^¤å¼gªº) , ¨Ó¾Ç²ß§ó¦hÃö©ó +Perl ªºª¾ÃÑ, ¥H¤Î Unicode ªº¨Ï¥Î¤è¦¡. ¤£¹L, ¥~³¡ªº¸ê·½¬Û·íÂ×´I: + +=head2 ´£¨Ñ Perl ¸ê·½ªººô§} + +=over 4 + +=item L + +Perl ªº­º­¶ (¥Ñ¼ÚµÜ§¤½¥qºûÅ@) + +=item L + +Perl ºî¦X¨åÂúô (Comprehensive Perl Archive Network) + +=item L + +Perl ¶l»¼½×¾Â¤@Äý + +=back + +=head2 ¾Ç²ß Perl ªººô§} + +=over 4 + +=item L + +¥¿Å餤¤åª©ªº¼ÚµÜ§ Perl ®ÑÂÇ + +=item L + +»OÆW Perl ³s½u°Q½×°Ï (¤]´N¬O¦U¤j BBS ªº Perl ³s½uª©) + +=back + +=head2 Perl ¨Ï¥ÎªÌ¶°·| + +=over 4 + +=item L + +»OÆW Perl ±À¼s²Õ¤@Äý + +=item L + +ÃÀ¥ß¨ó½u¤W²á¤Ñ«Ç + +=back + +=head2 Unicode ¬ÛÃöºô§} + +=over 4 + +=item L + +Unicode ¾Ç³N¾Ç·| (Unicode ¼Ð·Çªº¨î©wªÌ) + +=item L + +Unix/Linux ¤Wªº UTF-8 ¤Î Unicode µª«È°Ý + +=back + +=head1 AUTHORS + +Jarkko Hietaniemi Ejhi@iki.fiE + +­ð©vº~ Eautrijus@autrijus.orgE + +=cut diff --git a/pod/buildtoc.PL b/pod/buildtoc.PL index b38bec5..413afd8 100644 --- a/pod/buildtoc.PL +++ b/pod/buildtoc.PL @@ -90,6 +90,13 @@ if (-d "pod") { perlwin32 ); +@CJKPODS = qw( + perlcn + perljp + perlko + perltw + ); + @pods = ( qw( @@ -199,13 +206,16 @@ if (-d "pod") { ), - @ARCHPODS + @ARCHPODS, ); for (@ARCHPODS) { s/$/.pod/ } @ARCHPODS{@ARCHPODS} = (); +for (@CJKPODS) { s/$/.pod/ } +@CJKPODS{@CJKPODS} = (); + for (@pods) { s/$/.pod/ } @pods{@pods} = (); @PODS{@PODS} = (); @@ -232,6 +242,11 @@ die "$0: could not find the pod listing of perl.pod\n" unless @PERLPODS; @PERLPODS{@PERLPODS} = (); +# Delete the CJK because we cannot mix their encodings. +delete @PERLPODS{@CJKPODS}; +delete @PODS{@CJKPODS}; +delete @pods{@CJKPODS}; + # Cross-check against ourselves # Cross-check against the MANIFEST # Cross-check against the perl.pod @@ -240,7 +255,7 @@ foreach my $i (sort keys %PODS) { warn "$0: $i exists but is unknown by buildtoc\n" unless exists $pods{$i}; warn "$0: $i exists but is unknown by ../MANIFEST\n" - if !exists $MANIPODS{$i} && !exists $ARCHPODS{$i}; + if !exists $MANIPODS{$i} && !exists $ARCHPODS{$i} && !exists $CJKPODS{$i}; warn "$0: $i exists but is unknown by perl.pod\n" unless exists $PERLPODS{$i}; } diff --git a/pod/perl.pod b/pod/perl.pod index 2cac3f8..b751d0b 100644 --- a/pod/perl.pod +++ b/pod/perl.pod @@ -136,6 +136,13 @@ For ease of access, the Perl manual has been split up into several sections. perl5005delta Perl changes in version 5.005 perl5004delta Perl changes in version 5.004 +=head2 Language-Specific + + perlcn Perl for Simplified Chinese (in EUC-CN) + perljp Perl for Japanese (in EUC-JP) + perlko Perl for Korean (in EUC-KR) + perltw Perl for Traditional Chinese (in Big5) + =head2 Platform-Specific perlaix Perl notes for AIX diff --git a/pod/perltoc.pod b/pod/perltoc.pod index 2fe1c39..8325d43 100644 --- a/pod/perltoc.pod +++ b/pod/perltoc.pod @@ -29,6 +29,8 @@ through to locate the proper section you're looking for. =item Miscellaneous +=item Language-Specific + =item Platform-Specific =back @@ -2264,7 +2266,7 @@ to enable UTF-8/UTF-EBCDIC in scripts =item Unicode Encodings -=item Security Implications of Malformed UTF-8 +=item Security Implications of Unicode =item Unicode in Perl on EBCDIC @@ -2276,6 +2278,16 @@ to enable UTF-8/UTF-EBCDIC in scripts =item BUGS +=over 4 + +=item Interaction with locales + +=item Interaction with extensions + +=item speed + +=back + =item SEE ALSO =back @@ -2426,6 +2438,8 @@ chcp, dataset access, OS/390, z/OS iconv, locales =item Protecting Your Programs +=item Unicode + =back =item SEE ALSO @@ -2717,8 +2731,8 @@ tarball, Announce to the modules list, Announce to clpa, Fix bugs! =back -=head2 perlfaq1 - General Questions About Perl ($Revision: 1.7 $, $Date: -2002/02/21 14:49:15 $) +=head2 perlfaq1 - General Questions About Perl ($Revision: 1.8 $, $Date: +2002/04/07 18:46:13 $) =over 4 @@ -2764,8 +2778,8 @@ Scheme, or Tcl? =back -=head2 perlfaq2 - Obtaining and Learning about Perl ($Revision: 1.9 $, -$Date: 2002/03/09 21:01:13 $) +=head2 perlfaq2 - Obtaining and Learning about Perl ($Revision: 1.12 $, +$Date: 2002/04/09 17:16:05 $) =over 4 @@ -2820,8 +2834,8 @@ References, Tutorials, Task-Oriented, Special Topics =back -=head2 perlfaq3 - Programming Tools ($Revision: 1.15 $, $Date: 2002/02/11 -19:29:52 $) +=head2 perlfaq3 - Programming Tools ($Revision: 1.18 $, $Date: 2002/04/09 +17:11:16 $) =over 4 @@ -2905,8 +2919,8 @@ my C program; what am I doing wrong? =back -=head2 perlfaq4 - Data Manipulation ($Revision: 1.19 $, $Date: 2002/03/11 -22:15:19 $) +=head2 perlfaq4 - Data Manipulation ($Revision: 1.20 $, $Date: 2002/04/07 +18:46:13 $) =over 4 @@ -3122,8 +3136,8 @@ array of hashes or arrays? =back -=head2 perlfaq5 - Files and Formats ($Revision: 1.12 $, $Date: 2002/03/11 -22:25:25 $) +=head2 perlfaq5 - Files and Formats ($Revision: 1.15 $, $Date: 2002/04/12 +02:02:05 $) =over 4 @@ -3215,8 +3229,8 @@ protected files? Isn't this a bug in Perl? =back -=head2 perlfaq6 - Regular Expressions ($Revision: 1.8 $, $Date: 2002/01/31 -04:27:55 $) +=head2 perlfaq6 - Regular Expressions ($Revision: 1.10 $, $Date: 2002/04/07 +18:32:57 $) =over 4 @@ -3282,8 +3296,8 @@ file? =back -=head2 perlfaq7 - General Perl Language Issues ($Revision: 1.7 $, $Date: -2002/01/31 04:27:55 $) +=head2 perlfaq7 - General Perl Language Issues ($Revision: 1.8 $, $Date: +2002/03/26 15:48:32 $) =over 4 @@ -3489,7 +3503,7 @@ search path? =back -=head2 perlfaq9 - Networking ($Revision: 1.7 $, $Date: 2002/01/28 04:17:27 +=head2 perlfaq9 - Networking ($Revision: 1.9 $, $Date: 2002/04/07 18:46:13 $) =over 4 @@ -3974,6 +3988,8 @@ C, C

, C, C =item Creating New Variables +GV_ADDMULTI, GV_ADDWARN + =item Reference Counts and Mortality =item Stashes and Globs @@ -4798,6 +4814,12 @@ PerlIO_apply_layers(f,mode,layers), PerlIO_binmode(f,ptype,imode,layers), =item Update auxiliary tools +=item Create debugging macros + +=item truncate to the people + +=item Unicode in Filenames + =back =item Recently done things @@ -4918,10 +4940,6 @@ PerlIO_apply_layers(f,mode,layers), PerlIO_binmode(f,ptype,imode,layers), =item Unicode collation and normalization -=item Create debugging macros - -=item truncate to the people - =item pack/unpack tutorial =back @@ -5119,6 +5137,8 @@ I =item PerlIO is Now The Default +=item Restricted Hashes + =item Safe Signals =item Unicode Overhaul @@ -8416,7 +8436,7 @@ diagnostics =item USAGE -use encoding [I] ;, use encoding I [ STDIN => +use encoding [I] ;, use encoding I [ STDIN =E I ...] ;, no encoding; =item CAVEATS @@ -8429,6 +8449,10 @@ I ...] ;, no encoding; =back +=item NON-ASCII Identifiers and Filter option + +use encoding I Filter=E1; + =item EXAMPLE - Greekperl =item KNOWN PROBLEMS @@ -10431,7 +10455,7 @@ C, C, C, C =item p C, C, C, C, C, C, -C, C +C, C, C =item P @@ -10501,7 +10525,7 @@ C, C =item y -C, C +C =item z @@ -11014,12 +11038,16 @@ Perl code =item PERL ENCODING API -$bytes = encode(ENCODING, $string[, CHECK]), $string = decode(ENCODING, -$bytes[, CHECK]), [$length =] from_to($string, FROM_ENCODING, TO_ENCODING[, -CHECK]) +$octets = encode(ENCODING, $string[, CHECK]), $string = decode(ENCODING, +$octets[, CHECK]), [$length =] from_to($string, FROM_ENCODING, TO_ENCODING +[,CHECK]) =over 4 +=item UTF-8 / utf8 + +$octets = encode_utf8($string);, $string = decode_utf8($octets [, CHECK]); + =item Listing available encodings =item Defining Aliases @@ -11030,16 +11058,6 @@ CHECK]) =item Handling Malformed Data -Scheme 1, Scheme 2, Other Schemes - -=over 4 - -=item UTF-8 / utf8 - -$bytes = encode_utf8($string);, $string = decode_utf8($bytes [, CHECK]); - -=back - =item Defining Encodings =item Messing with Perl's Internals @@ -11050,9 +11068,7 @@ is_utf8(STRING [, CHECK]), _utf8_on(STRING), _utf8_off(STRING) =back -=head2 Encode::10646_1, Encode::10656_1 -- for internal use only - -=head2 Encode::Alias - alias defintions to encodings +=head2 Encode::Alias - alias definitions to encodings =over 4 @@ -11140,9 +11156,9 @@ reference, e.g.: =item SEE ALSO -=back +Scheme 1, Scheme 2, Other Schemes -=head2 Encode::Internal -- for internal use only +=back =head2 Encode::JP - Japanese Encodings @@ -11162,13 +11178,9 @@ reference, e.g.: =back -=head2 Encode::JP::2022_JP -- internally used by Encode::JP - -=head2 Encode::JP::2022_JP1 -- internally used by Encode::JP - =head2 Encode::JP::H2Z -- internally used by Encode::JP::2022_JP* -=head2 Encode::JP::JIS -- internally used by Encode::JP +=head2 Encode::JP::JIS7 -- internally used by Encode::JP =head2 Encode::Supported -- Supported encodings by Encode @@ -11188,6 +11200,8 @@ reference, e.g.: =item Built-in Encodings +=item Encode::Unicode -- other Unicode encodings + =item Encode::Byte -- Extended ASCII ISO-8859 and corresponding vendor mappings, KOI8 - De Facto Standard for @@ -11196,7 +11210,7 @@ Cyrillic world =item The CJK: Chinese, Japanese, Korean (Multibyte) Encode::CN -- Continental China, Encode::JP -- Japan, Encode::KR -- Korea, -Encode::TW -- Taiwan, Encode::HanExtra -- More Chinese via CPAN +Encode::HanExtra -- More Chinese via CPAN =item Miscellaneous encodings @@ -11226,7 +11240,8 @@ KS_C_5601-1987, GB2312, Big5, Shift_JIS =item Glossary character repertoire, coded character set (CCS), character encoding scheme -(CES), EUC, ISO-2022, UCS, UCS-2, Unicode, UTF, UTF-16 +(CES), charset (in MIME context), EUC, ISO-2022, UCS, UCS-2, Unicode, UTF, +UTF-16 =item See Also @@ -11239,7 +11254,11 @@ RFC, UC, Unicode Glossary =item Other Notable Sites -czyborra.com, CJK.inf +czyborra.com, CJK.inf, Jungshik Shin's Hangul FAQ + +=item Offline sources + +C by Ken Lunde =back @@ -11275,14 +11294,37 @@ czyborra.com, CJK.inf =back -=head2 Encode::Unicode -- for internal use only +=head2 Encode::Unicode -- Various Unicode Transform Format -=head2 Encode::XS -- for internal use only +=over 4 + +=item SYNOPSIS + +=item ABSTRACT + +L says:, Quick Reference + +=item Size, Endianness, and BOM + +=over 4 -=head2 Encode::lib::Encode::10646_1, Encode::10656_1 -- for internal use -only +=item by Size -=head2 Encode::lib::Encode::Alias, Encode::Alias - alias defintions to +=item by Endianness + +BOM as integer when fetched in network byte order + +=back + +=item Surrogate Pairs + +=item SEE ALSO + +=back + +=head2 Encode::XS -- for internal use only + +=head2 Encode::lib::Encode::Alias, Encode::Alias - alias definitions to encodings =over 4 @@ -11330,22 +11372,15 @@ Implementation Base Class =item SEE ALSO -=back - -=head2 Encode::lib::Encode::Internal, Encode::Internal -- for internal use -only - -=head2 Encode::lib::Encode::JP::2022_JP, Encode::JP::2022_JP -- internally -used by Encode::JP +Scheme 1, Scheme 2, Other Schemes -=head2 Encode::lib::Encode::JP::2022_JP1, Encode::JP::2022_JP1 -- -internally used by Encode::JP +=back =head2 Encode::lib::Encode::JP::H2Z, Encode::JP::H2Z -- internally used by Encode::JP::2022_JP* -=head2 Encode::lib::Encode::JP::JIS, Encode::JP::JIS -- internally used by -Encode::JP +=head2 Encode::lib::Encode::JP::JIS7, Encode::JP::JIS7 -- internally used +by Encode::JP =head2 Encode::lib::Encode::Supported, Encode::Supported -- Supported encodings by Encode @@ -11366,6 +11401,8 @@ encodings by Encode =item Built-in Encodings +=item Encode::Unicode -- other Unicode encodings + =item Encode::Byte -- Extended ASCII ISO-8859 and corresponding vendor mappings, KOI8 - De Facto Standard for @@ -11374,7 +11411,7 @@ Cyrillic world =item The CJK: Chinese, Japanese, Korean (Multibyte) Encode::CN -- Continental China, Encode::JP -- Japan, Encode::KR -- Korea, -Encode::TW -- Taiwan, Encode::HanExtra -- More Chinese via CPAN +Encode::HanExtra -- More Chinese via CPAN =item Miscellaneous encodings @@ -11404,7 +11441,8 @@ KS_C_5601-1987, GB2312, Big5, Shift_JIS =item Glossary character repertoire, coded character set (CCS), character encoding scheme -(CES), EUC, ISO-2022, UCS, UCS-2, Unicode, UTF, UTF-16 +(CES), charset (in MIME context), EUC, ISO-2022, UCS, UCS-2, Unicode, UTF, +UTF-16 =item See Also @@ -11417,25 +11455,89 @@ RFC, UC, Unicode Glossary =item Other Notable Sites -czyborra.com, CJK.inf +czyborra.com, CJK.inf, Jungshik Shin's Hangul FAQ + +=item Offline sources + +C by Ken Lunde =back =back -=head2 Encode::lib::Encode::Unicode, Encode::Unicode -- for internal use -only +=head2 Encode::lib::Encode::Unicode, Encode::Unicode -- Various Unicode +Transform Format + +=over 4 + +=item SYNOPSIS + +=item ABSTRACT + +L says:, Quick Reference + +=item Size, Endianness, and BOM + +=over 4 + +=item by Size + +=item by Endianness + +BOM as integer when fetched in network byte order + +=back + +=item Surrogate Pairs + +=item SEE ALSO + +=back =head2 Encode::lib::Encode::XS, Encode::XS -- for internal use only -=head2 Encode::lib::Encode::ucs2_le, Encode::ucs2_le -- for internal use -only +=head2 Encode::lib::Encoder, Encode::Encoder -- Object Oriented Encoder + +=over 4 + +=item SYNOPSIS + + use Encode::Encoder; + # Encode::encode("ISO-8859-1", $data); + Encode::Encoder->new($data)->iso_8859_1; # OOP way + # shortcut + use Encode::Encoder qw(encoder); + encoder($data)->iso_8859_1; + # you can stack them! + encoder($data)->iso_8859_1->base64; # provided base64() is defined + # you can use it as a decoder as well + encoder($base64)->bytes('base64')->latin1; + # stringified + print encoder($data)->utf8->latin1; # prints the string in latin1 + # numified + encoder("\x{abcd}\x{ef}g")->utf8 == 6; # true. bytes::length($data) -=head2 Encode::lib::Encode::utf8, Encode::utf8 -- for internal use only +=item ABSTRACT + +=item Description + +=over 4 -=head2 Encode::ucs2_le -- for internal use only +=item Predefined Methods -=head2 Encode::utf8 -- for internal use only +$e = Encode::Encoder-Enew([$data, $encoding]);, encoder(), +$e-Edata([$data]), $e-Eencoding([$encoding]), +$e-Ebytes([$encoding]) + +=item Example: base64 transcoder + +=item operator overloading + +=back + +=item SEE ALSO + +=back =head2 Encodencoding, encoding - allows you to write your script in non-asii or non-utf8 @@ -11448,7 +11550,7 @@ non-asii or non-utf8 =item USAGE -use encoding [I] ;, use encoding I [ STDIN => +use encoding [I] ;, use encoding I [ STDIN =E I ...] ;, no encoding; =item CAVEATS @@ -11461,6 +11563,10 @@ I ...] ;, no encoding; =back +=item NON-ASCII Identifiers and Filter option + +use encoding I Filter=E1; + =item EXAMPLE - Greekperl =item KNOWN PROBLEMS @@ -11469,6 +11575,49 @@ I ...] ;, no encoding; =back +=head2 Encoder, Encode::Encoder -- Object Oriented Encoder + +=over 4 + +=item SYNOPSIS + + use Encode::Encoder; + # Encode::encode("ISO-8859-1", $data); + Encode::Encoder->new($data)->iso_8859_1; # OOP way + # shortcut + use Encode::Encoder qw(encoder); + encoder($data)->iso_8859_1; + # you can stack them! + encoder($data)->iso_8859_1->base64; # provided base64() is defined + # you can use it as a decoder as well + encoder($base64)->bytes('base64')->latin1; + # stringified + print encoder($data)->utf8->latin1; # prints the string in latin1 + # numified + encoder("\x{abcd}\x{ef}g")->utf8 == 6; # true. bytes::length($data) + +=item ABSTRACT + +=item Description + +=over 4 + +=item Predefined Methods + +$e = Encode::Encoder-Enew([$data, $encoding]);, encoder(), +$e-Edata([$data]), $e-Eencoding([$encoding]), +$e-Ebytes([$encoding]) + +=item Example: base64 transcoder + +=item operator overloading + +=back + +=item SEE ALSO + +=back + =head2 English - use nice English (or awk) names for ugly punctuation variables @@ -12312,15 +12461,18 @@ CONFIGURE, DEFINE, DIR, DISTNAME, DL_FUNCS, DL_VARS, EXCLUDE_EXT, EXE_FILES, FIRST_MAKEFILE, FULLPERL, FULLPERLRUN, FULLPERLRUNINST, FUNCLIST, H, IMPORTS, INC, INCLUDE_EXT, INSTALLARCHLIB, INSTALLBIN, INSTALLDIRS, INSTALLMAN1DIR, INSTALLMAN3DIR, INSTALLPRIVLIB, INSTALLSCRIPT, -INSTALLSITEARCH, INSTALLSITELIB, INST_ARCHLIB, INST_BIN, INST_LIB, -INST_MAN1DIR, INST_MAN3DIR, INST_SCRIPT, LDFROM, LIB, LIBPERL_A, LIBS, -LINKTYPE, MAKEAPERL, MAKEFILE, MAN1PODS, MAN3PODS, MAP_TARGET, MYEXTLIB, -NAME, NEEDS_LINKING, NOECHO, NORECURS, NO_VC, OBJECT, OPTIMIZE, PERL, -PERL_CORE, PERLMAINCC, PERL_ARCHLIB, PERL_LIB, PERL_MALLOC_OK, PERLRUN, -PERLRUNINST, PERL_SRC, PERM_RW, PERM_RWX, PL_FILES, PM, PMLIBDIRS, +INSTALLSITEARCH, INSTALLSITEBIN, INSTALLSITELIB, INSTALLSITEMAN1DIR, +INSTALLSITEMAN3DIR, INSTALLVENDORARCH, INSTALLVENDORBIN, INSTALLVENDORLIB, +INSTALLVENDORMAN1DIR, INSTALLVENDORMAN3DIR, INST_ARCHLIB, INST_BIN, +INST_LIB, INST_MAN1DIR, INST_MAN3DIR, INST_SCRIPT, LDFROM, LIB, LIBPERL_A, +LIBS, LINKTYPE, MAKEAPERL, MAKEFILE, MAN1PODS, MAN3PODS, MAP_TARGET, +MYEXTLIB, NAME, NEEDS_LINKING, NOECHO, NORECURS, NO_VC, OBJECT, OPTIMIZE, +PERL, PERL_CORE, PERLMAINCC, PERL_ARCHLIB, PERL_LIB, PERL_MALLOC_OK, +PERLRUN, PERLRUNINST, PERL_SRC, PERM_RW, PERM_RWX, PL_FILES, PM, PMLIBDIRS, PM_FILTER, POLLUTE, PPM_INSTALL_EXEC, PPM_INSTALL_SCRIPT, PREFIX, -PREREQ_PM, PREREQ_FATAL, PREREQ_PRINT, PRINT_PREREQ, SKIP, TYPEMAPS, -VERSION, VERSION_FROM, XS, XSOPT, XSPROTOARG, XS_VERSION +PREREQ_PM, PREREQ_FATAL, PREREQ_PRINT, PRINT_PREREQ, SITEPREFIX, SKIP, +TYPEMAPS, VENDORPREFIX, VERSION, VERSION_FROM, XS, XSOPT, XSPROTOARG, +XS_VERSION =item Additional lowercase attributes @@ -12499,6 +12651,10 @@ C, C =item DESCRIPTION +=item AUTHOR + +=item HISTORY + =back =head2 File::Compare - Compare files or filehandles @@ -13022,6 +13178,10 @@ TopSystemUID =item DESCRIPTION +cacheout EXPR, cacheout MODE, EXPR + +=item CAVEATS + =item BUGS =back