From: Jarkko Hietaniemi Date: Thu, 17 Aug 2000 14:44:02 +0000 (+0000) Subject: Add perlebcdic from Peter Prymmer, regen toc. X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=commitdiff_plain;h=d396a55899b7bce58ef6008d9af7a500b5175b4a;p=p5sagit%2Fp5-mst-13.2.git Add perlebcdic from Peter Prymmer, regen toc. p4raw-id: //depot/perl@6676 --- diff --git a/MANIFEST b/MANIFEST index 3c7753d..876f5ff 100644 --- a/MANIFEST +++ b/MANIFEST @@ -1099,6 +1099,7 @@ pod/perldebug.pod Debugger info pod/perldelta.pod Changes since last version pod/perldiag.pod Diagnostic info pod/perldsc.pod Data Structures Cookbook +pod/perlebcdic.pod Considerations for running Perl on EBCDIC platforms pod/perlembed.pod Embedding info pod/perlfaq.pod Frequently Asked Questions, Top Level pod/perlfaq1.pod Frequently Asked Questions, Part 1 diff --git a/pod/buildtoc.PL b/pod/buildtoc.PL index 131dac3..140d214 100644 --- a/pod/buildtoc.PL +++ b/pod/buildtoc.PL @@ -88,11 +88,17 @@ if (-d "pod") { perlopentut perlretut - perlref perlre + perlref + perlform - perllocale - perlunicode + + perlboot + perltoot + perltootc + perlobj + perlbot + perltie perlipc perlfork @@ -100,14 +106,11 @@ if (-d "pod") { perlthrtut perlport - perlsec + perllocale + perlunicode + perlebcdic - perlboot - perltoot - perltootc - perlobj - perlbot - perltie + perlsec perlmod perlmodlib diff --git a/pod/perl.pod b/pod/perl.pod index 998fe2f..895f8d9 100644 --- a/pod/perl.pod +++ b/pod/perl.pod @@ -42,11 +42,17 @@ For ease of access, the Perl manual has been split up into several sections: perlopentut Perl open() tutorial perlretut Perl regular expressions tutorial - perlref Perl references, the rest of the story perlre Perl regular expressions, the rest of the story + perlref Perl references, the rest of the story + perlform Perl formats - perllocale Perl locale support - perlunicode Perl unicode support + + perlboot Perl OO tutorial for beginners + perltoot Perl OO tutorial, part 1 + perltootc Perl OO tutorial, part 2 + perlobj Perl objects + perlbot Perl OO tricks and examples + perltie Perl objects hidden behind simple variables perlipc Perl interprocess communication perlfork Perl fork() information @@ -54,14 +60,11 @@ For ease of access, the Perl manual has been split up into several sections: perlthrtut Perl threads tutorial perlport Perl portability guide - perlsec Perl security + perllocale Perl locale support + perlunicode Perl unicode support + perlebcdic Considerations for running Perl on EBCDIC platforms - perlboot Perl OO tutorial for beginners - perltoot Perl OO tutorial, part 1 - perltootc Perl OO tutorial, part 2 - perlobj Perl objects - perlbot Perl OO tricks and examples - perltie Perl objects hidden behind simple variables + perlsec Perl security perlmod Perl modules: how they work perlmodlib Perl modules: how to write and use diff --git a/pod/perlebcdic.pod b/pod/perlebcdic.pod new file mode 100644 index 0000000..f27a8de --- /dev/null +++ b/pod/perlebcdic.pod @@ -0,0 +1,1001 @@ +=head1 NAME + +perlebcdic - Considerations for running Perl on EBCDIC platforms + +=head1 DESCRIPTION + +An exploration of some of the issues facing Perl programmers +on EBCDIC based computers. We do not cover localization, +internationalization, or multi byte character set issues (yet). + +Portions that are still incomplete are marked with XXX. + +=head1 COMMON CHARACTER CODE SETS + +=head2 ASCII + +The American Standard Code for Information Interchange is a set of +integers running from 0 to 127 (decimal) that imply character +interpretation by the display and other system(s) of computers. +The range 0..127 is covered by setting the bits in a 7-bit binary +digit, hence the set is sometimes referred to as a "7-bit ASCII". +ASCII was described by the American National Standards Instute +document ANSI X3.4-1986. It was also described by ISO 646:1991 +(with localization for currency symbols). The full ASCII set is +given in the table below as the first 128 elements. Languages that +can be written adequately with the characters in ASCII include +English, Hawaiian, Indonesian, Swahili and some Native American +languages. + +=head2 ISO 8859 + +The ISO 8859-$n are a collection of character code sets from the +International Organization for Standardization (ISO) each of which +adds characters to the ASCII set that are typically found in European +languages many of which are based on the Roman, or Latin, alphabet. + +=head2 Latin 1 (ISO 8859-1) + +A particular 8-bit extension to ASCII that includes grave and acute +accented Latin characters. Languages that can employ ISO 8859-1 +include all the languages covered by ASCII as well as Afrikaans, +Albanian, Basque, Catalan, Danish, Faroese, Finnish, Norwegian, +Portugese, Spanish, and Swedish. Dutch is covered albeit without +the ij ligature. French is covered too but without the oe ligature. +German can use ISO 8859-1 but must do so without German-style +quotation marks. This set is based on Western European extensions +to ASCII and is commonly encountered in world wide web work. +In IBM character code set identification terminology ISO 8859-1 is +known as CCSID 819 (or sometimes 0819 or even 00819). + +=head2 EBCDIC + +Extended Binary Coded Decimal Interchange Code. The EBCDIC acronym +refers to a large collection of slightly different single and +multi byte coded character sets that are different from ASCII or +ISO 8859-1 and typically run on host computers. The +EBCDIC encodings derive from Hollerith punched card encodings. +The layout on the cards was such that high bits were set for the +upper and lower case alphabet characters [a-z] and [A-Z], but there +were gaps within each latin alphabet range. + +=head2 13 variant characters + +XXX. + +EBCDIC character sets may be known by character code set identification +numbers (CCSID numbers) or code page numbers. + +=head2 0037 + +Character code set ID 0037 is a mapping of the ASCII plus Latin-1 +characters (i.e. ISO 8859-1) to an EBCDIC set. 0037 is used +on the OS/400 operating system that runs on AS/400 computers. +CCSID 37 differs from ISO 8859-1 in 237 places, in other words +they agree on only 19 code point values. + +=head2 1047 + +Character code set ID 1047 is also a mapping of the ASCII plus +Latin-1 characters (i.e. ISO 8859-1) to an EBCDIC set. 1047 is +used under Unix System Services for OS/390, and OpenEdition for VM/ESA. +CCSID 1047 differs from CCSID 0037 in eight places. + +=head2 POSIX-BC + +The EBCDIC code page in use on Siemens' BS2000 system is distinct from +1047 and 0037. It is identified below as the POSIX-BC set. + +=head1 SINGLE OCTET TABLES + +The following tables list the ASCII and Latin 1 ordered sets including +the subsets: C0 controls (0..31), ASCII graphics (32..7e), delete (7f), +C1 controls (80..9f), and Latin-1 (a.k.a. ISO 8859-1) (a0..ff). In the +table non-printing control character names as well as the Latin 1 +extensions to ASCII have been labelled with character names roughly +corresponding to I albeit with +substitutions such as s/LATIN// and s/VULGAR// in all cases, +s/CAPITAL LETTER// in some cases, and s/SMALL LETTER ([A-Z])/\l$1/ +in some other cases. The "names" of the C1 control set +(128..159 in ISO 8859-1) are somewhat arbitrary. The differences +between the 0037 and 1047 sets are flagged with ***. The differences +between the 1047 and POSIX-BC sets are flagged with ###. +All ord() numbers listed are decimal. If you would rather see this +table listing octal values then run the table (that is, the pod +version of this document since this recipe may not work with +a pod2XXX translation to another format) through: + +=over 4 + +=item recipe 0 + +=back + + perl -ne 'if(/(.{33})(\d+)\s+(\d+)\s+(\d+)\s+(\d+)/)' \ + -e '{printf("%s%-9o%-9o%-9o%-9o\n",$1,$2,$3,$4,$5)}' perlebcdic.pod + +If you would rather see this table listing hexadecimal values then +run the table through: + +=over 4 + +=item recipe 1 + +=back + + perl -ne 'if(/(.{33})(\d+)\s+(\d+)\s+(\d+)\s+(\d+)/)' \ + -e '{printf("%s%-9X%-9X%-9X%-9X\n",$1,$2,$3,$4,$5)}' perlebcdic.pod + + + 8859-1 + chr 0819 0037 1047 POSIX-BC + ---------------------------------------------------------------- + 0 0 0 0 + 1 1 1 1 + 2 2 2 2 + 3 3 3 3 + 4 55 55 55 + 5 45 45 45 + 6 46 46 46 + 7 47 47 47 + 8 22 22 22 + 9 5 5 5 + 10 37 21 21 *** + 11 11 11 11 +
12 12 12 12 + 13 13 13 13 + 14 14 14 14 + 15 15 15 15 + 16 16 16 16 + 17 17 17 17 + 18 18 18 18 + 19 19 19 19 + 20 60 60 60 + 21 61 61 61 + 22 50 50 50 + 23 38 38 38 + 24 24 24 24 + 25 25 25 25 + 26 63 63 63 + 27 39 39 39 + 28 28 28 28 + 29 29 29 29 + 30 30 30 30 + 31 31 31 31 + 32 64 64 64 + ! 33 90 90 90 + " 34 127 127 127 + # 35 123 123 123 + $ 36 91 91 91 + % 37 108 108 108 + & 38 80 80 80 + ' 39 125 125 125 + ( 40 77 77 77 + ) 41 93 93 93 + * 42 92 92 92 + + 43 78 78 78 + , 44 107 107 107 + - 45 96 96 96 + . 46 75 75 75 + / 47 97 97 97 + 0 48 240 240 240 + 1 49 241 241 241 + 2 50 242 242 242 + 3 51 243 243 243 + 4 52 244 244 244 + 5 53 245 245 245 + 6 54 246 246 246 + 7 55 247 247 247 + 8 56 248 248 248 + 9 57 249 249 249 + : 58 122 122 122 + ; 59 94 94 94 + < 60 76 76 76 + = 61 126 126 126 + > 62 110 110 110 + ? 63 111 111 111 + @ 64 124 124 124 + A 65 193 193 193 + B 66 194 194 194 + C 67 195 195 195 + D 68 196 196 196 + E 69 197 197 197 + F 70 198 198 198 + G 71 199 199 199 + H 72 200 200 200 + I 73 201 201 201 + J 74 209 209 209 + K 75 210 210 210 + L 76 211 211 211 + M 77 212 212 212 + N 78 213 213 213 + O 79 214 214 214 + P 80 215 215 215 + Q 81 216 216 216 + R 82 217 217 217 + S 83 226 226 226 + T 84 227 227 227 + U 85 228 228 228 + V 86 229 229 229 + W 87 230 230 230 + X 88 231 231 231 + Y 89 232 232 232 + Z 90 233 233 233 + [ 91 186 173 187 *** ### + \ 92 224 224 188 ### + ] 93 187 189 189 *** + ^ 94 176 95 106 *** ### + _ 95 109 109 109 + ` 96 121 121 74 ### + a 97 129 129 129 + b 98 130 130 130 + c 99 131 131 131 + d 100 132 132 132 + e 101 133 133 133 + f 102 134 134 134 + g 103 135 135 135 + h 104 136 136 136 + i 105 137 137 137 + j 106 145 145 145 + k 107 146 146 146 + l 108 147 147 147 + m 109 148 148 148 + n 110 149 149 149 + o 111 150 150 150 + p 112 151 151 151 + q 113 152 152 152 + r 114 153 153 153 + s 115 162 162 162 + t 116 163 163 163 + u 117 164 164 164 + v 118 165 165 165 + w 119 166 166 166 + x 120 167 167 167 + y 121 168 168 168 + z 122 169 169 169 + { 123 192 192 251 ### + | 124 79 79 79 + } 125 208 208 253 ### + ~ 126 161 161 255 ### + 127 7 7 7 + 128 32 32 32 + 129 33 33 33 + 130 34 34 34 + 131 35 35 35 + 132 36 36 36 + 133 21 37 37 *** + 134 6 6 6 + 135 23 23 23 + 136 40 40 40 + 137 41 41 41 + 138 42 42 42 + 139 43 43 43 + 140 44 44 44 + 141 9 9 9 + 142 10 10 10 + 143 27 27 27 + 144 48 48 48 + 145 49 49 49 + 146 26 26 26 + 147 51 51 51 + 148 52 52 52 + 149 53 53 53 + 150 54 54 54 + 151 8 8 8 + 152 56 56 56 + 153 57 57 57 + 154 58 58 58 + 155 59 59 59 + 156 4 4 4 + 157 20 20 20 + 158 62 62 62 + 159 255 255 95 ### + 160 65 65 65 + 161 170 170 170 + 162 74 74 176 ### + 163 177 177 177 + 164 159 159 159 + 165 178 178 178 + 166 106 106 208 ### +
167 181 181 181 + 168 189 187 121 *** ### + 169 180 180 180 + 170 154 154 154 + 171 138 138 138 + 172 95 176 186 *** ### + 173 202 202 202 + 174 175 175 175 + 175 188 188 161 ### + 176 144 144 144 + 177 143 143 143 + 178 234 234 234 + 179 250 250 250 + 180 190 190 190 + 181 160 160 160 + 182 182 182 182 + 183 179 179 179 + 184 157 157 157 + 185 218 218 218 + 186 155 155 155 + 187 139 139 139 + 188 183 183 183 + 189 184 184 184 + 190 185 185 185 + 191 171 171 171 + 192 100 100 100 + 193 101 101 101 + 194 98 98 98 + 195 102 102 102 + 196 99 99 99 + 197 103 103 103 + 198 158 158 158 + 199 104 104 104 + 200 116 116 116 + 201 113 113 113 + 202 114 114 114 + 203 115 115 115 + 204 120 120 120 + 205 117 117 117 + 206 118 118 118 + 207 119 119 119 + 208 172 172 172 + 209 105 105 105 + 210 237 237 237 + 211 238 238 238 + 212 235 235 235 + 213 239 239 239 + 214 236 236 236 + 215 191 191 191 + 216 128 128 128 + 217 253 253 224 ### + 218 254 254 254 + 219 251 251 221 ### + 220 252 252 252 + 221 173 186 173 *** ### + 222 174 174 174 + 223 89 89 89 + 224 68 68 68 + 225 69 69 69 + 226 66 66 66 + 227 70 70 70 + 228 67 67 67 + 229 71 71 71 + 230 156 156 156 + 231 72 72 72 + 232 84 84 84 + 233 81 81 81 + 234 82 82 82 + 235 83 83 83 + 236 88 88 88 + 237 85 85 85 + 238 86 86 86 + 239 87 87 87 + 240 140 140 140 + 241 73 73 73 + 242 205 205 205 + 243 206 206 206 + 244 203 203 203 + 245 207 207 207 + 246 204 204 204 + 247 225 225 225 + 248 112 112 112 + 249 221 221 192 ### + 250 222 222 222 + 251 219 219 219 + 252 220 220 220 + 253 141 141 141 + 254 142 142 142 + 255 223 223 223 + +If you would rather see the above table in CCSID 0037 order rather than +ASCII + Latin-1 order then run the table through: + +=over 4 + +=item recipe 2 + +=back + + perl -ne 'if(/.{33}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\s{1,3}/)'\ + -e '{push(@l,$_)}' \ + -e 'END{print map{$_->[0]}' \ + -e ' sort{$a->[1] <=> $b->[1]}' \ + -e ' map{[$_,substr($_,42,3)]}@l;}' perlebcdic.pod + +If you would rather see it in CCSID 1047 order then change the digit +42 in the last line to 51, like this: + +=over 4 + +=item recipe 3 + +=back + + perl -ne 'if(/.{33}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\s{1,3}/)'\ + -e '{push(@l,$_)}' \ + -e 'END{print map{$_->[0]}' \ + -e ' sort{$a->[1] <=> $b->[1]}' \ + -e ' map{[$_,substr($_,51,3)]}@l;}' perlebcdic.pod + +If you would rather see it in POSIX-BC order then change the digit +51 in the last line to 60, like this: + +=over 4 + +=item recipe 4 + +=back + + perl -ne 'if(/.{33}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\s{1,3}/)'\ + -e '{push(@l,$_)}' \ + -e 'END{print map{$_->[0]}' \ + -e ' sort{$a->[1] <=> $b->[1]}' \ + -e ' map{[$_,substr($_,60,3)]}@l;}' perlebcdic.pod + + +=head1 IDENTIFYING CHARACTER CODE SETS + +To determine the character set you are running under from perl one +could use the return value of ord() or chr() to test one or more +character values. For example: + + $is_ascii = "A" eq chr(65); + $is_ebcdic = "A" eq chr(193); + +"\t" is a . So that: + + $is_ascii = ord("\t") == 9; + $is_ebcdic = ord("\t") == 5; + +To distinguish EBCDIC code pages try looking at one or more of +the characters that differ between them. For example: + + $is_ebcdic_37 = "\n" eq chr(37); + $is_ebcdic_1047 = "\n" eq chr(21); + +Or better still choose a character that is uniquely encoded in any +of the code sets, e.g.: + + $is_ascii = ord('[') == 91; + $is_ebcdic_37 = ord('[') == 186; + $is_ebcdic_1047 = ord('[') == 173; + $is_ebcdic_POSIX_BC = ord('[') == 187; + +However, it would be unwise to write tests such as: + + $is_ascii = "\r" ne chr(13); # WRONG + $is_ascii = "\n" ne chr(10); # ILL ADVISED + +Obviously the first of these will fail to distinguish most ASCII machines +from either a CCSID 0037, a 1047, or a POSIX-BC EBCDIC machine since "\r" eq +chr(13) under all of those coded character sets. But note too that +because "\n" is chr(13) and "\r" is chr(10) on the MacIntosh (which is an +ASCII machine) the second C<$is_ascii> test will lead to trouble there. + +To determine whether or not perl was built under an EBCDIC +code page you can use the Config module like so: + + use Config; + $is_ebcdic = $Config{ebcdic} eq 'define'; + +=head1 CONVERSIONS + +In order to convert a string of characters from one character set to +another a simple list of numbers, such as in the right columns in the +above table, along with perl's tr/// operator is all that is needed. +The data in the table are in ASCII order hence the EBCDIC columns +provide easy to use ASCII to EBCDIC operations that are also easily +reversed. + +For example, to convert ASCII to code page 037 take the output of the second +column from the output of recipe 0 and use it in tr/// like so: + + $cp_037 = + '\000\001\002\003\234\011\206\177\227\215\216\013\014\015\016\017' . + '\020\021\022\023\235\205\010\207\030\031\222\217\034\035\036\037' . + '\200\201\202\203\204\012\027\033\210\211\212\213\214\005\006\007' . + '\220\221\026\223\224\225\226\004\230\231\232\233\024\025\236\032' . + '\040\240\342\344\340\341\343\345\347\361\242\056\074\050\053\174' . + '\046\351\352\353\350\355\356\357\354\337\041\044\052\051\073\254' . + '\055\057\302\304\300\301\303\305\307\321\246\054\045\137\076\077' . + '\370\311\312\313\310\315\316\317\314\140\072\043\100\047\075\042' . + '\330\141\142\143\144\145\146\147\150\151\253\273\360\375\376\261' . + '\260\152\153\154\155\156\157\160\161\162\252\272\346\270\306\244' . + '\265\176\163\164\165\166\167\170\171\172\241\277\320\335\336\256' . + '\136\243\245\267\251\247\266\274\275\276\133\135\257\250\264\327' . + '\173\101\102\103\104\105\106\107\110\111\255\364\366\362\363\365' . + '\175\112\113\114\115\116\117\120\121\122\271\373\374\371\372\377' . + '\134\367\123\124\125\126\127\130\131\132\262\324\326\322\323\325' . + '\060\061\062\063\064\065\066\067\070\071\263\333\334\331\332\237' ; + + my $ebcdic_string = $ascii_string; + $ebcdic_string = tr/\000-\377/$cp_037/; + +To convert from EBCDIC to ASCII just reverse the order of the tr/// +arguments like so: + + my $ascii_string = $ebcdic_string; + $ascii_string = tr/$code_page_chrs/\000-\037/; + +XPG4 interoperability often implies the presence of an I utility +available from the shell or from the C library. Consult your system's +documentation for information on iconv. + +On OS/390 see the iconv(1) man page. One way to invoke the iconv +shell utility from within perl would be to: + + $ascii_data = `echo '$ebcdic_data'| iconv -f IBM-1047 -t ISO8859-1` + +or the inverse map: + + $ebcdic_data = `echo '$ascii_data'| iconv -f ISO8859-1 -t IBM-1047` + +XXX iconv under qsh on OS/400? +XXX iconv on VM? +XXX iconv on BS2k? + +For other perl based conversion options see the Convert::* modules on CPAN. + +=head1 OPERATOR DIFFERENCES + +The C<..> range operator treats certain character ranges with +care on EBCDIC machines. For example the following array +will have twenty six elements on either an EBCDIC machine +or an ASCII machine: + + @alphabet = ('A'..'Z'); # $#alphabet == 25 + +The bitwise operators such as & ^ | may return different results +when operating on string or character data in a perl program running +on an EBCDIC machine than when run on an ASCII machine. Here is +an example adapted from the one in L: + + # EBCDIC-based examples + print "j p \n" ^ " a h"; # prints "JAPH\n" + print "JA" | " ph\n"; # prints "japh\n" + print "JAPH\nJunk" & "\277\277\277\277\277"; # prints "japh\n"; + print 'p N$' ^ " E chr(0) and "\cA" -> chr(1) as well, but the +thirty three characters that result depend on which code page you are +using. The table below uses the character names from the previous table +but with substitions such as s/START OF/S.O./; s/END OF /E.O./; +s/TRANSMISSION/TRANS./; s/TABULATION/TAB./; s/VERTICAL/VERT./; +s/HORIZONTAL/HORIZ./; s/DEVICE CONTROL/D.C./; s/SEPARATOR/SEP./; +s/NEGATIVE ACKNOWLEDGE/NEG. ACK./;. The POSIX-BC and 1047 sets are +identical throughout this range and differ from the 0037 set at only +one spot (21 decimal). Note that "\c\\" maps to two characters +not one. + + chr ord 8859-1 0037 1047 && POSIX-BC + ------------------------------------------------------------------------ + "\c?" 127 " " ***>< + "\c@" 0 ***>< + "\cA" 1 + "\cB" 2 + "\cC" 3 + "\cD" 4 + "\cE" 5 + "\cF" 6 + "\cG" 7 + "\cH" 8 + "\cI" 9 + "\cJ" 10 + "\cK" 11 + "\cL" 12 + "\cM" 13 + "\cN" 14 + "\cO" 15 + "\cP" 16 + "\cQ" 17 + "\cR" 18 + "\cS" 19 + "\cT" 20 + "\cU" 21 *** + "\cV" 22 + "\cW" 23 + "\cX" 24 + "\cY" 25 + "\cZ" 26 + "\c[" 27 + "\c\\" 28 \ \ \ + "\c]" 29 + "\c^" 30 ***>< + "\c_" 31 ***>< + + +=head1 FUNCTION DIFFERENCES + +=over 8 + +=item chr() + +chr() must be given an EBCDIC code number argument to yield a desired +character return value on an EBCDIC machine. For example: + + $CAPITAL_LETTER_A = chr(193); + +=item ord() + +ord() will return EBCDIC code number values on an EBCDIC machine. +For example: + + $the_number_193 = ord("A"); + +=item pack() + +The c and C templates for pack() are dependent upon character set +encoding. Examples of usage on EBCDIC include: + + $foo = pack("CCCC",193,194,195,196); + # $foo eq "ABCD" + $foo = pack("C4",193,194,195,196); + # same thing + + $foo = pack("ccxxcc",193,194,195,196); + # $foo eq "AB\0\0CD" + +=item print() + +One must be careful with scalars and strings that are passed to +print that contain ASCII encodings. One common place +for this to occur is in the output of the MIME type header for +CGI script writing. For example, many perl programming guides +recommend something similar to: + + print "Content-type:\ttext/html\015\012\015\012"; + # this may be wrong on EBCDIC + +Under the IBM OS/390 USS Web Server for example you should instead +write that as: + + print "Content-type:\ttext/html\r\n\r\n"; # OK for DGW et alia + +That is because the translation from EBCDIC to ASCII is done +by the web server in this case (such code will not be appropriate for +the Macintosh however). Consult your web server's documentation for +further details. + +=item printf() + +The formats that can convert characters to numbers and vice versa +will be different from their ASCII counterparts when executed +on an EBCDIC machine. Examples include: + + printf("%c%c%c",193,194,195); # prints ABC + +=item sort() + +EBCDIC sort results may differ from ASCII sort results especially for +mixed case strings. This is discussed in more detail below. + +=item sprintf() + +See the discussion of printf() above. An example of the use +of sprintf would be: + + $CAPITAL_LETTER_A = sprintf("%c",193); + +=item unpack() + +See the discussion of pack() above. + +=back + +=head1 REGULAR EXPRESSION DIFFERENCES + +As of perl 5.005_03 the letter range regular expression such as +[A-Z] and [a-z] have been especially coded to not pick up gap +characters. For example characters such as +that lie between I and J would not be matched by C. +If you do want to match such characters in a single octet +regular expression try matching the hex or octal code such +as C on EBCDIC or C on ASCII machines to +have your regular expression match . + +Another place to be wary of is the inappropriate use of hex or +octal constants in regular expressions. Consider the following +set of subs: + + sub is_c0 { + my $char = substr(shift,0,1); + $char =~ /[\000-\037]/; + } + + sub is_print_ascii { + my $char = substr(shift,0,1); + $char =~ /[\040-\176]/; + } + + sub is_delete { + my $char = substr(shift,0,1); + $char eq "\177"; + } + + sub is_c1 { + my $char = substr(shift,0,1); + $char =~ /[\200-\237]/; + } + + sub is_latin_1 { + my $char = substr(shift,0,1); + $char =~ /[\240-\377]/; + } + +The above would be adequate if the concern was only with numeric codepoints. +However, we may actually be concerned with characters rather than codepoints +and on an EBCDIC machine would like for constructs such as +C to print +out the expected message. One way to represent the above collection +of character classification subs that is capable of working across the +four coded character sets discussed in this document is as follows: + + sub Is_c0 { + my $char = substr(shift,0,1); + if (ord('^')==94) { # ascii + return $char =~ /[\000-\037]/; + } + if (ord('^')==176) { # 37 + return $char =~ /[\000-\003\067\055-\057\026\005\045\013-\023\074\075\062\046\030\031\077\047\034-\037]/; + } + if (ord('^')==95 || ord('^')==106) { # 1047 || posix-bc + return $char =~ /[\000-\003\067\055-\057\026\005\025\013-\023\074\075\062\046\030\031\077\047\034-\037]/; + } + } + + sub Is_print_ascii { + my $char = substr(shift,0,1); + $char =~ /[ !"\#\$%&'()*+,\-.\/0-9:;<=>?\@A-Z[\\\]^_`a-z{|}~]/; + } + + sub Is_delete { + my $char = substr(shift,0,1); + if (ord('^')==94) { # ascii + return $char eq "\177"; + } + else { # ebcdic + return $char eq "\007"; + } + } + + sub Is_c1 { + my $char = substr(shift,0,1); + if (ord('^')==94) { # ascii + return $char =~ /[\200-\237]/; + } + if (ord('^')==176) { # 37 + return $char =~ /[\040-\044\025\006\027\050-\054\011\012\033\060\061\032\063-\066\010\070-\073\040\024\076\377]/; + } + if (ord('^')==95) { # 1047 + return $char =~ /[\040-\045\006\027\050-\054\011\012\033\060\061\032\063-\066\010\070-\073\040\024\076\377]/; + } + if (ord('^')==106) { # posix-bc + return $char =~ + /[\040-\045\006\027\050-\054\011\012\033\060\061\032\063-\066\010\070-\073\040\024\076\137]/; + } + } + + sub Is_latin_1 { + my $char = substr(shift,0,1); + if (ord('^')==94) { # ascii + return $char =~ /[\240-\377]/; + } + if (ord('^')==176) { # 37 + return $char =~ + /[\101\252\112\261\237\262\152\265\275\264\232\212\137\312\257\274\220\217\352\372\276\240\266\263\235\332\233\213\267\270\271\253\144\145\142\146\143\147\236\150\164\161-\163\170\165-\167\254\151\355\356\353\357\354\277\200\375\376\373\374\255\256\131\104\105\102\106\103\107\234\110\124\121-\123\130\125-\127\214\111\315\316\313\317\314\341\160\335\336\333\334\215\216\337]/; + } + if (ord('^')==95) { # 1047 + return $char =~ + /[\101\252\112\261\237\262\152\265\273\264\232\212\260\312\257\274\220\217\352\372\276\240\266\263\235\332\233\213\267\270\271\253\144\145\142\146\143\147\236\150\164\161-\163\170\165-\167\254\151\355\356\353\357\354\277\200\375\376\373\374\272\256\131\104\105\102\106\103\107\234\110\124\121-\123\130\125-\127\214\111\315\316\313\317\314\341\160\335\336\333\334\215\216\337]/; + } + if (ord('^')==106) { # posix-bc + return $char =~ + /[\101\252\260\261\237\262\320\265\171\264\232\212\272\312\257\241\220\217\352\372\276\240\266\263\235\332\233\213\267\270\271\253\144\145\142\146\143\147\236\150\164\161-\163\170\165-\167\254\151\355\356\353\357\354\277\200\340\376\335\374\255\256\131\104\105\102\106\103\107\234\110\124\121-\123\130\125-\127\214\111\315\316\313\317\314\341\160\300\336\333\334\215\216\337]/; + } + } + +Note however that only the C sub is really independent +of coded character set. Another way to write C would be +to use the characters in the range explicitly: + + sub Is_latin_1 { + my $char = substr(shift,0,1); + $char =~ /[ ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ]/; + } + +Although that form may run into trouble in network transit (due to the +presence of 8 bit characters) or on non ISO-Latin character sets. + + +=head1 SOCKETS + +Most socket programming assumes ASCII character encodings in network +byte order. Exceptions can include CGI script writing under a +host web server where the server may take care of translation for you. +Most host web servers convert EBCDIC data to ISO-8859-1 or Unicode on +output. + +=head1 SORTING + +One big difference between ASCII based character sets and EBCDIC ones +are the relative positions of upper and lower case letters and the +letters compared to the digits. If sorted on an ASCII based machine the +two letter abbreviation for a physician comes before the two letter +for drive, that is: + + @sorted = sort(qw(Dr. dr.)); # @sorted holds qw(Dr. dr.) on ASCII, + # qw(dr. Dr.) on EBCDIC + +The property of lower case before uppercase letters in EBCDIC is +even carried to the Latin 1 EBCDIC pages such as 0037 and 1047. +An example would be that (203) comes before + (235) on and ASCII machine, but the latter (83) +comes before the former (115) on an EBCDIC machine. (Astute readers will +note that the upper case version of is +simply "SS" and that the upper case version of +is not in the 0..255 range but it is at U+x0178 in Unicode). + +The sort order will cause differences between results obtained on +ASCII machines versus EBCDIC machines. What follows are some suggestions +on how to deal with these differences. + +=head2 Ignore ASCII vs EBCDIC sort differences. + +This is the least computationally expensive strategy. It may require +some user education. + +=head2 MONOCASE then sort data. + +In order to minimize the expense of monocasing mixed test try to +C towards the character set case most employed within the data. +If the data are primarily UPPERCASE non Latin 1 then apply tr/[a-z]/[A-Z]/ +then sort(). If the data are primarily lowercase non Latin 1 then +apply tr/[A-Z]/[a-z]/ before sorting. If the data are primarily UPPERCASE +and include Latin-1 characters then apply: tr/[a-z]/[A-Z]/; +XXX + +This strategy does not preserve the case of the data and may not be +acceptable. + +=head2 Convert, sort data, then reconvert. + +This is the most expensive proposition that does not employ a network +connection. + +=head2 Perform sorting on one type of machine only. + +This strategy can employ a network connection. As such +it would be computationally expensive. + +=head1 URL ENCODING and DECODING + +Note that some URLs have hexadecimal ASCII codepoints in them in an +attempt to overcome character limitation issues. For example the +tilde character is not on every keyboard hence a URL of the form: + + http://www.pvhp.com/~pvhp/ + +may also be expressed as either of: + + http://www.pvhp.com/%7Epvhp/ + + http://www.pvhp.com/%7epvhp/ + +where 7E is the hexadecimal ASCII codepoint for '~'. Here is an example +of decoding such a URL under CCSID 1047: + + $url = 'http://www.pvhp.com/%7Epvhp/'; + # this array assumes code page 1047 + my @a2e_1047 = ( + 0, 1, 2, 3, 55, 45, 46, 47, 22, 5, 21, 11, 12, 13, 14, 15, + 16, 17, 18, 19, 60, 61, 50, 38, 24, 25, 63, 39, 28, 29, 30, 31, + 64, 90,127,123, 91,108, 80,125, 77, 93, 92, 78,107, 96, 75, 97, + 240,241,242,243,244,245,246,247,248,249,122, 94, 76,126,110,111, + 124,193,194,195,196,197,198,199,200,201,209,210,211,212,213,214, + 215,216,217,226,227,228,229,230,231,232,233,173,224,189, 95,109, + 121,129,130,131,132,133,134,135,136,137,145,146,147,148,149,150, + 151,152,153,162,163,164,165,166,167,168,169,192, 79,208,161, 7, + 32, 33, 34, 35, 36, 37, 6, 23, 40, 41, 42, 43, 44, 9, 10, 27, + 48, 49, 26, 51, 52, 53, 54, 8, 56, 57, 58, 59, 4, 20, 62,255, + 65,170, 74,177,159,178,106,181,187,180,154,138,176,202,175,188, + 144,143,234,250,190,160,182,179,157,218,155,139,183,184,185,171, + 100,101, 98,102, 99,103,158,104,116,113,114,115,120,117,118,119, + 172,105,237,238,235,239,236,191,128,253,254,251,252,186,174, 89, + 68, 69, 66, 70, 67, 71,156, 72, 84, 81, 82, 83, 88, 85, 86, 87, + 140, 73,205,206,203,207,204,225,112,221,222,219,220,141,142,223 + ); + $url =~ s/%([0-9a-fA-F]{2})/pack("c",$a2e_1047[hex($1)])/ge; + +=head1 I18N AND L10N + +Internationalization(I18N) and localization(L10N) are supported at least +in principle even on EBCDIC machines. The details are system dependent +and discussed under the L section below. + +=head1 MULTI OCTET CHARACTER SETS + +Double byte EBCDIC code pages (?) XXX. + +UTF-8, UTF-EBCDIC, (?) XXX. + +=head1 OS ISSUES + +There may be a few system dependent issues +of concern to EBCDIC Perl programmers. + +=head2 OS/400 + +=over 8 + +=item IFS access + +XXX. + +=back + +=head2 OS/390 + +=over 8 + +=item dataset access + +For sequential data set access try: + + my @ds_records = `cat //DSNAME`; + +or: + + my @ds_records = `cat //'HLQ.DSNAME'`; + +See also the OS390::Stdio module on CPAN. + +=item locales + +On OS/390 see L for information on locales. The L10N files +are in F. $Config{d_setlocale} is 'define' on OS/390. + +=back + +=head2 VM/ESA? + +XXX. + +=head2 POSIX-BC? + +XXX. + +=head1 REFERENCES + +http://anubis.dkuug.dk/i18n/charmaps + +L. + +http://www.unicode.org/ + +http://www.unicode.org/unicode/reports/tr16/ + +B The Unicode Consortium, +ISBN 0-201-48345-9, Addison Wesley Developers Press, July 1996. + +B, IBM SC09-2190-00, December 1996. + +"Demystifying Character Sets", Andrea Vine, Multilingual Computing +& Technology, B<#26 Vol. 10 Issue 4>, August/September 1999; +ISSN 1523-0309; Multilingual Computing Inc. Sandpoint ID, USA. + +=head1 AUTHOR + +Peter Prymmer Epvhp@best.comE wrote this in 1999 and 2000 +with CCSID 0819 and 0037 help from Chris Leach and +Andre' Pirard EA.Pirard@ulg.ac.beE as well as POSIX-BC +help from Thomas Dorner EThomas.Dorner@start.deE. +Thanks also to Philip Newton and Vickie Cooper. Trademarks, registered +trademarks, service marks and registered service marks used in this +document are the property of their respective owners. + + diff --git a/pod/perltoc.pod b/pod/perltoc.pod index c20160c..282825c 100644 --- a/pod/perltoc.pod +++ b/pod/perltoc.pod @@ -564,7 +564,7 @@ binmode FILEHANDLE, DISCIPLINE, binmode FILEHANDLE, bless REF,CLASSNAME, bless REF, caller EXPR, caller, chdir EXPR, chmod LIST, chomp VARIABLE, chomp LIST, chomp, chop VARIABLE, chop LIST, chop, chown LIST, chr NUMBER, chr, chroot FILENAME, chroot, close FILEHANDLE, close, closedir DIRHANDLE, -connect SOCKET,NAME, continue BLOCK, cos EXPR, crypt PLAINTEXT,SALT, +connect SOCKET,NAME, continue BLOCK, cos EXPR, cos, crypt PLAINTEXT,SALT, dbmclose HASH, dbmopen HASH,DBNAME,MASK, defined EXPR, defined, delete EXPR, die LIST, do BLOCK, do SUBROUTINE(LIST), do EXPR, dump LABEL, dump, each HASH, eof FILEHANDLE, eof (), eof, eval EXPR, eval BLOCK, exec LIST, @@ -590,12 +590,12 @@ mkdir FILENAME,MASK, mkdir FILENAME, msgctl ID,CMD,ARG, msgget KEY,FLAGS, msgrcv ID,VAR,SIZE,TYPE,FLAGS, msgsnd ID,MSG,FLAGS, my EXPR, my EXPR : ATTRIBUTES, next LABEL, next, no Module LIST, oct EXPR, oct, open FILEHANDLE,MODE,LIST, open FILEHANDLE,EXPR, open FILEHANDLE, opendir -DIRHANDLE,EXPR, ord EXPR, ord, our EXPR, pack TEMPLATE,LIST, package, -package NAMESPACE, pipe READHANDLE,WRITEHANDLE, pop ARRAY, pop, pos SCALAR, -pos, print FILEHANDLE LIST, print LIST, print, printf FILEHANDLE FORMAT, -LIST, printf FORMAT, LIST, prototype FUNCTION, push ARRAY,LIST, q/STRING/, -qq/STRING/, qr/STRING/, qx/STRING/, qw/STRING/, quotemeta EXPR, quotemeta, -rand EXPR, rand, read FILEHANDLE,SCALAR,LENGTH,OFFSET, read +DIRHANDLE,EXPR, ord EXPR, ord, our EXPR, pack TEMPLATE,LIST, package +NAMESPACE, package, pipe READHANDLE,WRITEHANDLE, pop ARRAY, pop, pos +SCALAR, pos, print FILEHANDLE LIST, print LIST, print, printf FILEHANDLE +FORMAT, LIST, printf FORMAT, LIST, prototype FUNCTION, push ARRAY,LIST, +q/STRING/, qq/STRING/, qr/STRING/, qx/STRING/, qw/STRING/, quotemeta EXPR, +quotemeta, rand EXPR, rand, read FILEHANDLE,SCALAR,LENGTH,OFFSET, read FILEHANDLE,SCALAR,LENGTH, readdir DIRHANDLE, readline EXPR, readlink EXPR, readlink, readpipe EXPR, recv SOCKET,SCALAR,LENGTH,FLAGS, redo LABEL, redo, ref EXPR, ref, rename OLDNAME,NEWNAME, require VERSION, require EXPR, @@ -1027,21 +1027,23 @@ B<-w>, B<-W>, B<-X> =item use strict -=item Looking at data and -w +=item Looking at data and -w and w + +=item help =item Stepping through code =item Placeholder for a, w, t, T -=item Regular expressions +=item REGULAR EXPRESSIONS -=item Some ideas for output +=item OUTPUT TIPS =item CGI =item GUIs -=item Summary +=item SUMMARY =item SEE ALSO @@ -1385,36 +1387,6 @@ and optimizing the final combined regexp =back -=head2 perlref - Perl references and nested data structures - -=over - -=item NOTE - -=item DESCRIPTION - -=over - -=item Making References - -=item Using References - -=item Symbolic references - -=item Not-so-symbolic references - -=item Pseudo-hashes: Using an array as a hash - -=item Function Templates - -=back - -=item WARNING - -=item SEE ALSO - -=back - =head2 perlre - Perl regular expressions =over @@ -1461,6 +1433,36 @@ C<(?(condition)yes-pattern|no-pattern)> =back +=head2 perlref - Perl references and nested data structures + +=over + +=item NOTE + +=item DESCRIPTION + +=over + +=item Making References + +=item Using References + +=item Symbolic references + +=item Not-so-symbolic references + +=item Pseudo-hashes: Using an array as a hash + +=item Function Templates + +=back + +=item WARNING + +=item SEE ALSO + +=back + =head2 perlform - Perl formats =over @@ -1487,108 +1489,217 @@ C<(?(condition)yes-pattern|no-pattern)> =back -=head2 perllocale - Perl locale handling (internationalization and -localization) +=head2 perlboot - Beginner's Object-Oriented Tutorial =over =item DESCRIPTION -=item PREPARING TO USE LOCALES +=over -=item USING LOCALES +=item If we could talk to the animals... + +=item Introducing the method invocation arrow + +=item Invoking a barnyard + +=item The extra parameter of method invocation + +=item Calling a second method to simplify things + +=item Inheriting the windpipes + +=item A few notes about @ISA + +=item Overriding the methods + +=item Starting the search from a different place + +=item The SUPER way of doing things + +=item Where we're at so far... + +=item A horse is a horse, of course of course -- or is it? + +=item Invoking an instance method + +=item Accessing the instance data + +=item How to build a horse + +=item Inheriting the constructor + +=item Making a method work with either classes or instances + +=item Adding parameters to a method + +=item More interesting instances + +=item A horse of a different color + +=item Summary + +=back + +=item SEE ALSO + +=item COPYRIGHT + +=back + +=head2 perltoot - Tom's object-oriented tutorial for perl =over -=item The use locale pragma +=item DESCRIPTION -=item The setlocale function +=item Creating a Class -=item Finding locales +=over -=item LOCALE PROBLEMS +=item Object Representation -=item Temporarily fixing locale problems +=item Class Interface -=item Permanently fixing locale problems +=item Constructors and Instance Methods -=item Permanently fixing your system's locale configuration +=item Planning for the Future: Better Constructors -=item Fixing system locale configuration +=item Destructors -=item The localeconv function +=item Other Object Methods =back -=item LOCALE CATEGORIES +=item Class Data =over -=item Category LC_COLLATE: Collation +=item Accessing Class Data -=item Category LC_CTYPE: Character Types +=item Debugging Methods -=item Category LC_NUMERIC: Numeric Formatting +=item Class Destructors -=item Category LC_MONETARY: Formatting of monetary amounts +=item Documenting the Interface -=item LC_TIME +=back -=item Other categories +=item Aggregation + +=item Inheritance + +=over + +=item Overridden Methods + +=item Multiple Inheritance + +=item UNIVERSAL: The Root of All Objects =back -=item SECURITY +=item Alternate Object Representations -B (C, C, C, C and C):, -B (with C<\l>, C<\L>, C<\u> or C<\U>), -B (C):, B (C):, -B (printf() and write()):, B (lc(), lcfirst(), uc(), ucfirst()):, B (localeconv(), strcoll(), strftime(), strxfrm()):, B (isalnum(), isalpha(), isdigit(), isgraph(), -islower(), isprint(), ispunct(), isspace(), isupper(), isxdigit()): +=over -=item ENVIRONMENT +=item Arrays as Objects -PERL_BADLANG, LC_ALL, LANGUAGE, LC_CTYPE, LC_COLLATE, LC_MONETARY, -LC_NUMERIC, LC_TIME, LANG +=item Closures as Objects + +=back + +=item AUTOLOAD: Proxy Methods + +=over + +=item Autoloaded Data Methods + +=item Inherited Autoloaded Data Methods + +=back + +=item Metaclassical Tools + +=over + +=item Class::Struct + +=item Data Members as Variables =item NOTES +=item Object Terminology + +=back + +=item SEE ALSO + +=item AUTHOR AND COPYRIGHT + +=item COPYRIGHT + =over -=item Backward compatibility +=item Acknowledgments -=item I18N:Collate obsolete +=back -=item Sort speed and memory use impacts +=back -=item write() and LC_NUMERIC +=head2 perltootc - Tom's OO Tutorial for Class Data in Perl -=item Freely available locale definitions +=over -=item I18n and l10n +=item DESCRIPTION -=item An imperfect standard +=item Class Data as Package Variables + +=over + +=item Putting All Your Eggs in One Basket + +=item Inheritance Concerns + +=item The Eponymous Meta-Object + +=item Indirect References to Class Data + +=item Monadic Classes + +=item Translucent Attributes =back -=item BUGS +=item Class Data as Lexical Variables =over -=item Broken systems +=item Privacy and Responsibility + +=item File-Scoped Lexicals + +=item More Inheritance Concerns + +=item Locking the Door and Throwing Away the Key + +=item Translucency Revisited =back +=item NOTES + =item SEE ALSO +=item AUTHOR AND COPYRIGHT + +=item ACKNOWLEDGEMENTS + =item HISTORY =back -=head2 perlunicode - Unicode support in Perl +=head2 perlobj - Perl objects =over @@ -1596,23 +1707,102 @@ LC_NUMERIC, LC_TIME, LANG =over -=item Important Caveat +=item An Object is Simply a Reference -Input and Output Disciplines, Regular Expressions, C still needed -to enable a few features +=item A Class is Simply a Package -=item Byte and Character semantics +=item A Method is Simply a Subroutine -=item Effects of character semantics +=item Method Invocation -=item Character encodings for input and output +=item WARNING + +=item Default UNIVERSAL methods + +isa(CLASS), can(METHOD), VERSION( [NEED] ) + +=item Destructors + +=item Summary + +=item Two-Phased Garbage Collection =back -=item CAVEATS +=item SEE ALSO + +=back + +=head2 perlbot - Bag'o Object Tricks (the BOT) + +=over + +=item DESCRIPTION + +=item OO SCALING TIPS + +=item INSTANCE VARIABLES + +=item SCALAR INSTANCE VARIABLES + +=item INSTANCE VARIABLE INHERITANCE + +=item OBJECT RELATIONSHIPS + +=item OVERRIDING SUPERCLASS METHODS + +=item USING RELATIONSHIP WITH SDBM + +=item THINKING OF CODE REUSE + +=item CLASS CONTEXT AND THE OBJECT + +=item INHERITING A CONSTRUCTOR + +=item DELEGATION + +=back + +=head2 perltie - how to hide an object class in a simple variable + +=over + +=item SYNOPSIS + +=item DESCRIPTION + +=over + +=item Tying Scalars + +TIESCALAR classname, LIST, FETCH this, STORE this, value, DESTROY this + +=item Tying Arrays + +TIEARRAY classname, LIST, FETCH this, index, STORE this, index, value, +DESTROY this + +=item Tying Hashes + +USER, HOME, CLOBBER, LIST, TIEHASH classname, LIST, FETCH this, key, STORE +this, key, value, DELETE this, key, CLEAR this, EXISTS this, key, FIRSTKEY +this, NEXTKEY this, lastkey, DESTROY this + +=item Tying FileHandles + +TIEHANDLE classname, LIST, WRITE this, LIST, PRINT this, LIST, PRINTF this, +LIST, READ this, LIST, READLINE this, GETC this, CLOSE this, DESTROY this + +=item The C Gotcha + +=back =item SEE ALSO +=item BUGS + +=item AUTHOR + =back =head2 perlipc - Perl interprocess communication (signals, fifos, pipes, @@ -1968,372 +2158,274 @@ select RBITS,WBITS,EBITS,TIMEOUT, semctl ID,SEMNUM,CMD,ARG, semget KEY,NSEMS,FLAGS, semop KEY,OPSTRING, setgrent, setpgrp PID,PGRP, setpriority WHICH,WHO,PRIORITY, setpwent, setsockopt SOCKET,LEVEL,OPTNAME,OPTVAL, shmctl ID,CMD,ARG, shmget KEY,SIZE,FLAGS, -shmread ID,VAR,POS,SIZE, shmwrite ID,STRING,POS,SIZE, socketpair -SOCKET1,SOCKET2,DOMAIN,TYPE,PROTOCOL, stat FILEHANDLE, stat EXPR, stat, -symlink OLDFILE,NEWFILE, syscall LIST, sysopen -FILEHANDLE,FILENAME,MODE,PERMS, system LIST, times, truncate -FILEHANDLE,LENGTH, truncate EXPR,LENGTH, umask EXPR, umask, utime LIST, -wait, waitpid PID,FLAGS - -=back - -=item CHANGES - -v1.47, 22 March 2000, v1.46, 12 February 2000, v1.45, 20 December 1999, -v1.44, 19 July 1999, v1.43, 24 May 1999, v1.42, 22 May 1999, v1.41, 19 May -1999, v1.40, 11 April 1999, v1.39, 11 February 1999, v1.38, 31 December -1998, v1.37, 19 December 1998, v1.36, 9 September 1998, v1.35, 13 August -1998, v1.33, 06 August 1998, v1.32, 05 August 1998, v1.30, 03 August 1998, -v1.23, 10 July 1998 - -=item Supported Platforms - -=item SEE ALSO - -=item AUTHORS / CONTRIBUTORS - -=item VERSION - -=back - -=head2 perlsec - Perl security - -=over - -=item DESCRIPTION - -=over - -=item Laundering and Detecting Tainted Data - -=item Switches On the "#!" Line - -=item Cleaning Up Your Path - -=item Security Bugs - -=item Protecting Your Programs - -=back - -=item SEE ALSO - -=back - -=head2 perlboot - Beginner's Object-Oriented Tutorial - -=over - -=item DESCRIPTION - -=over - -=item If we could talk to the animals... - -=item Introducing the method invocation arrow - -=item Invoking a barnyard - -=item The extra parameter of method invocation - -=item Calling a second method to simplify things - -=item Inheriting the windpipes - -=item A few notes about @ISA - -=item Overriding the methods - -=item Starting the search from a different place - -=item The SUPER way of doing things - -=item Where we're at so far... - -=item A horse is a horse, of course of course -- or is it? - -=item Invoking an instance method - -=item Accessing the instance data +shmread ID,VAR,POS,SIZE, shmwrite ID,STRING,POS,SIZE, socketpair +SOCKET1,SOCKET2,DOMAIN,TYPE,PROTOCOL, stat FILEHANDLE, stat EXPR, stat, +symlink OLDFILE,NEWFILE, syscall LIST, sysopen +FILEHANDLE,FILENAME,MODE,PERMS, system LIST, times, truncate +FILEHANDLE,LENGTH, truncate EXPR,LENGTH, umask EXPR, umask, utime LIST, +wait, waitpid PID,FLAGS -=item How to build a horse +=back -=item Inheriting the constructor +=item CHANGES -=item Making a method work with either classes or instances +v1.47, 22 March 2000, v1.46, 12 February 2000, v1.45, 20 December 1999, +v1.44, 19 July 1999, v1.43, 24 May 1999, v1.42, 22 May 1999, v1.41, 19 May +1999, v1.40, 11 April 1999, v1.39, 11 February 1999, v1.38, 31 December +1998, v1.37, 19 December 1998, v1.36, 9 September 1998, v1.35, 13 August +1998, v1.33, 06 August 1998, v1.32, 05 August 1998, v1.30, 03 August 1998, +v1.23, 10 July 1998 -=item Adding parameters to a method +=item Supported Platforms -=item More interesting instances +=item SEE ALSO -=item A horse of a different color +=item AUTHORS / CONTRIBUTORS -=item Summary +=item VERSION =back -=item SEE ALSO +=head2 perllocale - Perl locale handling (internationalization and +localization) -=item COPYRIGHT +=over -=back +=item DESCRIPTION -=head2 perltoot - Tom's object-oriented tutorial for perl +=item PREPARING TO USE LOCALES + +=item USING LOCALES =over -=item DESCRIPTION +=item The use locale pragma -=item Creating a Class +=item The setlocale function -=over +=item Finding locales -=item Object Representation +=item LOCALE PROBLEMS -=item Class Interface +=item Temporarily fixing locale problems -=item Constructors and Instance Methods +=item Permanently fixing locale problems -=item Planning for the Future: Better Constructors +=item Permanently fixing your system's locale configuration -=item Destructors +=item Fixing system locale configuration -=item Other Object Methods +=item The localeconv function =back -=item Class Data +=item LOCALE CATEGORIES =over -=item Accessing Class Data - -=item Debugging Methods +=item Category LC_COLLATE: Collation -=item Class Destructors +=item Category LC_CTYPE: Character Types -=item Documenting the Interface +=item Category LC_NUMERIC: Numeric Formatting -=back +=item Category LC_MONETARY: Formatting of monetary amounts -=item Aggregation +=item LC_TIME -=item Inheritance +=item Other categories -=over +=back -=item Overridden Methods +=item SECURITY -=item Multiple Inheritance +B (C, C, C, C and C):, +B (with C<\l>, C<\L>, C<\u> or C<\U>), +B (C):, B (C):, +B (printf() and write()):, B (lc(), lcfirst(), uc(), ucfirst()):, B (localeconv(), strcoll(), strftime(), strxfrm()):, B (isalnum(), isalpha(), isdigit(), isgraph(), +islower(), isprint(), ispunct(), isspace(), isupper(), isxdigit()): -=item UNIVERSAL: The Root of All Objects +=item ENVIRONMENT -=back +PERL_BADLANG, LC_ALL, LANGUAGE, LC_CTYPE, LC_COLLATE, LC_MONETARY, +LC_NUMERIC, LC_TIME, LANG -=item Alternate Object Representations +=item NOTES =over -=item Arrays as Objects +=item Backward compatibility -=item Closures as Objects +=item I18N:Collate obsolete -=back +=item Sort speed and memory use impacts -=item AUTOLOAD: Proxy Methods +=item write() and LC_NUMERIC -=over +=item Freely available locale definitions -=item Autoloaded Data Methods +=item I18n and l10n -=item Inherited Autoloaded Data Methods +=item An imperfect standard =back -=item Metaclassical Tools +=item BUGS =over -=item Class::Struct - -=item Data Members as Variables - -=item NOTES - -=item Object Terminology +=item Broken systems =back =item SEE ALSO -=item AUTHOR AND COPYRIGHT - -=item COPYRIGHT - -=over - -=item Acknowledgments - -=back +=item HISTORY =back -=head2 perltootc - Tom's OO Tutorial for Class Data in Perl +=head2 perlunicode - Unicode support in Perl =over =item DESCRIPTION -=item Class Data as Package Variables - =over -=item Putting All Your Eggs in One Basket +=item Important Caveat -=item Inheritance Concerns +Input and Output Disciplines, Regular Expressions, C still needed +to enable a few features -=item The Eponymous Meta-Object +=item Byte and Character semantics -=item Indirect References to Class Data +=item Effects of character semantics -=item Monadic Classes +=item Character encodings for input and output -=item Translucent Attributes +=back + +=item CAVEATS + +=item SEE ALSO =back -=item Class Data as Lexical Variables +=head2 perlebcdic - Considerations for running Perl on EBCDIC platforms =over -=item Privacy and Responsibility +=item DESCRIPTION -=item File-Scoped Lexicals +=item COMMON CHARACTER CODE SETS -=item More Inheritance Concerns +=over -=item Locking the Door and Throwing Away the Key +=item ASCII -=item Translucency Revisited +=item ISO 8859 -=back +=item Latin 1 (ISO 8859-1) -=item NOTES +=item EBCDIC -=item SEE ALSO +=item 13 variant characters -=item AUTHOR AND COPYRIGHT +=item 0037 -=item ACKNOWLEDGEMENTS +=item 1047 -=item HISTORY +=item POSIX-BC =back -=head2 perlobj - Perl objects - -=over +=item SINGLE OCTET TABLES -=item DESCRIPTION +recipe 0, recipe 1, recipe 2, recipe 3, recipe 4 -=over +=item IDENTIFYING CHARACTER CODE SETS -=item An Object is Simply a Reference +=item CONVERSIONS -=item A Class is Simply a Package +=item OPERATOR DIFFERENCES -=item A Method is Simply a Subroutine +=item FUNCTION DIFFERENCES -=item Method Invocation +chr(), ord(), pack(), print(), printf(), sort(), sprintf(), unpack() -=item WARNING +=item REGULAR EXPRESSION DIFFERENCES -=item Default UNIVERSAL methods +=item SOCKETS -isa(CLASS), can(METHOD), VERSION( [NEED] ) +=item SORTING -=item Destructors +=over -=item Summary +=item Ignore ASCII vs EBCDIC sort differences. -=item Two-Phased Garbage Collection +=item MONOCASE then sort data. -=back +=item Convert, sort data, then reconvert. -=item SEE ALSO +=item Perform sorting on one type of machine only. =back -=head2 perlbot - Bag'o Object Tricks (the BOT) +=item URL ENCODING and DECODING -=over +=item I18N AND L10N -=item DESCRIPTION +=item MULTI OCTET CHARACTER SETS -=item OO SCALING TIPS +=item OS ISSUES -=item INSTANCE VARIABLES +=over -=item SCALAR INSTANCE VARIABLES +=item OS/400 -=item INSTANCE VARIABLE INHERITANCE +IFS access -=item OBJECT RELATIONSHIPS +=item OS/390 -=item OVERRIDING SUPERCLASS METHODS +dataset access, locales -=item USING RELATIONSHIP WITH SDBM +=item VM/ESA? -=item THINKING OF CODE REUSE +=item POSIX-BC? -=item CLASS CONTEXT AND THE OBJECT +=back -=item INHERITING A CONSTRUCTOR +=item REFERENCES -=item DELEGATION +=item AUTHOR =back -=head2 perltie - how to hide an object class in a simple variable +=head2 perlsec - Perl security =over -=item SYNOPSIS - =item DESCRIPTION =over -=item Tying Scalars - -TIESCALAR classname, LIST, FETCH this, STORE this, value, DESTROY this - -=item Tying Arrays - -TIEARRAY classname, LIST, FETCH this, index, STORE this, index, value, -DESTROY this - -=item Tying Hashes +=item Laundering and Detecting Tainted Data -USER, HOME, CLOBBER, LIST, TIEHASH classname, LIST, FETCH this, key, STORE -this, key, value, DELETE this, key, CLEAR this, EXISTS this, key, FIRSTKEY -this, NEXTKEY this, lastkey, DESTROY this +=item Switches On the "#!" Line -=item Tying FileHandles +=item Cleaning Up Your Path -TIEHANDLE classname, LIST, WRITE this, LIST, PRINT this, LIST, PRINTF this, -LIST, READ this, LIST, READLINE this, GETC this, CLOSE this, DESTROY this +=item Security Bugs -=item The C Gotcha +=item Protecting Your Programs =back =item SEE ALSO -=item BUGS - -=item AUTHOR - =back =head2 perlmod - Perl modules (packages and symbol tables) @@ -3251,8 +3343,6 @@ complete? =item How do I fork a daemon process? -=item How do I make my program run with sh and csh? - =item How do I find out if I'm running interactively or not? =item How do I timeout a slow event? @@ -5702,6 +5792,8 @@ F, F, F, F =item Segfault in make +=item op/sprintf test failure + =back =item Specific (mis)features of OS/2 port @@ -7694,8 +7786,9 @@ C, C =item c C, C, C, C, C, C, C, -C, C, C, C, C, C, -C, C, C, C, C +C, C, C, C, C, +C, C, C, C, C, C, +C =item C @@ -7751,9 +7844,9 @@ C, C, C, C, C, C, C, C, C, C, C, C, C, C, C, C, C, C, C, C, -C, C, C, C, -C, C, C, C, C, -C, C, C, C, C, +C, C, C, C, +C, C, C, C, C, +C, C, C, C, C, C, C, C, C, C, C, C, C, C, C, C, C, C, C, C, @@ -7783,8 +7876,8 @@ C, C, C, C, C, C =item g -C, C, C, C, C, -C, C, C, C, C +C, C, C, C, C, +C, C, C, C, C, C =item h @@ -7795,26 +7888,26 @@ C, C, C, C, C C, C, C, C, C, C, C, C, C, C, C, C, C, C, C, C, C, C, -C, C, C, C, C, C, -C, C, C, C, C, C, -C, C, C, C, C, -C, C, C, C, C, C, -C, C, C, C, C, -C, C, C, C, C, -C, C, C, C, C, -C, C, C, C, C, -C, C, C, C, C, -C, C, C, C, C, -C, C, C, C, C, -C, C, C, C, C, C, -C, C, C, C, -C, C, C, -C, C, C, C, C, -C, C, C, -C, C, C, C, -C, C, C, -C, C, C, C, -C, C, C +C, C, C, C, C, +C, C, C, C, C, +C, C, C, C, C, +C, C, C, C, C, C, +C, C, C, C, C, +C, C, C, C, C, +C, C, C, C, C, +C, C, C, C, C, +C, C, C, C, C, +C, C, C, C, +C, C, C, C, C, +C, C, C, C, C, +C, C, C, C, C, C, +C, C, C, C, +C, C, C, C, +C, C, C, C, +C, C, C, C, +C, C, C, C, +C, C, C, +C, C, C, C, C =item k @@ -7822,12 +7915,12 @@ C, C =item l -C, C, C, C, C, C, -C, C, C, C, C, C, -C, C, C, C, C, C, -C, C, C, C, C, C, -C, C, C, C, C, C, -C +C, C, C, C, C, +C, C, C, C, C, C, +C, C, C, C, C, +C, C, C, C, C, +C, C, C, C, C, C, +C, C, C, C, C, C =item m @@ -11689,31 +11782,6 @@ I|I =back -=head2 Pod::PlainText - Convert POD data to formatted ASCII text - -=over - -=item SYNOPSIS - -=item DESCRIPTION - -alt, indent, loose, sentence, width - -=item DIAGNOSTICS - -Bizarre space in item, Can't open %s for reading: %s, Unknown escape: %s, -Unknown sequence: %s, Unmatched =back - -=item RESTRICTIONS - -=item NOTES - -=item SEE ALSO - -=item AUTHOR - -=back - =head2 Pod::Plainer - Perl extension for converting Pod to old style Pod. =over @@ -12038,6 +12106,12 @@ Memory, CPU, Snooping, Signals, State Changes =item DESCRIPTION +=over + +=item OBJECT ORIENTED SYNTAX + +=back + =item AUTHOR =back