X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=pod%2Fperlpodspec.pod;h=40e5371fe3a1a8e95c0576f479b1fb00934a47a5;hb=1bf2966364b6356e9050b17d8920dd4a8ce27d97;hp=145309c58f6dc95d1062910446b9fc57283a450d;hpb=210b36aa2e9e009554be8970c3315c2658c0384f;p=p5sagit%2Fp5-mst-13.2.git diff --git a/pod/perlpodspec.pod b/pod/perlpodspec.pod index 145309c..40e5371 100644 --- a/pod/perlpodspec.pod +++ b/pod/perlpodspec.pod @@ -238,7 +238,7 @@ ignored. Examples: # This is the first line of program text. sub foo { # This is the second. -It is an error to try to I a Pod black with a "=cut" command. In +It is an error to try to I a Pod block with a "=cut" command. In that case, the Pod processor must halt parsing of the input file, and must by default emit a warning. @@ -332,6 +332,29 @@ then "text..." will constitute a data paragraph. There is no way to use "=for formatname text..." to express "text..." as a verbatim paragraph. +=item "=encoding encodingname" + +This command, which should occur early in the document (at least +before any non-US-ASCII data!), declares that this document is +encoded in the encoding I, which must be +an encoding name that L recognizes. (Encoding's list +of supported encodings, in L, is useful here.) +If the Pod parser cannot decode the declared encoding, it +should emit a warning and may abort parsing the document +altogether. + +A document having more than one "=encoding" line should be +considered an error. Pod processors may silently tolerate this if +the not-first "=encoding" lines are just duplicates of the +first one (e.g., if there's a "=use utf8" line, and later on +another "=use utf8" line). But Pod processors should complain if +there are contradictory "=encoding" lines in the same document +(e.g., if there is a "=encoding utf8" early in the document and +"=encoding big5" later). Pod processors that recognize BOMs +may also complain if they see an "=encoding" line +that contradicts the BOM (e.g., if a document with a UTF-16LE +BOM has an "=encoding shiftjis" line). + =back If a Pod processor sees any command other than the ones listed @@ -589,7 +612,7 @@ UTF-16. If the file begins with the three literal byte values 0xEF 0xBB 0xBF =for comment - If toke.c is modified to support UTF32, add mention of those here. + If toke.c is modified to support UTF-32, add mention of those here. =item * @@ -611,11 +634,11 @@ is sufficient to establish this file's encoding. =for comment If/WHEN some brave soul makes these heuristics into a generic - text-file class (or file discipline?), we can presumably delete + text-file class (or PerlIO layer?), we can presumably delete mention of these icky details from this file, and can instead - tell people to just use appropriate class/discipline. + tell people to just use appropriate class/layer. Auto-recognition of newline sequences would be another desirable - feature of such a class/discipline. + feature of such a class/layer. HINT HINT HINT. =for comment @@ -883,9 +906,9 @@ character) to the escape sequences or codes necessary for conveying such sequences in the target output format. A converter to *roff would, for example know that "\xE9" (whether conveyed literally, or via a EE...> sequence) is to be conveyed as "e\\*'". -Similarly, a program rendering Pod in a MacOS application window, would +Similarly, a program rendering Pod in a Mac OS application window, would presumably need to know that "\xE9" maps to codepoint 142 in MacRoman -encoding that (at time of writing) is native for MacOS. Such +encoding that (at time of writing) is native for Mac OS. Such Unicode2whatever mappings are presumably already widely available for common output formats. (Such mappings may be incomplete! Implementers are not expected to bend over backwards in an attempt to render