X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=pod%2Fperlpodspec.pod;h=0b60dfd967c4d536ac82551b675a1baf84bed6c5;hb=62703e7218aceb3f5d30f70a2307dd02e5eb8c63;hp=ab20799a89d29277d4f2941d3afefed5c1d31851;hpb=8939ba947b65b018b80ecab3fe1366287d07d1d7;p=p5sagit%2Fp5-mst-13.2.git diff --git a/pod/perlpodspec.pod b/pod/perlpodspec.pod index ab20799..0b60dfd 100644 --- a/pod/perlpodspec.pod +++ b/pod/perlpodspec.pod @@ -238,7 +238,7 @@ ignored. Examples: # This is the first line of program text. sub foo { # This is the second. -It is an error to try to I a Pod black with a "=cut" command. In +It is an error to try to I a Pod block with a "=cut" command. In that case, the Pod processor must halt parsing of the input file, and must by default emit a warning. @@ -332,6 +332,29 @@ then "text..." will constitute a data paragraph. There is no way to use "=for formatname text..." to express "text..." as a verbatim paragraph. +=item "=encoding encodingname" + +This command, which should occur early in the document (at least +before any non-US-ASCII data!), declares that this document is +encoded in the encoding I, which must be +an encoding name that L recognizes. (Encoding's list +of supported encodings, in L, is useful here.) +If the Pod parser cannot decode the declared encoding, it +should emit a warning and may abort parsing the document +altogether. + +A document having more than one "=encoding" line should be +considered an error. Pod processors may silently tolerate this if +the not-first "=encoding" lines are just duplicates of the +first one (e.g., if there's a "=use utf8" line, and later on +another "=use utf8" line). But Pod processors should complain if +there are contradictory "=encoding" lines in the same document +(e.g., if there is a "=encoding utf8" early in the document and +"=encoding big5" later). Pod processors that recognize BOMs +may also complain if they see an "=encoding" line +that contradicts the BOM (e.g., if a document with a UTF-16LE +BOM has an "=encoding shiftjis" line). + =back If a Pod processor sees any command other than the ones listed @@ -463,7 +486,7 @@ L. This formatting code is syntactically simple, but semantically complex. What it means is that each space in the printable -content of this code signifies a nonbreaking space. +content of this code signifies a non-breaking space. Consider: @@ -474,7 +497,7 @@ Consider: Both signify the monospace (c[ode] style) text consisting of "$x", one space, "?", one space, ":", one space, "$z". The difference is that in the latter, with the S code, those spaces -are not "normal" spaces, but instead are nonbreaking spaces. +are not "normal" spaces, but instead are non-breaking spaces. =back @@ -499,7 +522,7 @@ a "-". This was so that this: would parse as equivalent to this: - C<$foo-Ebar> + C<$foo-Ebar> instead of as equivalent to a "C" formatting code containing only "$foo-", and then a "bar>" outside the "C" formatting code. This @@ -589,7 +612,7 @@ UTF-16. If the file begins with the three literal byte values 0xEF 0xBB 0xBF =for comment - If toke.c is modified to support UTF32, add mention of those here. + If toke.c is modified to support UTF-32, add mention of those here. =item * @@ -611,11 +634,11 @@ is sufficient to establish this file's encoding. =for comment If/WHEN some brave soul makes these heuristics into a generic - text-file class (or file discipline?), we can presumably delete + text-file class (or PerlIO layer?), we can presumably delete mention of these icky details from this file, and can instead - tell people to just use appropriate class/discipline. + tell people to just use appropriate class/layer. Auto-recognition of newline sequences would be another desirable - feature of such a class/discipline. + feature of such a class/layer. HINT HINT HINT. =for comment @@ -709,10 +732,10 @@ paragraphs. =item * When rendering Pod to a format that has two kinds of hyphens (-), one -that's a nonbreaking hyphen, and another that's a breakable hyphen +that's a non-breaking hyphen, and another that's a breakable hyphen (as in "object-oriented", which can be split across lines as "object-", newline, "oriented"), formatters are encouraged to -generally translate "-" to nonbreaking hyphen, but may apply +generally translate "-" to non-breaking hyphen, but may apply heuristics to convert some of these to breaking hyphens. =item * @@ -969,15 +992,15 @@ EEeuro>1,000,000 Solution|Million::Euros>". =item * -Some Pod formatters output to formats that implement nonbreaking +Some Pod formatters output to formats that implement non-breaking spaces as an individual character (which I'll call "NBSP"), and -others output to formats that implement nonbreaking spaces just as +others output to formats that implement non-breaking spaces just as spaces wrapped in a "don't break this across lines" code. Note that at the level of Pod, both sorts of codes can occur: Pod can contain a NBSP character (whether as a literal, or as a "EE160>" or "EEnbsp>" code); and Pod can contain "SEfoo IEbarE baz>" codes, where "mere spaces" (character 32) in -such codes are taken to represent nonbreaking spaces. Pod +such codes are taken to represent non-breaking spaces. Pod parsers should consider supporting the optional parsing of "SEfoo IEbarE baz>" as if it were "fooIIEbarEIbaz", and, going the other way, the