is a module or program that converts Pod to some other format (HTML,
plaintext, TeX, PostScript, RTF). A B<Pod processor> might be a
formatter or translator, or might be a program that does something
-else with the Pod (like wordcounting it, scanning for index points,
+else with the Pod (like counting words, scanning for index points,
etc.).
Pod content is contained in B<Pod blocks>. A Pod block starts with a
whitespace character (and there's no "=begin"..."=end" region around).
The "=begin I<identifier>" ... "=end I<identifier>" commands stop
-paragraphs that they surround from being parsed as data or verbatim
+paragraphs that they surround from being parsed as ordinary or verbatim
paragraphs, if I<identifier> doesn't begin with a colon. This
is discussed in detail in the section
L</About Data Paragraphs and "=beginE<sol>=end" Regions>.
# This is the first line of program text.
sub foo { # This is the second.
-It is an error to try to I<start> a Pod black with a "=cut" command. In
+It is an error to try to I<start> a Pod block with a "=cut" command. In
that case, the Pod processor must halt parsing of the input file, and
must by default emit a warning.
=item "=begin formatname"
+=item "=begin formatname parameter"
+
This marks the following paragraphs (until the matching "=end
formatname") as being for some special kind of processing. Unless
"formatname" begins with a colon, the contained non-command
L</About Data Paragraphs and "=beginE<sol>=end" Regions>.
It is advised that formatnames match the regexp
-C<m/\A:?[-a-zA-Z0-9_]+\z/>. Implementors should anticipate future
-expansion in the semantics and syntax of the first parameter
-to "=begin"/"=end"/"=for".
+C<m/\A:?[−a−zA−Z0−9_]+\z/>. Everything following whitespace after the
+formatname is a parameter that may be used by the formatter when dealing
+with this region. This parameter must not be repeated in the "=end"
+paragraph. Implementors should anticipate future expansion in the
+semantics and syntax of the first parameter to "=begin"/"=end"/"=for".
=item "=end formatname"
=item "=encoding encodingname"
This command, which should occur early in the document (at least
-before any non-USASCII data!), declares that this document is
+before any non-US-ASCII data!), declares that this document is
encoded in the encoding I<encodingname>, which must be
-an encoding name that L<Encoding> recognizes. (Encoding's list
-of supported encodings, in L<Encoding::Supported>, is useful here.)
+an encoding name that L<Encode> recognizes. (Encode's list
+of supported encodings, in L<Encode::Supported>, is useful here.)
If the Pod parser cannot decode the declared encoding, it
should emit a warning and may abort parsing the document
altogether.
A document having more than one "=encoding" line should be
considered an error. Pod processors may silently tolerate this if
the not-first "=encoding" lines are just duplicates of the
-first one (e.g., if there's a "=use utf8" line, and later on
-another "=use utf8" line). But Pod processors should complain if
+first one (e.g., if there's a "=encoding utf8" line, and later on
+another "=encoding utf8" line). But Pod processors should complain if
there are contradictory "=encoding" lines in the same document
(e.g., if there is a "=encoding utf8" early in the document and
"=encoding big5" later). Pod processors that recognize BOMs
may also complain if they see an "=encoding" line
-that contradicts the BOM (e.g., if a document with a UTF16LE BOM
-has an "=encoding shiftjis" line).
+that contradicts the BOM (e.g., if a document with a UTF-16LE
+BOM has an "=encoding shiftjis" line).
=back
This formatting code is syntactically simple, but semantically
complex. What it means is that each space in the printable
-content of this code signifies a nonbreaking space.
+content of this code signifies a non-breaking space.
Consider:
Both signify the monospace (c[ode] style) text consisting of
"$x", one space, "?", one space, ":", one space, "$z". The
difference is that in the latter, with the S code, those spaces
-are not "normal" spaces, but instead are nonbreaking spaces.
+are not "normal" spaces, but instead are non-breaking spaces.
=back
would parse as equivalent to this:
- C<$foo-E<lt>bar>
+ C<$foo-E<gt>bar>
instead of as equivalent to a "C" formatting code containing
only "$foo-", and then a "bar>" outside the "C" formatting code. This
0xEF 0xBB 0xBF
=for comment
- If toke.c is modified to support UTF32, add mention of those here.
+ If toke.c is modified to support UTF-32, add mention of those here.
=item *
Pod parsers should not, by default, try to coerce apostrophe (') and
quote (") into smart quotes (little 9's, 66's, 99's, etc), nor try to
turn backtick (`) into anything else but a single backtick character
-(distinct from an openquote character!), nor "--" into anything but
+(distinct from an open quote character!), nor "--" into anything but
two minus signs. They I<must never> do any of those things to text
in CE<lt>...> formatting codes, and never I<ever> to text in verbatim
paragraphs.
=item *
When rendering Pod to a format that has two kinds of hyphens (-), one
-that's a nonbreaking hyphen, and another that's a breakable hyphen
+that's a non-breaking hyphen, and another that's a breakable hyphen
(as in "object-oriented", which can be split across lines as
"object-", newline, "oriented"), formatters are encouraged to
-generally translate "-" to nonbreaking hyphen, but may apply
+generally translate "-" to non-breaking hyphen, but may apply
heuristics to convert some of these to breaking hyphens.
=item *
=item *
-It is up to individual Pod formatter to display good judgment when
+It is up to individual Pod formatter to display good judgement when
confronted with an unrenderable character (which is distinct from an
unknown EE<lt>thing> sequence that the parser couldn't resolve to
anything, renderable or not). It is good practice to map Latin letters
=item *
-Some Pod formatters output to formats that implement nonbreaking
+Some Pod formatters output to formats that implement non-breaking
spaces as an individual character (which I'll call "NBSP"), and
-others output to formats that implement nonbreaking spaces just as
+others output to formats that implement non-breaking spaces just as
spaces wrapped in a "don't break this across lines" code. Note that
at the level of Pod, both sorts of codes can occur: Pod can contain a
NBSP character (whether as a literal, or as a "EE<lt>160>" or
"EE<lt>nbsp>" code); and Pod can contain "SE<lt>foo
IE<lt>barE<gt> baz>" codes, where "mere spaces" (character 32) in
-such codes are taken to represent nonbreaking spaces. Pod
+such codes are taken to represent non-breaking spaces. Pod
parsers should consider supporting the optional parsing of "SE<lt>foo
IE<lt>barE<gt> baz>" as if it were
"fooI<NBSP>IE<lt>barE<gt>I<NBSP>baz", and, going the other way, the
=item Fourth:
The section (AKA "item" in older perlpods), or undef if none. E.g.,
-in L<Getopt::Std/DESCRIPTION>, "DESCRIPTION" is the section. (Note
+in "LE<lt>Getopt::Std/DESCRIPTIONE<gt>", "DESCRIPTION" is the section. (Note
that this is not the same as a manpage section like the "5" in "man 5
crontab". "Section Foo" in the Pod sense means the part of the text
that's introduced by the heading or item whose text is "Foo".)
'url', # what sort of link
"http://www.perl.org/" # original content
+ L<Perl.org|http://www.perl.org/>
+ => "Perl.org", # link text
+ "http://www.perl.org/", # possibly inferred link text
+ "http://www.perl.org/", # name
+ undef, # section
+ 'url', # what sort of link
+ "Perl.org|http://www.perl.org/" # original content
+
Note that you can distinguish URL-links from anything else by the
fact that they match C<m/\A\w+:[^:\s]\S*\z/>. So
C<LE<lt>http://www.perl.comE<gt>> is a URL, but
=item *
-Previous versions of perlpod allowed for a C<LE<lt>sectionE<gt>> syntax
-(as in "C<LE<lt>Object AttributesE<gt>>"), which was not easily distinguishable
-from C<LE<lt>nameE<gt>> syntax. This syntax is no longer in the
-specification, and has been replaced by the C<LE<lt>"section"E<gt>> syntax
-(where the quotes were formerly optional). Pod parsers should tolerate
-the C<LE<lt>sectionE<gt>> syntax, for a while at least. The suggested
-heuristic for distinguishing C<LE<lt>sectionE<gt>> from C<LE<lt>nameE<gt>>
-is that if it contains any whitespace, it's a I<section>. Pod processors
-may warn about this being deprecated syntax.
+Previous versions of perlpod allowed for a C<LE<lt>sectionE<gt>> syntax (as in
+C<LE<lt>Object AttributesE<gt>>), which was not easily distinguishable from
+C<LE<lt>nameE<gt>> syntax and for C<LE<lt>"section"E<gt>> which was only
+slightly less ambiguous. This syntax is no longer in the specification, and
+has been replaced by the C<LE<lt>/sectionE<gt>> syntax (where the slash was
+formerly optional). Pod parsers should tolerate the C<LE<lt>"section"E<gt>>
+syntax, for a while at least. The suggested heuristic for distinguishing
+C<LE<lt>sectionE<gt>> from C<LE<lt>nameE<gt>> is that if it contains any
+whitespace, it's a I<section>. Pod processors should warn about this being
+deprecated syntax.
=back
Ut Enim
-But (for the forseeable future), Pod does not provide any way for Pod
+But (for the foreseeable future), Pod does not provide any way for Pod
authors to distinguish which grouping is meant by the above
"=item"-cluster structure. So formatters should format it like so: