lib/Pod/t/pod.t podlators test
lib/Pod/t/Select.t See if Pod::Select works
lib/Pod/t/termcap.t podlators test
+lib/Pod/t/text-encoding.t podlators test
lib/Pod/t/text-options.t podlators test
lib/Pod/t/text-utf8.t podlators test
lib/Pod/t/text.t podlators test
@ISA = qw(Pod::Simple);
-$VERSION = '2.20';
+$VERSION = '2.21';
# Set the debugging level. If someone has inserted a debug function into this
# class already, use that. Otherwise, use any Pod::Simple debug function
return;
}
+ # If we were given the utf8 option, set an output encoding on our file
+ # handle. Wrap in an eval in case we're using a version of Perl too old
+ # to understand this.
+ #
+ # This is evil because it changes the global state of a file handle that
+ # we may not own. However, we can't just blindly encode all output, since
+ # there may be a pre-applied output encoding (such as from PERL_UNICODE)
+ # and then we would double-encode. This seems to be the least bad
+ # approach.
+ if ($$self{utf8}) {
+ eval { binmode ($$self{output_fh}, ':encoding(UTF-8)') };
+ }
+
# Determine information for the preamble and then output it.
my ($name, $section);
if (defined $$self{name}) {
by many implementations and may even result in segfaults and other bad
behavior.
+Be aware that, when using this option, the input encoding of your POD
+source must be properly declared unless it is US-ASCII or Latin-1. POD
+input without an C<=encoding> command will be assumed to be in Latin-1,
+and if it's actually in UTF-8, the output will be double-encoded. See
+L<perlpod(1)> for more information on the C<=encoding> command.
+
=back
The standard Pod::Simple method parse_file() takes one argument naming the
=head1 BUGS
+Encoding handling assumes that PerlIO is available and does not work
+properly if it isn't. The C<utf8> option is therefore not supported
+unless Perl is built with PerlIO support.
+
There is currently no way to turn off the guesswork that tries to format
unmarked text appropriately, and sometimes it isn't wanted (particularly
when using POD to document something other than Perl). Most of the work
-towards fixing this has now been done, however, and all that's still needed
+toward fixing this has now been done, however, and all that's still needed
is a user interface.
The NAME section should be recognized specially and index entries emitted
=head1 CAVEATS
+If Pod::Man is given the C<utf8> option, the encoding of its output file
+handle will be forced to UTF-8 if possible, overriding any existing
+encoding. This will be done even if the file handle is not created by
+Pod::Man and was passed in from outside. This maintains consistency
+regardless of PERL_UNICODE and other settings.
+
The handling of hyphens and em dashes is somewhat fragile, and one may get
the wrong one under some circumstances. This should only matter for
B<troff> output.
@ISA = qw(Exporter);
@EXPORT = qw(parselink);
-$VERSION = 1.08;
+$VERSION = '1.09';
##############################################################################
# Implementation
formatting codes. Any double quotes around the section are removed as part
of the parsing, as is any leading or trailing whitespace.
-If the text of the LE<lt>E<gt> escape is entirely enclosed in double quotes,
-it's interpreted as a link to a section for backwards compatibility.
+If the text of the LE<lt>E<gt> escape is entirely enclosed in double
+quotes, it's interpreted as a link to a section for backward
+compatibility.
No attempt is made to resolve formatting codes. This must be done after
calling parselink() (since EE<lt>E<gt> formatting codes can be used to
# We have to export pod2text for backward compatibility.
@EXPORT = qw(pod2text);
-$VERSION = 3.11;
+$VERSION = '3.12';
##############################################################################
# Initialization
}
# Output text to the output device. Replace non-breaking spaces with spaces
-# and soft hyphens with nothing.
+# and soft hyphens with nothing, and then try to fix the output encoding if
+# necessary to match the input encoding unless UTF-8 output is forced. This
+# preserves the traditional pass-through behavior of Pod::Text.
sub output {
my ($self, $text) = @_;
$text =~ tr/\240\255/ /d;
+ unless ($$self{opt_utf8} || $$self{CHECKED_ENCODING}) {
+ my $encoding = $$self{encoding} || '';
+ if ($encoding) {
+ eval { binmode ($$self{output_fh}, ":encoding($encoding)") };
+ }
+ $$self{CHECKED_ENCODING} = 1;
+ }
print { $$self{output_fh} } $text;
}
$$self{MARGIN} = $margin; # Default left margin.
$$self{PENDING} = [[]]; # Pending output.
+ # We have to redo encoding handling for each document.
+ delete $$self{CHECKED_ENCODING};
+
+ # If we were given the utf8 option, set an output encoding on our file
+ # handle. Wrap in an eval in case we're using a version of Perl too old
+ # to understand this.
+ #
+ # This is evil because it changes the global state of a file handle that
+ # we may not own. However, we can't just blindly encode all output, since
+ # there may be a pre-applied output encoding (such as from PERL_UNICODE)
+ # and then we would double-encode. This seems to be the least bad
+ # approach.
+ if ($$self{opt_utf8}) {
+ eval { binmode ($$self{output_fh}, ':encoding(UTF-8)') };
+ }
+
return '';
}
Pod::Text - Convert POD data to formatted ASCII text
=for stopwords
-alt stderr Allbery Sean Burke's Christiansen
+alt stderr Allbery Sean Burke's Christiansen UTF-8 pre-Unicode utf8
=head1 SYNOPSIS
Send error messages about invalid POD to standard error instead of
appending a POD ERRORS section to the generated output.
+=item utf8
+
+By default, Pod::Text uses the same output encoding as the input encoding
+of the POD source (provided that Perl was built with PerlIO; otherwise, it
+doesn't encode its output). If this option is given, the output encoding
+is forced to UTF-8.
+
+Be aware that, when using this option, the input encoding of your POD
+source must be properly declared unless it is US-ASCII or Latin-1. POD
+input without an C<=encoding> command will be assumed to be in Latin-1,
+and if it's actually in UTF-8, the output will be double-encoded. See
+L<perlpod(1)> for more information on the C<=encoding> command.
+
=item width
The column at which to wrap text on the right-hand side. Defaults to 76.
=back
+=head1 BUGS
+
+Encoding handling assumes that PerlIO is available and does not work
+properly if it isn't. The C<utf8> option is therefore not supported
+unless Perl is built with PerlIO support.
+
+=head1 CAVEATS
+
+If Pod::Text is given the C<utf8> option, the encoding of its output file
+handle will be forced to UTF-8 if possible, overriding any existing
+encoding. This will be done even if the file handle is not created by
+Pod::Text and was passed in from outside. This maintains consistency
+regardless of PERL_UNICODE and other settings.
+
+If the C<utf8> option is not given, the encoding of its output file handle
+will be forced to the detected encoding of the input POD, which preserves
+whatever the input text is. This ensures backward compatibility with
+earlier, pre-Unicode versions of this module, without large numbers of
+Perl warnings.
+
+This is not ideal, but it seems to be the best compromise. If it doesn't
+work for you, please let me know the details of how it broke.
+
=head1 NOTES
This is a replacement for an earlier Pod::Text module written by Tom
=head1 SEE ALSO
-L<Pod::Simple>, L<Pod::Text::Termcap>, L<pod2text(1)>
+L<Pod::Simple>, L<Pod::Text::Termcap>, L<perlpod(1)>, L<pod2text(1)>
The current version of this module is always available from its web site at
L<http://www.eyrie.org/~eagle/software/podlators/>. It is also part of the
@ISA = qw(Pod::Text);
-$VERSION = 2.04;
+$VERSION = '2.05';
##############################################################################
# Overrides
@ISA = qw(Pod::Text);
-$VERSION = 2.02;
+$VERSION = '2.03';
##############################################################################
# Overrides
@ISA = qw(Pod::Text);
-$VERSION = 2.04;
+$VERSION = '2.05';
##############################################################################
# Overrides
my $n = 2;
eval { binmode (\*DATA, ':encoding(utf-8)') };
+eval { binmode (\*STDOUT, ':encoding(utf-8)') };
while (<DATA>) {
my %options;
next until $_ eq "###\n";
close TMP;
my $parser = Pod::Man->new (%options) or die "Cannot create parser\n";
open (OUT, '> out.tmp') or die "Cannot create out.tmp: $!\n";
- eval { binmode (\*OUT, ':encoding(utf-8)') };
$parser->parse_from_file ('tmp.pod', \*OUT);
close OUT;
my $accents = 0;
#!/usr/bin/perl
#
# t/pod-spelling.t -- Test POD spelling.
+#
+# Copyright 2008 Russ Allbery <rra@stanford.edu>
+#
+# This program is free software; you may redistribute it and/or modify it
+# under the same terms as Perl itself.
# Called to skip all tests with a reason.
sub skip_all {
- print "1..1\n";
- print "ok 1 # skip - @_\n";
+ print "1..0 # Skipped: @_\n";
exit;
}
+# Skip all spelling tests unless flagged to run maintainer tests.
+skip_all "Spelling tests only run for maintainer"
+ unless $ENV{RRA_MAINTAINER_TESTS};
+
# Make sure we have prerequisites. hunspell is currently not supported due to
# lack of support for contractions.
eval 'use Test::Pod 1.00';
--- /dev/null
+#!/usr/bin/perl -w
+#
+# text-encoding.t -- Test Pod::Text with various weird encoding combinations.
+#
+# Copyright 2002, 2004, 2006, 2007, 2008 by Russ Allbery <rra@stanford.edu>
+#
+# This program is free software; you may redistribute it and/or modify it
+# under the same terms as Perl itself.
+
+BEGIN {
+ chdir 't' if -d 't';
+ if ($ENV{PERL_CORE}) {
+ @INC = '../lib';
+ } else {
+ unshift (@INC, '../blib/lib');
+ }
+ unshift (@INC, '../blib/lib');
+ $| = 1;
+ print "1..4\n";
+
+ # PerlIO encoding support requires Perl 5.8 or later.
+ if ($] < 5.008) {
+ my $n;
+ for $n (1..4) {
+ print "ok $n # skip -- Perl 5.8 required for UTF-8 support\n";
+ }
+ exit;
+ }
+}
+
+END {
+ print "not ok 1\n" unless $loaded;
+}
+
+use Pod::Text;
+
+$loaded = 1;
+print "ok 1\n";
+
+my $n = 2;
+eval { binmode (\*DATA, ':raw') };
+eval { binmode (\*STDOUT, ':raw') };
+while (<DATA>) {
+ my %opts;
+ $opts{utf8} = 1 if $n == 4;
+ my $parser = Pod::Text->new (%opts) or die "Cannot create parser\n";
+ next until $_ eq "###\n";
+ open (TMP, '> tmp.pod') or die "Cannot create tmp.pod: $!\n";
+ eval { binmode (\*TMP, ':raw') };
+ while (<DATA>) {
+ last if $_ eq "###\n";
+ print TMP $_;
+ }
+ close TMP;
+ open (OUT, '> out.tmp') or die "Cannot create out.tmp: $!\n";
+ $parser->parse_from_file ('tmp.pod', \*OUT);
+ close OUT;
+ open (TMP, 'out.tmp') or die "Cannot open out.tmp: $!\n";
+ eval { binmode (\*TMP, ':raw') };
+ my $output;
+ {
+ local $/;
+ $output = <TMP>;
+ }
+ close TMP;
+ unlink ('tmp.pod', 'out.tmp');
+ my $expected = '';
+ while (<DATA>) {
+ last if $_ eq "###\n";
+ $expected .= $_;
+ }
+ if ($output eq $expected) {
+ print "ok $n\n";
+ } else {
+ print "not ok $n\n";
+ print "Expected\n========\n$expected\nOutput\n======\n$output\n";
+ }
+ $n++;
+}
+
+# Below the marker are bits of POD and corresponding expected text output.
+# This is used to test specific features or problems with Pod::Text. The
+# input and output are separated by lines containing only ###.
+
+__DATA__
+
+###
+=head1 Test of SE<lt>E<gt>
+
+This is S<some whitespace>.
+###
+Test of S<>
+ This is some whitespace.
+
+###
+
+###
+=encoding utf-8
+
+=head1 I can eat glass
+
+=over 4
+
+=item Esperanto
+
+Mi povas manÄ\9di vitron, Ä\9di ne damaÄ\9das min.
+
+=item Braille
+
+â \8aâ \80â \89â \81â \9dâ \80â \91â \81â \9eâ \80â \9bâ \87â \81â \8eâ \8eâ \80â \81â \9dâ \99â \80â \8aâ \9eâ \80â \99â \95â \91â \8eâ \9dâ \9eâ \80â \93â ¥â \97â \9eâ \80â \8dâ \91
+
+=item Hindi
+
+मà¥\88à¤\82 à¤\95ाà¤\81à¤\9a à¤\96ा सà¤\95ता हà¥\82à¤\81 à¤\94र मà¥\81à¤\9dà¥\87 à¤\89ससà¥\87 à¤\95à¥\8bà¤\88 à¤\9aà¥\8bà¤\9f नहà¥\80à¤\82 पहà¥\81à¤\82à¤\9aतà¥\80.
+
+=back
+
+See L<http://www.columbia.edu/kermit/utf8.html>
+###
+I can eat glass
+ Esperanto
+ Mi povas manÄ\9di vitron, Ä\9di ne damaÄ\9das min.
+
+ Braille
+ â \8aâ \80â \89â \81â \9dâ \80â \91â \81â \9eâ \80â \9bâ \87â \81â \8eâ \8eâ \80â \81â \9dâ \99â \80â \8aâ \9eâ
+ \80â \99â \95â \91â \8eâ \9dâ \9eâ \80â \93â ¥â \97â \9eâ \80â \8dâ \91
+
+ Hindi
+ मà¥\88à¤\82 à¤\95ाà¤\81à¤\9a à¤\96ा सà¤\95ता हà¥\82à¤\81 à¤\94र
+ मà¥\81à¤\9dà¥\87 à¤\89ससà¥\87 à¤\95à¥\8bà¤\88 à¤\9aà¥\8bà¤\9f नहà¥\80à¤\82
+ पहà¥\81à¤\82à¤\9aतà¥\80.
+
+ See <http://www.columbia.edu/kermit/utf8.html>
+
+###
+
+###
+=head1 Beyoncé
+###
+Beyoncé
+###
#
# text-options.t -- Additional tests for Pod::Text options.
#
-# Copyright 2002, 2004, 2006 by Russ Allbery <rra@stanford.edu>
+# Copyright 2002, 2004, 2006, 2008 by Russ Allbery <rra@stanford.edu>
#
# This program is free software; you may redistribute it and/or modify it
# under the same terms as Perl itself.
}
use Pod::Text;
-use Pod::Simple;
$loaded = 1;
print "ok 1\n";
}
close TMP;
open (OUT, '> out.tmp') or die "Cannot create out.tmp: $!\n";
- eval { binmode (\*OUT, ':encoding(utf-8)') };
$parser->parse_from_file ('tmp.pod', \*OUT);
close OUT;
open (TMP, 'out.tmp') or die "Cannot open out.tmp: $!\n";
my $stdin;
@ARGV = map { $_ eq '-' && !$stdin++ ? ('--', $_) : $_ } @ARGV;
-# Parse our options, trying to retain backwards compatibility with pod2man but
+# Parse our options, trying to retain backward compatibility with pod2man but
# allowing short forms as well. --lax is currently ignored.
my %options;
$options{errors} = 'pod';
my $verbose = $options{verbose};
delete $options{verbose};
-# This isn't a valid Pod::Man option and is only accepted for backwards
+# This isn't a valid Pod::Man option and is only accepted for backward
# compatibility.
delete $options{lax};
=item B<-l>, B<--lax>
-No longer used. B<pod2man> used to check its input for validity as a manual
-page, but this should now be done by L<podchecker(1)> instead. Accepted for
-backwards compatibility; this option no longer does anything.
+No longer used. B<pod2man> used to check its input for validity as a
+manual page, but this should now be done by L<podchecker(1)> instead.
+Accepted for backward compatibility; this option no longer does anything.
=item B<-n> I<name>, B<--name>=I<name>
supported by many implementations and may even result in segfaults and
other bad behavior.
+Be aware that, when using this option, the input encoding of your POD
+source must be properly declared unless it is US-ASCII or Latin-1. POD
+input without an C<=encoding> command will be assumed to be in Latin-1,
+and if it's actually in UTF-8, the output will be double-encoded. See
+L<perlpod(1)> for more information on the C<=encoding> command.
+
=item B<-v>, B<--verbose>
Print out the name of each output file as it is being generated.
=head1 SEE ALSO
-L<Pod::Man>, L<Pod::Simple>, L<man(1)>, L<nroff(1)>, L<podchecker(1)>,
-L<troff(1)>, L<man(7)>
+L<Pod::Man>, L<Pod::Simple>, L<man(1)>, L<nroff(1)>, L<perlpod(1)>,
+L<podchecker(1)>, L<troff(1)>, L<man(7)>
The man page documenting the an macro set may be L<man(5)> instead of
L<man(7)> on your system.
Getopt::Long::config ('bundling');
GetOptions (\%options, 'alt|a', 'code', 'color|c', 'help|h', 'indent|i=i',
'loose|l', 'margin|left-margin|m=i', 'overstrike|o',
- 'quotes|q=s', 'sentence|s', 'stderr', 'termcap|t', 'width|w=i')
+ 'quotes|q=s', 'sentence|s', 'stderr', 'termcap|t', 'utf8|u',
+ 'width|w=i')
or exit 1;
pod2usage (1) if $options{help};
pod2text - Convert POD data to formatted ASCII text
=for stopwords
--aclost --alt --stderr Allbery --overstrike overstrike --termcap
+-aclostu --alt --stderr Allbery --overstrike overstrike --termcap --utf8
+UTF-8
=head1 SYNOPSIS
-pod2text [B<-aclost>] [B<--code>] [B<-i> I<indent>] S<[B<-q> I<quotes>]>
+pod2text [B<-aclostu>] [B<--code>] [B<-i> I<indent>] S<[B<-q> I<quotes>]>
[B<--stderr>] S<[B<-w> I<width>]> [I<input> [I<output> ...]]
pod2text B<-h>
your system support termios. With this option, the output of B<pod2text>
will contain terminal control sequences for your current terminal type.
+=item B<-u>, B<--utf8>
+
+By default, B<pod2text> tries to use the same output encoding as its input
+encoding (to be backward-compatible with older versions). This option
+says to instead force the output encoding to UTF-8.
+
+Be aware that, when using this option, the input encoding of your POD
+source must be properly declared unless it is US-ASCII or Latin-1. POD
+input without an C<=encoding> command will be assumed to be in Latin-1,
+and if it's actually in UTF-8, the output will be double-encoded. See
+L<perlpod(1)> for more information on the C<=encoding> command.
+
=item B<-w>, B<--width=>I<width>, B<->I<width>
The column at which to wrap text on the right-hand side. Defaults to 76,
=head1 SEE ALSO
L<Pod::Text>, L<Pod::Text::Color>, L<Pod::Text::Overstrike>,
-L<Pod::Text::Termcap>, L<Pod::Simple>
+L<Pod::Text::Termcap>, L<Pod::Simple>, L<perlpod(1)>
The current version of this script is always available from its web site at
L<http://www.eyrie.org/~eagle/software/podlators/>. It is also part of the