1 #############################################################################
2 # Pod/Parser.pm -- package which defines a base class for parsing POD docs.
4 # Based on Tom Christiansen's Pod::Text module
5 # (with extensive modifications).
7 # Copyright (C) 1996-1999 Tom Christiansen. All rights reserved.
8 # This file is part of "PodParser". PodParser is free software;
9 # you can redistribute it and/or modify it under the same terms
11 #############################################################################
15 use vars qw($VERSION);
16 $VERSION = 1.081; ## Current version of this package
17 require 5.004; ## requires this Perl version or later
19 #############################################################################
23 Pod::Parser - base class for creating POD filters and translators
30 @ISA = qw(Pod::Parser);
33 my ($parser, $command, $paragraph, $line_num) = @_;
34 ## Interpret the command and its text; sample actions might be:
35 if ($command eq 'head1') { ... }
36 elsif ($command eq 'head2') { ... }
37 ## ... other commands and their actions
38 my $out_fh = $parser->output_handle();
39 my $expansion = $parser->interpolate($paragraph, $line_num);
40 print $out_fh $expansion;
44 my ($parser, $paragraph, $line_num) = @_;
45 ## Format verbatim paragraph; sample actions might be:
46 my $out_fh = $parser->output_handle();
47 print $out_fh $paragraph;
51 my ($parser, $paragraph, $line_num) = @_;
52 ## Translate/Format this block of text; sample actions might be:
53 my $out_fh = $parser->output_handle();
54 my $expansion = $parser->interpolate($paragraph, $line_num);
55 print $out_fh $expansion;
58 sub interior_sequence {
59 my ($parser, $seq_command, $seq_argument) = @_;
60 ## Expand an interior sequence; sample actions might be:
61 return "*$seq_argument*" if ($seq_command = 'B');
62 return "`$seq_argument'" if ($seq_command = 'C');
63 return "_${seq_argument}_'" if ($seq_command = 'I');
64 ## ... other sequence commands and their resulting text
69 ## Create a parser object and have it parse file whose name was
70 ## given on the command-line (use STDIN if no files were given).
71 $parser = new MyParser();
72 $parser->parse_from_filehandle(\*STDIN) if (@ARGV == 0);
73 for (@ARGV) { $parser->parse_from_file($_); }
77 perl5.004, Pod::InputObjects, Exporter, FileHandle, Carp
85 B<Pod::Parser> is a base class for creating POD filters and translators.
86 It handles most of the effort involved with parsing the POD sections
87 from an input stream, leaving subclasses free to be concerned only with
88 performing the actual translation of text.
90 B<Pod::Parser> parses PODs, and makes method calls to handle the various
91 components of the POD. Subclasses of B<Pod::Parser> override these methods
92 to translate the POD into whatever output format they desire.
96 To create a POD filter for translating POD documentation into some other
97 format, you create a subclass of B<Pod::Parser> which typically overrides
98 just the base class implementation for the following methods:
116 B<interior_sequence()>
120 You may also want to override the B<begin_input()> and B<end_input()>
121 methods for your subclass (to perform any needed per-file and/or
122 per-document initialization or cleanup).
124 If you need to perform any preprocesssing of input before it is parsed
125 you may want to override one or more of B<preprocess_line()> and/or
126 B<preprocess_paragraph()>.
128 Sometimes it may be necessary to make more than one pass over the input
129 files. If this is the case you have several options. You can make the
130 first pass using B<Pod::Parser> and override your methods to store the
131 intermediate results in memory somewhere for the B<end_pod()> method to
132 process. You could use B<Pod::Parser> for several passes with an
133 appropriate state variable to control the operation for each pass. If
134 your input source can't be reset to start at the beginning, you can
135 store it in some other structure as a string or an array and have that
136 structure implement a B<getline()> method (which is all that
137 B<parse_from_filehandle()> uses to read input).
139 Feel free to add any member data fields you need to keep track of things
140 like current font, indentation, horizontal or vertical position, or
141 whatever else you like. Be sure to read L<"PRIVATE METHODS AND DATA">
142 to avoid name collisions.
144 For the most part, the B<Pod::Parser> base class should be able to
145 do most of the input parsing for you and leave you free to worry about
146 how to intepret the commands and translate the result.
150 #############################################################################
155 use Pod::InputObjects;
161 ## These "variables" are used as local "glob aliases" for performance
162 use vars qw(%myData @input_stack);
164 #############################################################################
166 =head1 RECOMMENDED SUBROUTINE/METHOD OVERRIDES
168 B<Pod::Parser> provides several methods which most subclasses will probably
169 want to override. These methods are as follows:
173 ##---------------------------------------------------------------------------
177 $parser->command($cmd,$text,$line_num,$pod_para);
179 This method should be overridden by subclasses to take the appropriate
180 action when a POD command paragraph (denoted by a line beginning with
181 "=") is encountered. When such a POD directive is seen in the input,
182 this method is called and is passed:
188 the name of the command for this POD paragraph
192 the paragraph text for the given POD paragraph command.
196 the line-number of the beginning of the paragraph
200 a reference to a C<Pod::Paragraph> object which contains further
201 information about the paragraph command (see L<Pod::InputObjects>
206 B<Note> that this method I<is> called for C<=pod> paragraphs.
208 The base class implementation of this method simply treats the raw POD
209 command as normal block of paragraph text (invoking the B<textblock()>
210 method with the command paragraph).
215 my ($self, $cmd, $text, $line_num, $pod_para) = @_;
216 ## Just treat this like a textblock
217 $self->textblock($pod_para->raw_text(), $line_num, $pod_para);
220 ##---------------------------------------------------------------------------
224 $parser->verbatim($text,$line_num,$pod_para);
226 This method may be overridden by subclasses to take the appropriate
227 action when a block of verbatim text is encountered. It is passed the
228 following parameters:
234 the block of text for the verbatim paragraph
238 the line-number of the beginning of the paragraph
242 a reference to a C<Pod::Paragraph> object which contains further
243 information about the paragraph (see L<Pod::InputObjects>
248 The base class implementation of this method simply prints the textblock
249 (unmodified) to the output filehandle.
254 my ($self, $text, $line_num, $pod_para) = @_;
255 my $out_fh = $self->{_OUTPUT};
259 ##---------------------------------------------------------------------------
261 =head1 B<textblock()>
263 $parser->textblock($text,$line_num,$pod_para);
265 This method may be overridden by subclasses to take the appropriate
266 action when a normal block of POD text is encountered (although the base
267 class method will usually do what you want). It is passed the following
274 the block of text for the a POD paragraph
278 the line-number of the beginning of the paragraph
282 a reference to a C<Pod::Paragraph> object which contains further
283 information about the paragraph (see L<Pod::InputObjects>
288 In order to process interior sequences, subclasses implementations of
289 this method will probably want to invoke either B<interpolate()> or
290 B<parse_text()>, passing it the text block C<$text>, and the corresponding
291 line number in C<$line_num>, and then perform any desired processing upon
294 The base class implementation of this method simply prints the text block
295 as it occurred in the input stream).
300 my ($self, $text, $line_num, $pod_para) = @_;
301 my $out_fh = $self->{_OUTPUT};
302 print $out_fh $self->interpolate($text, $line_num);
305 ##---------------------------------------------------------------------------
307 =head1 B<interior_sequence()>
309 $parser->interior_sequence($seq_cmd,$seq_arg,$pod_seq);
311 This method should be overridden by subclasses to take the appropriate
312 action when an interior sequence is encountered. An interior sequence is
313 an embedded command within a block of text which appears as a command
314 name (usually a single uppercase character) followed immediately by a
315 string of text which is enclosed in angle brackets. This method is
316 passed the sequence command C<$seq_cmd> and the corresponding text
317 C<$seq_arg>. It is invoked by the B<interpolate()> method for each interior
318 sequence that occurs in the string that it is passed. It should return
319 the desired text string to be used in place of the interior sequence.
320 The C<$pod_seq> argument is a reference to a C<Pod::InteriorSequence>
321 object which contains further information about the interior sequence.
322 Please see L<Pod::InputObjects> for details if you need to access this
323 additional information.
325 Subclass implementations of this method may wish to invoke the
326 B<nested()> method of C<$pod_seq> to see if it is nested inside
327 some other interior-sequence (and if so, which kind).
329 The base class implementation of the B<interior_sequence()> method
330 simply returns the raw text of the interior sequence (as it occurred
331 in the input) to the caller.
335 sub interior_sequence {
336 my ($self, $seq_cmd, $seq_arg, $pod_seq) = @_;
337 ## Just return the raw text of the interior sequence
338 return $pod_seq->raw_text();
341 #############################################################################
343 =head1 OPTIONAL SUBROUTINE/METHOD OVERRIDES
345 B<Pod::Parser> provides several methods which subclasses may want to override
346 to perform any special pre/post-processing. These methods do I<not> have to
347 be overridden, but it may be useful for subclasses to take advantage of them.
351 ##---------------------------------------------------------------------------
355 my $parser = Pod::Parser->new();
357 This is the constructor for B<Pod::Parser> and its subclasses. You
358 I<do not> need to override this method! It is capable of constructing
359 subclass objects as well as base class objects, provided you use
360 any of the following constructor invocation styles:
362 my $parser1 = MyParser->new();
363 my $parser2 = new MyParser();
364 my $parser3 = $parser2->new();
366 where C<MyParser> is some subclass of B<Pod::Parser>.
368 Using the syntax C<MyParser::new()> to invoke the constructor is I<not>
369 recommended, but if you insist on being able to do this, then the
370 subclass I<will> need to override the B<new()> constructor method. If
371 you do override the constructor, you I<must> be sure to invoke the
372 B<initialize()> method of the newly blessed object.
374 Using any of the above invocations, the first argument to the
375 constructor is always the corresponding package name (or object
376 reference). No other arguments are required, but if desired, an
377 associative array (or hash-table) my be passed to the B<new()>
380 my $parser1 = MyParser->new( MYDATA => $value1, MOREDATA => $value2 );
381 my $parser2 = new MyParser( -myflag => 1 );
383 All arguments passed to the B<new()> constructor will be treated as
384 key/value pairs in a hash-table. The newly constructed object will be
385 initialized by copying the contents of the given hash-table (which may
386 have been empty). The B<new()> constructor for this class and all of its
387 subclasses returns a blessed reference to the initialized object (hash-table).
392 ## Determine if we were called via an object-ref or a classname
394 my $class = ref($this) || $this;
395 ## Any remaining arguments are treated as initial values for the
396 ## hash that is used to represent this object.
398 my $self = { %params };
399 ## Bless ourselves into the desired class and perform any initialization
405 ##---------------------------------------------------------------------------
407 =head1 B<initialize()>
409 $parser->initialize();
411 This method performs any necessary object initialization. It takes no
412 arguments (other than the object instance of course, which is typically
413 copied to a local variable named C<$self>). If subclasses override this
414 method then they I<must> be sure to invoke C<$self-E<gt>SUPER::initialize()>.
423 ##---------------------------------------------------------------------------
425 =head1 B<begin_pod()>
427 $parser->begin_pod();
429 This method is invoked at the beginning of processing for each POD
430 document that is encountered in the input. Subclasses should override
431 this method to perform any per-document initialization.
440 ##---------------------------------------------------------------------------
442 =head1 B<begin_input()>
444 $parser->begin_input();
446 This method is invoked by B<parse_from_filehandle()> immediately I<before>
447 processing input from a filehandle. The base class implementation does
448 nothing, however, subclasses may override it to perform any per-file
451 Note that if multiple files are parsed for a single POD document
452 (perhaps the result of some future C<=include> directive) this method
453 is invoked for every file that is parsed. If you wish to perform certain
454 initializations once per document, then you should use B<begin_pod()>.
463 ##---------------------------------------------------------------------------
465 =head1 B<end_input()>
467 $parser->end_input();
469 This method is invoked by B<parse_from_filehandle()> immediately I<after>
470 processing input from a filehandle. The base class implementation does
471 nothing, however, subclasses may override it to perform any per-file
474 Please note that if multiple files are parsed for a single POD document
475 (perhaps the result of some kind of C<=include> directive) this method
476 is invoked for every file that is parsed. If you wish to perform certain
477 cleanup actions once per document, then you should use B<end_pod()>.
486 ##---------------------------------------------------------------------------
492 This method is invoked at the end of processing for each POD document
493 that is encountered in the input. Subclasses should override this method
494 to perform any per-document finalization.
503 ##---------------------------------------------------------------------------
505 =head1 B<preprocess_line()>
507 $textline = $parser->preprocess_line($text, $line_num);
509 This method should be overridden by subclasses that wish to perform
510 any kind of preprocessing for each I<line> of input (I<before> it has
511 been determined whether or not it is part of a POD paragraph). The
512 parameter C<$text> is the input line; and the parameter C<$line_num> is
513 the line number of the corresponding text line.
515 The value returned should correspond to the new text to use in its
516 place. If the empty string or an undefined value is returned then no
517 further processing will be performed for this line.
519 Please note that the B<preprocess_line()> method is invoked I<before>
520 the B<preprocess_paragraph()> method. After all (possibly preprocessed)
521 lines in a paragraph have been assembled together and it has been
522 determined that the paragraph is part of the POD documentation from one
523 of the selected sections, then B<preprocess_paragraph()> is invoked.
525 The base class implementation of this method returns the given text.
529 sub preprocess_line {
530 my ($self, $text, $line_num) = @_;
534 ##---------------------------------------------------------------------------
536 =head1 B<preprocess_paragraph()>
538 $textblock = $parser->preprocess_paragraph($text, $line_num);
540 This method should be overridden by subclasses that wish to perform any
541 kind of preprocessing for each block (paragraph) of POD documentation
542 that appears in the input stream. The parameter C<$text> is the POD
543 paragraph from the input file; and the parameter C<$line_num> is the
544 line number for the beginning of the corresponding paragraph.
546 The value returned should correspond to the new text to use in its
547 place If the empty string is returned or an undefined value is
548 returned, then the given C<$text> is ignored (not processed).
550 This method is invoked after gathering up all thelines in a paragraph
551 but before trying to further parse or interpret them. After
552 B<preprocess_paragraph()> returns, the current cutting state (which
553 is returned by C<$self-E<gt>cutting()>) is examined. If it evaluates
554 to false then input text (including the given C<$text>) is cut (not
555 processed) until the next POD directive is encountered.
557 Please note that the B<preprocess_line()> method is invoked I<before>
558 the B<preprocess_paragraph()> method. After all (possibly preprocessed)
559 lines in a paragraph have been assembled together and it has been
560 determined that the paragraph is part of the POD documentation from one
561 of the selected sections, then B<preprocess_paragraph()> is invoked.
563 The base class implementation of this method returns the given text.
567 sub preprocess_paragraph {
568 my ($self, $text, $line_num) = @_;
572 #############################################################################
574 =head1 METHODS FOR PARSING AND PROCESSING
576 B<Pod::Parser> provides several methods to process input text. These
577 methods typically won't need to be overridden, but subclasses may want
578 to invoke them to exploit their functionality.
582 ##---------------------------------------------------------------------------
584 =head1 B<parse_text()>
586 $ptree1 = $parser->parse_text($text, $line_num);
587 $ptree2 = $parser->parse_text({%opts}, $text, $line_num);
588 $ptree3 = $parser->parse_text(\%opts, $text, $line_num);
590 This method is useful if you need to perform your own interpolation
591 of interior sequences and can't rely upon B<interpolate> to expand
592 them in simple bottom-up order order.
594 The parameter C<$text> is a string or block of text to be parsed
595 for interior sequences; and the parameter C<$line_num> is the
596 line number curresponding to the beginning of C<$text>.
598 B<parse_text()> will parse the given text into a parse-tree of "nodes."
599 and interior-sequences. Each "node" in the parse tree is either a
600 text-string, or a B<Pod::InteriorSequence>. The result returned is a
601 parse-tree of type B<Pod::ParseTree>. Please see L<Pod::InputObjects>
602 for more information about B<Pod::InteriorSequence> and B<Pod::ParseTree>.
604 If desired, an optional hash-ref may be specified as the first argument
605 to customize certain aspects of the parse-tree that is created and
606 returned. The set of recognized option keywords are:
610 =item B<-expand_seq> =E<gt> I<code-ref>|I<method-name>
612 Normally, the parse-tree returned by B<parse_text()> will contain an
613 unexpanded C<Pod::InteriorSequence> object for each interior-sequence
614 encountered. Specifying B<-expand_seq> tells B<parse_text()> to "expand"
615 every interior-sequence it sees by invoking the referenced function
616 (or named method of the parser object) and using the return value as the
619 If a subroutine reference was given, it is invoked as:
621 &$code_ref( $parser, $sequence )
623 and if a method-name was given, it is invoked as:
625 $parser->method_name( $sequence )
627 where C<$parser> is a reference to the parser object, and C<$sequence>
628 is a reference to the interior-sequence object.
629 [I<NOTE>: If the B<interior_sequence()> method is specified, then it is
630 invoked according to the interface specified in L<"interior_sequence()">].
632 =item B<-expand_ptree> =E<gt> I<code-ref>|I<method-name>
634 Rather than returning a C<Pod::ParseTree>, pass the parse-tree as an
635 argument to the referenced subroutine (or named method of the parser
636 object) and return the result instead of the parse-tree object.
638 If a subroutine reference was given, it is invoked as:
640 &$code_ref( $parser, $ptree )
642 and if a method-name was given, it is invoked as:
644 $parser->method_name( $ptree )
646 where C<$parser> is a reference to the parser object, and C<$ptree>
647 is a reference to the parse-tree object.
653 ## This global regex is used to see if the text before a '>' inside
654 ## an interior sequence looks like '-' or '=', but not '--' or '=='
655 use vars qw( $ARROW_RE );
656 $ARROW_RE = join('', qw{ (?: [^=]+= | [^-]+- )$ });
657 #$ARROW_RE = qr/(?:[^=]+=|[^-]+-)$/; ## 5.005+ only!
663 ## Get options and set any defaults
664 my %opts = (ref $_[0]) ? %{ shift() } : ();
665 my $expand_seq = $opts{'-expand_seq'} || undef;
666 my $expand_ptree = $opts{'-expand_ptree'} || undef;
670 my $file = $self->input_file();
671 my ($cmd, $prev) = ('', '');
673 ## Convert method calls into closures, for our convenience
674 my $xseq_sub = $expand_seq;
675 my $xptree_sub = $expand_ptree;
676 if (defined $expand_seq and $expand_seq eq 'interior_sequence') {
677 ## If 'interior_sequence' is the method to use, we have to pass
678 ## more than just the sequence object, we also need to pass the
679 ## sequence name and text.
681 my ($self, $iseq) = @_;
682 my $args = join("", $iseq->parse_tree->children);
683 return $self->interior_sequence($iseq->name, $args, $iseq);
686 ref $xseq_sub or $xseq_sub = sub { shift()->$expand_seq(@_) };
687 ref $xptree_sub or $xptree_sub = sub { shift()->$expand_ptree(@_) };
689 ## Keep track of the "current" interior sequence, and maintain a stack
690 ## of "in progress" sequences.
692 ## NOTE that we push our own "accumulator" at the very beginning of the
693 ## stack. It's really a parse-tree, not a sequence; but it implements
694 ## the methods we need so we can use it to gather-up all the sequences
695 ## and strings we parse. Thus, by the end of our parsing, it should be
696 ## the only thing left on our stack and all we have to do is return it!
698 my $seq = Pod::ParseTree->new();
699 my @seq_stack = ($seq);
701 ## Iterate over all sequence starts/stops, newlines, & text
702 ## (NOTE: split with capturing parens keeps the delimiters)
704 for ( split /([A-Z]<|>|\n)/ ) {
705 ## Keep track of line count
706 ++$line if ($_ eq "\n");
707 ## Look for the beginning of a sequence
708 if ( /^([A-Z])(<)$/ ) {
709 ## Push a new sequence onto the stack of those "in-progress"
710 $seq = Pod::InteriorSequence->new(
711 -name => ($cmd = $1),
712 -ldelim => $2, -rdelim => '',
713 -file => $file, -line => $line
715 (@seq_stack > 1) and $seq->nested($seq_stack[-1]);
716 push @seq_stack, $seq;
718 ## Look for sequence ending (preclude '->' and '=>' inside C<...>)
719 elsif ( (@seq_stack > 1) and
720 /^>$/ and ($cmd ne 'C' or $prev !~ /$ARROW_RE/o) )
722 ## End of current sequence, record terminating delimiter
724 ## Pop it off the stack of "in progress" sequences
726 ## Append result to its parent in current parse tree
727 $seq_stack[-1]->append($expand_seq ? &$xseq_sub($self,$seq) : $seq);
728 ## Remember the current cmd-name
729 $cmd = (@seq_stack > 1) ? $seq_stack[-1]->name : '';
732 ## In the middle of a sequence, append this text to it
733 $seq->append($_) if length;
735 ## Remember the "current" sequence and the previously seen token
736 ($seq, $prev) = ( $seq_stack[-1], $_ );
739 ## Handle unterminated sequences
740 while (@seq_stack > 1) {
741 ($cmd, $file, $line) = ($seq->name, $seq->file_line);
743 warn "** Unterminated $cmd<...> at $file line $line\n";
744 $seq_stack[-1]->append($expand_seq ? &$xseq_sub($self,$seq) : $seq);
745 $seq = $seq_stack[-1];
748 ## Return the resulting parse-tree
749 my $ptree = (pop @seq_stack)->parse_tree;
750 return $expand_ptree ? &$xptree_sub($self, $ptree) : $ptree;
753 ##---------------------------------------------------------------------------
755 =head1 B<interpolate()>
757 $textblock = $parser->interpolate($text, $line_num);
759 This method translates all text (including any embedded interior sequences)
760 in the given text string C<$text> and returns the interpolated result. The
761 parameter C<$line_num> is the line number corresponding to the beginning
764 B<interpolate()> merely invokes a private method to recursively expand
765 nested interior sequences in bottom-up order (innermost sequences are
766 expanded first). If there is a need to expand nested sequences in
767 some alternate order, use B<parse_text> instead.
772 my($self, $text, $line_num) = @_;
773 my %parse_opts = ( -expand_seq => 'interior_sequence' );
774 my $ptree = $self->parse_text( \%parse_opts, $text, $line_num );
775 return join "", $ptree->children();
778 ##---------------------------------------------------------------------------
782 =head1 B<parse_paragraph()>
784 $parser->parse_paragraph($text, $line_num);
786 This method takes the text of a POD paragraph to be processed, along
787 with its corresponding line number, and invokes the appropriate method
788 (one of B<command()>, B<verbatim()>, or B<textblock()>).
790 This method does I<not> usually need to be overridden by subclasses.
796 sub parse_paragraph {
797 my ($self, $text, $line_num) = @_;
798 local *myData = $self; ## an alias to avoid deref-ing overhead
801 ## This is the end of a non-empty paragraph
802 ## Ignore up until next POD directive if we are cutting
803 if ($myData{_CUTTING}) {
804 return unless ($text =~ /^={1,2}\S/);
805 $myData{_CUTTING} = 0;
808 ## Now we know this is block of text in a POD section!
810 ##-----------------------------------------------------------------
811 ## This is a hook (hack ;-) for Pod::Select to do its thing without
812 ## having to override methods, but also without Pod::Parser assuming
813 ## $self is an instance of Pod::Select (if the _SELECTED_SECTIONS
814 ## field exists then we assume there is an is_selected() method for
815 ## us to invoke (calling $self->can('is_selected') could verify this
816 ## but that is more overhead than I want to incur)
817 ##-----------------------------------------------------------------
819 ## Ignore this block if it isnt in one of the selected sections
820 if (exists $myData{_SELECTED_SECTIONS}) {
821 $self->is_selected($text) or return ($myData{_CUTTING} = 1);
824 ## Perform any desired preprocessing and re-check the "cutting" state
825 $text = $self->preprocess_paragraph($text, $line_num);
826 return 1 unless ((defined $text) and (length $text));
827 return 1 if ($myData{_CUTTING});
829 ## Look for one of the three types of paragraphs
830 my ($pfx, $cmd, $arg, $sep) = ('', '', '', '');
831 my $pod_para = undef;
832 if ($text =~ /^(={1,2})(?=\S)/) {
833 ## Looks like a command paragraph. Capture the command prefix used
834 ## ("=" or "=="), as well as the command-name, its paragraph text,
835 ## and whatever sequence of characters was used to separate them
837 $_ = substr($text, length $pfx);
838 $sep = /(\s+)(?=\S)/ ? $1 : '';
839 ($cmd, $text) = split(" ", $_, 2);
840 ## If this is a "cut" directive then we dont need to do anything
841 ## except return to "cutting" mode.
843 $myData{_CUTTING} = 1;
847 ## Save the attributes indicating how the command was specified.
848 $pod_para = new Pod::Paragraph(
853 -file => $myData{_INFILE},
856 # ## Invoke appropriate callbacks
857 # if (exists $myData{_CALLBACKS}) {
858 # ## Look through the callback list, invoke callbacks,
859 # ## then see if we need to do the default actions
860 # ## (invoke_callbacks will return true if we do).
861 # return 1 unless $self->invoke_callbacks($cmd, $text, $line_num, $pod_para);
864 ## A command paragraph
865 $self->command($cmd, $text, $line_num, $pod_para);
867 elsif ($text =~ /^\s+/) {
868 ## Indented text - must be a verbatim paragraph
869 $self->verbatim($text, $line_num, $pod_para);
872 ## Looks like an ordinary block of text
873 $self->textblock($text, $line_num, $pod_para);
878 ##---------------------------------------------------------------------------
880 =head1 B<parse_from_filehandle()>
882 $parser->parse_from_filehandle($in_fh,$out_fh);
884 This method takes an input filehandle (which is assumed to already be
885 opened for reading) and reads the entire input stream looking for blocks
886 (paragraphs) of POD documentation to be processed. If no first argument
887 is given the default input filehandle C<STDIN> is used.
889 The C<$in_fh> parameter may be any object that provides a B<getline()>
890 method to retrieve a single line of input text (hence, an appropriate
891 wrapper object could be used to parse PODs from a single string or an
894 Using C<$in_fh-E<gt>getline()>, input is read line-by-line and assembled
895 into paragraphs or "blocks" (which are separated by lines containing
896 nothing but whitespace). For each block of POD documentation
897 encountered it will invoke a method to parse the given paragraph.
899 If a second argument is given then it should correspond to a filehandle where
900 output should be sent (otherwise the default output filehandle is
901 C<STDOUT> if no output filehandle is currently in use).
903 B<NOTE:> For performance reasons, this method caches the input stream at
904 the top of the stack in a local variable. Any attempts by clients to
905 change the stack contents during processing when in the midst executing
906 of this method I<will not affect> the input stream used by the current
907 invocation of this method.
909 This method does I<not> usually need to be overridden by subclasses.
913 sub parse_from_filehandle {
915 my %opts = (ref $_[0] eq 'HASH') ? %{ shift() } : ();
916 my ($in_fh, $out_fh) = @_;
919 ## Put this stream at the top of the stack and do beginning-of-input
920 ## processing. NOTE that $in_fh might be reset during this process.
921 my $topstream = $self->_push_input_stream($in_fh, $out_fh);
922 (exists $opts{-cutting}) and $self->cutting( $opts{-cutting} );
924 ## Initialize line/paragraph
925 my ($textline, $paragraph) = ('', '');
926 my ($nlines, $plines) = (0, 0);
928 ## Use <$fh> instead of $fh->getline where possible (for speed)
930 my $tied_fh = (/^(?:GLOB|FileHandle|IO::\w+)$/ or tied $in_fh);
932 ## Read paragraphs line-by-line
933 while (defined ($textline = $tied_fh ? <$in_fh> : $in_fh->getline)) {
934 $textline = $self->preprocess_line($textline, ++$nlines);
935 next unless ((defined $textline) && (length $textline));
936 $_ = $paragraph; ## save previous contents
938 if ((! length $paragraph) && ($textline =~ /^==/)) {
939 ## '==' denotes a one-line command paragraph
940 $paragraph = $textline;
944 ## Append this line to the current paragraph
945 $paragraph .= $textline;
949 ## See of this line is blank and ends the current paragraph.
950 ## If it isnt, then keep iterating until it is.
951 next unless (($textline =~ /^\s*$/) && (length $paragraph));
953 ## Now process the paragraph
954 parse_paragraph($self, $paragraph, ($nlines - $plines) + 1);
958 ## Dont forget about the last paragraph in the file
959 if (length $paragraph) {
960 parse_paragraph($self, $paragraph, ($nlines - $plines) + 1)
963 ## Now pop the input stream off the top of the input stack.
964 $self->_pop_input_stream();
967 ##---------------------------------------------------------------------------
969 =head1 B<parse_from_file()>
971 $parser->parse_from_file($filename,$outfile);
973 This method takes a filename and does the following:
979 opens the input and output files for reading
980 (creating the appropriate filehandles)
984 invokes the B<parse_from_filehandle()> method passing it the
985 corresponding input and output filehandles.
989 closes the input and output files.
993 If the special input filename "-" or "<&STDIN" is given then the STDIN
994 filehandle is used for input (and no open or close is performed). If no
995 input filename is specified then "-" is implied.
997 If a second argument is given then it should be the name of the desired
998 output file. If the special output filename "-" or ">&STDOUT" is given
999 then the STDOUT filehandle is used for output (and no open or close is
1000 performed). If the special output filename ">&STDERR" is given then the
1001 STDERR filehandle is used for output (and no open or close is
1002 performed). If no output filehandle is currently in use and no output
1003 filename is specified, then "-" is implied.
1005 This method does I<not> usually need to be overridden by subclasses.
1009 sub parse_from_file {
1011 my %opts = (ref $_[0] eq 'HASH') ? %{ shift() } : ();
1012 my ($infile, $outfile) = @_;
1013 my ($in_fh, $out_fh) = (undef, undef);
1014 my ($close_input, $close_output) = (0, 0);
1015 local *myData = $self;
1018 ## Is $infile a filename or a (possibly implied) filehandle
1019 $infile = '-' unless ((defined $infile) && (length $infile));
1020 if (($infile eq '-') || ($infile =~ /^<&(STDIN|0)$/i)) {
1021 ## Not a filename, just a string implying STDIN
1022 $myData{_INFILE} = "<standard input>";
1025 elsif (ref $infile) {
1026 ## Must be a filehandle-ref (or else assume its a ref to an object
1027 ## that supports the common IO read operations).
1028 $myData{_INFILE} = ${$infile};
1032 ## We have a filename, open it for reading
1033 $myData{_INFILE} = $infile;
1034 $in_fh = FileHandle->new("< $infile") or
1035 croak "Can't open $infile for reading: $!\n";
1039 ## NOTE: we need to be *very* careful when "defaulting" the output
1040 ## file. We only want to use a default if this is the beginning of
1041 ## the entire document (but *not* if this is an included file). We
1042 ## determine this by seeing if the input stream stack has been set-up
1045 unless ((defined $outfile) && (length $outfile)) {
1046 (defined $myData{_TOP_STREAM}) && ($out_fh = $myData{_OUTPUT})
1047 || ($outfile = '-');
1049 ## Is $outfile a filename or a (possibly implied) filehandle
1050 if ((defined $outfile) && (length $outfile)) {
1051 if (($outfile eq '-') || ($outfile =~ /^>&?(?:STDOUT|1)$/i)) {
1052 ## Not a filename, just a string implying STDOUT
1053 $myData{_OUTFILE} = "<standard output>";
1056 elsif ($outfile =~ /^>&(STDERR|2)$/i) {
1057 ## Not a filename, just a string implying STDERR
1058 $myData{_OUTFILE} = "<standard error>";
1061 elsif (ref $outfile) {
1062 ## Must be a filehandle-ref (or else assume its a ref to an
1063 ## object that supports the common IO write operations).
1064 $myData{_OUTFILE} = ${$outfile};;
1068 ## We have a filename, open it for writing
1069 $myData{_OUTFILE} = $outfile;
1070 $out_fh = FileHandle->new("> $outfile") or
1071 croak "Can't open $outfile for writing: $!\n";
1076 ## Whew! That was a lot of work to set up reasonably/robust behavior
1077 ## in the case of a non-filename for reading and writing. Now we just
1078 ## have to parse the input and close the handles when we're finished.
1079 $self->parse_from_filehandle(\%opts, $in_fh, $out_fh);
1082 close($in_fh) || croak "Can't close $infile after reading: $!\n";
1084 close($out_fh) || croak "Can't close $outfile after writing: $!\n";
1087 #############################################################################
1089 =head1 ACCESSOR METHODS
1091 Clients of B<Pod::Parser> should use the following methods to access
1092 instance data fields:
1096 ##---------------------------------------------------------------------------
1100 $boolean = $parser->cutting();
1102 Returns the current C<cutting> state: a boolean-valued scalar which
1103 evaluates to true if text from the input file is currently being "cut"
1104 (meaning it is I<not> considered part of the POD document).
1106 $parser->cutting($boolean);
1108 Sets the current C<cutting> state to the given value and returns the
1114 return (@_ > 1) ? ($_[0]->{_CUTTING} = $_[1]) : $_[0]->{_CUTTING};
1117 ##---------------------------------------------------------------------------
1119 =head1 B<output_file()>
1121 $fname = $parser->output_file();
1123 Returns the name of the output file being written.
1128 return $_[0]->{_OUTFILE};
1131 ##---------------------------------------------------------------------------
1133 =head1 B<output_handle()>
1135 $fhandle = $parser->output_handle();
1137 Returns the output filehandle object.
1142 return $_[0]->{_OUTPUT};
1145 ##---------------------------------------------------------------------------
1147 =head1 B<input_file()>
1149 $fname = $parser->input_file();
1151 Returns the name of the input file being read.
1156 return $_[0]->{_INFILE};
1159 ##---------------------------------------------------------------------------
1161 =head1 B<input_handle()>
1163 $fhandle = $parser->input_handle();
1165 Returns the current input filehandle object.
1170 return $_[0]->{_INPUT};
1173 ##---------------------------------------------------------------------------
1177 =head1 B<input_streams()>
1179 $listref = $parser->input_streams();
1181 Returns a reference to an array which corresponds to the stack of all
1182 the input streams that are currently in the middle of being parsed.
1184 While parsing an input stream, it is possible to invoke
1185 B<parse_from_file()> or B<parse_from_filehandle()> to parse a new input
1186 stream and then return to parsing the previous input stream. Each input
1187 stream to be parsed is pushed onto the end of this input stack
1188 before any of its input is read. The input stream that is currently
1189 being parsed is always at the end (or top) of the input stack. When an
1190 input stream has been exhausted, it is popped off the end of the
1193 Each element on this input stack is a reference to C<Pod::InputSource>
1194 object. Please see L<Pod::InputObjects> for more details.
1196 This method might be invoked when printing diagnostic messages, for example,
1197 to obtain the name and line number of the all input files that are currently
1205 return $_[0]->{_INPUT_STREAMS};
1208 ##---------------------------------------------------------------------------
1212 =head1 B<top_stream()>
1214 $hashref = $parser->top_stream();
1216 Returns a reference to the hash-table that represents the element
1217 that is currently at the top (end) of the input stream stack
1218 (see L<"input_streams()">). The return value will be the C<undef>
1219 if the input stack is empty.
1221 This method might be used when printing diagnostic messages, for example,
1222 to obtain the name and line number of the current input file.
1229 return $_[0]->{_TOP_STREAM} || undef;
1232 #############################################################################
1234 =head1 PRIVATE METHODS AND DATA
1236 B<Pod::Parser> makes use of several internal methods and data fields
1237 which clients should not need to see or use. For the sake of avoiding
1238 name collisions for client data and methods, these methods and fields
1239 are briefly discussed here. Determined hackers may obtain further
1240 information about them by reading the B<Pod::Parser> source code.
1242 Private data fields are stored in the hash-object whose reference is
1243 returned by the B<new()> constructor for this class. The names of all
1244 private methods and data-fields used by B<Pod::Parser> begin with a
1245 prefix of "_" and match the regular expression C</^_\w+$/>.
1249 ##---------------------------------------------------------------------------
1253 =head1 B<_push_input_stream()>
1255 $hashref = $parser->_push_input_stream($in_fh,$out_fh);
1257 This method will push the given input stream on the input stack and
1258 perform any necessary beginning-of-document or beginning-of-file
1259 processing. The argument C<$in_fh> is the input stream filehandle to
1260 push, and C<$out_fh> is the corresponding output filehandle to use (if
1261 it is not given or is undefined, then the current output stream is used,
1262 which defaults to standard output if it doesnt exist yet).
1264 The value returned will be reference to the hash-table that represents
1265 the new top of the input stream stack. I<Please Note> that it is
1266 possible for this method to use default values for the input and output
1267 file handles. If this happens, you will need to look at the C<INPUT>
1268 and C<OUTPUT> instance data members to determine their new values.
1274 sub _push_input_stream {
1275 my ($self, $in_fh, $out_fh) = @_;
1276 local *myData = $self;
1278 ## Initialize stuff for the entire document if this is *not*
1279 ## an included file.
1281 ## NOTE: we need to be *very* careful when "defaulting" the output
1282 ## filehandle. We only want to use a default value if this is the
1283 ## beginning of the entire document (but *not* if this is an included
1285 unless (defined $myData{_TOP_STREAM}) {
1286 $out_fh = \*STDOUT unless (defined $out_fh);
1287 $myData{_CUTTING} = 1; ## current "cutting" state
1288 $myData{_INPUT_STREAMS} = []; ## stack of all input streams
1291 ## Initialize input indicators
1292 $myData{_OUTFILE} = '(unknown)' unless (defined $myData{_OUTFILE});
1293 $myData{_OUTPUT} = $out_fh if (defined $out_fh);
1294 $in_fh = \*STDIN unless (defined $in_fh);
1295 $myData{_INFILE} = '(unknown)' unless (defined $myData{_INFILE});
1296 $myData{_INPUT} = $in_fh;
1297 my $input_top = $myData{_TOP_STREAM}
1298 = new Pod::InputSource(
1299 -name => $myData{_INFILE},
1301 -was_cutting => $myData{_CUTTING}
1303 local *input_stack = $myData{_INPUT_STREAMS};
1304 push(@input_stack, $input_top);
1306 ## Perform beginning-of-document and/or beginning-of-input processing
1307 $self->begin_pod() if (@input_stack == 1);
1308 $self->begin_input();
1313 ##---------------------------------------------------------------------------
1317 =head1 B<_pop_input_stream()>
1319 $hashref = $parser->_pop_input_stream();
1321 This takes no arguments. It will perform any necessary end-of-file or
1322 end-of-document processing and then pop the current input stream from
1323 the top of the input stack.
1325 The value returned will be reference to the hash-table that represents
1326 the new top of the input stream stack.
1332 sub _pop_input_stream {
1334 local *myData = $self;
1335 local *input_stack = $myData{_INPUT_STREAMS};
1337 ## Perform end-of-input and/or end-of-document processing
1338 $self->end_input() if (@input_stack > 0);
1339 $self->end_pod() if (@input_stack == 1);
1341 ## Restore cutting state to whatever it was before we started
1342 ## parsing this file.
1343 my $old_top = pop(@input_stack);
1344 $myData{_CUTTING} = $old_top->was_cutting();
1346 ## Dont forget to reset the input indicators
1347 my $input_top = undef;
1348 if (@input_stack > 0) {
1349 $input_top = $myData{_TOP_STREAM} = $input_stack[-1];
1350 $myData{_INFILE} = $input_top->name();
1351 $myData{_INPUT} = $input_top->handle();
1353 delete $myData{_TOP_STREAM};
1354 delete $myData{_INPUT_STREAMS};
1360 #############################################################################
1364 L<Pod::InputObjects>, L<Pod::Select>
1366 B<Pod::InputObjects> defines POD input objects corresponding to
1367 command paragraphs, parse-trees, and interior-sequences.
1369 B<Pod::Select> is a subclass of B<Pod::Parser> which provides the ability
1370 to selectively include and/or exclude sections of a POD document from being
1371 translated based upon the current heading, subheading, subsubheading, etc.
1374 B<Pod::Callbacks> is a subclass of B<Pod::Parser> which gives its users
1375 the ability the employ I<callback functions> instead of, or in addition
1376 to, overriding methods of the base class.
1379 B<Pod::Select> and B<Pod::Callbacks> do not override any
1380 methods nor do they define any new methods with the same name. Because
1381 of this, they may I<both> be used (in combination) as a base class of
1382 the same subclass in order to combine their functionality without
1383 causing any namespace clashes due to multiple inheritance.
1387 Brad Appleton E<lt>bradapp@enteract.comE<gt>
1389 Based on code for B<Pod::Text> written by
1390 Tom Christiansen E<lt>tchrist@mox.perl.comE<gt>