Quick pre-release fixes: clean up results in File::Spec::VMS and
[p5sagit/p5-mst-13.2.git] / lib / Pod / Parser.pm
CommitLineData
360aca43 1#############################################################################
2# Pod/Parser.pm -- package which defines a base class for parsing POD docs.
3#
66aff6dd 4# Copyright (C) 1996-2000 by Bradford Appleton. All rights reserved.
360aca43 5# This file is part of "PodParser". PodParser is free software;
6# you can redistribute it and/or modify it under the same terms
7# as Perl itself.
8#############################################################################
9
10package Pod::Parser;
11
12use vars qw($VERSION);
48f30392 13$VERSION = 1.11; ## Current version of this package
360aca43 14require 5.004; ## requires this Perl version or later
15
16#############################################################################
17
18=head1 NAME
19
20Pod::Parser - base class for creating POD filters and translators
21
22=head1 SYNOPSIS
23
24 use Pod::Parser;
25
26 package MyParser;
27 @ISA = qw(Pod::Parser);
28
29 sub command {
30 my ($parser, $command, $paragraph, $line_num) = @_;
31 ## Interpret the command and its text; sample actions might be:
32 if ($command eq 'head1') { ... }
33 elsif ($command eq 'head2') { ... }
34 ## ... other commands and their actions
35 my $out_fh = $parser->output_handle();
36 my $expansion = $parser->interpolate($paragraph, $line_num);
37 print $out_fh $expansion;
38 }
39
40 sub verbatim {
41 my ($parser, $paragraph, $line_num) = @_;
42 ## Format verbatim paragraph; sample actions might be:
43 my $out_fh = $parser->output_handle();
44 print $out_fh $paragraph;
45 }
46
47 sub textblock {
48 my ($parser, $paragraph, $line_num) = @_;
49 ## Translate/Format this block of text; sample actions might be:
50 my $out_fh = $parser->output_handle();
51 my $expansion = $parser->interpolate($paragraph, $line_num);
52 print $out_fh $expansion;
53 }
54
55 sub interior_sequence {
56 my ($parser, $seq_command, $seq_argument) = @_;
57 ## Expand an interior sequence; sample actions might be:
66aff6dd 58 return "*$seq_argument*" if ($seq_command eq 'B');
59 return "`$seq_argument'" if ($seq_command eq 'C');
60 return "_${seq_argument}_'" if ($seq_command eq 'I');
360aca43 61 ## ... other sequence commands and their resulting text
62 }
63
64 package main;
65
66 ## Create a parser object and have it parse file whose name was
67 ## given on the command-line (use STDIN if no files were given).
68 $parser = new MyParser();
69 $parser->parse_from_filehandle(\*STDIN) if (@ARGV == 0);
70 for (@ARGV) { $parser->parse_from_file($_); }
71
72=head1 REQUIRES
73
475d79b5 74perl5.004, Pod::InputObjects, Exporter, Carp
360aca43 75
76=head1 EXPORTS
77
78Nothing.
79
80=head1 DESCRIPTION
81
82B<Pod::Parser> is a base class for creating POD filters and translators.
83It handles most of the effort involved with parsing the POD sections
84from an input stream, leaving subclasses free to be concerned only with
85performing the actual translation of text.
86
87B<Pod::Parser> parses PODs, and makes method calls to handle the various
88components of the POD. Subclasses of B<Pod::Parser> override these methods
89to translate the POD into whatever output format they desire.
90
91=head1 QUICK OVERVIEW
92
93To create a POD filter for translating POD documentation into some other
94format, you create a subclass of B<Pod::Parser> which typically overrides
95just the base class implementation for the following methods:
96
97=over 2
98
99=item *
100
101B<command()>
102
103=item *
104
105B<verbatim()>
106
107=item *
108
109B<textblock()>
110
111=item *
112
113B<interior_sequence()>
114
115=back
116
117You may also want to override the B<begin_input()> and B<end_input()>
118methods for your subclass (to perform any needed per-file and/or
119per-document initialization or cleanup).
120
121If you need to perform any preprocesssing of input before it is parsed
122you may want to override one or more of B<preprocess_line()> and/or
123B<preprocess_paragraph()>.
124
125Sometimes it may be necessary to make more than one pass over the input
126files. If this is the case you have several options. You can make the
127first pass using B<Pod::Parser> and override your methods to store the
128intermediate results in memory somewhere for the B<end_pod()> method to
129process. You could use B<Pod::Parser> for several passes with an
130appropriate state variable to control the operation for each pass. If
131your input source can't be reset to start at the beginning, you can
132store it in some other structure as a string or an array and have that
133structure implement a B<getline()> method (which is all that
134B<parse_from_filehandle()> uses to read input).
135
136Feel free to add any member data fields you need to keep track of things
137like current font, indentation, horizontal or vertical position, or
138whatever else you like. Be sure to read L<"PRIVATE METHODS AND DATA">
139to avoid name collisions.
140
141For the most part, the B<Pod::Parser> base class should be able to
142do most of the input parsing for you and leave you free to worry about
143how to intepret the commands and translate the result.
144
66aff6dd 145Note that all we have described here in this quick overview is the
146simplest most straightforward use of B<Pod::Parser> to do stream-based
664bb207 147parsing. It is also possible to use the B<Pod::Parser::parse_text> function
148to do more sophisticated tree-based parsing. See L<"TREE-BASED PARSING">.
149
150=head1 PARSING OPTIONS
151
152A I<parse-option> is simply a named option of B<Pod::Parser> with a
153value that corresponds to a certain specified behavior. These various
154behaviors of B<Pod::Parser> may be enabled/disabled by setting or
155or unsetting one or more I<parse-options> using the B<parseopts()> method.
156The set of currently accepted parse-options is as follows:
157
158=over 3
159
160=item B<-want_nonPODs> (default: unset)
161
162Normally (by default) B<Pod::Parser> will only provide access to
163the POD sections of the input. Input paragraphs that are not part
164of the POD-format documentation are not made available to the caller
165(not even using B<preprocess_paragraph()>). Setting this option to a
166non-empty, non-zero value will allow B<preprocess_paragraph()> to see
e3237417 167non-POD sections of the input as well as POD sections. The B<cutting()>
664bb207 168method can be used to determine if the corresponding paragraph is a POD
169paragraph, or some other input paragraph.
170
171=item B<-process_cut_cmd> (default: unset)
172
173Normally (by default) B<Pod::Parser> handles the C<=cut> POD directive
174by itself and does not pass it on to the caller for processing. Setting
a5317591 175this option to a non-empty, non-zero value will cause B<Pod::Parser> to
664bb207 176pass the C<=cut> directive to the caller just like any other POD command
177(and hence it may be processed by the B<command()> method).
178
179B<Pod::Parser> will still interpret the C<=cut> directive to mean that
180"cutting mode" has been (re)entered, but the caller will get a chance
181to capture the actual C<=cut> paragraph itself for whatever purpose
182it desires.
183
a5317591 184=item B<-warnings> (default: unset)
185
186Normally (by default) B<Pod::Parser> recognizes a bare minimum of
187pod syntax errors and warnings and issues diagnostic messages
188for errors, but not for warnings. (Use B<Pod::Checker> to do more
189thorough checking of POD syntax.) Setting this option to a non-empty,
190non-zero value will cause B<Pod::Parser> to issue diagnostics for
191the few warnings it recognizes as well as the errors.
192
664bb207 193=back
194
195Please see L<"parseopts()"> for a complete description of the interface
196for the setting and unsetting of parse-options.
197
360aca43 198=cut
199
200#############################################################################
201
202use vars qw(@ISA);
203use strict;
204#use diagnostics;
205use Pod::InputObjects;
206use Carp;
360aca43 207use Exporter;
f0963acb 208require VMS::Filespec if $^O eq 'VMS';
360aca43 209@ISA = qw(Exporter);
210
211## These "variables" are used as local "glob aliases" for performance
664bb207 212use vars qw(%myData %myOpts @input_stack);
360aca43 213
214#############################################################################
215
216=head1 RECOMMENDED SUBROUTINE/METHOD OVERRIDES
217
218B<Pod::Parser> provides several methods which most subclasses will probably
219want to override. These methods are as follows:
220
221=cut
222
223##---------------------------------------------------------------------------
224
225=head1 B<command()>
226
227 $parser->command($cmd,$text,$line_num,$pod_para);
228
229This method should be overridden by subclasses to take the appropriate
230action when a POD command paragraph (denoted by a line beginning with
231"=") is encountered. When such a POD directive is seen in the input,
232this method is called and is passed:
233
234=over 3
235
236=item C<$cmd>
237
238the name of the command for this POD paragraph
239
240=item C<$text>
241
242the paragraph text for the given POD paragraph command.
243
244=item C<$line_num>
245
246the line-number of the beginning of the paragraph
247
248=item C<$pod_para>
249
250a reference to a C<Pod::Paragraph> object which contains further
251information about the paragraph command (see L<Pod::InputObjects>
252for details).
253
254=back
255
256B<Note> that this method I<is> called for C<=pod> paragraphs.
257
258The base class implementation of this method simply treats the raw POD
259command as normal block of paragraph text (invoking the B<textblock()>
260method with the command paragraph).
261
262=cut
263
264sub command {
265 my ($self, $cmd, $text, $line_num, $pod_para) = @_;
266 ## Just treat this like a textblock
267 $self->textblock($pod_para->raw_text(), $line_num, $pod_para);
268}
269
270##---------------------------------------------------------------------------
271
272=head1 B<verbatim()>
273
274 $parser->verbatim($text,$line_num,$pod_para);
275
276This method may be overridden by subclasses to take the appropriate
277action when a block of verbatim text is encountered. It is passed the
278following parameters:
279
280=over 3
281
282=item C<$text>
283
284the block of text for the verbatim paragraph
285
286=item C<$line_num>
287
288the line-number of the beginning of the paragraph
289
290=item C<$pod_para>
291
292a reference to a C<Pod::Paragraph> object which contains further
293information about the paragraph (see L<Pod::InputObjects>
294for details).
295
296=back
297
298The base class implementation of this method simply prints the textblock
299(unmodified) to the output filehandle.
300
301=cut
302
303sub verbatim {
304 my ($self, $text, $line_num, $pod_para) = @_;
305 my $out_fh = $self->{_OUTPUT};
306 print $out_fh $text;
307}
308
309##---------------------------------------------------------------------------
310
311=head1 B<textblock()>
312
313 $parser->textblock($text,$line_num,$pod_para);
314
315This method may be overridden by subclasses to take the appropriate
316action when a normal block of POD text is encountered (although the base
317class method will usually do what you want). It is passed the following
318parameters:
319
320=over 3
321
322=item C<$text>
323
324the block of text for the a POD paragraph
325
326=item C<$line_num>
327
328the line-number of the beginning of the paragraph
329
330=item C<$pod_para>
331
332a reference to a C<Pod::Paragraph> object which contains further
333information about the paragraph (see L<Pod::InputObjects>
334for details).
335
336=back
337
338In order to process interior sequences, subclasses implementations of
339this method will probably want to invoke either B<interpolate()> or
340B<parse_text()>, passing it the text block C<$text>, and the corresponding
341line number in C<$line_num>, and then perform any desired processing upon
342the returned result.
343
344The base class implementation of this method simply prints the text block
345as it occurred in the input stream).
346
347=cut
348
349sub textblock {
350 my ($self, $text, $line_num, $pod_para) = @_;
351 my $out_fh = $self->{_OUTPUT};
352 print $out_fh $self->interpolate($text, $line_num);
353}
354
355##---------------------------------------------------------------------------
356
357=head1 B<interior_sequence()>
358
359 $parser->interior_sequence($seq_cmd,$seq_arg,$pod_seq);
360
361This method should be overridden by subclasses to take the appropriate
362action when an interior sequence is encountered. An interior sequence is
363an embedded command within a block of text which appears as a command
364name (usually a single uppercase character) followed immediately by a
365string of text which is enclosed in angle brackets. This method is
366passed the sequence command C<$seq_cmd> and the corresponding text
367C<$seq_arg>. It is invoked by the B<interpolate()> method for each interior
368sequence that occurs in the string that it is passed. It should return
369the desired text string to be used in place of the interior sequence.
370The C<$pod_seq> argument is a reference to a C<Pod::InteriorSequence>
371object which contains further information about the interior sequence.
372Please see L<Pod::InputObjects> for details if you need to access this
373additional information.
374
375Subclass implementations of this method may wish to invoke the
376B<nested()> method of C<$pod_seq> to see if it is nested inside
377some other interior-sequence (and if so, which kind).
378
379The base class implementation of the B<interior_sequence()> method
380simply returns the raw text of the interior sequence (as it occurred
381in the input) to the caller.
382
383=cut
384
385sub interior_sequence {
386 my ($self, $seq_cmd, $seq_arg, $pod_seq) = @_;
387 ## Just return the raw text of the interior sequence
388 return $pod_seq->raw_text();
389}
390
391#############################################################################
392
393=head1 OPTIONAL SUBROUTINE/METHOD OVERRIDES
394
395B<Pod::Parser> provides several methods which subclasses may want to override
396to perform any special pre/post-processing. These methods do I<not> have to
397be overridden, but it may be useful for subclasses to take advantage of them.
398
399=cut
400
401##---------------------------------------------------------------------------
402
403=head1 B<new()>
404
405 my $parser = Pod::Parser->new();
406
407This is the constructor for B<Pod::Parser> and its subclasses. You
408I<do not> need to override this method! It is capable of constructing
409subclass objects as well as base class objects, provided you use
410any of the following constructor invocation styles:
411
412 my $parser1 = MyParser->new();
413 my $parser2 = new MyParser();
414 my $parser3 = $parser2->new();
415
416where C<MyParser> is some subclass of B<Pod::Parser>.
417
418Using the syntax C<MyParser::new()> to invoke the constructor is I<not>
419recommended, but if you insist on being able to do this, then the
420subclass I<will> need to override the B<new()> constructor method. If
421you do override the constructor, you I<must> be sure to invoke the
422B<initialize()> method of the newly blessed object.
423
424Using any of the above invocations, the first argument to the
425constructor is always the corresponding package name (or object
426reference). No other arguments are required, but if desired, an
427associative array (or hash-table) my be passed to the B<new()>
428constructor, as in:
429
430 my $parser1 = MyParser->new( MYDATA => $value1, MOREDATA => $value2 );
431 my $parser2 = new MyParser( -myflag => 1 );
432
433All arguments passed to the B<new()> constructor will be treated as
434key/value pairs in a hash-table. The newly constructed object will be
435initialized by copying the contents of the given hash-table (which may
436have been empty). The B<new()> constructor for this class and all of its
437subclasses returns a blessed reference to the initialized object (hash-table).
438
439=cut
440
441sub new {
442 ## Determine if we were called via an object-ref or a classname
443 my $this = shift;
444 my $class = ref($this) || $this;
445 ## Any remaining arguments are treated as initial values for the
446 ## hash that is used to represent this object.
447 my %params = @_;
448 my $self = { %params };
449 ## Bless ourselves into the desired class and perform any initialization
450 bless $self, $class;
451 $self->initialize();
452 return $self;
453}
454
455##---------------------------------------------------------------------------
456
457=head1 B<initialize()>
458
459 $parser->initialize();
460
461This method performs any necessary object initialization. It takes no
462arguments (other than the object instance of course, which is typically
463copied to a local variable named C<$self>). If subclasses override this
464method then they I<must> be sure to invoke C<$self-E<gt>SUPER::initialize()>.
465
466=cut
467
468sub initialize {
469 #my $self = shift;
470 #return;
471}
472
473##---------------------------------------------------------------------------
474
475=head1 B<begin_pod()>
476
477 $parser->begin_pod();
478
479This method is invoked at the beginning of processing for each POD
480document that is encountered in the input. Subclasses should override
481this method to perform any per-document initialization.
482
483=cut
484
485sub begin_pod {
486 #my $self = shift;
487 #return;
488}
489
490##---------------------------------------------------------------------------
491
492=head1 B<begin_input()>
493
494 $parser->begin_input();
495
496This method is invoked by B<parse_from_filehandle()> immediately I<before>
497processing input from a filehandle. The base class implementation does
498nothing, however, subclasses may override it to perform any per-file
499initializations.
500
501Note that if multiple files are parsed for a single POD document
502(perhaps the result of some future C<=include> directive) this method
503is invoked for every file that is parsed. If you wish to perform certain
504initializations once per document, then you should use B<begin_pod()>.
505
506=cut
507
508sub begin_input {
509 #my $self = shift;
510 #return;
511}
512
513##---------------------------------------------------------------------------
514
515=head1 B<end_input()>
516
517 $parser->end_input();
518
519This method is invoked by B<parse_from_filehandle()> immediately I<after>
520processing input from a filehandle. The base class implementation does
521nothing, however, subclasses may override it to perform any per-file
522cleanup actions.
523
524Please note that if multiple files are parsed for a single POD document
525(perhaps the result of some kind of C<=include> directive) this method
526is invoked for every file that is parsed. If you wish to perform certain
527cleanup actions once per document, then you should use B<end_pod()>.
528
529=cut
530
531sub end_input {
532 #my $self = shift;
533 #return;
534}
535
536##---------------------------------------------------------------------------
537
538=head1 B<end_pod()>
539
540 $parser->end_pod();
541
542This method is invoked at the end of processing for each POD document
543that is encountered in the input. Subclasses should override this method
544to perform any per-document finalization.
545
546=cut
547
548sub end_pod {
549 #my $self = shift;
550 #return;
551}
552
553##---------------------------------------------------------------------------
554
555=head1 B<preprocess_line()>
556
557 $textline = $parser->preprocess_line($text, $line_num);
558
559This method should be overridden by subclasses that wish to perform
560any kind of preprocessing for each I<line> of input (I<before> it has
561been determined whether or not it is part of a POD paragraph). The
562parameter C<$text> is the input line; and the parameter C<$line_num> is
563the line number of the corresponding text line.
564
565The value returned should correspond to the new text to use in its
566place. If the empty string or an undefined value is returned then no
567further processing will be performed for this line.
568
569Please note that the B<preprocess_line()> method is invoked I<before>
570the B<preprocess_paragraph()> method. After all (possibly preprocessed)
571lines in a paragraph have been assembled together and it has been
572determined that the paragraph is part of the POD documentation from one
573of the selected sections, then B<preprocess_paragraph()> is invoked.
574
575The base class implementation of this method returns the given text.
576
577=cut
578
579sub preprocess_line {
580 my ($self, $text, $line_num) = @_;
581 return $text;
582}
583
584##---------------------------------------------------------------------------
585
586=head1 B<preprocess_paragraph()>
587
588 $textblock = $parser->preprocess_paragraph($text, $line_num);
589
590This method should be overridden by subclasses that wish to perform any
591kind of preprocessing for each block (paragraph) of POD documentation
592that appears in the input stream. The parameter C<$text> is the POD
593paragraph from the input file; and the parameter C<$line_num> is the
594line number for the beginning of the corresponding paragraph.
595
596The value returned should correspond to the new text to use in its
597place If the empty string is returned or an undefined value is
598returned, then the given C<$text> is ignored (not processed).
599
e3237417 600This method is invoked after gathering up all the lines in a paragraph
601and after determining the cutting state of the paragraph,
360aca43 602but before trying to further parse or interpret them. After
603B<preprocess_paragraph()> returns, the current cutting state (which
604is returned by C<$self-E<gt>cutting()>) is examined. If it evaluates
e3237417 605to true then input text (including the given C<$text>) is cut (not
360aca43 606processed) until the next POD directive is encountered.
607
608Please note that the B<preprocess_line()> method is invoked I<before>
609the B<preprocess_paragraph()> method. After all (possibly preprocessed)
e3237417 610lines in a paragraph have been assembled together and either it has been
360aca43 611determined that the paragraph is part of the POD documentation from one
66aff6dd 612of the selected sections or the C<-want_nonPODs> option is true,
e3237417 613then B<preprocess_paragraph()> is invoked.
360aca43 614
615The base class implementation of this method returns the given text.
616
617=cut
618
619sub preprocess_paragraph {
620 my ($self, $text, $line_num) = @_;
621 return $text;
622}
623
624#############################################################################
625
626=head1 METHODS FOR PARSING AND PROCESSING
627
628B<Pod::Parser> provides several methods to process input text. These
664bb207 629methods typically won't need to be overridden (and in some cases they
630can't be overridden), but subclasses may want to invoke them to exploit
631their functionality.
360aca43 632
633=cut
634
635##---------------------------------------------------------------------------
636
637=head1 B<parse_text()>
638
639 $ptree1 = $parser->parse_text($text, $line_num);
640 $ptree2 = $parser->parse_text({%opts}, $text, $line_num);
641 $ptree3 = $parser->parse_text(\%opts, $text, $line_num);
642
643This method is useful if you need to perform your own interpolation
644of interior sequences and can't rely upon B<interpolate> to expand
645them in simple bottom-up order order.
646
647The parameter C<$text> is a string or block of text to be parsed
648for interior sequences; and the parameter C<$line_num> is the
649line number curresponding to the beginning of C<$text>.
650
651B<parse_text()> will parse the given text into a parse-tree of "nodes."
652and interior-sequences. Each "node" in the parse tree is either a
653text-string, or a B<Pod::InteriorSequence>. The result returned is a
654parse-tree of type B<Pod::ParseTree>. Please see L<Pod::InputObjects>
655for more information about B<Pod::InteriorSequence> and B<Pod::ParseTree>.
656
657If desired, an optional hash-ref may be specified as the first argument
658to customize certain aspects of the parse-tree that is created and
659returned. The set of recognized option keywords are:
660
661=over 3
662
663=item B<-expand_seq> =E<gt> I<code-ref>|I<method-name>
664
665Normally, the parse-tree returned by B<parse_text()> will contain an
666unexpanded C<Pod::InteriorSequence> object for each interior-sequence
667encountered. Specifying B<-expand_seq> tells B<parse_text()> to "expand"
668every interior-sequence it sees by invoking the referenced function
669(or named method of the parser object) and using the return value as the
670expanded result.
671
672If a subroutine reference was given, it is invoked as:
673
674 &$code_ref( $parser, $sequence )
675
676and if a method-name was given, it is invoked as:
677
678 $parser->method_name( $sequence )
679
680where C<$parser> is a reference to the parser object, and C<$sequence>
681is a reference to the interior-sequence object.
682[I<NOTE>: If the B<interior_sequence()> method is specified, then it is
683invoked according to the interface specified in L<"interior_sequence()">].
684
664bb207 685=item B<-expand_text> =E<gt> I<code-ref>|I<method-name>
686
687Normally, the parse-tree returned by B<parse_text()> will contain a
688text-string for each contiguous sequence of characters outside of an
689interior-sequence. Specifying B<-expand_text> tells B<parse_text()> to
690"preprocess" every such text-string it sees by invoking the referenced
691function (or named method of the parser object) and using the return value
692as the preprocessed (or "expanded") result. [Note that if the result is
693an interior-sequence, then it will I<not> be expanded as specified by the
694B<-expand_seq> option; Any such recursive expansion needs to be handled by
695the specified callback routine.]
696
697If a subroutine reference was given, it is invoked as:
698
699 &$code_ref( $parser, $text, $ptree_node )
700
701and if a method-name was given, it is invoked as:
702
703 $parser->method_name( $text, $ptree_node )
704
705where C<$parser> is a reference to the parser object, C<$text> is the
706text-string encountered, and C<$ptree_node> is a reference to the current
707node in the parse-tree (usually an interior-sequence object or else the
708top-level node of the parse-tree).
709
360aca43 710=item B<-expand_ptree> =E<gt> I<code-ref>|I<method-name>
711
712Rather than returning a C<Pod::ParseTree>, pass the parse-tree as an
713argument to the referenced subroutine (or named method of the parser
714object) and return the result instead of the parse-tree object.
715
716If a subroutine reference was given, it is invoked as:
717
718 &$code_ref( $parser, $ptree )
719
720and if a method-name was given, it is invoked as:
721
722 $parser->method_name( $ptree )
723
724where C<$parser> is a reference to the parser object, and C<$ptree>
725is a reference to the parse-tree object.
726
727=back
728
729=cut
730
360aca43 731sub parse_text {
732 my $self = shift;
733 local $_ = '';
734
735 ## Get options and set any defaults
736 my %opts = (ref $_[0]) ? %{ shift() } : ();
737 my $expand_seq = $opts{'-expand_seq'} || undef;
664bb207 738 my $expand_text = $opts{'-expand_text'} || undef;
360aca43 739 my $expand_ptree = $opts{'-expand_ptree'} || undef;
740
741 my $text = shift;
742 my $line = shift;
743 my $file = $self->input_file();
66aff6dd 744 my $cmd = "";
360aca43 745
746 ## Convert method calls into closures, for our convenience
747 my $xseq_sub = $expand_seq;
664bb207 748 my $xtext_sub = $expand_text;
360aca43 749 my $xptree_sub = $expand_ptree;
e9fdc7d2 750 if (defined $expand_seq and $expand_seq eq 'interior_sequence') {
360aca43 751 ## If 'interior_sequence' is the method to use, we have to pass
752 ## more than just the sequence object, we also need to pass the
753 ## sequence name and text.
754 $xseq_sub = sub {
755 my ($self, $iseq) = @_;
756 my $args = join("", $iseq->parse_tree->children);
757 return $self->interior_sequence($iseq->name, $args, $iseq);
758 };
759 }
760 ref $xseq_sub or $xseq_sub = sub { shift()->$expand_seq(@_) };
664bb207 761 ref $xtext_sub or $xtext_sub = sub { shift()->$expand_text(@_) };
360aca43 762 ref $xptree_sub or $xptree_sub = sub { shift()->$expand_ptree(@_) };
66aff6dd 763
360aca43 764 ## Keep track of the "current" interior sequence, and maintain a stack
765 ## of "in progress" sequences.
766 ##
767 ## NOTE that we push our own "accumulator" at the very beginning of the
768 ## stack. It's really a parse-tree, not a sequence; but it implements
769 ## the methods we need so we can use it to gather-up all the sequences
770 ## and strings we parse. Thus, by the end of our parsing, it should be
771 ## the only thing left on our stack and all we have to do is return it!
772 ##
773 my $seq = Pod::ParseTree->new();
774 my @seq_stack = ($seq);
66aff6dd 775 my ($ldelim, $rdelim) = ('', '');
360aca43 776
faee740f 777 ## Iterate over all sequence starts text (NOTE: split with
778 ## capturing parens keeps the delimiters)
360aca43 779 $_ = $text;
66aff6dd 780 my @tokens = split /([A-Z]<(?:<+\s+)?)/;
781 while ( @tokens ) {
782 $_ = shift @tokens;
faee740f 783 ## Look for the beginning of a sequence
66aff6dd 784 if ( /^([A-Z])(<(?:<+\s+)?)$/ ) {
e9fdc7d2 785 ## Push a new sequence onto the stack of those "in-progress"
66aff6dd 786 ($cmd, $ldelim) = ($1, $2);
360aca43 787 $seq = Pod::InteriorSequence->new(
66aff6dd 788 -name => $cmd,
789 -ldelim => $ldelim, -rdelim => '',
790 -file => $file, -line => $line
360aca43 791 );
66aff6dd 792 $ldelim =~ s/\s+$//, ($rdelim = $ldelim) =~ tr/</>/;
360aca43 793 (@seq_stack > 1) and $seq->nested($seq_stack[-1]);
794 push @seq_stack, $seq;
795 }
66aff6dd 796 ## Look for sequence ending
797 elsif ( @seq_stack > 1 ) {
798 ## Make sure we match the right kind of closing delimiter
799 my ($seq_end, $post_seq) = ("", "");
800 if ( ($ldelim eq '<' and /\A(.*?)(>)/s)
801 or /\A(.*?)(\s+$rdelim)/s )
802 {
803 ## Found end-of-sequence, capture the interior and the
804 ## closing the delimiter, and put the rest back on the
805 ## token-list
806 $post_seq = substr($_, length($1) + length($2));
807 ($_, $seq_end) = ($1, $2);
808 (length $post_seq) and unshift @tokens, $post_seq;
809 }
810 if (length) {
811 ## In the middle of a sequence, append this text to it, and
812 ## dont forget to "expand" it if that's what the caller wanted
813 $seq->append($expand_text ? &$xtext_sub($self,$_,$seq) : $_);
814 $_ .= $seq_end;
815 }
816 if (length $seq_end) {
817 ## End of current sequence, record terminating delimiter
818 $seq->rdelim($seq_end);
819 ## Pop it off the stack of "in progress" sequences
820 pop @seq_stack;
821 ## Append result to its parent in current parse tree
822 $seq_stack[-1]->append($expand_seq ? &$xseq_sub($self,$seq)
823 : $seq);
824 ## Remember the current cmd-name and left-delimiter
825 $cmd = (@seq_stack > 1) ? $seq_stack[-1]->name : '';
826 $ldelim = (@seq_stack > 1) ? $seq_stack[-1]->ldelim : '';
827 $ldelim =~ s/\s+$//, ($rdelim = $ldelim) =~ tr/</>/;
828 }
360aca43 829 }
664bb207 830 elsif (length) {
831 ## In the middle of a sequence, append this text to it, and
832 ## dont forget to "expand" it if that's what the caller wanted
833 $seq->append($expand_text ? &$xtext_sub($self,$_,$seq) : $_);
360aca43 834 }
66aff6dd 835 ## Keep track of line count
836 $line += tr/\n//;
837 ## Remember the "current" sequence
838 $seq = $seq_stack[-1];
360aca43 839 }
840
841 ## Handle unterminated sequences
664bb207 842 my $errorsub = (@seq_stack > 1) ? $self->errorsub() : undef;
360aca43 843 while (@seq_stack > 1) {
844 ($cmd, $file, $line) = ($seq->name, $seq->file_line);
f0963acb 845 $file = VMS::Filespec::unixify($file) if $^O eq 'VMS';
66aff6dd 846 $ldelim = $seq->ldelim;
847 ($rdelim = $ldelim) =~ tr/</>/;
848 $rdelim =~ s/^(\S+)(\s*)$/$2$1/;
360aca43 849 pop @seq_stack;
a5317591 850 my $errmsg = "*** ERROR: unterminated ${cmd}${ldelim}...${rdelim}".
66aff6dd 851 " at line $line in file $file\n";
664bb207 852 (ref $errorsub) and &{$errorsub}($errmsg)
f5daac4a 853 or (defined $errorsub) and $self->$errorsub($errmsg)
664bb207 854 or warn($errmsg);
360aca43 855 $seq_stack[-1]->append($expand_seq ? &$xseq_sub($self,$seq) : $seq);
856 $seq = $seq_stack[-1];
857 }
858
859 ## Return the resulting parse-tree
860 my $ptree = (pop @seq_stack)->parse_tree;
861 return $expand_ptree ? &$xptree_sub($self, $ptree) : $ptree;
862}
863
864##---------------------------------------------------------------------------
865
866=head1 B<interpolate()>
867
868 $textblock = $parser->interpolate($text, $line_num);
869
870This method translates all text (including any embedded interior sequences)
871in the given text string C<$text> and returns the interpolated result. The
872parameter C<$line_num> is the line number corresponding to the beginning
873of C<$text>.
874
875B<interpolate()> merely invokes a private method to recursively expand
876nested interior sequences in bottom-up order (innermost sequences are
877expanded first). If there is a need to expand nested sequences in
878some alternate order, use B<parse_text> instead.
879
880=cut
881
882sub interpolate {
883 my($self, $text, $line_num) = @_;
884 my %parse_opts = ( -expand_seq => 'interior_sequence' );
885 my $ptree = $self->parse_text( \%parse_opts, $text, $line_num );
886 return join "", $ptree->children();
887}
888
889##---------------------------------------------------------------------------
890
891=begin __PRIVATE__
892
893=head1 B<parse_paragraph()>
894
895 $parser->parse_paragraph($text, $line_num);
896
897This method takes the text of a POD paragraph to be processed, along
898with its corresponding line number, and invokes the appropriate method
899(one of B<command()>, B<verbatim()>, or B<textblock()>).
900
664bb207 901For performance reasons, this method is invoked directly without any
902dynamic lookup; Hence subclasses may I<not> override it!
360aca43 903
904=end __PRIVATE__
905
906=cut
907
908sub parse_paragraph {
909 my ($self, $text, $line_num) = @_;
664bb207 910 local *myData = $self; ## alias to avoid deref-ing overhead
911 local *myOpts = ($myData{_PARSEOPTS} ||= {}); ## get parse-options
360aca43 912 local $_;
913
664bb207 914 ## See if we want to preprocess nonPOD paragraphs as well as POD ones.
e3237417 915 my $wantNonPods = $myOpts{'-want_nonPODs'};
916
917 ## Update cutting status
918 $myData{_CUTTING} = 0 if $text =~ /^={1,2}\S/;
664bb207 919
920 ## Perform any desired preprocessing if we wanted it this early
921 $wantNonPods and $text = $self->preprocess_paragraph($text, $line_num);
922
360aca43 923 ## Ignore up until next POD directive if we are cutting
e3237417 924 return if $myData{_CUTTING};
360aca43 925
926 ## Now we know this is block of text in a POD section!
927
928 ##-----------------------------------------------------------------
929 ## This is a hook (hack ;-) for Pod::Select to do its thing without
930 ## having to override methods, but also without Pod::Parser assuming
931 ## $self is an instance of Pod::Select (if the _SELECTED_SECTIONS
932 ## field exists then we assume there is an is_selected() method for
933 ## us to invoke (calling $self->can('is_selected') could verify this
934 ## but that is more overhead than I want to incur)
935 ##-----------------------------------------------------------------
936
937 ## Ignore this block if it isnt in one of the selected sections
938 if (exists $myData{_SELECTED_SECTIONS}) {
939 $self->is_selected($text) or return ($myData{_CUTTING} = 1);
940 }
941
664bb207 942 ## If we havent already, perform any desired preprocessing and
943 ## then re-check the "cutting" state
944 unless ($wantNonPods) {
945 $text = $self->preprocess_paragraph($text, $line_num);
946 return 1 unless ((defined $text) and (length $text));
947 return 1 if ($myData{_CUTTING});
948 }
360aca43 949
950 ## Look for one of the three types of paragraphs
951 my ($pfx, $cmd, $arg, $sep) = ('', '', '', '');
952 my $pod_para = undef;
953 if ($text =~ /^(={1,2})(?=\S)/) {
954 ## Looks like a command paragraph. Capture the command prefix used
955 ## ("=" or "=="), as well as the command-name, its paragraph text,
956 ## and whatever sequence of characters was used to separate them
957 $pfx = $1;
958 $_ = substr($text, length $pfx);
d23ed1f2 959 ($cmd, $sep, $text) = split /(\s+)/, $_, 2;
360aca43 960 ## If this is a "cut" directive then we dont need to do anything
961 ## except return to "cutting" mode.
962 if ($cmd eq 'cut') {
963 $myData{_CUTTING} = 1;
664bb207 964 return unless $myOpts{'-process_cut_cmd'};
360aca43 965 }
966 }
967 ## Save the attributes indicating how the command was specified.
968 $pod_para = new Pod::Paragraph(
969 -name => $cmd,
970 -text => $text,
971 -prefix => $pfx,
972 -separator => $sep,
973 -file => $myData{_INFILE},
974 -line => $line_num
975 );
976 # ## Invoke appropriate callbacks
977 # if (exists $myData{_CALLBACKS}) {
978 # ## Look through the callback list, invoke callbacks,
979 # ## then see if we need to do the default actions
980 # ## (invoke_callbacks will return true if we do).
981 # return 1 unless $self->invoke_callbacks($cmd, $text, $line_num, $pod_para);
982 # }
983 if (length $cmd) {
984 ## A command paragraph
985 $self->command($cmd, $text, $line_num, $pod_para);
986 }
987 elsif ($text =~ /^\s+/) {
988 ## Indented text - must be a verbatim paragraph
989 $self->verbatim($text, $line_num, $pod_para);
990 }
991 else {
992 ## Looks like an ordinary block of text
993 $self->textblock($text, $line_num, $pod_para);
994 }
995 return 1;
996}
997
998##---------------------------------------------------------------------------
999
1000=head1 B<parse_from_filehandle()>
1001
1002 $parser->parse_from_filehandle($in_fh,$out_fh);
1003
1004This method takes an input filehandle (which is assumed to already be
1005opened for reading) and reads the entire input stream looking for blocks
1006(paragraphs) of POD documentation to be processed. If no first argument
1007is given the default input filehandle C<STDIN> is used.
1008
1009The C<$in_fh> parameter may be any object that provides a B<getline()>
1010method to retrieve a single line of input text (hence, an appropriate
1011wrapper object could be used to parse PODs from a single string or an
1012array of strings).
1013
1014Using C<$in_fh-E<gt>getline()>, input is read line-by-line and assembled
1015into paragraphs or "blocks" (which are separated by lines containing
1016nothing but whitespace). For each block of POD documentation
1017encountered it will invoke a method to parse the given paragraph.
1018
1019If a second argument is given then it should correspond to a filehandle where
1020output should be sent (otherwise the default output filehandle is
1021C<STDOUT> if no output filehandle is currently in use).
1022
1023B<NOTE:> For performance reasons, this method caches the input stream at
1024the top of the stack in a local variable. Any attempts by clients to
1025change the stack contents during processing when in the midst executing
1026of this method I<will not affect> the input stream used by the current
1027invocation of this method.
1028
1029This method does I<not> usually need to be overridden by subclasses.
1030
1031=cut
1032
1033sub parse_from_filehandle {
1034 my $self = shift;
1035 my %opts = (ref $_[0] eq 'HASH') ? %{ shift() } : ();
1036 my ($in_fh, $out_fh) = @_;
22641bdf 1037 $in_fh = \*STDIN unless ($in_fh);
a5317591 1038 local *myData = $self; ## alias to avoid deref-ing overhead
1039 local *myOpts = ($myData{_PARSEOPTS} ||= {}); ## get parse-options
360aca43 1040 local $_;
1041
1042 ## Put this stream at the top of the stack and do beginning-of-input
1043 ## processing. NOTE that $in_fh might be reset during this process.
1044 my $topstream = $self->_push_input_stream($in_fh, $out_fh);
1045 (exists $opts{-cutting}) and $self->cutting( $opts{-cutting} );
1046
1047 ## Initialize line/paragraph
1048 my ($textline, $paragraph) = ('', '');
1049 my ($nlines, $plines) = (0, 0);
1050
1051 ## Use <$fh> instead of $fh->getline where possible (for speed)
1052 $_ = ref $in_fh;
1053 my $tied_fh = (/^(?:GLOB|FileHandle|IO::\w+)$/ or tied $in_fh);
1054
1055 ## Read paragraphs line-by-line
1056 while (defined ($textline = $tied_fh ? <$in_fh> : $in_fh->getline)) {
1057 $textline = $self->preprocess_line($textline, ++$nlines);
1058 next unless ((defined $textline) && (length $textline));
1059 $_ = $paragraph; ## save previous contents
1060
1061 if ((! length $paragraph) && ($textline =~ /^==/)) {
1062 ## '==' denotes a one-line command paragraph
1063 $paragraph = $textline;
1064 $plines = 1;
1065 $textline = '';
1066 } else {
1067 ## Append this line to the current paragraph
1068 $paragraph .= $textline;
1069 ++$plines;
1070 }
1071
66aff6dd 1072 ## See if this line is blank and ends the current paragraph.
360aca43 1073 ## If it isnt, then keep iterating until it is.
a5317591 1074 next unless (($textline =~ /^([^\S\r\n]*)[\r\n]*$/)
1075 && (length $paragraph));
66aff6dd 1076
1077 ## Issue a warning about any non-empty blank lines
a5317591 1078 if (length($1) > 1 and $myOpts{'-warnings'} and ! $myData{_CUTTING}) {
1079 my $errorsub = $self->errorsub();
1080 my $file = $self->input_file();
1081 $file = VMS::Filespec::unixify($file) if $^O eq 'VMS';
1082 my $errmsg = "*** WARNING: line containing nothing but whitespace".
1083 " in paragraph at line $nlines in file $file\n";
1084 (ref $errorsub) and &{$errorsub}($errmsg)
1085 or (defined $errorsub) and $self->$errorsub($errmsg)
1086 or warn($errmsg);
1087 }
360aca43 1088
1089 ## Now process the paragraph
1090 parse_paragraph($self, $paragraph, ($nlines - $plines) + 1);
1091 $paragraph = '';
1092 $plines = 0;
1093 }
1094 ## Dont forget about the last paragraph in the file
1095 if (length $paragraph) {
1096 parse_paragraph($self, $paragraph, ($nlines - $plines) + 1)
1097 }
1098
1099 ## Now pop the input stream off the top of the input stack.
1100 $self->_pop_input_stream();
1101}
1102
1103##---------------------------------------------------------------------------
1104
1105=head1 B<parse_from_file()>
1106
1107 $parser->parse_from_file($filename,$outfile);
1108
1109This method takes a filename and does the following:
1110
1111=over 2
1112
1113=item *
1114
1115opens the input and output files for reading
1116(creating the appropriate filehandles)
1117
1118=item *
1119
1120invokes the B<parse_from_filehandle()> method passing it the
1121corresponding input and output filehandles.
1122
1123=item *
1124
1125closes the input and output files.
1126
1127=back
1128
1129If the special input filename "-" or "<&STDIN" is given then the STDIN
1130filehandle is used for input (and no open or close is performed). If no
1131input filename is specified then "-" is implied.
1132
1133If a second argument is given then it should be the name of the desired
1134output file. If the special output filename "-" or ">&STDOUT" is given
1135then the STDOUT filehandle is used for output (and no open or close is
1136performed). If the special output filename ">&STDERR" is given then the
1137STDERR filehandle is used for output (and no open or close is
1138performed). If no output filehandle is currently in use and no output
1139filename is specified, then "-" is implied.
1140
1141This method does I<not> usually need to be overridden by subclasses.
1142
1143=cut
1144
1145sub parse_from_file {
1146 my $self = shift;
1147 my %opts = (ref $_[0] eq 'HASH') ? %{ shift() } : ();
1148 my ($infile, $outfile) = @_;
475d79b5 1149 my ($in_fh, $out_fh);
360aca43 1150 my ($close_input, $close_output) = (0, 0);
1151 local *myData = $self;
1152 local $_;
1153
1154 ## Is $infile a filename or a (possibly implied) filehandle
1155 $infile = '-' unless ((defined $infile) && (length $infile));
1156 if (($infile eq '-') || ($infile =~ /^<&(STDIN|0)$/i)) {
1157 ## Not a filename, just a string implying STDIN
1158 $myData{_INFILE} = "<standard input>";
1159 $in_fh = \*STDIN;
1160 }
1161 elsif (ref $infile) {
1162 ## Must be a filehandle-ref (or else assume its a ref to an object
1163 ## that supports the common IO read operations).
1164 $myData{_INFILE} = ${$infile};
1165 $in_fh = $infile;
1166 }
1167 else {
1168 ## We have a filename, open it for reading
1169 $myData{_INFILE} = $infile;
475d79b5 1170 open($in_fh, "< $infile") or
360aca43 1171 croak "Can't open $infile for reading: $!\n";
1172 $close_input = 1;
1173 }
1174
1175 ## NOTE: we need to be *very* careful when "defaulting" the output
1176 ## file. We only want to use a default if this is the beginning of
1177 ## the entire document (but *not* if this is an included file). We
1178 ## determine this by seeing if the input stream stack has been set-up
1179 ## already
1180 ##
1181 unless ((defined $outfile) && (length $outfile)) {
1182 (defined $myData{_TOP_STREAM}) && ($out_fh = $myData{_OUTPUT})
1183 || ($outfile = '-');
1184 }
1185 ## Is $outfile a filename or a (possibly implied) filehandle
1186 if ((defined $outfile) && (length $outfile)) {
1187 if (($outfile eq '-') || ($outfile =~ /^>&?(?:STDOUT|1)$/i)) {
1188 ## Not a filename, just a string implying STDOUT
1189 $myData{_OUTFILE} = "<standard output>";
1190 $out_fh = \*STDOUT;
1191 }
1192 elsif ($outfile =~ /^>&(STDERR|2)$/i) {
1193 ## Not a filename, just a string implying STDERR
1194 $myData{_OUTFILE} = "<standard error>";
1195 $out_fh = \*STDERR;
1196 }
1197 elsif (ref $outfile) {
1198 ## Must be a filehandle-ref (or else assume its a ref to an
1199 ## object that supports the common IO write operations).
1200 $myData{_OUTFILE} = ${$outfile};;
1201 $out_fh = $outfile;
1202 }
1203 else {
1204 ## We have a filename, open it for writing
1205 $myData{_OUTFILE} = $outfile;
475d79b5 1206 open($out_fh, "> $outfile") or
360aca43 1207 croak "Can't open $outfile for writing: $!\n";
1208 $close_output = 1;
1209 }
1210 }
1211
1212 ## Whew! That was a lot of work to set up reasonably/robust behavior
1213 ## in the case of a non-filename for reading and writing. Now we just
1214 ## have to parse the input and close the handles when we're finished.
1215 $self->parse_from_filehandle(\%opts, $in_fh, $out_fh);
1216
1217 $close_input and
1218 close($in_fh) || croak "Can't close $infile after reading: $!\n";
1219 $close_output and
1220 close($out_fh) || croak "Can't close $outfile after writing: $!\n";
1221}
1222
1223#############################################################################
1224
1225=head1 ACCESSOR METHODS
1226
1227Clients of B<Pod::Parser> should use the following methods to access
1228instance data fields:
1229
1230=cut
1231
1232##---------------------------------------------------------------------------
1233
664bb207 1234=head1 B<errorsub()>
1235
1236 $parser->errorsub("method_name");
1237 $parser->errorsub(\&warn_user);
1238 $parser->errorsub(sub { print STDERR, @_ });
1239
1240Specifies the method or subroutine to use when printing error messages
1241about POD syntax. The supplied method/subroutine I<must> return TRUE upon
1242successful printing of the message. If C<undef> is given, then the B<warn>
1243builtin is used to issue error messages (this is the default behavior).
1244
1245 my $errorsub = $parser->errorsub()
1246 my $errmsg = "This is an error message!\n"
1247 (ref $errorsub) and &{$errorsub}($errmsg)
e3237417 1248 or (defined $errorsub) and $parser->$errorsub($errmsg)
664bb207 1249 or warn($errmsg);
1250
1251Returns a method name, or else a reference to the user-supplied subroutine
1252used to print error messages. Returns C<undef> if the B<warn> builtin
1253is used to issue error messages (this is the default behavior).
1254
1255=cut
1256
1257sub errorsub {
1258 return (@_ > 1) ? ($_[0]->{_ERRORSUB} = $_[1]) : $_[0]->{_ERRORSUB};
1259}
1260
1261##---------------------------------------------------------------------------
1262
360aca43 1263=head1 B<cutting()>
1264
1265 $boolean = $parser->cutting();
1266
1267Returns the current C<cutting> state: a boolean-valued scalar which
1268evaluates to true if text from the input file is currently being "cut"
1269(meaning it is I<not> considered part of the POD document).
1270
1271 $parser->cutting($boolean);
1272
1273Sets the current C<cutting> state to the given value and returns the
1274result.
1275
1276=cut
1277
1278sub cutting {
1279 return (@_ > 1) ? ($_[0]->{_CUTTING} = $_[1]) : $_[0]->{_CUTTING};
1280}
1281
1282##---------------------------------------------------------------------------
1283
664bb207 1284##---------------------------------------------------------------------------
1285
1286=head1 B<parseopts()>
1287
1288When invoked with no additional arguments, B<parseopts> returns a hashtable
1289of all the current parsing options.
1290
1291 ## See if we are parsing non-POD sections as well as POD ones
1292 my %opts = $parser->parseopts();
1293 $opts{'-want_nonPODs}' and print "-want_nonPODs\n";
1294
1295When invoked using a single string, B<parseopts> treats the string as the
1296name of a parse-option and returns its corresponding value if it exists
1297(returns C<undef> if it doesn't).
1298
1299 ## Did we ask to see '=cut' paragraphs?
1300 my $want_cut = $parser->parseopts('-process_cut_cmd');
1301 $want_cut and print "-process_cut_cmd\n";
1302
1303When invoked with multiple arguments, B<parseopts> treats them as
1304key/value pairs and the specified parse-option names are set to the
1305given values. Any unspecified parse-options are unaffected.
1306
1307 ## Set them back to the default
a5317591 1308 $parser->parseopts(-warnings => 0);
664bb207 1309
1310When passed a single hash-ref, B<parseopts> uses that hash to completely
1311reset the existing parse-options, all previous parse-option values
1312are lost.
1313
1314 ## Reset all options to default
1315 $parser->parseopts( { } );
1316
a5317591 1317See L<"PARSING OPTIONS"> for more information on the name and meaning of each
664bb207 1318parse-option currently recognized.
1319
1320=cut
1321
1322sub parseopts {
1323 local *myData = shift;
1324 local *myOpts = ($myData{_PARSEOPTS} ||= {});
1325 return %myOpts if (@_ == 0);
1326 if (@_ == 1) {
1327 local $_ = shift;
1328 return ref($_) ? $myData{_PARSEOPTS} = $_ : $myOpts{$_};
1329 }
1330 my @newOpts = (%myOpts, @_);
1331 $myData{_PARSEOPTS} = { @newOpts };
1332}
1333
1334##---------------------------------------------------------------------------
1335
360aca43 1336=head1 B<output_file()>
1337
1338 $fname = $parser->output_file();
1339
1340Returns the name of the output file being written.
1341
1342=cut
1343
1344sub output_file {
1345 return $_[0]->{_OUTFILE};
1346}
1347
1348##---------------------------------------------------------------------------
1349
1350=head1 B<output_handle()>
1351
1352 $fhandle = $parser->output_handle();
1353
1354Returns the output filehandle object.
1355
1356=cut
1357
1358sub output_handle {
1359 return $_[0]->{_OUTPUT};
1360}
1361
1362##---------------------------------------------------------------------------
1363
1364=head1 B<input_file()>
1365
1366 $fname = $parser->input_file();
1367
1368Returns the name of the input file being read.
1369
1370=cut
1371
1372sub input_file {
1373 return $_[0]->{_INFILE};
1374}
1375
1376##---------------------------------------------------------------------------
1377
1378=head1 B<input_handle()>
1379
1380 $fhandle = $parser->input_handle();
1381
1382Returns the current input filehandle object.
1383
1384=cut
1385
1386sub input_handle {
1387 return $_[0]->{_INPUT};
1388}
1389
1390##---------------------------------------------------------------------------
1391
1392=begin __PRIVATE__
1393
1394=head1 B<input_streams()>
1395
1396 $listref = $parser->input_streams();
1397
1398Returns a reference to an array which corresponds to the stack of all
1399the input streams that are currently in the middle of being parsed.
1400
1401While parsing an input stream, it is possible to invoke
1402B<parse_from_file()> or B<parse_from_filehandle()> to parse a new input
1403stream and then return to parsing the previous input stream. Each input
1404stream to be parsed is pushed onto the end of this input stack
1405before any of its input is read. The input stream that is currently
1406being parsed is always at the end (or top) of the input stack. When an
1407input stream has been exhausted, it is popped off the end of the
1408input stack.
1409
1410Each element on this input stack is a reference to C<Pod::InputSource>
1411object. Please see L<Pod::InputObjects> for more details.
1412
1413This method might be invoked when printing diagnostic messages, for example,
1414to obtain the name and line number of the all input files that are currently
1415being processed.
1416
1417=end __PRIVATE__
1418
1419=cut
1420
1421sub input_streams {
1422 return $_[0]->{_INPUT_STREAMS};
1423}
1424
1425##---------------------------------------------------------------------------
1426
1427=begin __PRIVATE__
1428
1429=head1 B<top_stream()>
1430
1431 $hashref = $parser->top_stream();
1432
1433Returns a reference to the hash-table that represents the element
1434that is currently at the top (end) of the input stream stack
1435(see L<"input_streams()">). The return value will be the C<undef>
1436if the input stack is empty.
1437
1438This method might be used when printing diagnostic messages, for example,
1439to obtain the name and line number of the current input file.
1440
1441=end __PRIVATE__
1442
1443=cut
1444
1445sub top_stream {
1446 return $_[0]->{_TOP_STREAM} || undef;
1447}
1448
1449#############################################################################
1450
1451=head1 PRIVATE METHODS AND DATA
1452
1453B<Pod::Parser> makes use of several internal methods and data fields
1454which clients should not need to see or use. For the sake of avoiding
1455name collisions for client data and methods, these methods and fields
1456are briefly discussed here. Determined hackers may obtain further
1457information about them by reading the B<Pod::Parser> source code.
1458
1459Private data fields are stored in the hash-object whose reference is
1460returned by the B<new()> constructor for this class. The names of all
1461private methods and data-fields used by B<Pod::Parser> begin with a
1462prefix of "_" and match the regular expression C</^_\w+$/>.
1463
1464=cut
1465
1466##---------------------------------------------------------------------------
1467
1468=begin _PRIVATE_
1469
1470=head1 B<_push_input_stream()>
1471
1472 $hashref = $parser->_push_input_stream($in_fh,$out_fh);
1473
1474This method will push the given input stream on the input stack and
1475perform any necessary beginning-of-document or beginning-of-file
1476processing. The argument C<$in_fh> is the input stream filehandle to
1477push, and C<$out_fh> is the corresponding output filehandle to use (if
1478it is not given or is undefined, then the current output stream is used,
1479which defaults to standard output if it doesnt exist yet).
1480
1481The value returned will be reference to the hash-table that represents
1482the new top of the input stream stack. I<Please Note> that it is
1483possible for this method to use default values for the input and output
1484file handles. If this happens, you will need to look at the C<INPUT>
1485and C<OUTPUT> instance data members to determine their new values.
1486
1487=end _PRIVATE_
1488
1489=cut
1490
1491sub _push_input_stream {
1492 my ($self, $in_fh, $out_fh) = @_;
1493 local *myData = $self;
1494
1495 ## Initialize stuff for the entire document if this is *not*
1496 ## an included file.
1497 ##
1498 ## NOTE: we need to be *very* careful when "defaulting" the output
1499 ## filehandle. We only want to use a default value if this is the
1500 ## beginning of the entire document (but *not* if this is an included
1501 ## file).
1502 unless (defined $myData{_TOP_STREAM}) {
1503 $out_fh = \*STDOUT unless (defined $out_fh);
1504 $myData{_CUTTING} = 1; ## current "cutting" state
1505 $myData{_INPUT_STREAMS} = []; ## stack of all input streams
1506 }
1507
1508 ## Initialize input indicators
1509 $myData{_OUTFILE} = '(unknown)' unless (defined $myData{_OUTFILE});
1510 $myData{_OUTPUT} = $out_fh if (defined $out_fh);
1511 $in_fh = \*STDIN unless (defined $in_fh);
1512 $myData{_INFILE} = '(unknown)' unless (defined $myData{_INFILE});
1513 $myData{_INPUT} = $in_fh;
1514 my $input_top = $myData{_TOP_STREAM}
1515 = new Pod::InputSource(
1516 -name => $myData{_INFILE},
1517 -handle => $in_fh,
1518 -was_cutting => $myData{_CUTTING}
1519 );
1520 local *input_stack = $myData{_INPUT_STREAMS};
1521 push(@input_stack, $input_top);
1522
1523 ## Perform beginning-of-document and/or beginning-of-input processing
1524 $self->begin_pod() if (@input_stack == 1);
1525 $self->begin_input();
1526
1527 return $input_top;
1528}
1529
1530##---------------------------------------------------------------------------
1531
1532=begin _PRIVATE_
1533
1534=head1 B<_pop_input_stream()>
1535
1536 $hashref = $parser->_pop_input_stream();
1537
1538This takes no arguments. It will perform any necessary end-of-file or
1539end-of-document processing and then pop the current input stream from
1540the top of the input stack.
1541
1542The value returned will be reference to the hash-table that represents
1543the new top of the input stream stack.
1544
1545=end _PRIVATE_
1546
1547=cut
1548
1549sub _pop_input_stream {
1550 my ($self) = @_;
1551 local *myData = $self;
1552 local *input_stack = $myData{_INPUT_STREAMS};
1553
1554 ## Perform end-of-input and/or end-of-document processing
1555 $self->end_input() if (@input_stack > 0);
1556 $self->end_pod() if (@input_stack == 1);
1557
1558 ## Restore cutting state to whatever it was before we started
1559 ## parsing this file.
1560 my $old_top = pop(@input_stack);
1561 $myData{_CUTTING} = $old_top->was_cutting();
1562
1563 ## Dont forget to reset the input indicators
1564 my $input_top = undef;
1565 if (@input_stack > 0) {
1566 $input_top = $myData{_TOP_STREAM} = $input_stack[-1];
1567 $myData{_INFILE} = $input_top->name();
1568 $myData{_INPUT} = $input_top->handle();
1569 } else {
1570 delete $myData{_TOP_STREAM};
1571 delete $myData{_INPUT_STREAMS};
1572 }
1573
1574 return $input_top;
1575}
1576
1577#############################################################################
1578
664bb207 1579=head1 TREE-BASED PARSING
1580
1581If straightforward stream-based parsing wont meet your needs (as is
1582likely the case for tasks such as translating PODs into structured
1583markup languages like HTML and XML) then you may need to take the
1584tree-based approach. Rather than doing everything in one pass and
1585calling the B<interpolate()> method to expand sequences into text, it
1586may be desirable to instead create a parse-tree using the B<parse_text()>
1587method to return a tree-like structure which may contain an ordered list
1588list of children (each of which may be a text-string, or a similar
1589tree-like structure).
1590
1591Pay special attention to L<"METHODS FOR PARSING AND PROCESSING"> and
1592to the objects described in L<Pod::InputObjects>. The former describes
1593the gory details and parameters for how to customize and extend the
1594parsing behavior of B<Pod::Parser>. B<Pod::InputObjects> provides
1595several objects that may all be used interchangeably as parse-trees. The
1596most obvious one is the B<Pod::ParseTree> object. It defines the basic
1597interface and functionality that all things trying to be a POD parse-tree
1598should do. A B<Pod::ParseTree> is defined such that each "node" may be a
1599text-string, or a reference to another parse-tree. Each B<Pod::Paragraph>
1600object and each B<Pod::InteriorSequence> object also supports the basic
1601parse-tree interface.
1602
1603The B<parse_text()> method takes a given paragraph of text, and
1604returns a parse-tree that contains one or more children, each of which
1605may be a text-string, or an InteriorSequence object. There are also
1606callback-options that may be passed to B<parse_text()> to customize
1607the way it expands or transforms interior-sequences, as well as the
1608returned result. These callbacks can be used to create a parse-tree
1609with custom-made objects (which may or may not support the parse-tree
1610interface, depending on how you choose to do it).
1611
1612If you wish to turn an entire POD document into a parse-tree, that process
1613is fairly straightforward. The B<parse_text()> method is the key to doing
1614this successfully. Every paragraph-callback (i.e. the polymorphic methods
1615for B<command()>, B<verbatim()>, and B<textblock()> paragraphs) takes
1616a B<Pod::Paragraph> object as an argument. Each paragraph object has a
1617B<parse_tree()> method that can be used to get or set a corresponding
1618parse-tree. So for each of those paragraph-callback methods, simply call
1619B<parse_text()> with the options you desire, and then use the returned
1620parse-tree to assign to the given paragraph object.
1621
1622That gives you a parse-tree for each paragraph - so now all you need is
1623an ordered list of paragraphs. You can maintain that yourself as a data
1624element in the object/hash. The most straightforward way would be simply
1625to use an array-ref, with the desired set of custom "options" for each
1626invocation of B<parse_text>. Let's assume the desired option-set is
1627given by the hash C<%options>. Then we might do something like the
1628following:
1629
1630 package MyPodParserTree;
1631
1632 @ISA = qw( Pod::Parser );
1633
1634 ...
1635
1636 sub begin_pod {
1637 my $self = shift;
1638 $self->{'-paragraphs'} = []; ## initialize paragraph list
1639 }
1640
1641 sub command {
1642 my ($parser, $command, $paragraph, $line_num, $pod_para) = @_;
1643 my $ptree = $parser->parse_text({%options}, $paragraph, ...);
1644 $pod_para->parse_tree( $ptree );
1645 push @{ $self->{'-paragraphs'} }, $pod_para;
1646 }
1647
1648 sub verbatim {
1649 my ($parser, $paragraph, $line_num, $pod_para) = @_;
1650 push @{ $self->{'-paragraphs'} }, $pod_para;
1651 }
1652
1653 sub textblock {
1654 my ($parser, $paragraph, $line_num, $pod_para) = @_;
1655 my $ptree = $parser->parse_text({%options}, $paragraph, ...);
1656 $pod_para->parse_tree( $ptree );
1657 push @{ $self->{'-paragraphs'} }, $pod_para;
1658 }
1659
1660 ...
1661
1662 package main;
1663 ...
1664 my $parser = new MyPodParserTree(...);
1665 $parser->parse_from_file(...);
1666 my $paragraphs_ref = $parser->{'-paragraphs'};
1667
1668Of course, in this module-author's humble opinion, I'd be more inclined to
1669use the existing B<Pod::ParseTree> object than a simple array. That way
1670everything in it, paragraphs and sequences, all respond to the same core
1671interface for all parse-tree nodes. The result would look something like:
1672
1673 package MyPodParserTree2;
1674
1675 ...
1676
1677 sub begin_pod {
1678 my $self = shift;
1679 $self->{'-ptree'} = new Pod::ParseTree; ## initialize parse-tree
1680 }
1681
1682 sub parse_tree {
1683 ## convenience method to get/set the parse-tree for the entire POD
1684 (@_ > 1) and $_[0]->{'-ptree'} = $_[1];
1685 return $_[0]->{'-ptree'};
1686 }
1687
1688 sub command {
1689 my ($parser, $command, $paragraph, $line_num, $pod_para) = @_;
1690 my $ptree = $parser->parse_text({<<options>>}, $paragraph, ...);
1691 $pod_para->parse_tree( $ptree );
1692 $parser->parse_tree()->append( $pod_para );
1693 }
1694
1695 sub verbatim {
1696 my ($parser, $paragraph, $line_num, $pod_para) = @_;
1697 $parser->parse_tree()->append( $pod_para );
1698 }
1699
1700 sub textblock {
1701 my ($parser, $paragraph, $line_num, $pod_para) = @_;
1702 my $ptree = $parser->parse_text({<<options>>}, $paragraph, ...);
1703 $pod_para->parse_tree( $ptree );
1704 $parser->parse_tree()->append( $pod_para );
1705 }
1706
1707 ...
1708
1709 package main;
1710 ...
1711 my $parser = new MyPodParserTree2(...);
1712 $parser->parse_from_file(...);
1713 my $ptree = $parser->parse_tree;
1714 ...
1715
1716Now you have the entire POD document as one great big parse-tree. You
1717can even use the B<-expand_seq> option to B<parse_text> to insert
1718whole different kinds of objects. Just don't expect B<Pod::Parser>
1719to know what to do with them after that. That will need to be in your
1720code. Or, alternatively, you can insert any object you like so long as
1721it conforms to the B<Pod::ParseTree> interface.
1722
1723One could use this to create subclasses of B<Pod::Paragraphs> and
1724B<Pod::InteriorSequences> for specific commands (or to create your own
1725custom node-types in the parse-tree) and add some kind of B<emit()>
1726method to each custom node/subclass object in the tree. Then all you'd
1727need to do is recursively walk the tree in the desired order, processing
1728the children (most likely from left to right) by formatting them if
1729they are text-strings, or by calling their B<emit()> method if they
1730are objects/references.
1731
360aca43 1732=head1 SEE ALSO
1733
1734L<Pod::InputObjects>, L<Pod::Select>
1735
1736B<Pod::InputObjects> defines POD input objects corresponding to
1737command paragraphs, parse-trees, and interior-sequences.
1738
1739B<Pod::Select> is a subclass of B<Pod::Parser> which provides the ability
1740to selectively include and/or exclude sections of a POD document from being
1741translated based upon the current heading, subheading, subsubheading, etc.
1742
1743=for __PRIVATE__
1744B<Pod::Callbacks> is a subclass of B<Pod::Parser> which gives its users
1745the ability the employ I<callback functions> instead of, or in addition
1746to, overriding methods of the base class.
1747
1748=for __PRIVATE__
1749B<Pod::Select> and B<Pod::Callbacks> do not override any
1750methods nor do they define any new methods with the same name. Because
1751of this, they may I<both> be used (in combination) as a base class of
1752the same subclass in order to combine their functionality without
1753causing any namespace clashes due to multiple inheritance.
1754
1755=head1 AUTHOR
1756
1757Brad Appleton E<lt>bradapp@enteract.comE<gt>
1758
1759Based on code for B<Pod::Text> written by
1760Tom Christiansen E<lt>tchrist@mox.perl.comE<gt>
1761
1762=cut
1763
17641;