Commit | Line | Data |
3fea05b9 |
1 | |
2 | =head1 NAME |
3 | |
4 | Pod::Simple::Subclassing -- write a formatter as a Pod::Simple subclass |
5 | |
6 | =head1 SYNOPSIS |
7 | |
8 | package Pod::SomeFormatter; |
9 | use Pod::Simple; |
10 | @ISA = qw(Pod::Simple); |
11 | $VERSION = '1.01'; |
12 | use strict; |
13 | |
14 | sub _handle_element_start { |
15 | my($parser, $element_name, $attr_hash_r) = @_; |
16 | ... |
17 | } |
18 | |
19 | sub _handle_element_end { |
20 | my($parser, $element_name) = @_; |
21 | ... |
22 | } |
23 | |
24 | sub _handle_text { |
25 | my($parser, $text) = @_; |
26 | ... |
27 | } |
28 | 1; |
29 | |
30 | =head1 DESCRIPTION |
31 | |
32 | This document is about using Pod::Simple to write a Pod processor, |
33 | generally a Pod formatter. If you just want to know about using an |
34 | existing Pod formatter, instead see its documentation and see also the |
35 | docs in L<Pod::Simple>. |
36 | |
37 | The zeroeth step in writing a Pod formatter is to make sure that there |
38 | isn't already a decent one in CPAN. See L<http://search.cpan.org/>, and |
39 | run a search on the name of the format you want to render to. Also |
40 | consider joining the Pod People list |
41 | L<http://lists.perl.org/showlist.cgi?name=pod-people> and asking whether |
42 | anyone has a formatter for that format -- maybe someone cobbled one |
43 | together but just hasn't released it. |
44 | |
45 | The first step in writing a Pod processor is to read L<perlpodspec>, |
46 | which contains notes information on writing a Pod parser (which has been |
47 | largely taken care of by Pod::Simple), but also a lot of requirements |
48 | and recommendations for writing a formatter. |
49 | |
50 | The second step is to actually learn the format you're planning to |
51 | format to -- or at least as much as you need to know to represent Pod, |
52 | which probably isn't much. |
53 | |
54 | The third step is to pick which of Pod::Simple's interfaces you want to |
55 | use -- the basic interface via Pod::Simple or L<Pod::Simple::Methody> is |
56 | event-based, sort of like L<HTML::Parser>'s interface, or sort of like |
57 | L<XML::Parser>'s "Handlers" interface), but L<Pod::Simple::PullParser> |
58 | provides a token-stream interface, sort of like L<HTML::TokeParser>'s |
59 | interface; L<Pod::Simple::SimpleTree> provides a simple tree interface, |
60 | rather like XML::Parser's "Tree" interface. Users familiar with |
61 | XML-handling will find one of these styles relatively familiar; but if |
62 | you would be even more at home with XML, there are classes that produce |
63 | an XML representation of the Pod stream, notably |
64 | L<Pod::Simple::XMLOutStream>; you can feed the output of such a class to |
65 | whatever XML parsing system you are most at home with. |
66 | |
67 | The last step is to write your code based on how the events (or tokens, |
68 | or tree-nodes, or the XML, or however you're parsing) will map to |
69 | constructs in the output format. Also sure to consider how to escape |
70 | text nodes containing arbitrary text, and also what to do with text |
71 | nodes that represent preformatted text (from verbatim sections). |
72 | |
73 | |
74 | |
75 | =head1 Events |
76 | |
77 | TODO intro... mention that events are supplied for implicits, like for |
78 | missing >'s |
79 | |
80 | |
81 | In the following section, we use XML to represent the event structure |
82 | associated with a particular construct. That is, TODO |
83 | |
84 | =over |
85 | |
86 | =item C<< $parser->_handle_element_start( I<element_name>, I<attr_hashref> ) >> |
87 | |
88 | =item C<< $parser->_handle_element_end( I<element_name> ) >> |
89 | |
90 | =item C<< $parser->_handle_text( I<text_string> ) >> |
91 | |
92 | =back |
93 | |
94 | TODO describe |
95 | |
96 | |
97 | =over |
98 | |
99 | =item events with an element_name of Document |
100 | |
101 | Parsing a document produces this event structure: |
102 | |
103 | <Document start_line="543"> |
104 | ...all events... |
105 | </Document> |
106 | |
107 | The value of the I<start_line> attribute will be the line number of the first |
108 | Pod directive in the document. |
109 | |
110 | If there is no Pod in the given document, then the |
111 | event structure will be this: |
112 | |
113 | <Document contentless="1" start_line="543"> |
114 | </Document> |
115 | |
116 | In that case, the value of the I<start_line> attribute will not be meaningful; |
117 | under current implementations, it will probably be the line number of the |
118 | last line in the file. |
119 | |
120 | =item events with an element_name of Para |
121 | |
122 | Parsing a plain (non-verbatim, non-directive, non-data) paragraph in |
123 | a Pod document produces this event structure: |
124 | |
125 | <Para start_line="543"> |
126 | ...all events in this paragraph... |
127 | </Para> |
128 | |
129 | The value of the I<start_line> attribute will be the line number of the start |
130 | of the paragraph. |
131 | |
132 | For example, parsing this paragraph of Pod: |
133 | |
134 | The value of the I<start_line> attribute will be the |
135 | line number of the start of the paragraph. |
136 | |
137 | produces this event structure: |
138 | |
139 | <Para start_line="129"> |
140 | The value of the |
141 | <I> |
142 | start_line |
143 | </I> |
144 | attribute will be the line number of the first Pod directive |
145 | in the document. |
146 | </Para> |
147 | |
148 | =item events with an element_name of B, C, F, or I. |
149 | |
150 | Parsing a BE<lt>...E<gt> formatting code (or of course any of its |
151 | semantically identical syntactic variants |
152 | S<BE<lt>E<lt> ... E<gt>E<gt>>, |
153 | or S<BE<lt>E<lt>E<lt>E<lt> ... E<gt>E<gt>E<gt>E<gt>>, etc.) |
154 | produces this event structure: |
155 | |
156 | <B> |
157 | ...stuff... |
158 | </B> |
159 | |
160 | Currently, there are no attributes conveyed. |
161 | |
162 | Parsing C, F, or I codes produce the same structure, with only a |
163 | different element name. |
164 | |
165 | If your parser object has been set to accept other formatting codes, |
166 | then they will be presented like these B/C/F/I codes -- i.e., without |
167 | any attributes. |
168 | |
169 | =item events with an element_name of S |
170 | |
171 | Normally, parsing an SE<lt>...E<gt> sequence produces this event |
172 | structure, just as if it were a B/C/F/I code: |
173 | |
174 | <S> |
175 | ...stuff... |
176 | </S> |
177 | |
178 | However, Pod::Simple (and presumably all derived parsers) offers the |
179 | C<nbsp_for_S> option which, if enabled, will suppress all S events, and |
180 | instead change all spaces in the content to non-breaking spaces. This is |
181 | intended for formatters that output to a format that has no code that |
182 | means the same as SE<lt>...E<gt>, but which has a code/character that |
183 | means non-breaking space. |
184 | |
185 | =item events with an element_name of X |
186 | |
187 | Normally, parsing an XE<lt>...E<gt> sequence produces this event |
188 | structure, just as if it were a B/C/F/I code: |
189 | |
190 | <X> |
191 | ...stuff... |
192 | </X> |
193 | |
194 | However, Pod::Simple (and presumably all derived parsers) offers the |
195 | C<nix_X_codes> option which, if enabled, will suppress all X events |
196 | and ignore their content. For formatters/processors that don't use |
197 | X events, this is presumably quite useful. |
198 | |
199 | |
200 | =item events with an element_name of L |
201 | |
202 | Because the LE<lt>...E<gt> is the most complex construct in the |
203 | language, it should not surprise you that the events it generates are |
204 | the most complex in the language. Most of complexity is hidden away in |
205 | the attribute values, so for those of you writing a Pod formatter that |
206 | produces a non-hypertextual format, you can just ignore the attributes |
207 | and treat an L event structure like a formatting element that |
208 | (presumably) doesn't actually produce a change in formatting. That is, |
209 | the content of the L event structure (as opposed to its |
210 | attributes) is always what text should be displayed. |
211 | |
212 | There are, at first glance, three kinds of L links: URL, man, and pod. |
213 | |
214 | When a LE<lt>I<some_url>E<gt> code is parsed, it produces this event |
215 | structure: |
216 | |
217 | <L content-implicit="yes" to="that_url" type="url"> |
218 | that_url |
219 | </L> |
220 | |
221 | The C<type="url"> attribute is always specified for this type of |
222 | L code. |
223 | |
224 | For example, this Pod source: |
225 | |
226 | L<http://www.perl.com/CPAN/authors/> |
227 | |
228 | produces this event structure: |
229 | |
230 | <L content-implicit="yes" to="http://www.perl.com/CPAN/authors/" type="url"> |
231 | http://www.perl.com/CPAN/authors/ |
232 | </L> |
233 | |
234 | When a LE<lt>I<manpage(section)>E<gt> code is parsed (and these are |
235 | fairly rare and not terribly useful), it produces this event structure: |
236 | |
237 | <L content-implicit="yes" to="manpage(section)" type="man"> |
238 | manpage(section) |
239 | </L> |
240 | |
241 | The C<type="man"> attribute is always specified for this type of |
242 | L code. |
243 | |
244 | For example, this Pod source: |
245 | |
246 | L<crontab(5)> |
247 | |
248 | produces this event structure: |
249 | |
250 | <L content-implicit="yes" to="crontab(5)" type="man"> |
251 | crontab(5) |
252 | </L> |
253 | |
254 | In the rare cases where a man page link has a specified, that text appears |
255 | in a I<section> attribute. For example, this Pod source: |
256 | |
257 | L<crontab(5)/"ENVIRONMENT"> |
258 | |
259 | will produce this event structure: |
260 | |
261 | <L content-implicit="yes" section="ENVIRONMENT" to="crontab(5)" type="man"> |
262 | "ENVIRONMENT" in crontab(5) |
263 | </L> |
264 | |
265 | In the rare case where the Pod document has code like |
266 | LE<lt>I<sometext>|I<manpage(section)>E<gt>, then the I<sometext> will appear |
267 | as the content of the element, the I<manpage(section)> text will appear |
268 | only as the value of the I<to> attribute, and there will be no |
269 | C<content-implicit="yes"> attribute (whose presence means that the Pod parser |
270 | had to infer what text should appear as the link text -- as opposed to |
271 | cases where that attribute is absent, which means that the Pod parser did |
272 | I<not> have to infer the link text, because that L code explicitly specified |
273 | some link text.) |
274 | |
275 | For example, this Pod source: |
276 | |
277 | L<hell itself!|crontab(5)> |
278 | |
279 | will produce this event structure: |
280 | |
281 | <L to="crontab(5)" type="man"> |
282 | hell itself! |
283 | </L> |
284 | |
285 | The last type of L structure is for links to/within Pod documents. It is |
286 | the most complex because it can have a I<to> attribute, I<or> a |
287 | I<section> attribute, or both. The C<type="pod"> attribute is always |
288 | specified for this type of L code. |
289 | |
290 | In the most common case, the simple case of a LE<lt>podpageE<gt> code |
291 | produces this event structure: |
292 | |
293 | <L content-implicit="yes" to="Net::Ping" type="pod"> |
294 | podpage |
295 | </L> |
296 | |
297 | For example, this Pod source: |
298 | |
299 | L<Net::Ping> |
300 | |
301 | produces this event structure: |
302 | |
303 | <L content-implicit="yes" to="Net::Ping" type="pod"> |
304 | Net::Ping |
305 | </L> |
306 | |
307 | In cases where there is link-text explicitly specified, it |
308 | is to be found in the content of the element (and not the |
309 | attributes), just as with the LE<lt>I<sometext>|I<manpage(section)>E<gt> |
310 | case discussed above. For example, this Pod source: |
311 | |
312 | L<Perl Error Messages|perldiag> |
313 | |
314 | produces this event structure: |
315 | |
316 | <L to="perldiag" type="pod"> |
317 | Perl Error Messages |
318 | </L> |
319 | |
320 | In cases of links to a section in the current Pod document, |
321 | there is a I<section> attribute instead of a I<to> attribute. |
322 | For example, this Pod source: |
323 | |
324 | L</"Member Data"> |
325 | |
326 | produces this event structure: |
327 | |
328 | <L content-implicit="yes" section="Member Data" type="pod"> |
329 | "Member Data" |
330 | </L> |
331 | |
332 | As another example, this Pod source: |
333 | |
334 | L<the various attributes|/"Member Data"> |
335 | |
336 | produces this event structure: |
337 | |
338 | <L section="Member Data" type="pod"> |
339 | the various attributes |
340 | </L> |
341 | |
342 | In cases of links to a section in a different Pod document, |
343 | there are both a I<section> attribute and a L<to> attribute. |
344 | For example, this Pod source: |
345 | |
346 | L<perlsyn/"Basic BLOCKs and Switch Statements"> |
347 | |
348 | produces this event structure: |
349 | |
350 | <L content-implicit="yes" section="Basic BLOCKs and Switch Statements" to="perlsyn" type="pod"> |
351 | "Basic BLOCKs and Switch Statements" in perlsyn |
352 | </L> |
353 | |
354 | As another example, this Pod source: |
355 | |
356 | L<SWITCH statements|perlsyn/"Basic BLOCKs and Switch Statements"> |
357 | |
358 | produces this event structure: |
359 | |
360 | <L section="Basic BLOCKs and Switch Statements" to="perlsyn" type="pod"> |
361 | SWITCH statements |
362 | </L> |
363 | |
364 | Incidentally, note that we do not distinguish between these syntaxes: |
365 | |
366 | L</"Member Data"> |
367 | L<"Member Data"> |
368 | L</Member Data> |
369 | L<Member Data> [deprecated syntax] |
370 | |
371 | That is, they all produce the same event structure, namely: |
372 | |
373 | <L content-implicit="yes" section="Member Data" type="pod"> |
374 | "Member Data" |
375 | </L> |
376 | |
377 | =item events with an element_name of E or Z |
378 | |
379 | While there are Pod codes EE<lt>...E<gt> and ZE<lt>E<gt>, these |
380 | I<do not> produce any E or Z events -- that is, there are no such |
381 | events as E or Z. |
382 | |
383 | =item events with an element_name of Verbatim |
384 | |
385 | When a Pod verbatim paragraph (AKA "codeblock") is parsed, it |
386 | produces this event structure: |
387 | |
388 | <Verbatim start_line="543" xml:space="preserve"> |
389 | ...text... |
390 | </Verbatim> |
391 | |
392 | The value of the I<start_line> attribute will be the line number of the |
393 | first line of this verbatim block. The I<xml:space> attribute is always |
394 | present, and always has the value "preserve". |
395 | |
396 | The text content will have tabs already expanded. |
397 | |
398 | |
399 | =item events with an element_name of head1 .. head4 |
400 | |
401 | When a "=head1 ..." directive is parsed, it produces this event |
402 | structure: |
403 | |
404 | <head1> |
405 | ...stuff... |
406 | </head1> |
407 | |
408 | For example, a directive consisting of this: |
409 | |
410 | =head1 Options to C<new> et al. |
411 | |
412 | will produce this event structure: |
413 | |
414 | <head1 start_line="543"> |
415 | Options to |
416 | <C> |
417 | new |
418 | </C> |
419 | et al. |
420 | </head1> |
421 | |
422 | "=head2" thru "=head4" directives are the same, except for the element |
423 | names in the event structure. |
424 | |
425 | =item events with an element_name of over-bullet |
426 | |
427 | When an "=over ... Z<>=back" block is parsed where the items are |
428 | a bulletted list, it will produce this event structure: |
429 | |
430 | <over-bullet indent="4" start_line="543"> |
431 | <item-bullet start_line="545"> |
432 | ...Stuff... |
433 | </item-bullet> |
434 | ...more item-bullets... |
435 | </over-bullet> |
436 | |
437 | The value of the I<indent> attribute is whatever value is after the |
438 | "=over" directive, as in "=over 8". If no such value is specified |
439 | in the directive, then the I<indent> attribute has the value "4". |
440 | |
441 | For example, this Pod source: |
442 | |
443 | =over |
444 | |
445 | =item * |
446 | |
447 | Stuff |
448 | |
449 | =item * |
450 | |
451 | Bar I<baz>! |
452 | |
453 | =back |
454 | |
455 | produces this event structure: |
456 | |
457 | <over-bullet indent="4" start_line="10"> |
458 | <item-bullet start_line="12"> |
459 | Stuff |
460 | </item-bullet> |
461 | <item-bullet start_line="14"> |
462 | Bar <I>baz</I>! |
463 | </item-bullet> |
464 | </over-bullet> |
465 | |
466 | =item events with an element_name of over-number |
467 | |
468 | When an "=over ... Z<>=back" block is parsed where the items are |
469 | a numbered list, it will produce this event structure: |
470 | |
471 | <over-number indent="4" start_line="543"> |
472 | <item-number number="1" start_line="545"> |
473 | ...Stuff... |
474 | </item-number> |
475 | ...more item-number... |
476 | </over-bullet> |
477 | |
478 | This is like the "over-bullet" event structure; but note that the contents |
479 | are "item-number" instead of "item-bullet", and note that they will have |
480 | a "number" attribute, which some formatters/processors may ignore |
481 | (since, for example, there's no need for it in HTML when producing |
482 | an "<UL><LI>...</LI>...</UL>" structure), but which any processor may use. |
483 | |
484 | Note that the values for the I<number> attributes of "item-number" |
485 | elements in a given "over-number" area I<will> start at 1 and go up by |
486 | one each time. If the Pod source doesn't follow that order (even though |
487 | it really should should!), whatever numbers it has will be ignored (with |
488 | the correct values being put in the I<number> attributes), and an error |
489 | message might be issued to the user. |
490 | |
491 | =item events with an element_name of over-text |
492 | |
493 | These events are are somewhat unlike the other over-* |
494 | structures, as far as what their contents are. When |
495 | an "=over ... Z<>=back" block is parsed where the items are |
496 | a list of text "subheadings", it will produce this event structure: |
497 | |
498 | <over-text indent="4" start_line="543"> |
499 | <item-text> |
500 | ...stuff... |
501 | </item-text> |
502 | ...stuff (generally Para or Verbatim elements)... |
503 | <item-text> |
504 | ...more item-text and/or stuff... |
505 | </over-text> |
506 | |
507 | The I<indent> attribute is as with the other over-* events. |
508 | |
509 | For example, this Pod source: |
510 | |
511 | =over |
512 | |
513 | =item Foo |
514 | |
515 | Stuff |
516 | |
517 | =item Bar I<baz>! |
518 | |
519 | Quux |
520 | |
521 | =back |
522 | |
523 | produces this event structure: |
524 | |
525 | <over-text indent="4" start_line="20"> |
526 | <item-text start_line="22"> |
527 | Foo |
528 | </item-text> |
529 | <Para start_line="24"> |
530 | Stuff |
531 | </Para> |
532 | <item-text start_line="26"> |
533 | Bar |
534 | <I> |
535 | baz |
536 | </I> |
537 | ! |
538 | </item-text> |
539 | <Para start_line="28"> |
540 | Quux |
541 | </Para> |
542 | </over-text> |
543 | |
544 | |
545 | |
546 | =item events with an element_name of over-block |
547 | |
548 | These events are are somewhat unlike the other over-* |
549 | structures, as far as what their contents are. When |
550 | an "=over ... Z<>=back" block is parsed where there are no items, |
551 | it will produce this event structure: |
552 | |
553 | <over-block indent="4" start_line="543"> |
554 | ...stuff (generally Para or Verbatim elements)... |
555 | </over-block> |
556 | |
557 | The I<indent> attribute is as with the other over-* events. |
558 | |
559 | For example, this Pod source: |
560 | |
561 | =over |
562 | |
563 | For cutting off our trade with all parts of the world |
564 | |
565 | For transporting us beyond seas to be tried for pretended offenses |
566 | |
567 | He is at this time transporting large armies of foreign mercenaries to |
568 | complete the works of death, desolation and tyranny, already begun with |
569 | circumstances of cruelty and perfidy scarcely paralleled in the most |
570 | barbarous ages, and totally unworthy the head of a civilized nation. |
571 | |
572 | =cut |
573 | |
574 | will produce this event structure: |
575 | |
576 | <over-block indent="4" start_line="2"> |
577 | <Para start_line="4"> |
578 | For cutting off our trade with all parts of the world |
579 | </Para> |
580 | <Para start_line="6"> |
581 | For transporting us beyond seas to be tried for pretended offenses |
582 | </Para> |
583 | <Para start_line="8"> |
584 | He is at this time transporting large armies of [...more text...] |
585 | </Para> |
586 | </over-block> |
587 | |
588 | =item events with an element_name of item-bullet |
589 | |
590 | See L</"events with an element_name of over-bullet">, above. |
591 | |
592 | =item events with an element_name of item-number |
593 | |
594 | See L</"events with an element_name of over-number">, above. |
595 | |
596 | =item events with an element_name of item-text |
597 | |
598 | See L</"events with an element_name of over-text">, above. |
599 | |
600 | =item events with an element_name of for |
601 | |
602 | TODO... |
603 | |
604 | =item events with an element_name of Data |
605 | |
606 | TODO... |
607 | |
608 | =back |
609 | |
610 | |
611 | |
612 | =head1 More Pod::Simple Methods |
613 | |
614 | Pod::Simple provides a lot of methods that aren't generally interesting |
615 | to the end user of an existing Pod formatter, but some of which you |
616 | might find useful in writing a Pod formatter. They are listed below. The |
617 | first several methods (the accept_* methods) are for declaring the |
618 | capabilites of your parser, notably what C<=for I<targetname>> sections |
619 | it's interested in, what extra NE<lt>...E<gt> codes it accepts beyond |
620 | the ones described in the I<perlpod>. |
621 | |
622 | =over |
623 | |
624 | =item C<< $parser->accept_targets( I<SOMEVALUE> ) >> |
625 | |
626 | As the parser sees sections like: |
627 | |
628 | =for html <img src="fig1.jpg"> |
629 | |
630 | or |
631 | |
632 | =begin html |
633 | |
634 | <img src="fig1.jpg"> |
635 | |
636 | =end html |
637 | |
638 | ...the parser will ignore these sections unless your subclass has |
639 | specified that it wants to see sections targetted to "html" (or whatever |
640 | the formatter name is). |
641 | |
642 | If you want to process all sections, even if they're not targetted for you, |
643 | call this before you start parsing: |
644 | |
645 | $parser->accept_targets('*'); |
646 | |
647 | =item C<< $parser->accept_targets_as_text( I<SOMEVALUE> ) >> |
648 | |
649 | This is like accept_targets, except that it specifies also that the |
650 | content of sections for this target should be treated as Pod text even |
651 | if the target name in "=for I<targetname>" doesn't start with a ":". |
652 | |
653 | At time of writing, I don't think you'll need to use this. |
654 | |
655 | |
656 | =item C<< $parser->accept_codes( I<Codename>, I<Codename>... ) >> |
657 | |
658 | This tells the parser that you accept additional formatting codes, |
659 | beyond just the standard ones (I B C L F S X, plus the two weird ones |
660 | you don't actually see in the parse tree, Z and E). For example, to also |
661 | accept codes "N", "R", and "W": |
662 | |
663 | $parser->accept_codes( qw( N R W ) ); |
664 | |
665 | B<TODO: document how this interacts with =extend, and long element names> |
666 | |
667 | |
668 | =item C<< $parser->accept_directive_as_data( I<directive_name> ) >> |
669 | |
670 | =item C<< $parser->accept_directive_as_verbatim( I<directive_name> ) >> |
671 | |
672 | =item C<< $parser->accept_directive_as_processed( I<directive_name> ) >> |
673 | |
674 | In the unlikely situation that you need to tell the parser that you will |
675 | accept additional directives ("=foo" things), you need to first set the |
676 | parset to treat its content as data (i.e., not really processed at |
677 | all), or as verbatim (mostly just expanding tabs), or as processed text |
678 | (parsing formatting codes like BE<lt>...E<gt>). |
679 | |
680 | For example, to accept a new directive "=method", you'd presumably |
681 | use: |
682 | |
683 | $parser->accept_directive_as_processed("method"); |
684 | |
685 | so that you could have Pod lines like: |
686 | |
687 | =method I<$whatever> thing B<um> |
688 | |
689 | Making up your own directives breaks compatibility with other Pod |
690 | formatters, in a way that using "=for I<target> ..." lines doesn't; |
691 | however, you may find this useful if you're making a Pod superset |
692 | format where you don't need to worry about compatibility. |
693 | |
694 | |
695 | =item C<< $parser->nbsp_for_S( I<BOOLEAN> ); >> |
696 | |
697 | Setting this attribute to a true value (and by default it is false) will |
698 | turn "SE<lt>...E<gt>" sequences into sequences of words separated by |
699 | C<\xA0> (non-breaking space) characters. For example, it will take this: |
700 | |
701 | I like S<Dutch apple pie>, don't you? |
702 | |
703 | and treat it as if it were: |
704 | |
705 | I like DutchE<nbsp>appleE<nbsp>pie, don't you? |
706 | |
707 | This is handy for output formats that don't have anything quite like an |
708 | "SE<lt>...E<gt>" code, but which do have a code for non-breaking space. |
709 | |
710 | There is currently no method for going the other way; but I can |
711 | probably provide one upon request. |
712 | |
713 | |
714 | =item C<< $parser->version_report() >> |
715 | |
716 | This returns a string reporting the $VERSION value from your module (and |
717 | its classname) as well as the $VERSION value of Pod::Simple. Note that |
718 | L<perlpodspec> requires output formats (wherever possible) to note |
719 | this detail in a comment in the output format. For example, for |
720 | some kind of SGML output format: |
721 | |
722 | print OUT "<!-- \n", $parser->version_report, "\n -->"; |
723 | |
724 | |
725 | =item C<< $parser->pod_para_count() >> |
726 | |
727 | This returns the count of Pod paragraphs seen so far. |
728 | |
729 | |
730 | =item C<< $parser->line_count() >> |
731 | |
732 | This is the current line number being parsed. But you might find the |
733 | "line_number" event attribute more accurate, when it is present. |
734 | |
735 | |
736 | =item C<< $parser->nix_X_codes( I<SOMEVALUE> ) >> |
737 | |
738 | This attribute, when set to a true value (and it is false by default) |
739 | ignores any "XE<lt>...E<gt>" sequences in the document being parsed. |
740 | Many formats don't actually use the content of these codes, so have |
741 | no reason to process them. |
742 | |
743 | |
744 | =item C<< $parser->merge_text( I<SOMEVALUE> ) >> |
745 | |
746 | This attribute, when set to a true value (and it is false by default) |
747 | makes sure that only one event (or token, or node) will be created |
748 | for any single contiguous sequence of text. For example, consider |
749 | this somewhat contrived example: |
750 | |
751 | I just LOVE Z<>hotE<32>apple pie! |
752 | |
753 | When that is parsed and events are about to be called on it, it may |
754 | actually seem to be four different text events, one right after another: |
755 | one event for "I just LOVE ", one for "hot", one for " ", and one for |
756 | "apple pie!". But if you have merge_text on, then you're guaranteed |
757 | that it will be fired as one text event: "I just LOVE hot apple pie!". |
758 | |
759 | |
760 | =item C<< $parser->code_handler( I<CODE_REF> ) >> |
761 | |
762 | This specifies code that should be called when a code line is seen |
763 | (i.e., a line outside of the Pod). Normally this is undef, meaning |
764 | that no code should be called. If you provide a routine, it should |
765 | start out like this: |
766 | |
767 | sub get_code_line { # or whatever you'll call it |
768 | my($line, $line_number, $parser) = @_; |
769 | ... |
770 | } |
771 | |
772 | Note, however, that sometimes the Pod events aren't processed in exactly |
773 | the same order as the code lines are -- i.e., if you have a file with |
774 | Pod, then code, then more Pod, sometimes the code will be processed (via |
775 | whatever you have code_handler call) before the all of the preceding Pod |
776 | has been processed. |
777 | |
778 | |
779 | =item C<< $parser->cut_handler( I<CODE_REF> ) >> |
780 | |
781 | This is just like the code_handler attribute, except that it's for |
782 | "=cut" lines, not code lines. The same caveats apply. "=cut" lines are |
783 | unlikely to be interesting, but this is included for completeness. |
784 | |
785 | |
786 | =item C<< $parser->whine( I<linenumber>, I<complaint string> ) >> |
787 | |
788 | This notes a problem in the Pod, which will be reported to in the "Pod |
789 | Errors" section of the document and/or send to STDERR, depending on the |
790 | values of the attributes C<no_whining>, C<no_errata_section>, and |
791 | C<complain_stderr>. |
792 | |
793 | =item C<< $parser->scream( I<linenumber>, I<complaint string> ) >> |
794 | |
795 | This notes an error like C<whine> does, except that it is not |
796 | suppressable with C<no_whining>. This should be used only for very |
797 | serious errors. |
798 | |
799 | |
800 | =item C<< $parser->source_dead(1) >> |
801 | |
802 | This aborts parsing of the current document, by switching on the flag |
803 | that indicates that EOF has been seen. In particularly drastic cases, |
804 | you might want to do this. It's rather nicer than just calling |
805 | C<die>! |
806 | |
807 | =item C<< $parser->hide_line_numbers( I<SOMEVALUE> ) >> |
808 | |
809 | Some subclasses that indescriminately dump event attributes (well, |
810 | except for ones beginning with "~") can use this object attribute for |
811 | refraining to dump the "start_line" attribute. |
812 | |
813 | =item C<< $parser->no_whining( I<SOMEVALUE> ) >> |
814 | |
815 | This attribute, if set to true, will suppress reports of non-fatal |
816 | error messages. The default value is false, meaning that complaints |
817 | I<are> reported. How they get reported depends on the values of |
818 | the attributes C<no_errata_section> and C<complain_stderr>. |
819 | |
820 | =item C<< $parser->no_errata_section( I<SOMEVALUE> ) >> |
821 | |
822 | This attribute, if set to true, will suppress generation of an errata |
823 | section. The default value is false -- i.e., an errata section will be |
824 | generated. |
825 | |
826 | =item C<< $parser->complain_stderr( I<SOMEVALUE> ) >> |
827 | |
828 | This attribute, if set to true will send complaints to STDERR. The |
829 | default value is false -- i.e., complaints do not go to STDERR. |
830 | |
831 | =item C<< $parser->bare_output( I<SOMEVALUE> ) >> |
832 | |
833 | Some formatter subclasses use this as a flag for whether output should |
834 | have prologue and epilogue code omitted. For example, setting this to |
835 | true for an HTML formatter class should omit the |
836 | "<html><head><title>...</title><body>..." prologue and the |
837 | "</body></html>" epilogue. |
838 | |
839 | If you want to set this to true, you should probably also set |
840 | C<no_whining> or at least C<no_errata_section> to true. |
841 | |
842 | =item C<< $parser->preserve_whitespace( I<SOMEVALUE> ) >> |
843 | |
844 | If you set this attribute to a true value, the parser will try to |
845 | preserve whitespace in the output. This means that such formatting |
846 | conventions as two spaces after periods will be preserved by the parser. |
847 | This is primarily useful for output formats that treat whitespace as |
848 | significant (such as text or *roff, but not HTML). |
849 | |
850 | =back |
851 | |
852 | |
853 | =head1 SEE ALSO |
854 | |
855 | L<Pod::Simple> -- event-based Pod-parsing framework |
856 | |
857 | L<Pod::Simple::Methody> -- like Pod::Simple, but each sort of event |
858 | calls its own method (like C<start_head3>) |
859 | |
860 | L<Pod::Simple::PullParser> -- a Pod-parsing framework like Pod::Simple, |
861 | but with a token-stream interface |
862 | |
863 | L<Pod::Simple::SimpleTree> -- a Pod-parsing framework like Pod::Simple, |
864 | but with a tree interface |
865 | |
866 | L<Pod::Simple::Checker> -- a simple Pod::Simple subclass that reads |
867 | documents, and then makes a plaintext report of any errors found in the |
868 | document |
869 | |
870 | L<Pod::Simple::DumpAsXML> -- for dumping Pod documents as tidily |
871 | indented XML, showing each event on its own line |
872 | |
873 | L<Pod::Simple::XMLOutStream> -- dumps a Pod document as XML (without |
874 | introducing extra whitespace as Pod::Simple::DumpAsXML does). |
875 | |
876 | L<Pod::Simple::DumpAsText> -- for dumping Pod documents as tidily |
877 | indented text, showing each event on its own line |
878 | |
879 | L<Pod::Simple::LinkSection> -- class for objects representing the values |
880 | of the TODO and TODO attributes of LE<lt>...E<gt> elements |
881 | |
882 | L<Pod::Escapes> -- the module the Pod::Simple uses for evaluating |
883 | EE<lt>...E<gt> content |
884 | |
885 | L<Pod::Simple::Text> -- a simple plaintext formatter for Pod |
886 | |
887 | L<Pod::Simple::TextContent> -- like Pod::Simple::Text, but |
888 | makes no effort for indent or wrap the text being formatted |
889 | |
890 | L<perlpod|perlpod> |
891 | |
892 | L<perlpodspec|perlpodspec> |
893 | |
894 | L<perldoc> |
895 | |
896 | |
897 | =head1 COPYRIGHT AND DISCLAIMERS |
898 | |
899 | Copyright (c) 2002 Sean M. Burke. All rights reserved. |
900 | |
901 | This library is free software; you can redistribute it and/or modify it |
902 | under the same terms as Perl itself. |
903 | |
904 | This program is distributed in the hope that it will be useful, but |
905 | without any warranty; without even the implied warranty of |
906 | merchantability or fitness for a particular purpose. |
907 | |
908 | =head1 AUTHOR |
909 | |
910 | Sean M. Burke C<sburke@cpan.org> |
911 | |
912 | |
913 | =for notes |
914 | Hm, my old podchecker version (1.2) says: |
915 | *** WARNING: node 'http://search.cpan.org/' contains non-escaped | or / at line 38 in file Subclassing.pod |
916 | *** WARNING: node 'http://lists.perl.org/showlist.cgi?name=pod-people' contains non-escaped | or / at line 41 in file Subclassing.pod |
917 | Yes, L<...> is hard. |
918 | |
919 | |
920 | =cut |
921 | |
922 | |