1 .\" Automatically generated by Pod::Man v1.37, Pod::Parser v1.3
4 .\" ========================================================================
5 .de Sh \" Subsection heading
13 .de Sp \" Vertical space (when we can't use .PP)
17 .de Vb \" Begin verbatim text
22 .de Ve \" End verbatim text
26 .\" Set up some character translations and predefined strings. \*(-- will
27 .\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left
28 .\" double quote, and \*(R" will give a right double quote. | will give a
29 .\" real vertical bar. \*(C+ will give a nicer C++. Capital omega is used to
30 .\" do unbreakable dashes and therefore won't be available. \*(C` and \*(C'
31 .\" expand to `' in nroff, nothing in troff, for use with C<>.
33 .ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p'
37 . if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
38 . if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch
51 .\" If the F register is turned on, we'll generate index entries on stderr for
52 .\" titles (.TH), headers (.SH), subsections (.Sh), items (.Ip), and index
53 .\" entries marked with X<> in POD. Of course, you'll have to process the
54 .\" output yourself in some meaningful fashion.
57 . tm Index:\\$1\t\\n%\t"\\$2"
63 .\" For nroff, turn off justification. Always turn off hyphenation; it makes
64 .\" way too many mistakes in technical documents.
68 .\" Accent mark definitions (@(#)ms.acc 1.5 88/02/08 SMI; from UCB 4.2).
69 .\" Fear. Run. Save yourself. No user-serviceable parts.
70 . \" fudge factors for nroff and troff
79 . ds #H ((1u-(\\\\n(.fu%2u))*.13m)
85 . \" simple accents for nroff and troff
95 . ds ' \\k:\h'-(\\n(.wu*8/10-\*(#H)'\'\h"|\\n:u"
96 . ds ` \\k:\h'-(\\n(.wu*8/10-\*(#H)'\`\h'|\\n:u'
97 . ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'^\h'|\\n:u'
98 . ds , \\k:\h'-(\\n(.wu*8/10)',\h'|\\n:u'
99 . ds ~ \\k:\h'-(\\n(.wu-\*(#H-.1m)'~\h'|\\n:u'
100 . ds / \\k:\h'-(\\n(.wu*8/10-\*(#H)'\z\(sl\h'|\\n:u'
102 . \" troff and (daisy-wheel) nroff accents
103 .ds : \\k:\h'-(\\n(.wu*8/10-\*(#H+.1m+\*(#F)'\v'-\*(#V'\z.\h'.2m+\*(#F'.\h'|\\n:u'\v'\*(#V'
104 .ds 8 \h'\*(#H'\(*b\h'-\*(#H'
105 .ds o \\k:\h'-(\\n(.wu+\w'\(de'u-\*(#H)/2u'\v'-.3n'\*(#[\z\(de\v'.3n'\h'|\\n:u'\*(#]
106 .ds d- \h'\*(#H'\(pd\h'-\w'~'u'\v'-.25m'\f2\(hy\fP\v'.25m'\h'-\*(#H'
107 .ds D- D\\k:\h'-\w'D'u'\v'-.11m'\z\(hy\v'.11m'\h'|\\n:u'
108 .ds th \*(#[\v'.3m'\s+1I\s-1\v'-.3m'\h'-(\w'I'u*2/3)'\s-1o\s+1\*(#]
109 .ds Th \*(#[\s+2I\s-2\h'-\w'I'u*3/5'\v'-.3m'o\v'.3m'\*(#]
110 .ds ae a\h'-(\w'a'u*4/10)'e
111 .ds Ae A\h'-(\w'A'u*4/10)'E
112 . \" corrections for vroff
113 .if v .ds ~ \\k:\h'-(\\n(.wu*9/10-\*(#H)'\s-2\u~\d\s+2\h'|\\n:u'
114 .if v .ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'\v'-.4m'^\v'.4m'\h'|\\n:u'
115 . \" for low resolution devices (crt and lpr)
116 .if \n(.H>23 .if \n(.V>19 \
129 .\" ========================================================================
131 .IX Title "HTML::HeadParser 3"
132 .TH HTML::HeadParser 3 "2009-08-13" "perl v5.8.7" "User Contributed Perl Documentation"
134 HTML::HeadParser \- Parse <HEAD> section of a HTML document
136 .IX Header "SYNOPSIS"
138 \& require HTML::HeadParser;
139 \& $p = HTML::HeadParser\->new;
140 \& $p\->parse($text) and print "not finished";
144 \& $p\->header('Title') # to access <title>....</title>
145 \& $p\->header('Content\-Base') # to access <base href="http://...">
146 \& $p\->header('Foo') # to access <meta http\-equiv="Foo" content="...">
147 \& $p\->header('X\-Meta\-Author') # to access <meta name="author" content="...">
148 \& $p\->header('X\-Meta\-Charset') # to access <meta charset="...">
151 .IX Header "DESCRIPTION"
152 The \f(CW\*(C`HTML::HeadParser\*(C'\fR is a specialized (and lightweight)
153 \&\f(CW\*(C`HTML::Parser\*(C'\fR that will only parse the <\s-1HEAD\s0>...</HEAD>
154 section of an \s-1HTML\s0 document. The \fIparse()\fR method
155 will return a \s-1FALSE\s0 value as soon as some <\s-1BODY\s0> element or body
156 text are found, and should not be called again after this.
158 Note that the \f(CW\*(C`HTML::HeadParser\*(C'\fR might get confused if raw undecoded
159 \&\s-1UTF\-8\s0 is passed to the \fIparse()\fR method. Make sure the strings are
160 properly decoded before passing them on.
162 The \f(CW\*(C`HTML::HeadParser\*(C'\fR keeps a reference to a header object, and the
163 parser will update this header object as the various elements of the
164 <\s-1HEAD\s0> section of the \s-1HTML\s0 document are recognized. The following
165 header fields are affected:
166 .IP "Content\-Base:" 4
167 .IX Item "Content-Base:"
168 The \fIContent-Base\fR header is initialized from the <base
169 href=\*(L"...\*(R"> element.
172 The \fITitle\fR header is initialized from the <title>...</title>
176 The \fIIsindex\fR header will be added if there is a <isindex>
177 element in the <head>. The header value is initialized from the
178 \&\fIprompt\fR attribute if it is present. If no \fIprompt\fR attribute is
179 given it will have '?' as the value.
180 .IP "X\-Meta\-Foo:" 4
181 .IX Item "X-Meta-Foo:"
182 All <meta> elements containing a \f(CW\*(C`name\*(C'\fR attribute will result in
183 headers using the prefix \f(CW\*(C`X\-Meta\-\*(C'\fR appended with the value of the
184 \&\f(CW\*(C`name\*(C'\fR attribute as the name of the header, and the value of the
185 \&\f(CW\*(C`content\*(C'\fR attribute as the pushed header value.
187 <meta> elements containing a \f(CW\*(C`http\-equiv\*(C'\fR attribute will result
188 in headers as in above, but without the \f(CW\*(C`X\-Meta\-\*(C'\fR prefix in the
191 <meta> elements containing a \f(CW\*(C`charset\*(C'\fR attribute will result in
192 an \f(CW\*(C`X\-Meta\-Charset\*(C'\fR header, using the value of the \f(CW\*(C`charset\*(C'\fR
193 attribute as the pushed header value.
196 The following methods (in addition to those provided by the
197 superclass) are available:
198 .IP "$hp = HTML::HeadParser\->new" 4
199 .IX Item "$hp = HTML::HeadParser->new"
201 .ie n .IP "$hp = HTML::HeadParser\->new( $header )" 4
202 .el .IP "$hp = HTML::HeadParser\->new( \f(CW$header\fR )" 4
203 .IX Item "$hp = HTML::HeadParser->new( $header )"
205 The object constructor. The optional \f(CW$header\fR argument should be a
206 reference to an object that implement the \fIheader()\fR and \fIpush_header()\fR
207 methods as defined by the \f(CW\*(C`HTTP::Headers\*(C'\fR class. Normally it will be
208 of some class that is a or delegates to the \f(CW\*(C`HTTP::Headers\*(C'\fR class.
210 If no \f(CW$header\fR is given \f(CW\*(C`HTML::HeadParser\*(C'\fR will create an
211 \&\f(CW\*(C`HTTP::Headers\*(C'\fR object by itself (initially empty).
212 .IP "$hp\->header;" 4
213 .IX Item "$hp->header;"
214 Returns a reference to the header object.
215 .ie n .IP "$hp\->header( $key )" 4
216 .el .IP "$hp\->header( \f(CW$key\fR )" 4
217 .IX Item "$hp->header( $key )"
218 Returns a header value. It is just a shorter way to write
219 \&\f(CW\*(C`$hp\->header\->header($key)\*(C'\fR.
223 \& $h = HTTP::Headers\->new;
224 \& $p = HTML::HeadParser\->new($h);
225 \& $p\->parse(<<EOT);
226 \& <title>Stupid example</title>
227 \& <base href="http://www.linpro.no/lwp/">
228 \& Normal text starts here.
231 \& print $h\->title; # should print "Stupid example"
234 .IX Header "SEE ALSO"
235 HTML::Parser, HTTP::Headers
237 The \f(CW\*(C`HTTP::Headers\*(C'\fR class is distributed as part of the
238 \&\fIlibwww-perl\fR package. If you don't have that distribution installed
239 you need to provide the \f(CW$header\fR argument to the \f(CW\*(C`HTML::HeadParser\*(C'\fR
240 constructor with your own object that implements the documented
243 .IX Header "COPYRIGHT"
244 Copyright 1996\-2001 Gisle Aas. All rights reserved.
246 This library is free software; you can redistribute it and/or
247 modify it under the same terms as Perl itself.