Add built local::lib
[catagits/Gitalist.git] / local-lib5 / lib / perl5 / lwptut.pod
CommitLineData
3fea05b9 1=head1 NAME
2
3lwptut -- An LWP Tutorial
4
5=head1 DESCRIPTION
6
7LWP (short for "Library for WWW in Perl") is a very popular group of
8Perl modules for accessing data on the Web. Like most Perl
9module-distributions, each of LWP's component modules comes with
10documentation that is a complete reference to its interface. However,
11there are so many modules in LWP that it's hard to know where to start
12looking for information on how to do even the simplest most common
13things.
14
15Really introducing you to using LWP would require a whole book -- a book
16that just happens to exist, called I<Perl & LWP>. But this article
17should give you a taste of how you can go about some common tasks with
18LWP.
19
20
21=head2 Getting documents with LWP::Simple
22
23If you just want to get what's at a particular URL, the simplest way
24to do it is LWP::Simple's functions.
25
26In a Perl program, you can call its C<get($url)> function. It will try
27getting that URL's content. If it works, then it'll return the
28content; but if there's some error, it'll return undef.
29
30 my $url = 'http://freshair.npr.org/dayFA.cfm?todayDate=current';
31 # Just an example: the URL for the most recent /Fresh Air/ show
32
33 use LWP::Simple;
34 my $content = get $url;
35 die "Couldn't get $url" unless defined $content;
36
37 # Then go do things with $content, like this:
38
39 if($content =~ m/jazz/i) {
40 print "They're talking about jazz today on Fresh Air!\n";
41 }
42 else {
43 print "Fresh Air is apparently jazzless today.\n";
44 }
45
46The handiest variant on C<get> is C<getprint>, which is useful in Perl
47one-liners. If it can get the page whose URL you provide, it sends it
48to STDOUT; otherwise it complains to STDERR.
49
50 % perl -MLWP::Simple -e "getprint 'http://cpan.org/RECENT'"
51
52That is the URL of a plain text file that lists new files in CPAN in
53the past two weeks. You can easily make it part of a tidy little
54shell command, like this one that mails you the list of new
55C<Acme::> modules:
56
57 % perl -MLWP::Simple -e "getprint 'http://cpan.org/RECENT'" \
58 | grep "/by-module/Acme" | mail -s "New Acme modules! Joy!" $USER
59
60There are other useful functions in LWP::Simple, including one function
61for running a HEAD request on a URL (useful for checking links, or
62getting the last-revised time of a URL), and two functions for
63saving/mirroring a URL to a local file. See L<the LWP::Simple
64documentation|LWP::Simple> for the full details, or chapter 2 of I<Perl
65& LWP> for more examples.
66
67
68
69=for comment
70 ##########################################################################
71
72
73
74=head2 The Basics of the LWP Class Model
75
76LWP::Simple's functions are handy for simple cases, but its functions
77don't support cookies or authorization, don't support setting header
78lines in the HTTP request, generally don't support reading header lines
79in the HTTP response (notably the full HTTP error message, in case of an
80error). To get at all those features, you'll have to use the full LWP
81class model.
82
83While LWP consists of dozens of classes, the main two that you have to
84understand are L<LWP::UserAgent> and L<HTTP::Response>. LWP::UserAgent
85is a class for "virtual browsers" which you use for performing requests,
86and L<HTTP::Response> is a class for the responses (or error messages)
87that you get back from those requests.
88
89The basic idiom is C<< $response = $browser->get($url) >>, or more fully
90illustrated:
91
92 # Early in your program:
93
94 use LWP 5.64; # Loads all important LWP classes, and makes
95 # sure your version is reasonably recent.
96
97 my $browser = LWP::UserAgent->new;
98
99 ...
100
101 # Then later, whenever you need to make a get request:
102 my $url = 'http://freshair.npr.org/dayFA.cfm?todayDate=current';
103
104 my $response = $browser->get( $url );
105 die "Can't get $url -- ", $response->status_line
106 unless $response->is_success;
107
108 die "Hey, I was expecting HTML, not ", $response->content_type
109 unless $response->content_type eq 'text/html';
110 # or whatever content-type you're equipped to deal with
111
112 # Otherwise, process the content somehow:
113
114 if($response->decoded_content =~ m/jazz/i) {
115 print "They're talking about jazz today on Fresh Air!\n";
116 }
117 else {
118 print "Fresh Air is apparently jazzless today.\n";
119 }
120
121There are two objects involved: C<$browser>, which holds an object of
122class LWP::UserAgent, and then the C<$response> object, which is of
123class HTTP::Response. You really need only one browser object per
124program; but every time you make a request, you get back a new
125HTTP::Response object, which will have some interesting attributes:
126
127=over
128
129=item *
130
131A status code indicating
132success or failure
133(which you can test with C<< $response->is_success >>).
134
135=item *
136
137An HTTP status
138line that is hopefully informative if there's failure (which you can
139see with C<< $response->status_line >>,
140returning something like "404 Not Found").
141
142=item *
143
144A MIME content-type like "text/html", "image/gif",
145"application/xml", etc., which you can see with
146C<< $response->content_type >>
147
148=item *
149
150The actual content of the response, in C<< $response->decoded_content >>.
151If the response is HTML, that's where the HTML source will be; if
152it's a GIF, then C<< $response->decoded_content >> will be the binary
153GIF data.
154
155=item *
156
157And dozens of other convenient and more specific methods that are
158documented in the docs for L<HTTP::Response>, and its superclasses
159L<HTTP::Message> and L<HTTP::Headers>.
160
161=back
162
163
164
165=for comment
166 ##########################################################################
167
168
169
170=head2 Adding Other HTTP Request Headers
171
172The most commonly used syntax for requests is C<< $response =
173$browser->get($url) >>, but in truth, you can add extra HTTP header
174lines to the request by adding a list of key-value pairs after the URL,
175like so:
176
177 $response = $browser->get( $url, $key1, $value1, $key2, $value2, ... );
178
179For example, here's how to send some more Netscape-like headers, in case
180you're dealing with a site that would otherwise reject your request:
181
182
183 my @ns_headers = (
184 'User-Agent' => 'Mozilla/4.76 [en] (Win98; U)',
185 'Accept' => 'image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */*',
186 'Accept-Charset' => 'iso-8859-1,*,utf-8',
187 'Accept-Language' => 'en-US',
188 );
189
190 ...
191
192 $response = $browser->get($url, @ns_headers);
193
194If you weren't reusing that array, you could just go ahead and do this:
195
196 $response = $browser->get($url,
197 'User-Agent' => 'Mozilla/4.76 [en] (Win98; U)',
198 'Accept' => 'image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */*',
199 'Accept-Charset' => 'iso-8859-1,*,utf-8',
200 'Accept-Language' => 'en-US',
201 );
202
203If you were only ever changing the 'User-Agent' line, you could just change
204the C<$browser> object's default line from "libwww-perl/5.65" (or the like)
205to whatever you like, using the LWP::UserAgent C<agent> method:
206
207 $browser->agent('Mozilla/4.76 [en] (Win98; U)');
208
209
210
211=for comment
212 ##########################################################################
213
214
215
216=head2 Enabling Cookies
217
218A default LWP::UserAgent object acts like a browser with its cookies
219support turned off. There are various ways of turning it on, by setting
220its C<cookie_jar> attribute. A "cookie jar" is an object representing
221a little database of all
222the HTTP cookies that a browser can know about. It can correspond to a
223file on disk (the way Netscape uses its F<cookies.txt> file), or it can
224be just an in-memory object that starts out empty, and whose collection of
225cookies will disappear once the program is finished running.
226
227To give a browser an in-memory empty cookie jar, you set its C<cookie_jar>
228attribute like so:
229
230 $browser->cookie_jar({});
231
232To give it a copy that will be read from a file on disk, and will be saved
233to it when the program is finished running, set the C<cookie_jar> attribute
234like this:
235
236 use HTTP::Cookies;
237 $browser->cookie_jar( HTTP::Cookies->new(
238 'file' => '/some/where/cookies.lwp',
239 # where to read/write cookies
240 'autosave' => 1,
241 # save it to disk when done
242 ));
243
244That file will be an LWP-specific format. If you want to be access the
245cookies in your Netscape cookies file, you can use the
246HTTP::Cookies::Netscape class:
247
248 use HTTP::Cookies;
249 # yes, loads HTTP::Cookies::Netscape too
250
251 $browser->cookie_jar( HTTP::Cookies::Netscape->new(
252 'file' => 'c:/Program Files/Netscape/Users/DIR-NAME-HERE/cookies.txt',
253 # where to read cookies
254 ));
255
256You could add an C<< 'autosave' => 1 >> line as further above, but at
257time of writing, it's uncertain whether Netscape might discard some of
258the cookies you could be writing back to disk.
259
260
261
262=for comment
263 ##########################################################################
264
265
266
267=head2 Posting Form Data
268
269Many HTML forms send data to their server using an HTTP POST request, which
270you can send with this syntax:
271
272 $response = $browser->post( $url,
273 [
274 formkey1 => value1,
275 formkey2 => value2,
276 ...
277 ],
278 );
279
280Or if you need to send HTTP headers:
281
282 $response = $browser->post( $url,
283 [
284 formkey1 => value1,
285 formkey2 => value2,
286 ...
287 ],
288 headerkey1 => value1,
289 headerkey2 => value2,
290 );
291
292For example, the following program makes a search request to AltaVista
293(by sending some form data via an HTTP POST request), and extracts from
294the HTML the report of the number of matches:
295
296 use strict;
297 use warnings;
298 use LWP 5.64;
299 my $browser = LWP::UserAgent->new;
300
301 my $word = 'tarragon';
302
303 my $url = 'http://www.altavista.com/sites/search/web';
304 my $response = $browser->post( $url,
305 [ 'q' => $word, # the Altavista query string
306 'pg' => 'q', 'avkw' => 'tgz', 'kl' => 'XX',
307 ]
308 );
309 die "$url error: ", $response->status_line
310 unless $response->is_success;
311 die "Weird content type at $url -- ", $response->content_type
312 unless $response->content_is_html;
313
314 if( $response->decoded_content =~ m{AltaVista found ([0-9,]+) results} ) {
315 # The substring will be like "AltaVista found 2,345 results"
316 print "$word: $1\n";
317 }
318 else {
319 print "Couldn't find the match-string in the response\n";
320 }
321
322
323
324=for comment
325 ##########################################################################
326
327
328
329=head2 Sending GET Form Data
330
331Some HTML forms convey their form data not by sending the data
332in an HTTP POST request, but by making a normal GET request with
333the data stuck on the end of the URL. For example, if you went to
334C<imdb.com> and ran a search on "Blade Runner", the URL you'd see
335in your browser window would be:
336
337 http://us.imdb.com/Tsearch?title=Blade%20Runner&restrict=Movies+and+TV
338
339To run the same search with LWP, you'd use this idiom, which involves
340the URI class:
341
342 use URI;
343 my $url = URI->new( 'http://us.imdb.com/Tsearch' );
344 # makes an object representing the URL
345
346 $url->query_form( # And here the form data pairs:
347 'title' => 'Blade Runner',
348 'restrict' => 'Movies and TV',
349 );
350
351 my $response = $browser->get($url);
352
353See chapter 5 of I<Perl & LWP> for a longer discussion of HTML forms
354and of form data, and chapters 6 through 9 for a longer discussion of
355extracting data from HTML.
356
357
358
359=head2 Absolutizing URLs
360
361The URI class that we just mentioned above provides all sorts of methods
362for accessing and modifying parts of URLs (such as asking sort of URL it
363is with C<< $url->scheme >>, and asking what host it refers to with C<<
364$url->host >>, and so on, as described in L<the docs for the URI
365class|URI>. However, the methods of most immediate interest
366are the C<query_form> method seen above, and now the C<new_abs> method
367for taking a probably-relative URL string (like "../foo.html") and getting
368back an absolute URL (like "http://www.perl.com/stuff/foo.html"), as
369shown here:
370
371 use URI;
372 $abs = URI->new_abs($maybe_relative, $base);
373
374For example, consider this program that matches URLs in the HTML
375list of new modules in CPAN:
376
377 use strict;
378 use warnings;
379 use LWP;
380 my $browser = LWP::UserAgent->new;
381
382 my $url = 'http://www.cpan.org/RECENT.html';
383 my $response = $browser->get($url);
384 die "Can't get $url -- ", $response->status_line
385 unless $response->is_success;
386
387 my $html = $response->decoded_content;
388 while( $html =~ m/<A HREF=\"(.*?)\"/g ) {
389 print "$1\n";
390 }
391
392When run, it emits output that starts out something like this:
393
394 MIRRORING.FROM
395 RECENT
396 RECENT.html
397 authors/00whois.html
398 authors/01mailrc.txt.gz
399 authors/id/A/AA/AASSAD/CHECKSUMS
400 ...
401
402However, if you actually want to have those be absolute URLs, you
403can use the URI module's C<new_abs> method, by changing the C<while>
404loop to this:
405
406 while( $html =~ m/<A HREF=\"(.*?)\"/g ) {
407 print URI->new_abs( $1, $response->base ) ,"\n";
408 }
409
410(The C<< $response->base >> method from L<HTTP::Message|HTTP::Message>
411is for returning what URL
412should be used for resolving relative URLs -- it's usually just
413the same as the URL that you requested.)
414
415That program then emits nicely absolute URLs:
416
417 http://www.cpan.org/MIRRORING.FROM
418 http://www.cpan.org/RECENT
419 http://www.cpan.org/RECENT.html
420 http://www.cpan.org/authors/00whois.html
421 http://www.cpan.org/authors/01mailrc.txt.gz
422 http://www.cpan.org/authors/id/A/AA/AASSAD/CHECKSUMS
423 ...
424
425See chapter 4 of I<Perl & LWP> for a longer discussion of URI objects.
426
427Of course, using a regexp to match hrefs is a bit simplistic, and for
428more robust programs, you'll probably want to use an HTML-parsing module
429like L<HTML::LinkExtor> or L<HTML::TokeParser> or even maybe
430L<HTML::TreeBuilder>.
431
432
433
434
435=for comment
436 ##########################################################################
437
438=head2 Other Browser Attributes
439
440LWP::UserAgent objects have many attributes for controlling how they
441work. Here are a few notable ones:
442
443=over
444
445=item *
446
447C<< $browser->timeout(15); >>
448
449This sets this browser object to give up on requests that don't answer
450within 15 seconds.
451
452
453=item *
454
455C<< $browser->protocols_allowed( [ 'http', 'gopher'] ); >>
456
457This sets this browser object to not speak any protocols other than HTTP
458and gopher. If it tries accessing any other kind of URL (like an "ftp:"
459or "mailto:" or "news:" URL), then it won't actually try connecting, but
460instead will immediately return an error code 500, with a message like
461"Access to 'ftp' URIs has been disabled".
462
463
464=item *
465
466C<< use LWP::ConnCache; $browser->conn_cache(LWP::ConnCache->new()); >>
467
468This tells the browser object to try using the HTTP/1.1 "Keep-Alive"
469feature, which speeds up requests by reusing the same socket connection
470for multiple requests to the same server.
471
472
473=item *
474
475C<< $browser->agent( 'SomeName/1.23 (more info here maybe)' ) >>
476
477This changes how the browser object will identify itself in
478the default "User-Agent" line is its HTTP requests. By default,
479it'll send "libwww-perl/I<versionnumber>", like
480"libwww-perl/5.65". You can change that to something more descriptive
481like this:
482
483 $browser->agent( 'SomeName/3.14 (contact@robotplexus.int)' );
484
485Or if need be, you can go in disguise, like this:
486
487 $browser->agent( 'Mozilla/4.0 (compatible; MSIE 5.12; Mac_PowerPC)' );
488
489
490=item *
491
492C<< push @{ $ua->requests_redirectable }, 'POST'; >>
493
494This tells this browser to obey redirection responses to POST requests
495(like most modern interactive browsers), even though the HTTP RFC says
496that should not normally be done.
497
498
499=back
500
501
502For more options and information, see L<the full documentation for
503LWP::UserAgent|LWP::UserAgent>.
504
505
506
507=for comment
508 ##########################################################################
509
510
511
512=head2 Writing Polite Robots
513
514If you want to make sure that your LWP-based program respects F<robots.txt>
515files and doesn't make too many requests too fast, you can use the LWP::RobotUA
516class instead of the LWP::UserAgent class.
517
518LWP::RobotUA class is just like LWP::UserAgent, and you can use it like so:
519
520 use LWP::RobotUA;
521 my $browser = LWP::RobotUA->new('YourSuperBot/1.34', 'you@yoursite.com');
522 # Your bot's name and your email address
523
524 my $response = $browser->get($url);
525
526But HTTP::RobotUA adds these features:
527
528
529=over
530
531=item *
532
533If the F<robots.txt> on C<$url>'s server forbids you from accessing
534C<$url>, then the C<$browser> object (assuming it's of class LWP::RobotUA)
535won't actually request it, but instead will give you back (in C<$response>) a 403 error
536with a message "Forbidden by robots.txt". That is, if you have this line:
537
538 die "$url -- ", $response->status_line, "\nAborted"
539 unless $response->is_success;
540
541then the program would die with an error message like this:
542
543 http://whatever.site.int/pith/x.html -- 403 Forbidden by robots.txt
544 Aborted at whateverprogram.pl line 1234
545
546=item *
547
548If this C<$browser> object sees that the last time it talked to
549C<$url>'s server was too recently, then it will pause (via C<sleep>) to
550avoid making too many requests too often. How long it will pause for, is
551by default one minute -- but you can control it with the C<<
552$browser->delay( I<minutes> ) >> attribute.
553
554For example, this code:
555
556 $browser->delay( 7/60 );
557
558...means that this browser will pause when it needs to avoid talking to
559any given server more than once every 7 seconds.
560
561=back
562
563For more options and information, see L<the full documentation for
564LWP::RobotUA|LWP::RobotUA>.
565
566
567
568
569
570=for comment
571 ##########################################################################
572
573=head2 Using Proxies
574
575In some cases, you will want to (or will have to) use proxies for
576accessing certain sites and/or using certain protocols. This is most
577commonly the case when your LWP program is running (or could be running)
578on a machine that is behind a firewall.
579
580To make a browser object use proxies that are defined in the usual
581environment variables (C<HTTP_PROXY>, etc.), just call the C<env_proxy>
582on a user-agent object before you go making any requests on it.
583Specifically:
584
585 use LWP::UserAgent;
586 my $browser = LWP::UserAgent->new;
587
588 # And before you go making any requests:
589 $browser->env_proxy;
590
591For more information on proxy parameters, see L<the LWP::UserAgent
592documentation|LWP::UserAgent>, specifically the C<proxy>, C<env_proxy>,
593and C<no_proxy> methods.
594
595
596
597=for comment
598 ##########################################################################
599
600=head2 HTTP Authentication
601
602Many web sites restrict access to documents by using "HTTP
603Authentication". This isn't just any form of "enter your password"
604restriction, but is a specific mechanism where the HTTP server sends the
605browser an HTTP code that says "That document is part of a protected
606'realm', and you can access it only if you re-request it and add some
607special authorization headers to your request".
608
609For example, the Unicode.org admins stop email-harvesting bots from
610harvesting the contents of their mailing list archives, by protecting
611them with HTTP Authentication, and then publicly stating the username
612and password (at C<http://www.unicode.org/mail-arch/>) -- namely
613username "unicode-ml" and password "unicode".
614
615For example, consider this URL, which is part of the protected
616area of the web site:
617
618 http://www.unicode.org/mail-arch/unicode-ml/y2002-m08/0067.html
619
620If you access that with a browser, you'll get a prompt
621like
622"Enter username and password for 'Unicode-MailList-Archives' at server
623'www.unicode.org'".
624
625In LWP, if you just request that URL, like this:
626
627 use LWP;
628 my $browser = LWP::UserAgent->new;
629
630 my $url =
631 'http://www.unicode.org/mail-arch/unicode-ml/y2002-m08/0067.html';
632 my $response = $browser->get($url);
633
634 die "Error: ", $response->header('WWW-Authenticate') || 'Error accessing',
635 # ('WWW-Authenticate' is the realm-name)
636 "\n ", $response->status_line, "\n at $url\n Aborting"
637 unless $response->is_success;
638
639Then you'll get this error:
640
641 Error: Basic realm="Unicode-MailList-Archives"
642 401 Authorization Required
643 at http://www.unicode.org/mail-arch/unicode-ml/y2002-m08/0067.html
644 Aborting at auth1.pl line 9. [or wherever]
645
646...because the C<$browser> doesn't know any the username and password
647for that realm ("Unicode-MailList-Archives") at that host
648("www.unicode.org"). The simplest way to let the browser know about this
649is to use the C<credentials> method to let it know about a username and
650password that it can try using for that realm at that host. The syntax is:
651
652 $browser->credentials(
653 'servername:portnumber',
654 'realm-name',
655 'username' => 'password'
656 );
657
658In most cases, the port number is 80, the default TCP/IP port for HTTP; and
659you usually call the C<credentials> method before you make any requests.
660For example:
661
662 $browser->credentials(
663 'reports.mybazouki.com:80',
664 'web_server_usage_reports',
665 'plinky' => 'banjo123'
666 );
667
668So if we add the following to the program above, right after the C<<
669$browser = LWP::UserAgent->new; >> line...
670
671 $browser->credentials( # add this to our $browser 's "key ring"
672 'www.unicode.org:80',
673 'Unicode-MailList-Archives',
674 'unicode-ml' => 'unicode'
675 );
676
677...then when we run it, the request succeeds, instead of causing the
678C<die> to be called.
679
680
681
682=for comment
683 ##########################################################################
684
685=head2 Accessing HTTPS URLs
686
687When you access an HTTPS URL, it'll work for you just like an HTTP URL
688would -- if your LWP installation has HTTPS support (via an appropriate
689Secure Sockets Layer library). For example:
690
691 use LWP;
692 my $url = 'https://www.paypal.com/'; # Yes, HTTPS!
693 my $browser = LWP::UserAgent->new;
694 my $response = $browser->get($url);
695 die "Error at $url\n ", $response->status_line, "\n Aborting"
696 unless $response->is_success;
697 print "Whee, it worked! I got that ",
698 $response->content_type, " document!\n";
699
700If your LWP installation doesn't have HTTPS support set up, then the
701response will be unsuccessful, and you'll get this error message:
702
703 Error at https://www.paypal.com/
704 501 Protocol scheme 'https' is not supported
705 Aborting at paypal.pl line 7. [or whatever program and line]
706
707If your LWP installation I<does> have HTTPS support installed, then the
708response should be successful, and you should be able to consult
709C<$response> just like with any normal HTTP response.
710
711For information about installing HTTPS support for your LWP
712installation, see the helpful F<README.SSL> file that comes in the
713libwww-perl distribution.
714
715
716=for comment
717 ##########################################################################
718
719
720
721=head2 Getting Large Documents
722
723When you're requesting a large (or at least potentially large) document,
724a problem with the normal way of using the request methods (like C<<
725$response = $browser->get($url) >>) is that the response object in
726memory will have to hold the whole document -- I<in memory>. If the
727response is a thirty megabyte file, this is likely to be quite an
728imposition on this process's memory usage.
729
730A notable alternative is to have LWP save the content to a file on disk,
731instead of saving it up in memory. This is the syntax to use:
732
733 $response = $ua->get($url,
734 ':content_file' => $filespec,
735 );
736
737For example,
738
739 $response = $ua->get('http://search.cpan.org/',
740 ':content_file' => '/tmp/sco.html'
741 );
742
743When you use this C<:content_file> option, the C<$response> will have
744all the normal header lines, but C<< $response->content >> will be
745empty.
746
747Note that this ":content_file" option isn't supported under older
748versions of LWP, so you should consider adding C<use LWP 5.66;> to check
749the LWP version, if you think your program might run on systems with
750older versions.
751
752If you need to be compatible with older LWP versions, then use
753this syntax, which does the same thing:
754
755 use HTTP::Request::Common;
756 $response = $ua->request( GET($url), $filespec );
757
758
759=for comment
760 ##########################################################################
761
762
763=head1 SEE ALSO
764
765Remember, this article is just the most rudimentary introduction to
766LWP -- to learn more about LWP and LWP-related tasks, you really
767must read from the following:
768
769=over
770
771=item *
772
773L<LWP::Simple> -- simple functions for getting/heading/mirroring URLs
774
775=item *
776
777L<LWP> -- overview of the libwww-perl modules
778
779=item *
780
781L<LWP::UserAgent> -- the class for objects that represent "virtual browsers"
782
783=item *
784
785L<HTTP::Response> -- the class for objects that represent the response to
786a LWP response, as in C<< $response = $browser->get(...) >>
787
788=item *
789
790L<HTTP::Message> and L<HTTP::Headers> -- classes that provide more methods
791to HTTP::Response.
792
793=item *
794
795L<URI> -- class for objects that represent absolute or relative URLs
796
797=item *
798
799L<URI::Escape> -- functions for URL-escaping and URL-unescaping strings
800(like turning "this & that" to and from "this%20%26%20that").
801
802=item *
803
804L<HTML::Entities> -- functions for HTML-escaping and HTML-unescaping strings
805(like turning "C. & E. BrontE<euml>" to and from "C. &amp; E. Bront&euml;")
806
807=item *
808
809L<HTML::TokeParser> and L<HTML::TreeBuilder> -- classes for parsing HTML
810
811=item *
812
813L<HTML::LinkExtor> -- class for finding links in HTML documents
814
815=item *
816
817The book I<Perl & LWP> by Sean M. Burke. O'Reilly & Associates, 2002.
818ISBN: 0-596-00178-9. C<http://www.oreilly.com/catalog/perllwp/>
819
820=back
821
822
823=head1 COPYRIGHT
824
825Copyright 2002, Sean M. Burke. You can redistribute this document and/or
826modify it, but only under the same terms as Perl itself.
827
828=head1 AUTHOR
829
830Sean M. Burke C<sburke@cpan.org>
831
832=for comment
833 ##########################################################################
834
835=cut
836
837# End of Pod