From: Michael G. Schwern Date: Wed, 10 Nov 1999 17:21:46 +0000 (-0500) Subject: [DOCPATCH 5.005_62 perlfaq9.pod] Mention HTML::FormatText X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=commitdiff_plain;h=7d7e76cf99e05858505bf53b2c9d61ea589e2cb6;p=p5sagit%2Fp5-mst-13.2.git [DOCPATCH 5.005_62 perlfaq9.pod] Mention HTML::FormatText To: perl5-porters@perl.org, pod-people@perl.org Cc: tchrist@mox.perl.com, gnat@frii.com Message-ID: <19991110172146.A23527@athens.aocn.com> p4raw-id: //depot/cfgperl@4569 --- diff --git a/pod/perlfaq9.pod b/pod/perlfaq9.pod index 3da9bc1..7fc0cdc 100644 --- a/pod/perlfaq9.pod +++ b/pod/perlfaq9.pod @@ -77,7 +77,9 @@ stamp prepended. =head2 How do I remove HTML from a string? The most correct way (albeit not the fastest) is to use HTML::Parser -from CPAN (part of the HTML-Tree package on CPAN). +from CPAN (part of the HTML-Tree package on CPAN). Another correct +way is to use HTML::FormatText which not only removes HTML but also +attempts to do a little simple formatting of the resulting plain text. Many folks attempt a simple-minded regular expression approach, like C.*?E//g>, but that fails in many cases because the tags