X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=pod%2Fperlfaq6.pod;h=d21a11157b11c80dea2ae6361fda4836226ff05d;hb=3bf198a5e20d135d4136d3233d58cf49a70772d9;hp=589d89e495b3dc5d156558f12ec449508ed89111;hpb=68dc074516a6859e3424b48d1647bcb08b1a1a7d;p=p5sagit%2Fp5-mst-13.2.git

diff --git a/pod/perlfaq6.pod b/pod/perlfaq6.pod
index 589d89e..d21a111 100644
--- a/pod/perlfaq6.pod
+++ b/pod/perlfaq6.pod
@@ -1,6 +1,6 @@
 =head1 NAME
 
-perlfaq6 - Regexps ($Revision: 1.14 $)
+perlfaq6 - Regexps ($Revision: 1.17 $, $Date: 1997/04/24 22:44:10 $)
 
 =head1 DESCRIPTION
 
@@ -11,7 +11,7 @@ with regular expressions, but those answers are found elsewhere in
 this document (in the section on Data and the Networking one on
 networking, to be precise).
 
-=head2 How can I hope to use regular expressions without creating illegible and unmaintainable code?  
+=head2 How can I hope to use regular expressions without creating illegible and unmaintainable code?
 
 Three techniques can make regular expressions maintainable and
 understandable.
@@ -96,8 +96,8 @@ record read in.
     while ( <> ) {
 	while ( /\b(\w\S+)(\s+\1)+\b/gi ) {
 	    print "Duplicate $1 at paragraph $.\n";
-	} 
-    } 
+	}
+    }
 
 Here's code that finds sentences that begin with "From " (which would
 be mangled by many mailers):
@@ -138,11 +138,32 @@ on matching balanced text.
 $/ must be a string, not a regular expression.  Awk has to be better
 for something. :-)
 
-Actually, you could do this if you don't mind reading the whole file into 
+Actually, you could do this if you don't mind reading the whole file
+into memory:
 
     undef $/;
     @records = split /your_pattern/, <FH>;
 
+The Net::Telnet module (available from CPAN) has the capability to
+wait for a pattern in the input stream, or timeout if it doesn't
+appear within a certain time.
+
+    ## Create a file with three lines.
+    open FH, ">file";
+    print FH "The first line\nThe second line\nThe third line\n";
+    close FH;
+
+    ## Get a read/write filehandle to it.
+    $fh = new FileHandle "+<file";
+
+    ## Attach it to a "stream" object.
+    use Net::Telnet;
+    $file = new Net::Telnet (-fhopen => $fh);
+
+    ## Search for the second line and print out the third.
+    $file->waitfor('/second line\n/');
+    print $file->getline;
+
 =head2 How do I substitute case insensitively on the LHS, but preserving case on the RHS?
 
 It depends on what you mean by "preserving case".  The following
@@ -197,10 +218,10 @@ See L<perllocale>.
 =head2 How can I match a locale-smart version of C</[a-zA-Z]/>?
 
 One alphabetic character would be C</[^\W\d_]/>, no matter what locale
-you're in.  Non-alphabetics would be C</[\W\d_]/> (assuming you don't 
+you're in.  Non-alphabetics would be C</[\W\d_]/> (assuming you don't
 consider an underscore a letter).
 
-=head2 How can I quote a variable to use in a regexp?  
+=head2 How can I quote a variable to use in a regexp?
 
 The Perl parser will expand $variable and @variable references in
 regular expressions unless the delimiter is a single quote.  Remember,
@@ -308,10 +329,10 @@ Use the split function:
 	foreach $word ( split ) { 
 	    # do something with $word here
 	} 
-    } 
+    }
 
-Note that this isn't really a word in the English sense; it's just 
-chunks of consecutive non-whitespace characters.  
+Note that this isn't really a word in the English sense; it's just
+chunks of consecutive non-whitespace characters.
 
 To work with only alphanumeric sequences, you might consider
 
@@ -324,25 +345,25 @@ To work with only alphanumeric sequences, you might consider
 =head2 How can I print out a word-frequency or line-frequency summary?
 
 To do this, you have to parse out each word in the input stream.  We'll
-pretend that by word you mean chunk of alphabetics, hyphens, or 
-apostrophes, rather than the non-whitespace chunk idea of a word given 
+pretend that by word you mean chunk of alphabetics, hyphens, or
+apostrophes, rather than the non-whitespace chunk idea of a word given
 in the previous question:
 
     while (<>) {
 	while ( /(\b[^\W_\d][\w'-]+\b)/g ) {   # misses "`sheep'"
 	    $seen{$1}++;
-	} 
-    } 
+	}
+    }
     while ( ($word, $count) = each %seen ) {
 	print "$count $word\n";
-    } 
+    }
 
 If you wanted to do the same thing for lines, you wouldn't need a
 regular expression:
 
     while (<>) { 
 	$seen{$_}++;
-    } 
+    }
     while ( ($line, $count) = each %seen ) {
 	print "$count $line";
     }
@@ -475,7 +496,7 @@ Of course, that could have been written as
     while (<>) {
       chomp;
       PARSER: {
-	   if ( /\G( \d+\b    )/gx  { 
+	   if ( /\G( \d+\b    )/gx  {
 		print "number: $1\n";
 		redo PARSER;
 	   }
@@ -518,7 +539,7 @@ side-effects, and side-effects can be mystifying.  There's no void
 grep() that's not better written as a C<for> (well, C<foreach>,
 technically) loop.
 
-=head2 How can I match strings with multi-byte characters?
+=head2 How can I match strings with multibyte characters?
 
 This is hard, and there's no good way.  Perl does not directly support
 wide characters.  It pretends that a byte and a character are
@@ -526,23 +547,24 @@ synonymous.  The following set of approaches was offered by Jeffrey
 Friedl, whose article in issue #5 of The Perl Journal talks about this
 very matter.
 
-Let's suppose you have some weird Martian encoding where pairs of ASCII
-uppercase letters encode single Martian letters (i.e. the two bytes
-"CV" make a single Martian letter, as do the two bytes "SG", "VS",
-"XX", etc.). Other bytes represent single characters, just like ASCII.
+Let's suppose you have some weird Martian encoding where pairs of
+ASCII uppercase letters encode single Martian letters (i.e. the two
+bytes "CV" make a single Martian letter, as do the two bytes "SG",
+"VS", "XX", etc.). Other bytes represent single characters, just like
+ASCII.
 
-So, the string of Martian "I am CVSGXX!" uses 12 bytes to encode the nine
-characters 'I', ' ', 'a', 'm', ' ', 'CV', 'SG', 'XX', '!'.
+So, the string of Martian "I am CVSGXX!" uses 12 bytes to encode the
+nine characters 'I', ' ', 'a', 'm', ' ', 'CV', 'SG', 'XX', '!'.
 
 Now, say you want to search for the single character C</GX/>. Perl
-doesn't know about Martian, so it'll find the two bytes "GX" in the
-"I am CVSGXX!"  string, even though that character isn't there: it just
-looks like it is because "SG" is next to "XX", but there's no real "GX".
-This is a big problem.
+doesn't know about Martian, so it'll find the two bytes "GX" in the "I
+am CVSGXX!"  string, even though that character isn't there: it just
+looks like it is because "SG" is next to "XX", but there's no real
+"GX".  This is a big problem.
 
 Here are a few ways, all painful, to deal with it:
 
-   $martian =~ s/([A-Z][A-Z])/ $1 /g; # Make sure adjacent ``maritan'' bytes
+   $martian =~ s/([A-Z][A-Z])/ $1 /g; # Make sure adjacent ``martian'' bytes
                                       # are no longer adjacent.
    print "found GX!\n" if $martian =~ /GX/;
 
@@ -558,7 +580,7 @@ Or like this:
 Or like this:
 
    while ($martian =~ m/\G([A-Z][A-Z]|.)/gs) {  # \G probably unneeded
-       print "found GX!\n", last if $1 eq 'GX';	
+       print "found GX!\n", last if $1 eq 'GX';
    }
 
 Or like this:
@@ -566,7 +588,7 @@ Or like this:
    die "sorry, Perl doesn't (yet) have Martian support )-:\n";
 
 In addition, a sample program which converts half-width to full-width
-katakana (in Shift-JIS or EUC encoding) is available from CPAN as 
+katakana (in Shift-JIS or EUC encoding) is available from CPAN as
 
 =for Tom make it so
 
@@ -578,3 +600,4 @@ all mixed.
 
 Copyright (c) 1997 Tom Christiansen and Nathan Torkington.
 All rights reserved.  See L<perlfaq> for distribution information.
+