From: Jarkko Hietaniemi Date: Mon, 11 Aug 2003 17:08:29 +0000 (+0000) Subject: [PATCH] [@20616] perlreref.pod incorrectly describes \c X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=commitdiff_plain;h=6d014f17427cba892ebb4fb2b45f28cc737c7c9e;p=p5sagit%2Fp5-mst-13.2.git [PATCH] [@20616] perlreref.pod incorrectly describes \c From: merlyn@stonehenge.com (Randal L. Schwartz) Date: 11 Aug 2003 09:45:29 -0700 Message-ID: <86isp4kus6.fsf@blue.stonehenge.com> Subject: [PATCH] perlreref.pod tweaks From: Ronald J Kimball Date: Mon, 11 Aug 2003 13:19:51 -0400 Message-ID: <20030811171951.GA332851@linguist.thayer.dartmouth.edu> Plus a note about {,n} not being a quantifier. p4raw-id: //depot/perl@20620 --- diff --git a/pod/perlreref.pod b/pod/perlreref.pod index 08cd227..9f083c7 100644 --- a/pod/perlreref.pod +++ b/pod/perlreref.pod @@ -6,7 +6,7 @@ perlreref - Perl Regular Expressions Reference This is a quick reference to Perl's regular expressions. For full information see L and L, as well -as the L section in this document. +as the L section in this document. =head1 OPERATORS @@ -15,6 +15,12 @@ as the L section in this document. $var =~ /foo/; + !~ determines to which variable the regex is applied, + and negates the result of the match; it returns + false if the match succeeds, and true if it fails. + + $var !~ /foo/; + m/pattern/igmsoxc searches a string for a pattern match, applying the given options. @@ -24,9 +30,9 @@ as the L section in this document. s match as a Single line - . matches \n o compile pattern Once x eXtended legibility - free whitespace and comments - c don't reset pos on fails when using /g + c don't reset pos on failed matches when using /g - If 'pattern' is an empty string, the last I match + If 'pattern' is an empty string, the last I matched regex is used. Delimiters other than '/' may be used for both this operator and the following ones. @@ -44,11 +50,11 @@ as the L section in this document. as a double quoted string unless a single-quote (') is the delimiter. ?pattern? is like m/pattern/ but matches only once. No alternate - delimiters can be used. Must be reset with 'reset'. + delimiters can be used. Must be reset with L. =head1 SYNTAX - \ Escapes the character(s) immediately following it + \ Escapes the character immediately following it . Matches any single character except a newline (unless /s is used) ^ Matches at the beginning of the string (or line, if /m is used) $ Matches at the end of the string (or line, if /m is used) @@ -59,7 +65,7 @@ as the L section in this document. [...] Matches any one of the characters contained within the brackets (...) Groups subexpressions for capturing to $1, $2... (?:...) Groups subexpressions without capturing (cluster) - | Matches either the expression preceding or following it + | Matches either the subexpression preceding or following it \1, \2 ... The text from the Nth group =head2 ESCAPE SEQUENCES @@ -78,8 +84,8 @@ These work as in normal strings. \cx Control-x \N{name} A named character - \l Lowercase until next character - \u Uppercase until next character + \l Lowercase next character + \u Uppercase next character \L Lowercase until \E \U Uppercase until \E \Q Disable pattern metacharacters until \E @@ -94,17 +100,17 @@ This one works differently from normal strings: [amy] Match 'a', 'm' or 'y' [f-j] Dash specifies "range" [f-j-] Dash escaped or at start or end means 'dash' - [^f-j] Caret indicates "match char any _except_ these" + [^f-j] Caret indicates "match any character _except_ these" The following work within or without a character class: \d A digit, same as [0-9] \D A nondigit, same as [^0-9] - \w A word character (alphanumeric), same as [a-zA-Z_0-9] - \W A non-word character, [^a-zA-Z_0-9] + \w A word character (alphanumeric), same as [a-zA-Z0-9_] + \W A non-word character, [^a-zA-Z0-9_] \s A whitespace character, same as [ \t\n\r\f] \S A non-whitespace character, [^ \t\n\r\f] - \C Match a byte (with Unicode. '.' matches char) + \C Match a byte (with Unicode, '.' matches char) \pP Match P-named (Unicode) property \p{...} Match Unicode property with long name \PP Match non-P @@ -142,34 +148,34 @@ All are zero-width assertions. ^ Match string start (or line, if /m is used) $ Match string end (or line, if /m is used) or before newline \b Match word boundary (between \w and \W) - \B Match except at word boundary + \B Match except at word boundary (between \w and \w or \W and \W) \A Match string start (regardless of /m) - \Z Match string end (preceding optional newline) + \Z Match string end (before optional newline) \z Match absolute string end \G Match where previous m//g left off - \c Suppresses resetting of search position when used with /g. - Without \c, search pattern is reset to the beginning of the string =head2 QUANTIFIERS -Quantifiers are greedy by default --- match the B leftmost. +Quantifiers are greedy by default -- match the B leftmost. Maximal Minimal Allowed range ------- ------- ------------- {n,m} {n,m}? Must occur at least n times but no more than m times {n,} {n,}? Must occur at least n times - {n} {n}? Must match exactly n times + {n} {n}? Must occur exactly n times * *? 0 or more times (same as {0,}) + +? 1 or more times (same as {1,}) ? ?? 0 or 1 time (same as {0,1}) +There is no quantifier {,n} -- that gets understood as a literal string. + =head2 EXTENDED CONSTRUCTS (?#text) A comment - (?imxs-imsx:...) Enable/disable option (as per m//) + (?imxs-imsx:...) Enable/disable option (as per m// modifiers) (?=...) Zero-width positive lookahead assertion (?!...) Zero-width negative lookahead assertion - (?<...) Zero-width positive lookbehind assertion + (?<=...) Zero-width positive lookbehind assertion (?...) Grab what we can, prohibit backtracking (?{ code }) Embedded code, return value becomes $^R @@ -195,17 +201,17 @@ See also L. $+ Last parenthesized pattern match $^N Holds the most recently closed capture $^R Holds the result of the last (?{...}) expr - @- Offsets of starts of groups. [0] holds start of whole match - @+ Offsets of ends of groups. [0] holds end of whole match + @- Offsets of starts of groups. $-[0] holds start of whole match + @+ Offsets of ends of groups. $+[0] holds end of whole match -Capture groups are numbered according to their I paren. +Captured groups are numbered according to their I paren. =head1 FUNCTIONS lc Lowercase a string lcfirst Lowercase first char of a string uc Uppercase a string - ucfirst Titlecase first char of a string + ucfirst Uppercase first char of a string pos Return or set current match position quotemeta Quote metacharacters reset Reset ?pattern? status @@ -283,3 +289,5 @@ Jim Cromie, and Jeffrey Goff for useful advice. + +=cut