As a list operator:
for (101 .. 200) { print; } # print $_ 100 times
- @foo = @foo[$[ .. $#foo]; # an expensive no-op
+ @foo = @foo[0 .. $#foo]; # an expensive no-op
@foo = @foo[$#foo-4 .. $#foo]; # slice last 5 items
The range operator (in a list context) makes use of the magical
If "/" is the delimiter then the initial C<m> is optional. With the C<m>
you can use any pair of non-alphanumeric, non-whitespace characters as
delimiters. This is particularly useful for matching Unix path names
-that contain "/", to avoid LTS (leaning toothpick syndrome).
+that contain "/", to avoid LTS (leaning toothpick syndrome). If "?" is
+the delimiter, then the match-only-once rule of C<?PATTERN?> applies.
PATTERN may contain variables, which will be interpolated (and the
pattern recompiled) every time the pattern search is evaluated. (Note
strings, as if there were parentheses around the whole pattern.
In a scalar context, C<m//g> iterates through the string, returning TRUE
-each time it matches, and FALSE when it eventually runs out of
-matches. (In other words, it remembers where it left off last time and
-restarts the search at that point. You can actually find the current
-match position of a string or set it using the pos() function--see
-L<perlfunc/pos>.) Note that you can use this feature to stack C<m//g>
-matches or intermix C<m//g> matches with C<m/\G.../g>. Note that
-the C<\G> zero-width assertion is not supported without the C</g>
-modifier; currently, without C</g>, C<\G> behaves just like C<\A>, but
-that's accidental and may change in the future.
-
-If you modify the string in any way, the match position is reset to the
-beginning. Examples:
+each time it matches, and FALSE when it eventually runs out of matches.
+(In other words, it remembers where it left off last time and restarts
+the search at that point. You can actually find the current match
+position of a string or set it using the pos() function; see
+L<perlfunc/pos>.) A failed match normally resets the search position to
+the beginning of the string, but you can avoid that by adding the C</c>
+modifier (e.g. C<m//gc>). Modifying the target string also resets the
+search position.
+
+You can intermix C<m//g> matches with C<m/\G.../g>, where C<\G> is a
+zero-width assertion that matches the exact position where the previous
+C<m//g>, if any, left off. The C<\G> assertion is not supported without
+the C</g> modifier; currently, without C</g>, C<\G> behaves just like
+C<\A>, but that's accidental and may change in the future.
+
+Examples:
# list context
($one,$five,$fifteen) = (`uptime` =~ /(\d+\.\d+)/g);
}
print "$sentences\n";
- # using m//g with \G
+ # using m//gc with \G
$_ = "ppooqppqq";
while ($i++ < 2) {
print "1: '";
- print $1 while /(o)/g; print "', pos=", pos, "\n";
+ print $1 while /(o)/gc; print "', pos=", pos, "\n";
print "2: '";
- print $1 if /\G(q)/g; print "', pos=", pos, "\n";
+ print $1 if /\G(q)/gc; print "', pos=", pos, "\n";
print "3: '";
- print $1 while /(p)/g; print "', pos=", pos, "\n";
+ print $1 while /(p)/gc; print "', pos=", pos, "\n";
}
The last example should print:
2: 'q', pos=8
3: '', pos=8
-A useful idiom for C<lex>-like scanners is C</\G.../g>. You can
+A useful idiom for C<lex>-like scanners is C</\G.../gc>. You can
combine several regexps like this to process a string part-by-part,
-doing different actions depending on which regexp matched. The next
-regexp would step in at the place the previous one left off.
+doing different actions depending on which regexp matched. Each
+regexp tries to match where the previous one leaves off.
$_ = <<'EOL';
$url = new URI::URL "http://www/"; die if $url eq "xXx";
EOL
LOOP:
{
- print(" digits"), redo LOOP if /\G\d+\b[,.;]?\s*/g;
- print(" lowercase"), redo LOOP if /\G[a-z]+\b[,.;]?\s*/g;
- print(" UPPERCASE"), redo LOOP if /\G[A-Z]+\b[,.;]?\s*/g;
- print(" Capitalized"), redo LOOP if /\G[A-Z][a-z]+\b[,.;]?\s*/g;
- print(" MiXeD"), redo LOOP if /\G[A-Za-z]+\b[,.;]?\s*/g;
- print(" alphanumeric"), redo LOOP if /\G[A-Za-z0-9]+\b[,.;]?\s*/g;
- print(" line-noise"), redo LOOP if /\G[^A-Za-z0-9]+/g;
+ print(" digits"), redo LOOP if /\G\d+\b[,.;]?\s*/gc;
+ print(" lowercase"), redo LOOP if /\G[a-z]+\b[,.;]?\s*/gc;
+ print(" UPPERCASE"), redo LOOP if /\G[A-Z]+\b[,.;]?\s*/gc;
+ print(" Capitalized"), redo LOOP if /\G[A-Z][a-z]+\b[,.;]?\s*/gc;
+ print(" MiXeD"), redo LOOP if /\G[A-Za-z]+\b[,.;]?\s*/gc;
+ print(" alphanumeric"), redo LOOP if /\G[A-Za-z0-9]+\b[,.;]?\s*/gc;
+ print(" line-noise"), redo LOOP if /\G[^A-Za-z0-9]+/gc;
print ". That's all!\n";
}
$today = qx{ date };
+Note that how the string gets evaluated is entirely subject to the
+command interpreter on your system. On most platforms, you will have
+to protect shell metacharacters if you want them treated literally.
+On some platforms (notably DOS-like ones), the shell may not be
+capable of dealing with multiline commands, so putting newlines in
+the string may not get you what you want. You may be able to evaluate
+multiple commands in a single line by separating them with the command
+separator character, if your shell supports that (e.g. C<;> on many Unix
+shells; C<&> on the Windows NT C<cmd> shell).
+
+Beware that some command shells may place restrictions on the length
+of the command line. You must ensure your strings don't exceed this
+limit after any necessary interpolations. See the platform-specific
+release notes for more details about your particular environment.
+
+Also realize that using this operator frequently leads to unportable
+programs.
+
See L<"I/O Operators"> for more discussion.
=item qw/STRING/
is equivalent to the following Perl-like pseudo code:
- unshift(@ARGV, '-') if $#ARGV < $[;
+ unshift(@ARGV, '-') unless @ARGV;
while ($ARGV = shift) {
open(ARGV, $ARGV);
while (<ARGV>) {