=head2 The Arrow Operator
-"C<-E<gt>>" is an infix dereference operator, just as it is in C
+"C<< -> >>" is an infix dereference operator, just as it is in C
and C++. If the right side is either a C<[...]>, C<{...}>, or a
C<(...)> subscript, then the left side must be either a hard or
symbolic reference to an array, a hash, or a subroutine respectively.
a numeric context, you get a normal increment. If, however, the
variable has been used in only string contexts since it was set, and
has a value that is not the empty string and matches the pattern
-C</^[a-zA-Z]*[0-9]*$/>, the increment is done as a string, preserving each
+C</^[a-zA-Z]*[0-9]*\z/>, the increment is done as a string, preserving each
character within its range, with carry:
print ++($foo = '99'); # prints '100'
is returned. One effect of these rules is that C<-bareword> is equivalent
to C<"-bareword">.
-Unary "~" performs bitwise negation, i.e., 1's complement. For example,
-C<0666 &~ 027> is 0640. (See also L<Integer Arithmetic> and L<Bitwise
-String Operators>.)
+Unary "~" performs bitwise negation, i.e., 1's complement. For
+example, C<0666 & ~027> is 0640. (See also L<Integer Arithmetic> and
+L<Bitwise String Operators>.) Note that the width of the result is
+platform-dependent: ~0 is 32 bits wide on a 32-bit platform, but 64
+bits wide on a 64-bit platform, so if you are expecting a certain bit
+width, remember use the & operator to mask off the excess bits.
Unary "+" has no effect whatsoever, even on strings. It is useful
syntactically for separating a function name from a parenthesized expression
of operation work on some other string. The right argument is a search
pattern, substitution, or transliteration. The left argument is what is
supposed to be searched, substituted, or transliterated instead of the default
-$_. The return value indicates the success of the operation. (If the
-right argument is an expression rather than a search pattern,
+$_. When used in scalar context, the return value generally indicates the
+success of the operation. Behavior in list context depends on the particular
+operator. See L</"Regexp Quote-Like Operators"> for details.
+
+If the right argument is an expression rather than a search pattern,
substitution, or transliteration, it is interpreted as a search pattern at run
-time. This can be is less efficient than an explicit search, because the
-pattern must be compiled every time the expression is evaluated).
+time. This can be less efficient than an explicit search, because the
+pattern must be compiled every time the expression is evaluated.
Binary "!~" is just like "=~" except the return value is negated in
the logical sense.
C<$a>. If C<$b> is negative, then C<$a % $b> is C<$a> minus the
smallest multiple of C<$b> that is not less than C<$a> (i.e. the
result will be less than or equal to zero).
-Note than when C<use integer> is in scope, "%" give you direct access
+Note than when C<use integer> is in scope, "%" gives you direct access
to the modulus operator as implemented by your C compiler. This
operator is not as well defined for negative operands, but it will
execute faster.
the number of bits specified by the right argument. Arguments should
be integers. (See also L<Integer Arithmetic>.)
+Note that both "<<" and ">>" in Perl are implemented directly using
+"<<" and ">>" in C. If C<use integer> (see L<Integer Arithmetic>) is
+in force then signed C integers are used, else unsigned C integers are
+used. Either way, the implementation isn't going to generate results
+larger than the size of the integer type Perl was built with (32 bits
+or 64 bits).
+
+The result of overflowing the range of the integers is undefined
+because it is undefined also in C. In other words, using 32-bit
+integers, C<< 1 << 32 >> is undefined. Shifting by a negative number
+of bits is also undefined.
+
=head2 Named Unary Operators
The various named unary operators are treated as functions with one
If any list operator (print(), etc.) or any unary operator (chdir(), etc.)
is followed by a left parenthesis as the next token, the operator and
arguments within parentheses are taken to be of highest precedence,
-just like a normal function call. Examples:
+just like a normal function call. For example,
+because named unary operators are higher precedence than ||:
chdir $foo || die; # (chdir $foo) || die
chdir($foo) || die; # (chdir $foo) || die
chdir ($foo) || die; # (chdir $foo) || die
chdir +($foo) || die; # (chdir $foo) || die
-but, because * is higher precedence than ||:
+but, because * is higher precedence than named operators:
chdir $foo * 20; # chdir ($foo * 20)
chdir($foo) * 20; # (chdir $foo) * 20
=head2 Relational Operators
-Binary "E<lt>" returns true if the left argument is numerically less than
+Binary "<" returns true if the left argument is numerically less than
the right argument.
-Binary "E<gt>" returns true if the left argument is numerically greater
+Binary ">" returns true if the left argument is numerically greater
than the right argument.
-Binary "E<lt>=" returns true if the left argument is numerically less than
+Binary "<=" returns true if the left argument is numerically less than
or equal to the right argument.
-Binary "E<gt>=" returns true if the left argument is numerically greater
+Binary ">=" returns true if the left argument is numerically greater
than or equal to the right argument.
Binary "lt" returns true if the left argument is stringwise less than
Binary "!=" returns true if the left argument is numerically not equal
to the right argument.
-Binary "E<lt>=E<gt>" returns -1, 0, or 1 depending on whether the left
+Binary "<=>" returns -1, 0, or 1 depending on whether the left
argument is numerically less than, equal to, or greater than the right
-argument.
+argument. If your platform supports NaNs (not-a-numbers) as numeric
+values, using them with "<=>" returns undef. NaN is not "<", "==", ">",
+"<=" or ">=" anything (even NaN), so those 5 return false. NaN != NaN
+returns true, as does NaN != anything else. If your platform doesn't
+support NaNs then NaN is just a string with numeric value 0.
+
+ perl -le '$a = NaN; print "No NaN support here" if $a == $a'
+ perl -le '$a = NaN; print "NaN support here" if $a != $a'
Binary "eq" returns true if the left argument is stringwise equal to
the right argument.
Binary "ne" returns true if the left argument is stringwise not equal
to the right argument.
-Binary "cmp" returns -1, 0, or 1 depending on whether the left argument is stringwise
-less than, equal to, or greater than the right argument.
+Binary "cmp" returns -1, 0, or 1 depending on whether the left
+argument is stringwise less than, equal to, or greater than the right
+argument.
"lt", "le", "ge", "gt" and "cmp" use the collation (sort) order specified
by the current locale if C<use locale> is in effect. See L<perllocale>.
unlink("alpha", "beta", "gamma")
|| (gripe(), next LINE);
-Use "or" for assignment is unlikely to do what you want; see below.
+Using "or" for assignment is unlikely to do what you want; see below.
=head2 Range Operators
In list context, it's just the list argument separator, and inserts
both its arguments into the list.
-The =E<gt> digraph is mostly just a synonym for the comma operator. It's useful for
+The => digraph is mostly just a synonym for the comma operator. It's useful for
documenting arguments that come in pairs. As of release 5.001, it also forces
any word to the left of it to be interpreted as a string.
Customary Generic Meaning Interpolates
'' q{} Literal no
"" qq{} Literal yes
- `` qx{} Command yes (unless '' is delimiter)
+ `` qx{} Command yes*
qw{} Word list no
- // m{} Pattern match yes (unless '' is delimiter)
- qr{} Pattern yes (unless '' is delimiter)
- s{}{} Substitution yes (unless '' is delimiter)
+ // m{} Pattern match yes*
+ qr{} Pattern yes*
+ s{}{} Substitution yes*
tr{}{} Transliteration no (but see below)
+ * unless the delimiter is ''.
+
Non-bracketing delimiters use the same character fore and aft, but the four
sorts of brackets (round, angle, square, curly) will all nest, which means
that
q{foo{bar}baz}
-
+
is the same as
'foo{bar}baz'
$s = q{ if($a eq "}") ... }; # WRONG
-is a syntax error. The C<Text::Balanced> module on CPAN is able to do this
-properly.
+is a syntax error. The C<Text::Balanced> module (from CPAN, and
+starting from Perl 5.8 part of the standard distribution) is able
+to do this properly.
There can be whitespace between the operator and the quoting
characters, except when C<#> is being used as the quoting character.
s {foo} # Replace foo
{bar} # with bar.
-For constructs that do interpolate, variables beginning with "C<$>"
-or "C<@>" are interpolated, as are the following escape sequences. Within
-a transliteration, the first eleven of these sequences may be used.
+The following escape sequences are available in constructs that interpolate
+and in transliterations.
\t tab (HT, TAB)
\n newline (NL)
\c[ control char (ESC)
\N{name} named char
+The following escape sequences are available in constructs that interpolate
+but not in transliterations.
+
\l lowercase next char
\u uppercase next char
\L lowercase till \E
printing C<"\n"> may emit no actual data. In general, use C<"\n"> when
you mean a "newline" for your system, but use the literal ASCII when you
need an exact character. For example, most networking protocols expect
-and prefer a CR+LF (C<"\012\015"> or C<"\cJ\cM">) for line terminators,
+and prefer a CR+LF (C<"\015\012"> or C<"\cM\cJ">) for line terminators,
and although they often accept just C<"\012">, they seldom tolerate just
C<"\015">. If you get in the habit of using C<"\n"> for networking,
you may be burned some day.
+For constructs that do interpolate, variables beginning with "C<$>"
+or "C<@>" are interpolated. Subscripted variables such as C<$a[3]> or
+C<$href->{key}[0]> are also interpolated, as are array and hash slices.
+But method calls such as C<$obj->meth> are not.
+
+Interpolating an array or slice interpolates the elements in order,
+separated by the value of C<$">, so is equivalent to interpolating
+C<join $", @array>. "Punctuation" arrays such as C<@+> are only
+interpolated if the name is enclosed in braces C<@{+}>.
+
You cannot include a literal C<$> or C<@> within a C<\Q> sequence.
An unescaped C<$> or C<@> interpolates the corresponding variable,
while escaping will cause the literal string C<\$> to be inserted.
reset if eof; # clear ?? status for next file
}
-This usage is vaguely depreciated, which means it just might possibly
+This usage is vaguely deprecated, which means it just might possibly
be removed in some distant future version of Perl, perhaps somewhere
around the year 2168.
PATTERN may contain variables, which will be interpolated (and the
pattern recompiled) every time the pattern search is evaluated, except
-for when the delimiter is a single quote. (Note that C<$)> and C<$|>
-might not be interpolated because they look like end-of-string tests.)
+for when the delimiter is a single quote. (Note that C<$(>, C<$)>, and
+C<$|> are not interpolated because they look like end-of-string tests.)
If you want such a pattern to be compiled only once, add a C</o> after
the trailing delimiter. This avoids expensive run-time recompilations,
and is useful when the value you are interpolating won't change over
the life of the script. However, mentioning C</o> constitutes a promise
that you won't change the variables in the pattern. If you change them,
-Perl won't even notice. See also L<qr//>.
+Perl won't even notice. See also L<"qr/STRING/imosx">.
If the PATTERN evaluates to the empty string, the last
I<successfully> matched regular expression is used instead.
You can intermix C<m//g> matches with C<m/\G.../g>, where C<\G> is a
zero-width assertion that matches the exact position where the previous
-C<m//g>, if any, left off. The C<\G> assertion is not supported without
-the C</g> modifier. (Currently, without C</g>, C<\G> behaves just like
-C<\A>, but that's accidental and may change in the future.)
+C<m//g>, if any, left off. Without the C</g> modifier, the C<\G> assertion
+still anchors at pos(), but the match is of course only attempted once.
+Using C<\G> without C</g> on a target string that has not previously had a
+C</g> match applied to it is the same as using the C<\A> assertion to match
+the beginning of the string.
Examples:
($one,$five,$fifteen) = (`uptime` =~ /(\d+\.\d+)/g);
# scalar context
- $/ = ""; $* = 1; # $* deprecated in modern perls
+ $/ = "";
while (defined($paragraph = <>)) {
while ($paragraph =~ /[a-z]['")]*[.!?]+['")]*\s/g) {
$sentences++;
print "3: '";
print $1 while /(p)/gc; print "', pos=", pos, "\n";
}
+ print "Final: '$1', pos=",pos,"\n" if /\G(.)/;
The last example should print:
1: '', pos=7
2: 'q', pos=8
3: '', pos=8
+ Final: 'q', pos=8
+
+Notice that the final match matched C<q> instead of C<p>, which a match
+without the C<\G> anchor would have done. Also note that the final match
+did not update C<pos> -- C<pos> is only updated on a C</g> match. If the
+final match did indeed match C<p>, it's a good bet that you're running an
+older (pre-5.6.0) Perl.
A useful idiom for C<lex>-like scanners is C</\G.../gc>. You can
combine several regexps like this to process a string part-by-part,
=item qr/STRING/imosx
-This operators quotes--and compiles--its I<STRING> as a regular
+This operator quotes (and possibly compiles) its I<STRING> as a regular
expression. I<STRING> is interpolated the same way as I<PATTERN>
in C<m/PATTERN/>. If "'" is used as the delimiter, no interpolation
is done. Returns a Perl value which may be used instead of the
=item `STRING`
-A string which is (possibly) interpolated and then executed as a system
-command with C</bin/sh> or its equivalent. Shell wildcards, pipes,
-and redirections will be honored. The collected standard output of the
-command is returned; standard error is unaffected. In scalar context,
-it comes back as a single (potentially multi-line) string. In list
-context, returns a list of lines (however you've defined lines with $/
-or $INPUT_RECORD_SEPARATOR).
+A string which is (possibly) interpolated and then executed as a
+system command with C</bin/sh> or its equivalent. Shell wildcards,
+pipes, and redirections will be honored. The collected standard
+output of the command is returned; standard error is unaffected. In
+scalar context, it comes back as a single (potentially multi-line)
+string, or undef if the command failed. In list context, returns a
+list of lines (however you've defined lines with $/ or
+$INPUT_RECORD_SEPARATOR), or an empty list if the command failed.
Because backticks do not affect standard error, use shell file descriptor
syntax (assuming the shell supports this) if you care to address this.
separator character, if your shell supports that (e.g. C<;> on many Unix
shells; C<&> on the Windows NT C<cmd> shell).
+Beginning with v5.6.0, Perl will attempt to flush all files opened for
+output before starting the child process, but this may not be supported
+on some platforms (see L<perlport>). To be safe, you may need to set
+C<$|> ($AUTOFLUSH in English) or call the C<autoflush()> method of
+C<IO::Handle> on any open handles.
+
Beware that some command shells may place restrictions on the length
of the command line. You must ensure your strings don't exceed this
limit after any necessary interpolations. See the platform-specific
A common mistake is to try to separate the words with comma or to
put comments into a multi-line C<qw>-string. For this reason, the
-B<-w> switch (that is, the C<$^W> variable) produces warnings if
-the STRING contains the "," or the "#" character.
+C<use warnings> pragma and the B<-w> switch (that is, the C<$^W> variable)
+produces warnings if the STRING contains the "," or the "#" character.
=item s/PATTERN/REPLACEMENT/egimosx
text is not evaluated as a command. If the
PATTERN is delimited by bracketing quotes, the REPLACEMENT has its own
pair of quotes, which may or may not be bracketing quotes, e.g.,
-C<s(foo)(bar)> or C<sE<lt>fooE<gt>/bar/>. A C</e> will cause the
-replacement portion to be interpreted as a full-fledged Perl expression
-and eval()ed right then and there. It is, however, syntax checked at
-compile-time.
+C<s(foo)(bar)> or C<< s<foo>/bar/ >>. A C</e> will cause the
+replacement portion to be treated as a full-fledged Perl expression
+and evaluated right then and there. It is, however, syntax checked at
+compile-time. A second C<e> modifier will cause the replacement portion
+to be C<eval>ed before being run as a Perl expression.
Examples:
# symbolic dereferencing
s/\$(\w+)/${$1}/g;
- # /e's can even nest; this will expand
- # any embedded scalar variable (including lexicals) in $_
+ # Add one to the value of any numbers in the string
+ s/(\d+)/1 + $1/eg;
+
+ # This will expand any embedded scalar variable
+ # (including lexicals) in $_ : First $1 is interpolated
+ # to the variable name, and then evaluated
s/(\$\w+)/$1/eeg;
# Delete (most) C comments.
s/([^ ]*) *([^ ]*)/$2 $1/; # reverse 1st two fields
Note the use of $ instead of \ in the last example. Unlike
-B<sed>, we use the \E<lt>I<digit>E<gt> form in only the left hand side.
-Anywhere else it's $E<lt>I<digit>E<gt>.
+B<sed>, we use the \<I<digit>> form in only the left hand side.
+Anywhere else it's $<I<digit>>.
Occasionally, you can't use just a C</g> to get all the changes
to occur that you might want. Here are two common cases:
# expand tabs to 8-column spacing
1 while s/\t+/' ' x (length($&)*8 - length($`)%8)/e;
-=item tr/SEARCHLIST/REPLACEMENTLIST/cdsUC
+=item tr/SEARCHLIST/REPLACEMENTLIST/cds
-=item y/SEARCHLIST/REPLACEMENTLIST/cdsUC
+=item y/SEARCHLIST/REPLACEMENTLIST/cds
Transliterates all occurrences of the characters found in the search list
with the corresponding character in the replacement list. It returns
its own pair of quotes, which may or may not be bracketing quotes,
e.g., C<tr[A-Z][a-z]> or C<tr(+\-*/)/ABCD/>.
+Note that C<tr> does B<not> do regular expression character classes
+such as C<\d> or C<[:lower:]>. The <tr> operator is not equivalent to
+the tr(1) utility. If you want to map strings between lower/upper
+cases, see L<perlfunc/lc> and L<perlfunc/uc>, and in general consider
+using the C<s> operator if you need regular expressions.
+
Note also that the whole range idea is rather unportable between
character sets--and even within character sets they may cause results
you probably didn't expect. A sound principle is to use only ranges
c Complement the SEARCHLIST.
d Delete found but unreplaced characters.
s Squash duplicate replaced characters.
- U Translate to/from UTF-8.
- C Translate to/from 8-bit char (octet).
If the C</c> modifier is specified, the SEARCHLIST character set
is complemented. If the C</d> modifier is specified, any characters
This latter is useful for counting characters in a class or for
squashing character sequences in a class.
-The first C</U> or C</C> modifier applies to the left side of the translation.
-The second one applies to the right side. If present, these modifiers override
-the current utf8 state.
-
Examples:
$ARGV[1] =~ tr/A-Z/a-z/; # canonicalize to lower case
tr [\200-\377]
[\000-\177]; # delete 8th bit
- tr/\0-\xFF//CU; # change Latin-1 to Unicode
- tr/\0-\x{FF}//UC; # change Unicode to Latin-1
-
If multiple transliterations are given for a character, only the
first one is used:
quoting constructs, Perl performs different numbers of passes, from
one to five, but these passes are always performed in the same order.
-=over
+=over 4
=item Finding the end
The first pass is finding the end of the quoted construct, whether
it be a multicharacter delimiter C<"\nEOF\n"> in the C<<<EOF>
construct, a C</> that terminates a C<qq//> construct, a C<]> which
-terminates C<qq[]> construct, or a C<E<gt>> which terminates a
-fileglob started with C<E<lt>>.
+terminates C<qq[]> construct, or a C<< > >> which terminates a
+fileglob started with C<< < >>.
When searching for single-character non-pairing delimiters, such
as C</>, combinations of C<\\> and C<\/> are skipped. However,
The next step is interpolation in the text obtained, which is now
delimiter-independent. There are four different cases.
-=over
+=over 4
=item C<<<'EOF'>, C<m''>, C<s'''>, C<tr///>, C<y///>
The only interpolation is removal of C<\> from pairs C<\\>.
-=item C<"">, C<``>, C<qq//>, C<qx//>, C<<file*globE<gt>>
+=item C<"">, C<``>, C<qq//>, C<qx//>, C<< <file*glob> >>
C<\Q>, C<\U>, C<\u>, C<\L>, C<\l> (possibly paired with C<\E>) are
converted to corresponding Perl constructs. Thus, C<"$foo\Qbaz$bar">
may be closer to the conjectural I<intention> of the writer of C<"\Q\t\E">.
Interpolated scalars and arrays are converted internally to the C<join> and
-C<.> catentation operations. Thus, C<"$foo XXX '@arr'"> becomes:
+C<.> catenation operations. Thus, C<"$foo XXX '@arr'"> becomes:
$foo . " XXX '" . (join $", @arr) . "'";
Note also that the interpolation code needs to make a decision on
where the interpolated scalar ends. For instance, whether
-C<"a $b -E<gt> {c}"> really means:
+C<< "a $b -> {c}" >> really means:
"a " . $b . " -> {c}";
It is at this step that C<\1> is begrudgingly converted to C<$1> in
the replacement text of C<s///> to correct the incorrigible
I<sed> hackers who haven't picked up the saner idiom yet. A warning
-is emitted if the B<-w> command-line flag (that is, the C<$^W> variable)
-was set.
+is emitted if the C<use warnings> pragma or the B<-w> command-line flag
+(that is, the C<$^W> variable) was set.
The lack of processing of C<\\> creates specific restrictions on
the post-processed text. If the delimiter is C</>, one cannot get
It is possible to inspect both the string given to RE engine and the
resulting finite automaton. See the arguments C<debug>/C<debugcolor>
in the C<use L<re>> pragma, as well as Perl's B<-Dr> command-line
-switch documented in L<perlrun/Switches>.
+switch documented in L<perlrun/"Command Switches">.
=item Optimization of regular expressions
A string enclosed by backticks (grave accents) first undergoes
double-quote interpolation. It is then interpreted as an external
command, and the output of that command is the value of the
-pseudo-literal, j
-string consisting of all output is returned. In list context, a
-list of values is returned, one per line of output. (You can set
-C<$/> to use a different line terminator.) The command is executed
-each time the pseudo-literal is evaluated. The status value of the
-command is returned in C<$?> (see L<perlvar> for the interpretation
-of C<$?>). Unlike in B<csh>, no translation is done on the return
-data--newlines remain newlines. Unlike in any of the shells, single
-quotes do not hide variable names in the command from interpretation.
-To pass a literal dollar-sign through to the shell you need to hide
-it with a backslash. The generalized form of backticks is C<qx//>.
-(Because backticks always undergo shell expansion as well, see
-L<perlsec> for security concerns.)
+backtick string, like in a shell. In scalar context, a single string
+consisting of all output is returned. In list context, a list of
+values is returned, one per line of output. (You can set C<$/> to use
+a different line terminator.) The command is executed each time the
+pseudo-literal is evaluated. The status value of the command is
+returned in C<$?> (see L<perlvar> for the interpretation of C<$?>).
+Unlike in B<csh>, no translation is done on the return data--newlines
+remain newlines. Unlike in any of the shells, single quotes do not
+hide variable names in the command from interpretation. To pass a
+literal dollar-sign through to the shell you need to hide it with a
+backslash. The generalized form of backticks is C<qx//>. (Because
+backticks always undergo shell expansion as well, see L<perlsec> for
+security concerns.)
In scalar context, evaluating a filehandle in angle brackets yields
the next line from that file (the newline, if any, included), or
the value is automatically assigned to the global variable $_,
destroying whatever was there previously. (This may seem like an
odd thing to you, but you'll use the construct in almost every Perl
-script you write.) The $_ variables is not implicitly localized.
+script you write.) The $_ variable is not implicitly localized.
You'll have to put a C<local $_;> before the loop if you want that
to happen.
while (($_ = <STDIN>) ne '0') { ... }
while (<STDIN>) { last unless $_; ... }
-In other boolean contexts, C<E<lt>I<filehandle>E<gt>> without an
-explicit C<defined> test or comparison elicit a warning if the B<-w>
+In other boolean contexts, C<< <I<filehandle>> >> without an
+explicit C<defined> test or comparison elicit a warning if the
+C<use warnings> pragma or the B<-w>
command-line switch (the C<$^W> variable) is in effect.
The filehandles STDIN, STDOUT, and STDERR are predefined. (The
the open() function, amongst others. See L<perlopentut> and
L<perlfunc/open> for details on this.
-If a E<lt>FILEHANDLEE<gt> is used in a context that is looking for
+If a <FILEHANDLE> is used in a context that is looking for
a list, a list comprising all input lines is returned, one line per
list element. It's easy to grow to a rather large data space this
way, so use with care.
-E<lt>FILEHANDLEE<gt> may also be spelled C<readline(*FILEHANDLE)>.
+<FILEHANDLE> may also be spelled C<readline(*FILEHANDLE)>.
See L<perlfunc/readline>.
-The null filehandle E<lt>E<gt> is special: it can be used to emulate the
-behavior of B<sed> and B<awk>. Input from E<lt>E<gt> comes either from
+The null filehandle <> is special: it can be used to emulate the
+behavior of B<sed> and B<awk>. Input from <> comes either from
standard input, or from each file listed on the command line. Here's
-how it works: the first time E<lt>E<gt> is evaluated, the @ARGV array is
+how it works: the first time <> is evaluated, the @ARGV array is
checked, and if it is empty, C<$ARGV[0]> is set to "-", which when opened
gives you standard input. The @ARGV array is then processed as a list
of filenames. The loop
except that it isn't so cumbersome to say, and will actually work.
It really does shift the @ARGV array and put the current filename
into the $ARGV variable. It also uses filehandle I<ARGV>
-internally--E<lt>E<gt> is just a synonym for E<lt>ARGVE<gt>, which
+internally--<> is just a synonym for <ARGV>, which
is magical. (The pseudo code above doesn't work because it treats
-E<lt>ARGVE<gt> as non-magical.)
+<ARGV> as non-magical.)
-You can modify @ARGV before the first E<lt>E<gt> as long as the array ends up
+You can modify @ARGV before the first <> as long as the array ends up
containing the list of filenames you really want. Line numbers (C<$.>)
continue as though the input were one big happy file. See the example
in L<perlfunc/eof> for how to reset line numbers on each file.
# ... # code for each line
}
-The E<lt>E<gt> symbol will return C<undef> for end-of-file only once.
+The <> symbol will return C<undef> for end-of-file only once.
If you call it again after this, it will assume you are processing another
@ARGV list, and if you haven't set @ARGV, will read input from STDIN.
-If angle brackets contain is a simple scalar variable (e.g.,
-E<lt>$fooE<gt>), then that variable contains the name of the
+If what the angle brackets contain is a simple scalar variable (e.g.,
+<$foo>), then that variable contains the name of the
filehandle to input from, or its typeglob, or a reference to the
same. For example:
reference, it is interpreted as a filename pattern to be globbed, and
either a list of filenames or the next filename in the list is returned,
depending on context. This distinction is determined on syntactic
-grounds alone. That means C<E<lt>$xE<gt>> is always a readline() from
-an indirect handle, but C<E<lt>$hash{key}E<gt>> is always a glob().
+grounds alone. That means C<< <$x> >> is always a readline() from
+an indirect handle, but C<< <$hash{key}> >> is always a glob().
That's because $x is a simple scalar variable, but C<$hash{key}> is
not--it's a hash element.
One level of double-quote interpretation is done first, but you can't
-say C<E<lt>$fooE<gt>> because that's an indirect filehandle as explained
+say C<< <$foo> >> because that's an indirect filehandle as explained
in the previous paragraph. (In older versions of Perl, programmers
would insert curly brackets to force interpretation as a filename glob:
-C<E<lt>${foo}E<gt>>. These days, it's considered cleaner to call the
+C<< <${foo}> >>. These days, it's considered cleaner to call the
internal function directly as C<glob($foo)>, which is probably the right
way to have done it in the first place.) For example:
chmod 0644, $_;
}
-is equivalent to
+is roughly equivalent to:
open(FOO, "echo *.c | tr -s ' \t\r\f' '\\012\\012\\012\\012'|");
while (<FOO>) {
- chop;
+ chomp;
chmod 0644, $_;
}
-In fact, it's currently implemented that way, but this is expected
-to be made completely internal in the near future. (Which means
-it will not work on filenames with spaces in them unless you have
-csh(1) on your machine.) Of course, the shortest way to do the
-above is:
+except that the globbing is actually done internally using the standard
+C<File::Glob> extension. Of course, the shortest way to do the above is:
chmod 0644, <*.c>;
-Because globbing currently invokes a shell, it's often faster to
-call readdir() yourself and do your own grep() on the filenames.
-Furthermore, due to its current implementation of using a shell,
-the glob() routine may get "Arg list too long" errors (unless you've
-installed tcsh(1L) as F</bin/csh> or hacked your F<config.sh>).
-
A (file)glob evaluates its (embedded) argument only when it is
starting a new list. All values must be read before it will start
over. In list context, this isn't important because you automatically
get them all anyway. However, in scalar context the operator returns
-the next value each time it's called, or C
+the next value each time it's called, or C<undef> when the list has
run out. As with filehandle reads, an automatic C<defined> is
generated when the glob occurs in the test part of a C<while>,
because legal glob returns (e.g. a file called F<0>) would otherwise
because the latter will alternate between returning a filename and
returning false.
-It you're trying to do variable interpolation, it's definitely better
+If you're trying to do variable interpolation, it's definitely better
to use the glob() function, because the older notation can cause people
to become confused with the indirect filehandle notation.
or so.
Used on numbers, the bitwise operators ("&", "|", "^", "~", "<<",
-and ">>") always produce integral results. (But see also L<Bitwise
-String Operators>.) However, C<use integer> still has meaning for
+and ">>") always produce integral results. (But see also
+L<Bitwise String Operators>.) However, C<use integer> still has meaning for
them. By default, their results are interpreted as unsigned integers, but
if C<use integer> is in effect, their results are interpreted
as signed integers. For example, C<~0> usually evaluates to a large
The standard Math::BigInt and Math::BigFloat modules provide
variable-precision arithmetic and overloaded operators, although
-they're currently pretty slow. At the cost of some space and
+they're currently pretty slow. At the cost of some space and
considerable speed, they avoid the normal pitfalls associated with
limited-precision representations.
# prints +15241578780673678515622620750190521
-The non-standard modules SSLeay::BN and Math::Pari provide
-equivalent functionality (and much more) with a substantial
-performance savings.
+There are several modules that let you calculate with (bound only by
+memory and cpu-time) unlimited or fixed precision. There are also
+some non-standard modules that provide faster implementations via
+external C libraries.
+
+Here is a short, but incomplete summary:
+
+ Math::Fraction big, unlimited fractions like 9973 / 12967
+ Math::String treat string sequences like numbers
+ Math::FixedPrecision calculate with a fixed precision
+ Math::Currency for currency calculations
+ Bit::Vector manipulate bit vectors fast (uses C)
+ Math::BigIntFast Bit::Vector wrapper for big numbers
+ Math::Pari provides access to the Pari C library
+ Math::BigInteger uses an external C library
+ Math::Cephes uses external Cephes C library (no big numbers)
+ Math::Cephes::Fraction fractions via the Cephes library
+ Math::GMP another one using an external C library
+
+Choose wisely.
+
+=cut