=back
-Perl also has environment variables that control how Perl handles
-language-specific data. Please consult L<perllocale>.
+Perl also has environment variables that control how Perl handles data
+specific to particular natural languages. See L<perllocale>.
Apart from these, Perl uses no other environment variables, except
to make them available to the script being executed, and to child
before the line is processed, so a single list expression could produce
multiple list elements. The expressions may be spread out to more than
one line if enclosed in braces. If so, the opening brace must be the first
-token on the first line.
+token on the first line. If an expression evaluates to a number with a
+decimal part, and if the corresponding picture specifies that the decimal
+part should appear in the output (that is, any picture except multiple "#"
+characters B<without> an embedded "."), the character used for the decimal
+point is B<always> determined by the current LC_NUMERIC locale. This
+means that, if, for example, the run-time environment happens to specify a
+German locale, "," will be used instead of the default ".". See
+L<perllocale> and L<"WARNINGS"> for more information.
Picture fields that begin with ^ rather than @ are treated specially.
With a # field, the field is blanked out if the value is undefined. For
END
print $string;
-=head1 WARNING
+=head1 WARNINGS
Lexical variables (declared with "my") are not visible within a
format unless the format is declared within the scope of the lexical
variable. (They weren't visible at all before version 5.001.) Furthermore,
lexical aliases will not be compiled correctly: see
L<perlfunc/my> for other issues.
+
+Formats are the only part of Perl which unconditionally use information
+from a program's locale; if a program's environment specifies an
+LC_NUMERIC locale, it is always used to specify the decimal point
+character in formatted output. Perl ignores all other aspects of locale
+handling unless the C<use locale> pragma is in effect. Formatted output
+cannot be controlled by C<use locale> because the pragma is tied to the
+block structure of the program, and, for historical reasons, formats
+exist outside that block structure. See L<perllocale> for further
+discussion of locale handling.
Returns an lowercased version of EXPR. This is the internal function
implementing the \L escape in double-quoted strings.
-Should respect any POSIX setlocale() settings.
+Respects current LC_CTYPE locale if C<use locale> in force. See L<perllocale>.
If EXPR is omitted, uses $_.
Returns the value of EXPR with the first character lowercased. This is
the internal function implementing the \l escape in double-quoted strings.
-Should respect any POSIX setlocale() settings.
+Respects current LC_CTYPE locale if C<use locale> in force. See L<perllocale>.
If EXPR is omitted, uses $_.
=item printf FORMAT, LIST
-Equivalent to a "print FILEHANDLE sprintf(FORMAT, LIST)". The first argument
-of the list will be interpreted as the printf format.
+Equivalent to C<print FILEHANDLE sprintf(FORMAT, LIST)>. The first argument
+of the list will be interpreted as the printf format. If C<use locale> is
+in effect, the character used for the decimal point in formatted real numbers
+is affected by the LC_NUMERIC locale. See L<perllocale>.
=item prototype FUNCTION
=item quotemeta
-Returns the value of EXPR with with all regular expression
-metacharacters backslashed. This is the internal function implementing
+Returns the value of EXPR with with all non-alphanumeric
+characters backslashed. (That is, all characters not matching
+C</[A-Za-z_0-9]/> will be preceded by a backslash in the
+returned string, regardless of any locale settings.)
+This is the internal function implementing
the \Q escape in double-quoted strings.
If EXPR is omitted, uses $_.
$b (see example below). They are passed by reference, so don't
modify $a and $b. And don't try to declare them as lexicals either.
+When C<use locale> is in effect, C<sort LIST> sorts LIST according to the
+current collation locale. See L<perllocale>.
+
Examples:
# sort lexically
language. See L<sprintf(3)> or L<printf(3)> on your system for details.
(The * character for an indirectly specified length is not
supported, but you can get the same effect by interpolating a variable
-into the pattern.) Some C libraries' implementations of sprintf() can
+into the pattern.) If C<use locale> is
+in effect, the character used for the decimal point in formatted real numbers
+is affected by the LC_NUMERIC locale. See L<perllocale>.
+Some C libraries' implementations of sprintf() can
dump core when fed ludicrous arguments.
=item sqrt EXPR
Returns an uppercased version of EXPR. This is the internal function
implementing the \U escape in double-quoted strings.
-Should respect any POSIX setlocale() settings.
+Respects current LC_CTYPE locale if C<use locale> in force. See L<perllocale>.
If EXPR is omitted, uses $_.
Returns the value of EXPR with the first character uppercased. This is
the internal function implementing the \u escape in double-quoted strings.
-Should respect any POSIX setlocale() settings.
+Respects current LC_CTYPE locale if C<use locale> in force. See L<perllocale>.
If EXPR is omitted, uses $_.
Binary "cmp" returns -1, 0, or 1 depending on whether the left argument is stringwise
less than, equal to, or greater than the right argument.
+"lt", "le", "ge", "gt" and "cmp" use the collation (sort) order specified
+by the current locale if C<use locale> is in effect. See L<perllocale>.
+
=head2 Bitwise And
Binary "&" returns its operators ANDed together bit by bit.
\E end case modification
\Q quote regexp metacharacters till \E
+If C<use locale> is in effect, the case map used by C<\l>, C<\L>, C<\u>
+and <\U> is taken from the current locale. See L<perllocale>.
+
Patterns are subject to an additional level of interpretation as a
regular expression. This is done as a second pass, after variables are
interpolated, so that regular expressions may be incorporated into the
C<=~> need not be an lvalue--it may be the result of an expression
evaluation, but remember the C<=~> binds rather tightly.) See also
L<perlre>.
+See L<perllocale> for discussion of additional considerations which apply
+when C<use locale> is in effect.
Options are:
the variable is interpolated, use the C</o> option. If the pattern
evaluates to a null string, the last successfully executed regular
expression is used instead. See L<perlre> for further explanation on these.
+See L<perllocale> for discussion of additional considerations which apply
+when C<use locale> is in effect.
Options are:
Do case-insensitive pattern matching.
+If C<use locale> is in effect, the case map is taken from the current
+locale. See L<perllocale>.
+
=item m
Treat string as multiple lines. That is, change "^" and "$" from matching
\E end case modification (think vi)
\Q quote regexp metacharacters till \E
+If C<use locale> is in effect, the case map used by C<\l>, C<\L>, C<\u>
+and <\U> is taken from the current locale. See L<perllocale>.
+
In addition, Perl defines the following:
\w Match a "word" character (alphanumeric plus "_")
\D Match a non-digit character
Note that C<\w> matches a single alphanumeric character, not a whole
-word. To match a word you'd need to say C<\w+>. You may use C<\w>,
-C<\W>, C<\s>, C<\S>, C<\d>, and C<\D> within character classes (though not
-as either end of a range).
+word. To match a word you'd need to say C<\w+>. If C<use locale> is in
+effect, the list of alphabetic characters generated by C<\w> is taken
+from the current locale. See L<perllocale>. You may use C<\w>, C<\W>,
+C<\s>, C<\S>, C<\d>, and C<\D> within character classes (though not as
+either end of a range).
Perl defines the following zero-width assertions:
You may not use data derived from outside your program to affect something
else outside your program--at least, not by accident. All command-line
-arguments, environment variables, and file input are marked as "tainted".
-Tainted data may not be used directly or indirectly in any command that
-invokes a sub-shell, nor in any command that modifies files, directories,
-or processes. Any variable set within an expression that has previously
-referenced a tainted value itself becomes tainted, even if it is logically
-impossible for the tainted value to influence the variable. Because
-taintedness is associated with each scalar value, some elements of an
-array can be tainted and others not.
+arguments, environment variables, locale information (see L<perllocale>),
+and file input are marked as "tainted". Tainted data may not be used
+directly or indirectly in any command that invokes a sub-shell, nor in any
+command that modifies files, directories, or processes. Any variable set
+within an expression that has previously referenced a tainted value itself
+becomes tainted, even if it is logically impossible for the tainted value
+to influence the variable. Because taintedness is associated with each
+scalar value, some elements of an array can be tainted and others not.
For example:
Perl presumes that if you reference a substring using $1, $2, etc., that
you knew what you were doing when you wrote the pattern. That means using
a bit of thought--don't just blindly untaint anything, or you defeat the
-entire mechanism. It's better to verify that the variable has only
-good characters (for certain values of "good") rather than checking
-whether it has any bad characters. That's because it's far too easy to
-miss bad characters that you never thought of.
+entire mechanism. It's better to verify that the variable has only good
+characters (for certain values of "good") rather than checking whether it
+has any bad characters. That's because it's far too easy to miss bad
+characters that you never thought of.
Here's a test to make sure that the data contains nothing but "word"
characters (alphabetics, numerics, and underscores), a hyphen, an at sign,
untainting dirty data, unless you use the strategy detailed below to fork
a child of lesser privilege.
+The example does not untaint $data if C<use locale> is in effect,
+because the characters matched by C<\w> are determined by the locale.
+Perl considers that locale definitions are untrustworthy because they
+contain data from outside the program. If you are writing a
+locale-aware program, and want to launder data with a regular expression
+containing C<\w>, put C<no locale> ahead of the expression in the same
+block. See L<perllocale/SECURITY> for further discussion and examples.
+
=head2 Cleaning Up Your Path
For "Insecure C<$ENV{PATH}>" messages, you need to set C<$ENV{'PATH'}> to a