=head1 NAME
-perlfaq4 - Data Manipulation ($Revision: 1.67 $, $Date: 2005/08/10 15:55:49 $)
+perlfaq4 - Data Manipulation ($Revision: 1.71 $, $Date: 2005/11/23 07:46:45 $)
=head1 DESCRIPTION
The localtime function returns the day of the year. Without an
argument localtime uses the current time.
- $day_of_year = (localtime)[7];
+ $day_of_year = (localtime)[7];
The POSIX module can also format a date as the day of the year or
week of the year.
=head2 How can I tell whether a certain element is contained in a list or array?
+(portions of this answer contributed by Anno Siegel)
+
Hearing the word "in" is an I<in>dication that you probably should have
used a hash, not a list or array, to store your data. Hashes are
designed to answer this question quickly and efficiently. Arrays aren't.
Now check whether C<vec($read,$n,1)> is true for some C<$n>.
-Please do not use
+These methods guarantee fast individual tests but require a re-organization
+of the original list or array. They only pay off if you have to test
+multiple values against the same array.
- ($is_there) = grep $_ eq $whatever, @array;
+If you are testing only once, the standard module List::Util exports
+the function C<first> for this purpose. It works by stopping once it
+finds the element. It's written in C for speed, and its Perl equivalant
+looks like this subroutine:
-or worse yet
+ sub first (&@) {
+ my $code = shift;
+ foreach (@_) {
+ return $_ if &{$code}();
+ }
+ undef;
+ }
- ($is_there) = grep /$whatever/, @array;
+If speed is of little concern, the common idiom uses grep in scalar context
+(which returns the number of items that passed its condition) to traverse the
+entire list. This does have the benefit of telling you how many matches it
+found, though.
-These are slow (checks every element even if the first matches),
-inefficient (same reason), and potentially buggy (what if there are
-regex characters in $whatever?). If you're only testing once, then
-use:
+ my $is_there = grep $_ eq $whatever, @array;
- $is_there = 0;
- foreach $elt (@array) {
- if ($elt eq $elt_to_find) {
- $is_there = 1;
- last;
- }
- }
- if ($is_there) { ... }
+If you want to actually extract the matching elements, simply use grep in
+list context.
+ my @matches = grep $_ eq $whatever, @array;
+
=head2 How do I compute the difference of two arrays? How do I compute the intersection of two arrays?
Use a hash. Here's code to do both and more. It assumes that
=head2 How do I sort a hash (optionally by value instead of key)?
-Internally, hashes are stored in a way that prevents you from imposing
-an order on key-value pairs. Instead, you have to sort a list of the
-keys or values:
+(contributed by brian d foy)
- @keys = sort keys %hash; # sorted by key
- @keys = sort {
- $hash{$a} cmp $hash{$b}
- } keys %hash; # and by value
+To sort a hash, start with the keys. In this example, we give the list of
+keys to the sort function which then compares them ASCIIbetically (which
+might be affected by your locale settings). The output list has the keys
+in ASCIIbetical order. Once we have the keys, we can go through them to
+create a report which lists the keys in ASCIIbetical order.
-Here we'll do a reverse numeric sort by value, and if two keys are
-identical, sort by length of key, or if that fails, by straight ASCII
-comparison of the keys (well, possibly modified by your locale--see
-L<perllocale>).
+ my @keys = sort { $a cmp $b } keys %hash;
+
+ foreach my $key ( @keys )
+ {
+ printf "%-20s %6d\n", $key, $hash{$value};
+ }
- @keys = sort {
- $hash{$b} <=> $hash{$a}
- ||
- length($b) <=> length($a)
- ||
- $a cmp $b
- } keys %hash;
+We could get more fancy in the C<sort()> block though. Instead of
+comparing the keys, we can compute a value with them and use that
+value as the comparison.
+
+For instance, to make our report order case-insensitive, we use
+the C<\L> sequence in a double-quoted string to make everything
+lowercase. The C<sort()> block then compares the lowercased
+values to determine in which order to put the keys.
+
+ my @keys = sort { "\L$a" cmp "\L$b" } keys %hash;
+
+Note: if the computation is expensive or the hash has many elements,
+you may want to look at the Schwartzian Transform to cache the
+computation results.
+
+If we want to sort by the hash value instead, we use the hash key
+to look it up. We still get out a list of keys, but this time they
+are ordered by their value.
+
+ my @keys = sort { $hash{$a} <=> $hash{$b} } keys %hash;
+
+From there we can get more complex. If the hash values are the same,
+we can provide a secondary sort on the hash key.
+
+ my @keys = sort {
+ $hash{$a} <=> $hash{$b}
+ or
+ "\L$a" cmp "\L$b"
+ } keys %hash;
=head2 How can I always keep my hash sorted?
=head2 How can I use a reference as a hash key?
-You can't do this directly, but you could use the standard Tie::RefHash
-module distributed with Perl.
+(contributed by brian d foy)
+
+Hash keys are strings, so you can't really use a reference as the key.
+When you try to do that, perl turns the reference into its stringified
+form (for instance, C<HASH(0xDEADBEEF)>). From there you can't get back
+the reference from the stringified form, at least without doing some
+extra work on your own. Also remember that hash keys must be unique, but
+two different variables can store the same reference (and those variables
+can change later).
+
+The Tie::RefHash module, which is distributed with perl, might be what
+you want. It handles that extra work.
=head1 Data: Misc