This is patch.2b1f to perl5.002beta1.
[p5sagit/p5-mst-13.2.git] / pod / perlref.pod
CommitLineData
4633a7c4 1(Don't
2convert references into strings though, or you'll break their referenceness.)
3
a0d0e21e 4=head1 NAME
5
6perlref - Perl references and nested data structures
7
8=head1 DESCRIPTION
9
10In Perl 4 it was difficult to represent complex data structures, because
11all references had to be symbolic, and even that was difficult to do when
12you wanted to refer to a variable rather than a symbol table entry. Perl
135 not only makes it easier to use symbolic references to variables, but
14lets you have "hard" references to any piece of data. Any scalar may hold
15a hard reference. Since arrays and hashes contain scalars, you can now
16easily build arrays of arrays, arrays of hashes, hashes of arrays, arrays
17of hashes of functions, and so on.
18
19Hard references are smart--they keep track of reference counts for you,
20automatically freeing the thing referred to when its reference count
21goes to zero. If that thing happens to be an object, the object is
22destructed. See L<perlobj> for more about objects. (In a sense,
23everything in Perl is an object, but we usually reserve the word for
24references to objects that have been officially "blessed" into a class package.)
25
26A symbolic reference contains the name of a variable, just as a
27symbolic link in the filesystem merely contains the name of a file.
28The C<*glob> notation is a kind of symbolic reference. Hard references
29are more like hard links in the file system: merely another way
30at getting at the same underlying object, irrespective of its name.
31
32"Hard" references are easy to use in Perl. There is just one
33overriding principle: Perl does no implicit referencing or
34dereferencing. When a scalar is holding a reference, it always behaves
35as a scalar. It doesn't magically start being an array or a hash
36unless you tell it so explicitly by dereferencing it.
37
38References can be constructed several ways.
39
40=over 4
41
42=item 1.
43
44By using the backslash operator on a variable, subroutine, or value.
45(This works much like the & (address-of) operator works in C.) Note
46that this typically creates I<ANOTHER> reference to a variable, since
47there's already a reference to the variable in the symbol table. But
48the symbol table reference might go away, and you'll still have the
49reference that the backslash returned. Here are some examples:
50
51 $scalarref = \$foo;
52 $arrayref = \@ARGV;
53 $hashref = \%ENV;
54 $coderef = \&handler;
55
56=item 2.
57
58A reference to an anonymous array can be constructed using square
59brackets:
60
61 $arrayref = [1, 2, ['a', 'b', 'c']];
62
63Here we've constructed a reference to an anonymous array of three elements
64whose final element is itself reference to another anonymous array of three
65elements. (The multidimensional syntax described later can be used to
66access this. For example, after the above, $arrayref->[2][1] would have
67the value "b".)
68
69=item 3.
70
71A reference to an anonymous hash can be constructed using curly
72brackets:
73
74 $hashref = {
75 'Adam' => 'Eve',
76 'Clyde' => 'Bonnie',
77 };
78
79Anonymous hash and array constructors can be intermixed freely to
80produce as complicated a structure as you want. The multidimensional
81syntax described below works for these too. The values above are
82literals, but variables and expressions would work just as well, because
83assignment operators in Perl (even within local() or my()) are executable
84statements, not compile-time declarations.
85
86Because curly brackets (braces) are used for several other things
87including BLOCKs, you may occasionally have to disambiguate braces at the
88beginning of a statement by putting a C<+> or a C<return> in front so
89that Perl realizes the opening brace isn't starting a BLOCK. The economy and
90mnemonic value of using curlies is deemed worth this occasional extra
91hassle.
92
93For example, if you wanted a function to make a new hash and return a
94reference to it, you have these options:
95
96 sub hashem { { @_ } } # silently wrong
97 sub hashem { +{ @_ } } # ok
98 sub hashem { return { @_ } } # ok
99
100=item 4.
101
102A reference to an anonymous subroutine can be constructed by using
103C<sub> without a subname:
104
105 $coderef = sub { print "Boink!\n" };
106
107Note the presence of the semicolon. Except for the fact that the code
108inside isn't executed immediately, a C<sub {}> is not so much a
109declaration as it is an operator, like C<do{}> or C<eval{}>. (However, no
110matter how many times you execute that line (unless you're in an
111C<eval("...")>), C<$coderef> will still have a reference to the I<SAME>
112anonymous subroutine.)
113
748a9306 114Anonymous subroutines act as closures with respect to my() variables,
115that is, variables visible lexically within the current scope. Closure
116is a notion out of the Lisp world that says if you define an anonymous
117function in a particular lexical context, it pretends to run in that
118context even when it's called outside of the context.
119
120In human terms, it's a funny way of passing arguments to a subroutine when
121you define it as well as when you call it. It's useful for setting up
122little bits of code to run later, such as callbacks. You can even
123do object-oriented stuff with it, though Perl provides a different
124mechanism to do that already--see L<perlobj>.
125
126You can also think of closure as a way to write a subroutine template without
127using eval. (In fact, in version 5.000, eval was the I<only> way to get
128closures. You may wish to use "require 5.001" if you use closures.)
129
130Here's a small example of how closures works:
131
132 sub newprint {
133 my $x = shift;
134 return sub { my $y = shift; print "$x, $y!\n"; };
a0d0e21e 135 }
748a9306 136 $h = newprint("Howdy");
137 $g = newprint("Greetings");
138
139 # Time passes...
140
141 &$h("world");
142 &$g("earthlings");
a0d0e21e 143
748a9306 144This prints
145
146 Howdy, world!
147 Greetings, earthlings!
148
149Note particularly that $x continues to refer to the value passed into
150newprint() *despite* the fact that the "my $x" has seemingly gone out of
151scope by the time the anonymous subroutine runs. That's what closure
152is all about.
153
154This only applies to lexical variables, by the way. Dynamic variables
155continue to work as they have always worked. Closure is not something
156that most Perl programmers need trouble themselves about to begin with.
a0d0e21e 157
158=item 5.
159
160References are often returned by special subroutines called constructors.
748a9306 161Perl objects are just references to a special kind of object that happens to know
a0d0e21e 162which package it's associated with. Constructors are just special
163subroutines that know how to create that association. They do so by
164starting with an ordinary reference, and it remains an ordinary reference
165even while it's also being an object. Constructors are customarily
166named new(), but don't have to be:
167
168 $objref = new Doggie (Tail => 'short', Ears => 'long');
169
170=item 6.
171
172References of the appropriate type can spring into existence if you
173dereference them in a context that assumes they exist. Since we haven't
174talked about dereferencing yet, we can't show you any examples yet.
175
176=back
177
178That's it for creating references. By now you're probably dying to
179know how to use references to get back to your long-lost data. There
180are several basic methods.
181
182=over 4
183
184=item 1.
185
186Anywhere you'd put an identifier as part of a variable or subroutine
187name, you can replace the identifier with a simple scalar variable
188containing a reference of the correct type:
189
190 $bar = $$scalarref;
191 push(@$arrayref, $filename);
192 $$arrayref[0] = "January";
193 $$hashref{"KEY"} = "VALUE";
194 &$coderef(1,2,3);
195
196It's important to understand that we are specifically I<NOT> dereferencing
197C<$arrayref[0]> or C<$hashref{"KEY"}> there. The dereference of the
198scalar variable happens I<BEFORE> it does any key lookups. Anything more
199complicated than a simple scalar variable must use methods 2 or 3 below.
200However, a "simple scalar" includes an identifier that itself uses method
2011 recursively. Therefore, the following prints "howdy".
202
203 $refrefref = \\\"howdy";
204 print $$$$refrefref;
205
206=item 2.
207
208Anywhere you'd put an identifier as part of a variable or subroutine
209name, you can replace the identifier with a BLOCK returning a reference
210of the correct type. In other words, the previous examples could be
211written like this:
212
213 $bar = ${$scalarref};
214 push(@{$arrayref}, $filename);
215 ${$arrayref}[0] = "January";
216 ${$hashref}{"KEY"} = "VALUE";
217 &{$coderef}(1,2,3);
218
219Admittedly, it's a little silly to use the curlies in this case, but
220the BLOCK can contain any arbitrary expression, in particular,
221subscripted expressions:
222
223 &{ $dispatch{$index} }(1,2,3); # call correct routine
224
225Because of being able to omit the curlies for the simple case of C<$$x>,
226people often make the mistake of viewing the dereferencing symbols as
227proper operators, and wonder about their precedence. If they were,
228though, you could use parens instead of braces. That's not the case.
229Consider the difference below; case 0 is a short-hand version of case 1,
230I<NOT> case 2:
231
232 $$hashref{"KEY"} = "VALUE"; # CASE 0
233 ${$hashref}{"KEY"} = "VALUE"; # CASE 1
234 ${$hashref{"KEY"}} = "VALUE"; # CASE 2
235 ${$hashref->{"KEY"}} = "VALUE"; # CASE 3
236
237Case 2 is also deceptive in that you're accessing a variable
238called %hashref, not dereferencing through $hashref to the hash
239it's presumably referencing. That would be case 3.
240
241=item 3.
242
243The case of individual array elements arises often enough that it gets
244cumbersome to use method 2. As a form of syntactic sugar, the two
245lines like that above can be written:
246
247 $arrayref->[0] = "January";
748a9306 248 $hashref->{"KEY"} = "VALUE";
a0d0e21e 249
250The left side of the array can be any expression returning a reference,
251including a previous dereference. Note that C<$array[$x]> is I<NOT> the
252same thing as C<$array-E<gt>[$x]> here:
253
254 $array[$x]->{"foo"}->[0] = "January";
255
256This is one of the cases we mentioned earlier in which references could
257spring into existence when in an lvalue context. Before this
258statement, C<$array[$x]> may have been undefined. If so, it's
259automatically defined with a hash reference so that we can look up
260C<{"foo"}> in it. Likewise C<$array[$x]-E<gt>{"foo"}> will automatically get
261defined with an array reference so that we can look up C<[0]> in it.
262
263One more thing here. The arrow is optional I<BETWEEN> brackets
264subscripts, so you can shrink the above down to
265
266 $array[$x]{"foo"}[0] = "January";
267
268Which, in the degenerate case of using only ordinary arrays, gives you
269multidimensional arrays just like C's:
270
271 $score[$x][$y][$z] += 42;
272
273Well, okay, not entirely like C's arrays, actually. C doesn't know how
274to grow its arrays on demand. Perl does.
275
276=item 4.
277
278If a reference happens to be a reference to an object, then there are
279probably methods to access the things referred to, and you should probably
280stick to those methods unless you're in the class package that defines the
281object's methods. In other words, be nice, and don't violate the object's
282encapsulation without a very good reason. Perl does not enforce
283encapsulation. We are not totalitarians here. We do expect some basic
284civility though.
285
286=back
287
288The ref() operator may be used to determine what type of thing the
289reference is pointing to. See L<perlfunc>.
290
291The bless() operator may be used to associate a reference with a package
292functioning as an object class. See L<perlobj>.
293
294A type glob may be dereferenced the same way a reference can, since
295the dereference syntax always indicates the kind of reference desired.
296So C<${*foo}> and C<${\$foo}> both indicate the same scalar variable.
297
298Here's a trick for interpolating a subroutine call into a string:
299
300 print "My sub returned ${\mysub(1,2,3)}\n";
301
302The way it works is that when the C<${...}> is seen in the double-quoted
303string, it's evaluated as a block. The block executes the call to
304C<mysub(1,2,3)>, and then takes a reference to that. So the whole block
305returns a reference to a scalar, which is then dereferenced by C<${...}>
306and stuck into the double-quoted string.
307
308=head2 Symbolic references
309
310We said that references spring into existence as necessary if they are
311undefined, but we didn't say what happens if a value used as a
312reference is already defined, but I<ISN'T> a hard reference. If you
313use it as a reference in this case, it'll be treated as a symbolic
314reference. That is, the value of the scalar is taken to be the I<NAME>
315of a variable, rather than a direct link to a (possibly) anonymous
316value.
317
318People frequently expect it to work like this. So it does.
319
320 $name = "foo";
321 $$name = 1; # Sets $foo
322 ${$name} = 2; # Sets $foo
323 ${$name x 2} = 3; # Sets $foofoo
324 $name->[0] = 4; # Sets $foo[0]
325 @$name = (); # Clears @foo
326 &$name(); # Calls &foo() (as in Perl 4)
327 $pack = "THAT";
328 ${"${pack}::$name"} = 5; # Sets $THAT::foo without eval
329
330This is very powerful, and slightly dangerous, in that it's possible
331to intend (with the utmost sincerity) to use a hard reference, and
332accidentally use a symbolic reference instead. To protect against
333that, you can say
334
335 use strict 'refs';
336
337and then only hard references will be allowed for the rest of the enclosing
338block. An inner block may countermand that with
339
340 no strict 'refs';
341
342Only package variables are visible to symbolic references. Lexical
343variables (declared with my()) aren't in a symbol table, and thus are
344invisible to this mechanism. For example:
345
346 local($value) = 10;
347 $ref = \$value;
348 {
349 my $value = 20;
350 print $$ref;
351 }
352
353This will still print 10, not 20. Remember that local() affects package
354variables, which are all "global" to the package.
355
748a9306 356=head2 Not-so-symbolic references
357
358A new feature contributing to readability in 5.001 is that the brackets
359around a symbolic reference behave more like quotes, just as they
360always have within a string. That is,
361
362 $push = "pop on ";
363 print "${push}over";
364
365has always meant to print "pop on over", despite the fact that push is
366a reserved word. This has been generalized to work the same outside
367of quotes, so that
368
369 print ${push} . "over";
370
371and even
372
373 print ${ push } . "over";
374
375will have the same effect. (This would have been a syntax error in
3765.000, though Perl 4 allowed it in the spaceless form.) Note that this
377construct is I<not> considered to be a symbolic reference when you're
378using strict refs:
379
380 use strict 'refs';
381 ${ bareword }; # Okay, means $bareword.
382 ${ "bareword" }; # Error, symbolic reference.
383
384Similarly, because of all the subscripting that is done using single
385words, we've applied the same rule to any bareword that is used for
386subscripting a hash. So now, instead of writing
387
388 $array{ "aaa" }{ "bbb" }{ "ccc" }
389
390you can just write
391
392 $array{ aaa }{ bbb }{ ccc }
393
394and not worry about whether the subscripts are reserved words. In the
395rare event that you do wish to do something like
396
397 $array{ shift }
398
399you can force interpretation as a reserved word by adding anything that
400makes it more than a bareword:
401
402 $array{ shift() }
403 $array{ +shift }
404 $array{ shift @_ }
405
406The B<-w> switch will warn you if it interprets a reserved word as a string.
407But it will no longer warn you about using lowercase words, since the
408string is effectively quoted.
409
410=head2 WARNING
411
412You may not (usefully) use a reference as the key to a hash. It will be
413converted into a string:
414
415 $x{ \$a } = $a;
416
417If you try to dereference the key, it won't do a hard dereference, and
418you won't accomplish what you're attemping.
419
a0d0e21e 420=head2 Further Reading
421
422Besides the obvious documents, source code can be instructive.
423Some rather pathological examples of the use of references can be found
424in the F<t/op/ref.t> regression test in the Perl source directory.