Commit | Line | Data |
a0d0e21e |
1 | =head1 NAME |
d74e8afc |
2 | X<reference> X<pointer> X<data structure> X<structure> X<struct> |
a0d0e21e |
3 | |
4 | perlref - Perl references and nested data structures |
5 | |
a1e2a320 |
6 | =head1 NOTE |
7 | |
8 | This is complete documentation about all aspects of references. |
9 | For a shorter, tutorial introduction to just the essential features, |
10 | see L<perlreftut>. |
11 | |
a0d0e21e |
12 | =head1 DESCRIPTION |
13 | |
cb1a09d0 |
14 | Before release 5 of Perl it was difficult to represent complex data |
5a964f20 |
15 | structures, because all references had to be symbolic--and even then |
16 | it was difficult to refer to a variable instead of a symbol table entry. |
17 | Perl now not only makes it easier to use symbolic references to variables, |
18 | but also lets you have "hard" references to any piece of data or code. |
19 | Any scalar may hold a hard reference. Because arrays and hashes contain |
20 | scalars, you can now easily build arrays of arrays, arrays of hashes, |
21 | hashes of arrays, arrays of hashes of functions, and so on. |
a0d0e21e |
22 | |
23 | Hard references are smart--they keep track of reference counts for you, |
2d24ed35 |
24 | automatically freeing the thing referred to when its reference count goes |
7c2ea1c7 |
25 | to zero. (Reference counts for values in self-referential or |
2d24ed35 |
26 | cyclic data structures may not go to zero without a little help; see |
7b8d334a |
27 | L<perlobj/"Two-Phased Garbage Collection"> for a detailed explanation.) |
2d24ed35 |
28 | If that thing happens to be an object, the object is destructed. See |
29 | L<perlobj> for more about objects. (In a sense, everything in Perl is an |
30 | object, but we usually reserve the word for references to objects that |
31 | have been officially "blessed" into a class package.) |
32 | |
33 | Symbolic references are names of variables or other objects, just as a |
54310121 |
34 | symbolic link in a Unix filesystem contains merely the name of a file. |
d1be9408 |
35 | The C<*glob> notation is something of a symbolic reference. (Symbolic |
2d24ed35 |
36 | references are sometimes called "soft references", but please don't call |
37 | them that; references are confusing enough without useless synonyms.) |
d74e8afc |
38 | X<reference, symbolic> X<reference, soft> |
39 | X<symbolic reference> X<soft reference> |
2d24ed35 |
40 | |
54310121 |
41 | In contrast, hard references are more like hard links in a Unix file |
2d24ed35 |
42 | system: They are used to access an underlying object without concern for |
43 | what its (other) name is. When the word "reference" is used without an |
5a964f20 |
44 | adjective, as in the following paragraph, it is usually talking about a |
2d24ed35 |
45 | hard reference. |
d74e8afc |
46 | X<reference, hard> X<hard reference> |
2d24ed35 |
47 | |
48 | References are easy to use in Perl. There is just one overriding |
49 | principle: Perl does no implicit referencing or dereferencing. When a |
50 | scalar is holding a reference, it always behaves as a simple scalar. It |
51 | doesn't magically start being an array or hash or subroutine; you have to |
52 | tell it explicitly to do so, by dereferencing it. |
a0d0e21e |
53 | |
5a964f20 |
54 | =head2 Making References |
d74e8afc |
55 | X<reference, creation> X<referencing> |
5a964f20 |
56 | |
57 | References can be created in several ways. |
a0d0e21e |
58 | |
59 | =over 4 |
60 | |
61 | =item 1. |
d74e8afc |
62 | X<\> X<backslash> |
a0d0e21e |
63 | |
64 | By using the backslash operator on a variable, subroutine, or value. |
7c2ea1c7 |
65 | (This works much like the & (address-of) operator in C.) |
66 | This typically creates I<another> reference to a variable, because |
a0d0e21e |
67 | there's already a reference to the variable in the symbol table. But |
68 | the symbol table reference might go away, and you'll still have the |
69 | reference that the backslash returned. Here are some examples: |
70 | |
71 | $scalarref = \$foo; |
72 | $arrayref = \@ARGV; |
73 | $hashref = \%ENV; |
74 | $coderef = \&handler; |
55497cff |
75 | $globref = \*foo; |
cb1a09d0 |
76 | |
5a964f20 |
77 | It isn't possible to create a true reference to an IO handle (filehandle |
78 | or dirhandle) using the backslash operator. The most you can get is a |
79 | reference to a typeglob, which is actually a complete symbol table entry. |
80 | But see the explanation of the C<*foo{THING}> syntax below. However, |
81 | you can still use type globs and globrefs as though they were IO handles. |
a0d0e21e |
82 | |
83 | =item 2. |
d74e8afc |
84 | X<array, anonymous> X<[> X<[]> X<square bracket> |
85 | X<bracket, square> X<arrayref> X<array reference> X<reference, array> |
a0d0e21e |
86 | |
5a964f20 |
87 | A reference to an anonymous array can be created using square |
a0d0e21e |
88 | brackets: |
89 | |
90 | $arrayref = [1, 2, ['a', 'b', 'c']]; |
91 | |
5a964f20 |
92 | Here we've created a reference to an anonymous array of three elements |
54310121 |
93 | whose final element is itself a reference to another anonymous array of three |
a0d0e21e |
94 | elements. (The multidimensional syntax described later can be used to |
c47ff5f1 |
95 | access this. For example, after the above, C<< $arrayref->[2][1] >> would have |
a0d0e21e |
96 | the value "b".) |
97 | |
7c2ea1c7 |
98 | Taking a reference to an enumerated list is not the same |
cb1a09d0 |
99 | as using square brackets--instead it's the same as creating |
100 | a list of references! |
101 | |
54310121 |
102 | @list = (\$a, \@b, \%c); |
58e0a6ae |
103 | @list = \($a, @b, %c); # same thing! |
104 | |
54310121 |
105 | As a special case, C<\(@foo)> returns a list of references to the contents |
b6429b1b |
106 | of C<@foo>, not a reference to C<@foo> itself. Likewise for C<%foo>, |
107 | except that the key references are to copies (since the keys are just |
108 | strings rather than full-fledged scalars). |
cb1a09d0 |
109 | |
a0d0e21e |
110 | =item 3. |
d74e8afc |
111 | X<hash, anonymous> X<{> X<{}> X<curly bracket> |
112 | X<bracket, curly> X<brace> X<hashref> X<hash reference> X<reference, hash> |
a0d0e21e |
113 | |
5a964f20 |
114 | A reference to an anonymous hash can be created using curly |
a0d0e21e |
115 | brackets: |
116 | |
117 | $hashref = { |
118 | 'Adam' => 'Eve', |
119 | 'Clyde' => 'Bonnie', |
120 | }; |
121 | |
5a964f20 |
122 | Anonymous hash and array composers like these can be intermixed freely to |
a0d0e21e |
123 | produce as complicated a structure as you want. The multidimensional |
124 | syntax described below works for these too. The values above are |
125 | literals, but variables and expressions would work just as well, because |
126 | assignment operators in Perl (even within local() or my()) are executable |
127 | statements, not compile-time declarations. |
128 | |
129 | Because curly brackets (braces) are used for several other things |
130 | including BLOCKs, you may occasionally have to disambiguate braces at the |
131 | beginning of a statement by putting a C<+> or a C<return> in front so |
132 | that Perl realizes the opening brace isn't starting a BLOCK. The economy and |
133 | mnemonic value of using curlies is deemed worth this occasional extra |
134 | hassle. |
135 | |
136 | For example, if you wanted a function to make a new hash and return a |
137 | reference to it, you have these options: |
138 | |
139 | sub hashem { { @_ } } # silently wrong |
140 | sub hashem { +{ @_ } } # ok |
141 | sub hashem { return { @_ } } # ok |
142 | |
ebc58f1a |
143 | On the other hand, if you want the other meaning, you can do this: |
144 | |
145 | sub showem { { @_ } } # ambiguous (currently ok, but may change) |
146 | sub showem { {; @_ } } # ok |
147 | sub showem { { return @_ } } # ok |
148 | |
7c2ea1c7 |
149 | The leading C<+{> and C<{;> always serve to disambiguate |
ebc58f1a |
150 | the expression to mean either the HASH reference, or the BLOCK. |
151 | |
a0d0e21e |
152 | =item 4. |
d74e8afc |
153 | X<subroutine, anonymous> X<subroutine, reference> X<reference, subroutine> |
154 | X<scope, lexical> X<closure> X<lexical> X<lexical scope> |
a0d0e21e |
155 | |
5a964f20 |
156 | A reference to an anonymous subroutine can be created by using |
a0d0e21e |
157 | C<sub> without a subname: |
158 | |
159 | $coderef = sub { print "Boink!\n" }; |
160 | |
7c2ea1c7 |
161 | Note the semicolon. Except for the code |
162 | inside not being immediately executed, a C<sub {}> is not so much a |
a0d0e21e |
163 | declaration as it is an operator, like C<do{}> or C<eval{}>. (However, no |
5a964f20 |
164 | matter how many times you execute that particular line (unless you're in an |
19799a22 |
165 | C<eval("...")>), $coderef will still have a reference to the I<same> |
a0d0e21e |
166 | anonymous subroutine.) |
167 | |
748a9306 |
168 | Anonymous subroutines act as closures with respect to my() variables, |
7c2ea1c7 |
169 | that is, variables lexically visible within the current scope. Closure |
748a9306 |
170 | is a notion out of the Lisp world that says if you define an anonymous |
171 | function in a particular lexical context, it pretends to run in that |
7c2ea1c7 |
172 | context even when it's called outside the context. |
748a9306 |
173 | |
174 | In human terms, it's a funny way of passing arguments to a subroutine when |
175 | you define it as well as when you call it. It's useful for setting up |
176 | little bits of code to run later, such as callbacks. You can even |
54310121 |
177 | do object-oriented stuff with it, though Perl already provides a different |
178 | mechanism to do that--see L<perlobj>. |
748a9306 |
179 | |
7c2ea1c7 |
180 | You might also think of closure as a way to write a subroutine |
181 | template without using eval(). Here's a small example of how |
182 | closures work: |
748a9306 |
183 | |
184 | sub newprint { |
185 | my $x = shift; |
186 | return sub { my $y = shift; print "$x, $y!\n"; }; |
a0d0e21e |
187 | } |
748a9306 |
188 | $h = newprint("Howdy"); |
189 | $g = newprint("Greetings"); |
190 | |
191 | # Time passes... |
192 | |
193 | &$h("world"); |
194 | &$g("earthlings"); |
a0d0e21e |
195 | |
748a9306 |
196 | This prints |
197 | |
198 | Howdy, world! |
199 | Greetings, earthlings! |
200 | |
7c2ea1c7 |
201 | Note particularly that $x continues to refer to the value passed |
202 | into newprint() I<despite> "my $x" having gone out of scope by the |
203 | time the anonymous subroutine runs. That's what a closure is all |
204 | about. |
748a9306 |
205 | |
5a964f20 |
206 | This applies only to lexical variables, by the way. Dynamic variables |
748a9306 |
207 | continue to work as they have always worked. Closure is not something |
208 | that most Perl programmers need trouble themselves about to begin with. |
a0d0e21e |
209 | |
210 | =item 5. |
d74e8afc |
211 | X<constructor> X<new> |
a0d0e21e |
212 | |
63acfd00 |
213 | References are often returned by special subroutines called constructors. Perl |
214 | objects are just references to a special type of object that happens to know |
215 | which package it's associated with. Constructors are just special subroutines |
216 | that know how to create that association. They do so by starting with an |
217 | ordinary reference, and it remains an ordinary reference even while it's also |
218 | being an object. Constructors are often named C<new()>. You I<can> call them |
219 | indirectly: |
220 | |
221 | $objref = new Doggie( Tail => 'short', Ears => 'long' ); |
222 | |
223 | But that can produce ambiguous syntax in certain cases, so it's often |
224 | better to use the direct method invocation approach: |
5a964f20 |
225 | |
226 | $objref = Doggie->new(Tail => 'short', Ears => 'long'); |
227 | |
228 | use Term::Cap; |
229 | $terminal = Term::Cap->Tgetent( { OSPEED => 9600 }); |
230 | |
231 | use Tk; |
232 | $main = MainWindow->new(); |
233 | $menubar = $main->Frame(-relief => "raised", |
234 | -borderwidth => 2) |
235 | |
a0d0e21e |
236 | =item 6. |
d74e8afc |
237 | X<autovivification> |
a0d0e21e |
238 | |
239 | References of the appropriate type can spring into existence if you |
5f05dabc |
240 | dereference them in a context that assumes they exist. Because we haven't |
a0d0e21e |
241 | talked about dereferencing yet, we can't show you any examples yet. |
242 | |
cb1a09d0 |
243 | =item 7. |
d74e8afc |
244 | X<*foo{THING}> X<*> |
cb1a09d0 |
245 | |
55497cff |
246 | A reference can be created by using a special syntax, lovingly known as |
247 | the *foo{THING} syntax. *foo{THING} returns a reference to the THING |
248 | slot in *foo (which is the symbol table entry which holds everything |
249 | known as foo). |
cb1a09d0 |
250 | |
55497cff |
251 | $scalarref = *foo{SCALAR}; |
252 | $arrayref = *ARGV{ARRAY}; |
253 | $hashref = *ENV{HASH}; |
254 | $coderef = *handler{CODE}; |
36477c24 |
255 | $ioref = *STDIN{IO}; |
55497cff |
256 | $globref = *foo{GLOB}; |
c0bd1adc |
257 | $formatref = *foo{FORMAT}; |
55497cff |
258 | |
7c2ea1c7 |
259 | All of these are self-explanatory except for C<*foo{IO}>. It returns |
260 | the IO handle, used for file handles (L<perlfunc/open>), sockets |
261 | (L<perlfunc/socket> and L<perlfunc/socketpair>), and directory |
262 | handles (L<perlfunc/opendir>). For compatibility with previous |
39b99f21 |
263 | versions of Perl, C<*foo{FILEHANDLE}> is a synonym for C<*foo{IO}>, though it |
264 | is deprecated as of 5.8.0. If deprecation warnings are in effect, it will warn |
265 | of its use. |
55497cff |
266 | |
7c2ea1c7 |
267 | C<*foo{THING}> returns undef if that particular THING hasn't been used yet, |
268 | except in the case of scalars. C<*foo{SCALAR}> returns a reference to an |
5f05dabc |
269 | anonymous scalar if $foo hasn't been used yet. This might change in a |
270 | future release. |
271 | |
7c2ea1c7 |
272 | C<*foo{IO}> is an alternative to the C<*HANDLE> mechanism given in |
5a964f20 |
273 | L<perldata/"Typeglobs and Filehandles"> for passing filehandles |
274 | into or out of subroutines, or storing into larger data structures. |
275 | Its disadvantage is that it won't create a new filehandle for you. |
7c2ea1c7 |
276 | Its advantage is that you have less risk of clobbering more than |
277 | you want to with a typeglob assignment. (It still conflates file |
278 | and directory handles, though.) However, if you assign the incoming |
279 | value to a scalar instead of a typeglob as we do in the examples |
280 | below, there's no risk of that happening. |
36477c24 |
281 | |
7c2ea1c7 |
282 | splutter(*STDOUT); # pass the whole glob |
283 | splutter(*STDOUT{IO}); # pass both file and dir handles |
5a964f20 |
284 | |
cb1a09d0 |
285 | sub splutter { |
286 | my $fh = shift; |
287 | print $fh "her um well a hmmm\n"; |
288 | } |
289 | |
7c2ea1c7 |
290 | $rec = get_rec(*STDIN); # pass the whole glob |
291 | $rec = get_rec(*STDIN{IO}); # pass both file and dir handles |
5a964f20 |
292 | |
cb1a09d0 |
293 | sub get_rec { |
294 | my $fh = shift; |
295 | return scalar <$fh>; |
296 | } |
297 | |
a0d0e21e |
298 | =back |
299 | |
5a964f20 |
300 | =head2 Using References |
d74e8afc |
301 | X<reference, use> X<dereferencing> X<dereference> |
5a964f20 |
302 | |
a0d0e21e |
303 | That's it for creating references. By now you're probably dying to |
304 | know how to use references to get back to your long-lost data. There |
305 | are several basic methods. |
306 | |
307 | =over 4 |
308 | |
309 | =item 1. |
310 | |
6309d9d9 |
311 | Anywhere you'd put an identifier (or chain of identifiers) as part |
312 | of a variable or subroutine name, you can replace the identifier with |
313 | a simple scalar variable containing a reference of the correct type: |
a0d0e21e |
314 | |
315 | $bar = $$scalarref; |
316 | push(@$arrayref, $filename); |
317 | $$arrayref[0] = "January"; |
318 | $$hashref{"KEY"} = "VALUE"; |
319 | &$coderef(1,2,3); |
cb1a09d0 |
320 | print $globref "output\n"; |
a0d0e21e |
321 | |
19799a22 |
322 | It's important to understand that we are specifically I<not> dereferencing |
a0d0e21e |
323 | C<$arrayref[0]> or C<$hashref{"KEY"}> there. The dereference of the |
19799a22 |
324 | scalar variable happens I<before> it does any key lookups. Anything more |
a0d0e21e |
325 | complicated than a simple scalar variable must use methods 2 or 3 below. |
326 | However, a "simple scalar" includes an identifier that itself uses method |
327 | 1 recursively. Therefore, the following prints "howdy". |
328 | |
329 | $refrefref = \\\"howdy"; |
330 | print $$$$refrefref; |
331 | |
332 | =item 2. |
d74e8afc |
333 | X<${}> X<@{}> X<%{}> |
a0d0e21e |
334 | |
6309d9d9 |
335 | Anywhere you'd put an identifier (or chain of identifiers) as part of a |
336 | variable or subroutine name, you can replace the identifier with a |
337 | BLOCK returning a reference of the correct type. In other words, the |
338 | previous examples could be written like this: |
a0d0e21e |
339 | |
340 | $bar = ${$scalarref}; |
341 | push(@{$arrayref}, $filename); |
342 | ${$arrayref}[0] = "January"; |
343 | ${$hashref}{"KEY"} = "VALUE"; |
344 | &{$coderef}(1,2,3); |
36477c24 |
345 | $globref->print("output\n"); # iff IO::Handle is loaded |
a0d0e21e |
346 | |
347 | Admittedly, it's a little silly to use the curlies in this case, but |
348 | the BLOCK can contain any arbitrary expression, in particular, |
349 | subscripted expressions: |
350 | |
54310121 |
351 | &{ $dispatch{$index} }(1,2,3); # call correct routine |
a0d0e21e |
352 | |
353 | Because of being able to omit the curlies for the simple case of C<$$x>, |
354 | people often make the mistake of viewing the dereferencing symbols as |
355 | proper operators, and wonder about their precedence. If they were, |
5f05dabc |
356 | though, you could use parentheses instead of braces. That's not the case. |
a0d0e21e |
357 | Consider the difference below; case 0 is a short-hand version of case 1, |
19799a22 |
358 | I<not> case 2: |
a0d0e21e |
359 | |
360 | $$hashref{"KEY"} = "VALUE"; # CASE 0 |
361 | ${$hashref}{"KEY"} = "VALUE"; # CASE 1 |
362 | ${$hashref{"KEY"}} = "VALUE"; # CASE 2 |
363 | ${$hashref->{"KEY"}} = "VALUE"; # CASE 3 |
364 | |
365 | Case 2 is also deceptive in that you're accessing a variable |
366 | called %hashref, not dereferencing through $hashref to the hash |
367 | it's presumably referencing. That would be case 3. |
368 | |
369 | =item 3. |
d74e8afc |
370 | X<autovivification> X<< -> >> X<arrow> |
a0d0e21e |
371 | |
6da72b64 |
372 | Subroutine calls and lookups of individual array elements arise often |
373 | enough that it gets cumbersome to use method 2. As a form of |
374 | syntactic sugar, the examples for method 2 may be written: |
a0d0e21e |
375 | |
6da72b64 |
376 | $arrayref->[0] = "January"; # Array element |
377 | $hashref->{"KEY"} = "VALUE"; # Hash element |
378 | $coderef->(1,2,3); # Subroutine call |
a0d0e21e |
379 | |
6da72b64 |
380 | The left side of the arrow can be any expression returning a reference, |
19799a22 |
381 | including a previous dereference. Note that C<$array[$x]> is I<not> the |
c47ff5f1 |
382 | same thing as C<< $array->[$x] >> here: |
a0d0e21e |
383 | |
384 | $array[$x]->{"foo"}->[0] = "January"; |
385 | |
386 | This is one of the cases we mentioned earlier in which references could |
387 | spring into existence when in an lvalue context. Before this |
388 | statement, C<$array[$x]> may have been undefined. If so, it's |
389 | automatically defined with a hash reference so that we can look up |
c47ff5f1 |
390 | C<{"foo"}> in it. Likewise C<< $array[$x]->{"foo"} >> will automatically get |
a0d0e21e |
391 | defined with an array reference so that we can look up C<[0]> in it. |
5a964f20 |
392 | This process is called I<autovivification>. |
a0d0e21e |
393 | |
19799a22 |
394 | One more thing here. The arrow is optional I<between> brackets |
a0d0e21e |
395 | subscripts, so you can shrink the above down to |
396 | |
397 | $array[$x]{"foo"}[0] = "January"; |
398 | |
399 | Which, in the degenerate case of using only ordinary arrays, gives you |
400 | multidimensional arrays just like C's: |
401 | |
402 | $score[$x][$y][$z] += 42; |
403 | |
404 | Well, okay, not entirely like C's arrays, actually. C doesn't know how |
405 | to grow its arrays on demand. Perl does. |
406 | |
407 | =item 4. |
d74e8afc |
408 | X<encapsulation> |
a0d0e21e |
409 | |
410 | If a reference happens to be a reference to an object, then there are |
411 | probably methods to access the things referred to, and you should probably |
412 | stick to those methods unless you're in the class package that defines the |
413 | object's methods. In other words, be nice, and don't violate the object's |
414 | encapsulation without a very good reason. Perl does not enforce |
415 | encapsulation. We are not totalitarians here. We do expect some basic |
416 | civility though. |
417 | |
418 | =back |
419 | |
7c2ea1c7 |
420 | Using a string or number as a reference produces a symbolic reference, |
421 | as explained above. Using a reference as a number produces an |
422 | integer representing its storage location in memory. The only |
423 | useful thing to be done with this is to compare two references |
424 | numerically to see whether they refer to the same location. |
d74e8afc |
425 | X<reference, numeric context> |
7c2ea1c7 |
426 | |
427 | if ($ref1 == $ref2) { # cheap numeric compare of references |
428 | print "refs 1 and 2 refer to the same thing\n"; |
429 | } |
430 | |
431 | Using a reference as a string produces both its referent's type, |
432 | including any package blessing as described in L<perlobj>, as well |
433 | as the numeric address expressed in hex. The ref() operator returns |
434 | just the type of thing the reference is pointing to, without the |
435 | address. See L<perlfunc/ref> for details and examples of its use. |
d74e8afc |
436 | X<reference, string context> |
a0d0e21e |
437 | |
5a964f20 |
438 | The bless() operator may be used to associate the object a reference |
439 | points to with a package functioning as an object class. See L<perlobj>. |
a0d0e21e |
440 | |
5f05dabc |
441 | A typeglob may be dereferenced the same way a reference can, because |
7c2ea1c7 |
442 | the dereference syntax always indicates the type of reference desired. |
a0d0e21e |
443 | So C<${*foo}> and C<${\$foo}> both indicate the same scalar variable. |
444 | |
445 | Here's a trick for interpolating a subroutine call into a string: |
446 | |
cb1a09d0 |
447 | print "My sub returned @{[mysub(1,2,3)]} that time.\n"; |
448 | |
449 | The way it works is that when the C<@{...}> is seen in the double-quoted |
450 | string, it's evaluated as a block. The block creates a reference to an |
451 | anonymous array containing the results of the call to C<mysub(1,2,3)>. So |
452 | the whole block returns a reference to an array, which is then |
453 | dereferenced by C<@{...}> and stuck into the double-quoted string. This |
454 | chicanery is also useful for arbitrary expressions: |
a0d0e21e |
455 | |
184e9718 |
456 | print "That yields @{[$n + 5]} widgets\n"; |
a0d0e21e |
457 | |
458 | =head2 Symbolic references |
d74e8afc |
459 | X<reference, symbolic> X<reference, soft> |
460 | X<symbolic reference> X<soft reference> |
a0d0e21e |
461 | |
462 | We said that references spring into existence as necessary if they are |
463 | undefined, but we didn't say what happens if a value used as a |
19799a22 |
464 | reference is already defined, but I<isn't> a hard reference. If you |
7c2ea1c7 |
465 | use it as a reference, it'll be treated as a symbolic |
19799a22 |
466 | reference. That is, the value of the scalar is taken to be the I<name> |
a0d0e21e |
467 | of a variable, rather than a direct link to a (possibly) anonymous |
468 | value. |
469 | |
470 | People frequently expect it to work like this. So it does. |
471 | |
472 | $name = "foo"; |
473 | $$name = 1; # Sets $foo |
474 | ${$name} = 2; # Sets $foo |
475 | ${$name x 2} = 3; # Sets $foofoo |
476 | $name->[0] = 4; # Sets $foo[0] |
477 | @$name = (); # Clears @foo |
478 | &$name(); # Calls &foo() (as in Perl 4) |
479 | $pack = "THAT"; |
480 | ${"${pack}::$name"} = 5; # Sets $THAT::foo without eval |
481 | |
7c2ea1c7 |
482 | This is powerful, and slightly dangerous, in that it's possible |
a0d0e21e |
483 | to intend (with the utmost sincerity) to use a hard reference, and |
484 | accidentally use a symbolic reference instead. To protect against |
485 | that, you can say |
486 | |
487 | use strict 'refs'; |
488 | |
489 | and then only hard references will be allowed for the rest of the enclosing |
54310121 |
490 | block. An inner block may countermand that with |
a0d0e21e |
491 | |
492 | no strict 'refs'; |
493 | |
5a964f20 |
494 | Only package variables (globals, even if localized) are visible to |
495 | symbolic references. Lexical variables (declared with my()) aren't in |
496 | a symbol table, and thus are invisible to this mechanism. For example: |
a0d0e21e |
497 | |
5a964f20 |
498 | local $value = 10; |
b0c35547 |
499 | $ref = "value"; |
a0d0e21e |
500 | { |
501 | my $value = 20; |
502 | print $$ref; |
54310121 |
503 | } |
a0d0e21e |
504 | |
505 | This will still print 10, not 20. Remember that local() affects package |
506 | variables, which are all "global" to the package. |
507 | |
748a9306 |
508 | =head2 Not-so-symbolic references |
509 | |
a6006777 |
510 | A new feature contributing to readability in perl version 5.001 is that the |
511 | brackets around a symbolic reference behave more like quotes, just as they |
748a9306 |
512 | always have within a string. That is, |
513 | |
514 | $push = "pop on "; |
515 | print "${push}over"; |
516 | |
7c2ea1c7 |
517 | has always meant to print "pop on over", even though push is |
748a9306 |
518 | a reserved word. This has been generalized to work the same outside |
519 | of quotes, so that |
520 | |
521 | print ${push} . "over"; |
522 | |
523 | and even |
524 | |
525 | print ${ push } . "over"; |
526 | |
527 | will have the same effect. (This would have been a syntax error in |
7c2ea1c7 |
528 | Perl 5.000, though Perl 4 allowed it in the spaceless form.) This |
748a9306 |
529 | construct is I<not> considered to be a symbolic reference when you're |
530 | using strict refs: |
531 | |
532 | use strict 'refs'; |
533 | ${ bareword }; # Okay, means $bareword. |
534 | ${ "bareword" }; # Error, symbolic reference. |
535 | |
536 | Similarly, because of all the subscripting that is done using single |
537 | words, we've applied the same rule to any bareword that is used for |
538 | subscripting a hash. So now, instead of writing |
539 | |
540 | $array{ "aaa" }{ "bbb" }{ "ccc" } |
541 | |
5f05dabc |
542 | you can write just |
748a9306 |
543 | |
544 | $array{ aaa }{ bbb }{ ccc } |
545 | |
546 | and not worry about whether the subscripts are reserved words. In the |
547 | rare event that you do wish to do something like |
548 | |
549 | $array{ shift } |
550 | |
551 | you can force interpretation as a reserved word by adding anything that |
552 | makes it more than a bareword: |
553 | |
554 | $array{ shift() } |
555 | $array{ +shift } |
556 | $array{ shift @_ } |
557 | |
9f1b1f2d |
558 | The C<use warnings> pragma or the B<-w> switch will warn you if it |
559 | interprets a reserved word as a string. |
5f05dabc |
560 | But it will no longer warn you about using lowercase words, because the |
748a9306 |
561 | string is effectively quoted. |
562 | |
49399b3f |
563 | =head2 Pseudo-hashes: Using an array as a hash |
d74e8afc |
564 | X<pseudo-hash> X<pseudo hash> X<pseudohash> |
49399b3f |
565 | |
6d822dc4 |
566 | Pseudo-hashes have been removed from Perl. The 'fields' pragma |
567 | remains available. |
e0478e5a |
568 | |
5a964f20 |
569 | =head2 Function Templates |
d74e8afc |
570 | X<scope, lexical> X<closure> X<lexical> X<lexical scope> |
571 | X<subroutine, nested> X<sub, nested> X<subroutine, local> X<sub, local> |
5a964f20 |
572 | |
b5c19bd7 |
573 | As explained above, an anonymous function with access to the lexical |
574 | variables visible when that function was compiled, creates a closure. It |
575 | retains access to those variables even though it doesn't get run until |
576 | later, such as in a signal handler or a Tk callback. |
5a964f20 |
577 | |
578 | Using a closure as a function template allows us to generate many functions |
c2611fb3 |
579 | that act similarly. Suppose you wanted functions named after the colors |
5a964f20 |
580 | that generated HTML font changes for the various colors: |
581 | |
582 | print "Be ", red("careful"), "with that ", green("light"); |
583 | |
7c2ea1c7 |
584 | The red() and green() functions would be similar. To create these, |
5a964f20 |
585 | we'll assign a closure to a typeglob of the name of the function we're |
586 | trying to build. |
587 | |
588 | @colors = qw(red blue green yellow orange purple violet); |
589 | for my $name (@colors) { |
590 | no strict 'refs'; # allow symbol table manipulation |
591 | *$name = *{uc $name} = sub { "<FONT COLOR='$name'>@_</FONT>" }; |
592 | } |
593 | |
594 | Now all those different functions appear to exist independently. You can |
595 | call red(), RED(), blue(), BLUE(), green(), etc. This technique saves on |
596 | both compile time and memory use, and is less error-prone as well, since |
597 | syntax checks happen at compile time. It's critical that any variables in |
598 | the anonymous subroutine be lexicals in order to create a proper closure. |
599 | That's the reasons for the C<my> on the loop iteration variable. |
600 | |
601 | This is one of the only places where giving a prototype to a closure makes |
602 | much sense. If you wanted to impose scalar context on the arguments of |
603 | these functions (probably not a wise idea for this particular example), |
604 | you could have written it this way instead: |
605 | |
606 | *$name = sub ($) { "<FONT COLOR='$name'>$_[0]</FONT>" }; |
607 | |
608 | However, since prototype checking happens at compile time, the assignment |
609 | above happens too late to be of much use. You could address this by |
610 | putting the whole loop of assignments within a BEGIN block, forcing it |
611 | to occur during compilation. |
612 | |
613 | Access to lexicals that change over type--like those in the C<for> loop |
614 | above--only works with closures, not general subroutines. In the general |
615 | case, then, named subroutines do not nest properly, although anonymous |
b5c19bd7 |
616 | ones do. Thus is because named subroutines are created (and capture any |
617 | outer lexicals) only once at compile time, whereas anonymous subroutines |
618 | get to capture each time you execute the 'sub' operator. If you are |
619 | accustomed to using nested subroutines in other programming languages with |
620 | their own private variables, you'll have to work at it a bit in Perl. The |
621 | intuitive coding of this type of thing incurs mysterious warnings about |
b432a672 |
622 | "will not stay shared". For example, this won't work: |
5a964f20 |
623 | |
624 | sub outer { |
625 | my $x = $_[0] + 35; |
626 | sub inner { return $x * 19 } # WRONG |
627 | return $x + inner(); |
b432a672 |
628 | } |
5a964f20 |
629 | |
630 | A work-around is the following: |
631 | |
632 | sub outer { |
633 | my $x = $_[0] + 35; |
634 | local *inner = sub { return $x * 19 }; |
635 | return $x + inner(); |
b432a672 |
636 | } |
5a964f20 |
637 | |
638 | Now inner() can only be called from within outer(), because of the |
639 | temporary assignments of the closure (anonymous subroutine). But when |
640 | it does, it has normal access to the lexical variable $x from the scope |
641 | of outer(). |
642 | |
643 | This has the interesting effect of creating a function local to another |
644 | function, something not normally supported in Perl. |
645 | |
cb1a09d0 |
646 | =head1 WARNING |
d74e8afc |
647 | X<reference, string context> X<reference, use as hash key> |
748a9306 |
648 | |
649 | You may not (usefully) use a reference as the key to a hash. It will be |
650 | converted into a string: |
651 | |
652 | $x{ \$a } = $a; |
653 | |
54310121 |
654 | If you try to dereference the key, it won't do a hard dereference, and |
184e9718 |
655 | you won't accomplish what you're attempting. You might want to do something |
cb1a09d0 |
656 | more like |
748a9306 |
657 | |
cb1a09d0 |
658 | $r = \@a; |
659 | $x{ $r } = $r; |
660 | |
661 | And then at least you can use the values(), which will be |
662 | real refs, instead of the keys(), which won't. |
663 | |
5a964f20 |
664 | The standard Tie::RefHash module provides a convenient workaround to this. |
665 | |
cb1a09d0 |
666 | =head1 SEE ALSO |
a0d0e21e |
667 | |
668 | Besides the obvious documents, source code can be instructive. |
7c2ea1c7 |
669 | Some pathological examples of the use of references can be found |
a0d0e21e |
670 | in the F<t/op/ref.t> regression test in the Perl source directory. |
cb1a09d0 |
671 | |
672 | See also L<perldsc> and L<perllol> for how to use references to create |
5a964f20 |
673 | complex data structures, and L<perltoot>, L<perlobj>, and L<perlbot> |
674 | for how to use them to create objects. |