Commit | Line | Data |
cb1a09d0 |
1 | =head1 NAME |
4633a7c4 |
2 | |
3 | perlLoL - Manipulating Lists of Lists in Perl |
4 | |
cb1a09d0 |
5 | =head1 DESCRIPTION |
6 | |
7 | =head1 Declaration and Access of Lists of Lists |
4633a7c4 |
8 | |
9 | The simplest thing to build is a list of lists (sometimes called an array |
10 | of arrays). It's reasonably easy to understand, and almost everything |
11 | that applies here will also be applicable later on with the fancier data |
12 | structures. |
13 | |
14 | A list of lists, or an array of an array if you would, is just a regular |
1fef88e7 |
15 | old array @LoL that you can get at with two subscripts, like C<$LoL[3][2]>. Here's |
4633a7c4 |
16 | a declaration of the array: |
17 | |
18 | # assign to our array a list of list references |
54310121 |
19 | @LoL = ( |
4633a7c4 |
20 | [ "fred", "barney" ], |
21 | [ "george", "jane", "elroy" ], |
22 | [ "homer", "marge", "bart" ], |
23 | ); |
24 | |
25 | print $LoL[2][2]; |
26 | bart |
27 | |
28 | Now you should be very careful that the outer bracket type |
29 | is a round one, that is, parentheses. That's because you're assigning to |
5f05dabc |
30 | an @list, so you need parentheses. If you wanted there I<not> to be an @LoL, |
4633a7c4 |
31 | but rather just a reference to it, you could do something more like this: |
32 | |
33 | # assign a reference to list of list references |
34 | $ref_to_LoL = [ |
35 | [ "fred", "barney", "pebbles", "bambam", "dino", ], |
36 | [ "homer", "bart", "marge", "maggie", ], |
37 | [ "george", "jane", "alroy", "judy", ], |
38 | ]; |
39 | |
40 | print $ref_to_LoL->[2][2]; |
41 | |
54310121 |
42 | Notice that the outer bracket type has changed, and so our access syntax |
4633a7c4 |
43 | has also changed. That's because unlike C, in perl you can't freely |
54310121 |
44 | interchange arrays and references thereto. $ref_to_LoL is a reference to an |
45 | array, whereas @LoL is an array proper. Likewise, C<$LoL[2]> is not an |
4633a7c4 |
46 | array, but an array ref. So how come you can write these: |
47 | |
48 | $LoL[2][2] |
49 | $ref_to_LoL->[2][2] |
50 | |
51 | instead of having to write these: |
52 | |
53 | $LoL[2]->[2] |
54 | $ref_to_LoL->[2]->[2] |
55 | |
56 | Well, that's because the rule is that on adjacent brackets only (whether |
1fef88e7 |
57 | square or curly), you are free to omit the pointer dereferencing arrow. |
4d9142af |
58 | But you cannot do so for the very first one if it's a scalar containing |
4633a7c4 |
59 | a reference, which means that $ref_to_LoL always needs it. |
60 | |
61 | =head1 Growing Your Own |
62 | |
63 | That's all well and good for declaration of a fixed data structure, |
64 | but what if you wanted to add new elements on the fly, or build |
65 | it up entirely from scratch? |
66 | |
67 | First, let's look at reading it in from a file. This is something like |
68 | adding a row at a time. We'll assume that there's a flat file in which |
69 | each line is a row and each word an element. If you're trying to develop an |
70 | @LoL list containing all these, here's the right way to do that: |
71 | |
72 | while (<>) { |
73 | @tmp = split; |
74 | push @LoL, [ @tmp ]; |
54310121 |
75 | } |
4633a7c4 |
76 | |
77 | You might also have loaded that from a function: |
78 | |
79 | for $i ( 1 .. 10 ) { |
80 | $LoL[$i] = [ somefunc($i) ]; |
81 | } |
82 | |
83 | Or you might have had a temporary variable sitting around with the |
54310121 |
84 | list in it. |
4633a7c4 |
85 | |
86 | for $i ( 1 .. 10 ) { |
87 | @tmp = somefunc($i); |
88 | $LoL[$i] = [ @tmp ]; |
89 | } |
90 | |
91 | It's very important that you make sure to use the C<[]> list reference |
92 | constructor. That's because this will be very wrong: |
93 | |
94 | $LoL[$i] = @tmp; |
95 | |
54310121 |
96 | You see, assigning a named list like that to a scalar just counts the |
97 | number of elements in @tmp, which probably isn't what you want. |
4633a7c4 |
98 | |
99 | If you are running under C<use strict>, you'll have to add some |
100 | declarations to make it happy: |
101 | |
102 | use strict; |
103 | my(@LoL, @tmp); |
104 | while (<>) { |
105 | @tmp = split; |
106 | push @LoL, [ @tmp ]; |
54310121 |
107 | } |
4633a7c4 |
108 | |
109 | Of course, you don't need the temporary array to have a name at all: |
110 | |
111 | while (<>) { |
112 | push @LoL, [ split ]; |
54310121 |
113 | } |
4633a7c4 |
114 | |
115 | You also don't have to use push(). You could just make a direct assignment |
116 | if you knew where you wanted to put it: |
117 | |
118 | my (@LoL, $i, $line); |
1fef88e7 |
119 | for $i ( 0 .. 10 ) { |
4633a7c4 |
120 | $line = <>; |
121 | $LoL[$i] = [ split ' ', $line ]; |
54310121 |
122 | } |
4633a7c4 |
123 | |
124 | or even just |
125 | |
126 | my (@LoL, $i); |
1fef88e7 |
127 | for $i ( 0 .. 10 ) { |
4633a7c4 |
128 | $LoL[$i] = [ split ' ', <> ]; |
54310121 |
129 | } |
4633a7c4 |
130 | |
4d9142af |
131 | You should in general be leery of using potential list functions |
54310121 |
132 | in a scalar context without explicitly stating such. |
4633a7c4 |
133 | This would be clearer to the casual reader: |
134 | |
135 | my (@LoL, $i); |
1fef88e7 |
136 | for $i ( 0 .. 10 ) { |
4633a7c4 |
137 | $LoL[$i] = [ split ' ', scalar(<>) ]; |
54310121 |
138 | } |
4633a7c4 |
139 | |
140 | If you wanted to have a $ref_to_LoL variable as a reference to an array, |
141 | you'd have to do something like this: |
142 | |
143 | while (<>) { |
144 | push @$ref_to_LoL, [ split ]; |
54310121 |
145 | } |
4633a7c4 |
146 | |
5f05dabc |
147 | Actually, if you were using strict, you'd have to declare not only |
148 | $ref_to_LoL as you had to declare @LoL, but you'd I<also> having to |
a6006777 |
149 | initialize it to a reference to an empty list. (This was a bug in |
150 | perl version 5.001m that's been fixed for the 5.002 release.) |
4633a7c4 |
151 | |
152 | my $ref_to_LoL = []; |
153 | while (<>) { |
154 | push @$ref_to_LoL, [ split ]; |
54310121 |
155 | } |
4633a7c4 |
156 | |
157 | Ok, now you can add new rows. What about adding new columns? If you're |
5f05dabc |
158 | dealing with just matrices, it's often easiest to use simple assignment: |
4633a7c4 |
159 | |
160 | for $x (1 .. 10) { |
161 | for $y (1 .. 10) { |
162 | $LoL[$x][$y] = func($x, $y); |
163 | } |
164 | } |
165 | |
166 | for $x ( 3, 7, 9 ) { |
167 | $LoL[$x][20] += func2($x); |
54310121 |
168 | } |
4633a7c4 |
169 | |
54310121 |
170 | It doesn't matter whether those elements are already |
4633a7c4 |
171 | there or not: it'll gladly create them for you, setting |
172 | intervening elements to C<undef> as need be. |
173 | |
5f05dabc |
174 | If you wanted just to append to a row, you'd have |
4633a7c4 |
175 | to do something a bit funnier looking: |
176 | |
177 | # add new columns to an existing row |
178 | push @{ $LoL[0] }, "wilma", "betty"; |
179 | |
5f05dabc |
180 | Notice that I I<couldn't> say just: |
4633a7c4 |
181 | |
182 | push $LoL[0], "wilma", "betty"; # WRONG! |
183 | |
184 | In fact, that wouldn't even compile. How come? Because the argument |
185 | to push() must be a real array, not just a reference to such. |
186 | |
187 | =head1 Access and Printing |
188 | |
54310121 |
189 | Now it's time to print your data structure out. How |
5f05dabc |
190 | are you going to do that? Well, if you want only one |
4633a7c4 |
191 | of the elements, it's trivial: |
192 | |
193 | print $LoL[0][0]; |
194 | |
195 | If you want to print the whole thing, though, you can't |
5f05dabc |
196 | say |
4633a7c4 |
197 | |
198 | print @LoL; # WRONG |
199 | |
5f05dabc |
200 | because you'll get just references listed, and perl will never |
54310121 |
201 | automatically dereference things for you. Instead, you have to |
4633a7c4 |
202 | roll yourself a loop or two. This prints the whole structure, |
203 | using the shell-style for() construct to loop across the outer |
54310121 |
204 | set of subscripts. |
4633a7c4 |
205 | |
206 | for $aref ( @LoL ) { |
207 | print "\t [ @$aref ],\n"; |
208 | } |
209 | |
210 | If you wanted to keep track of subscripts, you might do this: |
211 | |
212 | for $i ( 0 .. $#LoL ) { |
213 | print "\t elt $i is [ @{$LoL[$i]} ],\n"; |
214 | } |
215 | |
216 | or maybe even this. Notice the inner loop. |
217 | |
218 | for $i ( 0 .. $#LoL ) { |
219 | for $j ( 0 .. $#{$LoL[$i]} ) { |
220 | print "elt $i $j is $LoL[$i][$j]\n"; |
221 | } |
222 | } |
223 | |
54310121 |
224 | As you can see, it's getting a bit complicated. That's why |
4633a7c4 |
225 | sometimes is easier to take a temporary on your way through: |
226 | |
227 | for $i ( 0 .. $#LoL ) { |
228 | $aref = $LoL[$i]; |
229 | for $j ( 0 .. $#{$aref} ) { |
230 | print "elt $i $j is $LoL[$i][$j]\n"; |
231 | } |
232 | } |
233 | |
5f05dabc |
234 | Hmm... that's still a bit ugly. How about this: |
4633a7c4 |
235 | |
236 | for $i ( 0 .. $#LoL ) { |
237 | $aref = $LoL[$i]; |
238 | $n = @$aref - 1; |
239 | for $j ( 0 .. $n ) { |
240 | print "elt $i $j is $LoL[$i][$j]\n"; |
241 | } |
242 | } |
243 | |
244 | =head1 Slices |
245 | |
4d9142af |
246 | If you want to get at a slice (part of a row) in a multidimensional |
4633a7c4 |
247 | array, you're going to have to do some fancy subscripting. That's |
248 | because while we have a nice synonym for single elements via the |
249 | pointer arrow for dereferencing, no such convenience exists for slices. |
250 | (Remember, of course, that you can always write a loop to do a slice |
251 | operation.) |
252 | |
253 | Here's how to do one operation using a loop. We'll assume an @LoL |
254 | variable as before. |
255 | |
256 | @part = (); |
54310121 |
257 | $x = 4; |
4633a7c4 |
258 | for ($y = 7; $y < 13; $y++) { |
259 | push @part, $LoL[$x][$y]; |
54310121 |
260 | } |
4633a7c4 |
261 | |
262 | That same loop could be replaced with a slice operation: |
263 | |
264 | @part = @{ $LoL[4] } [ 7..12 ]; |
265 | |
266 | but as you might well imagine, this is pretty rough on the reader. |
267 | |
268 | Ah, but what if you wanted a I<two-dimensional slice>, such as having |
5f05dabc |
269 | $x run from 4..8 and $y run from 7 to 12? Hmm... here's the simple way: |
4633a7c4 |
270 | |
271 | @newLoL = (); |
272 | for ($startx = $x = 4; $x <= 8; $x++) { |
273 | for ($starty = $y = 7; $x <= 12; $y++) { |
274 | $newLoL[$x - $startx][$y - $starty] = $LoL[$x][$y]; |
275 | } |
54310121 |
276 | } |
4633a7c4 |
277 | |
54310121 |
278 | We can reduce some of the looping through slices |
4633a7c4 |
279 | |
280 | for ($x = 4; $x <= 8; $x++) { |
281 | push @newLoL, [ @{ $LoL[$x] } [ 7..12 ] ]; |
282 | } |
283 | |
284 | If you were into Schwartzian Transforms, you would probably |
285 | have selected map for that |
286 | |
287 | @newLoL = map { [ @{ $LoL[$_] } [ 7..12 ] ] } 4 .. 8; |
288 | |
289 | Although if your manager accused of seeking job security (or rapid |
290 | insecurity) through inscrutable code, it would be hard to argue. :-) |
291 | If I were you, I'd put that in a function: |
292 | |
293 | @newLoL = splice_2D( \@LoL, 4 => 8, 7 => 12 ); |
294 | sub splice_2D { |
295 | my $lrr = shift; # ref to list of list refs! |
54310121 |
296 | my ($x_lo, $x_hi, |
4633a7c4 |
297 | $y_lo, $y_hi) = @_; |
298 | |
54310121 |
299 | return map { |
300 | [ @{ $lrr->[$_] } [ $y_lo .. $y_hi ] ] |
4633a7c4 |
301 | } $x_lo .. $x_hi; |
54310121 |
302 | } |
4633a7c4 |
303 | |
304 | |
4633a7c4 |
305 | =head1 SEE ALSO |
306 | |
307 | perldata(1), perlref(1), perldsc(1) |
308 | |
309 | =head1 AUTHOR |
310 | |
9607fc9c |
311 | Tom Christiansen <F<tchrist@perl.com>> |
4633a7c4 |
312 | |
313 | Last udpate: Sat Oct 7 19:35:26 MDT 1995 |