Commit | Line | Data |
e98cedbf |
1 | package Devel::Size; |
2 | |
e98cedbf |
3 | use strict; |
a6ea0805 |
4 | use vars qw($VERSION @ISA @EXPORT @EXPORT_OK $AUTOLOAD %EXPORT_TAGS); |
e98cedbf |
5 | |
6 | require Exporter; |
7 | require DynaLoader; |
8 | |
a6ea0805 |
9 | @ISA = qw(Exporter DynaLoader); |
e98cedbf |
10 | |
11 | # Items to export into callers namespace by default. Note: do not export |
12 | # names by default without a very good reason. Use EXPORT_OK instead. |
13 | # Do not simply export all your public functions/methods/constants. |
14 | |
15 | # This allows declaration use Devel::Size ':all'; |
16 | # If you do not need this, moving things directly into @EXPORT or @EXPORT_OK |
17 | # will save memory. |
a6ea0805 |
18 | %EXPORT_TAGS = ( 'all' => [ qw( |
0bff12d8 |
19 | size total_size |
e98cedbf |
20 | ) ] ); |
21 | |
a6ea0805 |
22 | @EXPORT_OK = ( @{ $EXPORT_TAGS{'all'} } ); |
e98cedbf |
23 | |
a6ea0805 |
24 | @EXPORT = qw( |
e98cedbf |
25 | |
26 | ); |
966a1570 |
27 | $VERSION = '0.56'; |
e98cedbf |
28 | |
29 | bootstrap Devel::Size $VERSION; |
30 | |
31 | # Preloaded methods go here. |
32 | |
33 | 1; |
34 | __END__ |
e98cedbf |
35 | |
36 | =head1 NAME |
37 | |
0bff12d8 |
38 | Devel::Size - Perl extension for finding the memory usage of Perl variables |
e98cedbf |
39 | |
40 | =head1 SYNOPSIS |
41 | |
0bff12d8 |
42 | use Devel::Size qw(size total_size); |
e98cedbf |
43 | |
0bff12d8 |
44 | my $size = size("A string"); |
45 | |
46 | my @foo = (1, 2, 3, 4, 5); |
47 | my $other_size = size(\@foo); |
48 | |
49 | my $foo = {a => [1, 2, 3], |
5c2e1b12 |
50 | b => {a => [1, 3, 4]} |
51 | }; |
0bff12d8 |
52 | my $total_size = total_size($foo); |
5c2e1b12 |
53 | |
e98cedbf |
54 | =head1 DESCRIPTION |
55 | |
0bff12d8 |
56 | This module figures out the real sizes of Perl variables in bytes. |
57 | Call functions with a reference to the variable you want the size |
58 | of. If the variable is a plain scalar it returns the size of |
59 | the scalar. If the variable is a hash or an array, use a reference |
60 | when calling. |
61 | |
62 | =head1 FUNCTIONS |
63 | |
64 | =head2 size($ref) |
e98cedbf |
65 | |
5c2e1b12 |
66 | The C<size> function returns the amount of memory the variable |
0bff12d8 |
67 | returns. If the variable is a hash or an array, it only reports |
68 | the amount used by the structure, I<not> the contents. |
69 | |
70 | =head2 total_size($ref) |
5c2e1b12 |
71 | |
0bff12d8 |
72 | The C<total_size> function will traverse the variable and look |
73 | at the sizes of contents. Any references contained in the variable |
74 | will also be followed, so this function can be used to get the |
75 | total size of a multidimensional data structure. At the moment |
76 | there is no way to get the size of an array or a hash and its |
77 | elements without using this function. |
5c2e1b12 |
78 | |
b98fcdb9 |
79 | =head1 EXPORT |
e98cedbf |
80 | |
0bff12d8 |
81 | None but default, but optionally C<size> and C<total_size>. |
e98cedbf |
82 | |
b98fcdb9 |
83 | =head1 UNDERSTANDING MEMORY ALLOCATION |
84 | |
85 | Please note that the following discussion of memory allocation in perl |
86 | is based on the perl 5.8.0 sources. While this is generally |
87 | applicable to all versions of perl, some of the gory details are |
88 | omitted. It also makes some presumptions on how your system memory |
89 | allocator works so, while it will be generally correct, it may not |
90 | exactly reflect your system. (Generally the only issue is the size of |
91 | the constant values we'll talk about, not their existence) |
92 | |
93 | =head2 The C library |
94 | |
95 | It's important firtst to understand how your OS and libraries handle |
96 | memory. When the perl interpreter needs some memory, it asks the C |
97 | runtime library for it, using the C<malloc()> call. C<malloc> has one |
98 | parameter, the size of the memory allocation you want, and returns a |
99 | pointer to that memory. C<malloc> also makes sure that the pointer it |
100 | returns to you is properly aligned. When you're done with the memory |
101 | you hand it back to the library with the C<free()> call. C<free> has |
102 | one parameter, the pointer that C<malloc> returned. There are a couple of interesting ramifications to this. |
103 | |
104 | Because malloc has to return an aligned pointer, it will round up the |
105 | memory allocation to make sure that the memory it returns is aligned |
106 | right. What that alignment is depends on your CPU, OS, and compiler |
107 | settings, but things are generally aligned to either a 4 or 8 byte |
108 | boundary. That means that if you ask for 1 byte, C<malloc> will |
109 | silently round up to either 4 or 8 bytes, though it doesn't tell the |
110 | program making the request, so the extra memory can't be used. |
111 | |
112 | Since C<free> isn't given the size of the memory chunk you're |
113 | freeing, it has to track it another way. Most libraries do this by |
114 | tacking on a length field just before the memory it hands to your |
115 | program. (It's put before the beginning rather than after the end |
116 | because it's less likely to get mangled by program bugs) This size |
117 | field is the size of your platform integer, Generally either 4 or 8 |
118 | bytes. |
119 | |
120 | So, if you asked for 1 byte, malloc would build something like this: |
121 | |
122 | +------------------+ |
123 | | 4 byte length | |
124 | +------------------+ <----- the pointer malloc returns |
125 | | your 1 byte | |
126 | +------------------+ |
127 | | 3 bytes padding | |
128 | +------------------+ |
129 | |
130 | As you can see, you asked for 1 byte but C<malloc> used 8. If your |
131 | integers were 8 bytes rather than 4, C<malloc> would have used 16 bytes |
132 | to satisfy your 1 byte request. |
133 | |
134 | The C memory allocation system also keeps a list of free memory |
135 | chunks, so it can recycle freed memory. For performance reasons, some |
136 | C memory allocation systems put a limit to the number of free |
137 | segments that are on the free list, or only search through a small |
138 | number of memory chunks waiting to be recycled before just |
139 | allocating more memory from the system. |
140 | |
141 | The memory allocation system tries to keep as few chunks on the free |
142 | list as possible. It does this by trying to notice if there are two |
143 | adjacent chunks of memory on the free list and, if there are, |
144 | coalescing them into a single larger chunk. This works pretty well, |
145 | but there are ways to have a lot of memory on the free list yet still |
146 | not have anything that can be allocated. If a program allocates one |
147 | million eight-byte chunks, for example, then frees every other chunk, |
148 | there will be four million bytes of memory on the free list, but none |
149 | of that memory can be handed out to satisfy a request for 10 |
150 | bytes. This is what's referred to as a fragmented free list, and can |
151 | be one reason why your program could have a lot of free memory yet |
152 | still not be able to allocate more, or have a huge process size and |
153 | still have almost no memory actually allocated to the program running. |
154 | |
155 | =head2 Perl |
156 | |
157 | Perl's memory allocation scheme is a bit convoluted, and more complex |
158 | than can really be addressed here, but there is one common spot where perl's |
159 | memory allocation is unintuitive, and that's for hash keys. |
160 | |
161 | When you have a hash, each entry has a structure that points to the |
162 | key and the value for that entry. The value is just a pointer to the |
163 | scalar in the entry, and doesn't take up any special amount of |
164 | memory. The key structure holds the hash value for the key, the key |
165 | length, and the key string. (The entry and key structures are |
166 | separate so perl can potentially share keys across multiple hashes) |
167 | |
168 | The entry structure has three pointers in it, and takes up either 12 |
169 | or 24 bytes, depending on whether you're on a 32 bit or 64 bit |
170 | system. Since these structures are of fixed size, perl can keep a big |
171 | pool of them internally (generally called an arena) so it doesn't |
172 | have to allocate memory for each one. |
173 | |
174 | The key structure, though, is of variable length because the key |
175 | string is of variable length, so perl has to ask the system for a |
176 | memory allocation for each key. The base size of this structure is |
177 | 8 or 16 bytes (once again, depending on whether you're on a 32 bit or |
178 | 64 bit system) plus the string length plus two bytes. |
179 | |
180 | Since this memory has to be allocated from the system there's the |
181 | malloc size-field overhead (4 or 8 bytes) plus the alignment bytes (0 |
182 | to 7, depending on your system and the key length) |
183 | that get added on to the chunk perl requests. If the key is only 1 |
184 | character, and you're on a 32 bit system, the allocation will be 16 |
185 | bytes. If the key is 7 characters then the allocation is 24 bytes on |
186 | a 32 bit system. If you're on a 64 bit system the numbers get even |
187 | larger. |
188 | |
189 | This does mean that hashes eat up a I<lot> of memory, both in memory |
190 | Devel::Size can track (the memory actually in the structures and |
191 | strings) and that it can't (the malloc alignment and length overhead). |
192 | |
193 | =head1 DANGERS |
194 | |
195 | Devel::Size, because of the way it works, can consume a |
196 | considerable amount of memory as it runs. It will use five |
197 | pointers, two integers, and two bytes worth of storage, plus |
198 | potential alignment and bucket overhead, per thing it looks at. This |
199 | memory is released at the end, but it may fragment your free pool, |
200 | and will definitely expand your process' memory footprint. |
201 | |
e98cedbf |
202 | =head1 BUGS |
203 | |
fea63ffa |
204 | Doesn't currently walk all the bits for code refs, formats, and |
6a9ad7ec |
205 | IO. Those throw a warning, but a minimum size for them is returned. |
e98cedbf |
206 | |
b98fcdb9 |
207 | Devel::Size only counts the memory that perl actually allocates. It |
208 | doesn't count 'dark' memory--memory that is lost due to fragmented free lists, |
209 | allocation alignments, or C library overhead. |
210 | |
e98cedbf |
211 | =head1 AUTHOR |
212 | |
213 | Dan Sugalski dan@sidhe.org |
214 | |
215 | =head1 SEE ALSO |
216 | |
217 | perl(1). |
218 | |
219 | =cut |