Bump $VERSION to 0.76_50
[p5sagit/Devel-Size.git] / lib / Devel / Size.pm
CommitLineData
e98cedbf 1package Devel::Size;
2
e98cedbf 3use strict;
43be4c63 4use vars qw($VERSION @ISA @EXPORT_OK %EXPORT_TAGS $warn $dangle);
e98cedbf 5
177ebd37 6require 5.005;
e98cedbf 7require Exporter;
568039a0 8require XSLoader;
e98cedbf 9
568039a0 10@ISA = qw(Exporter);
e98cedbf 11
43be4c63 12@EXPORT_OK = qw(size total_size);
e98cedbf 13
43be4c63 14# This allows declaration use Devel::Size ':all';
15%EXPORT_TAGS = ( 'all' => \@EXPORT_OK );
e98cedbf 16
0f651938 17$VERSION = '0.76_50';
e98cedbf 18
568039a0 19XSLoader::load( __PACKAGE__);
e98cedbf 20
78dfb4e7 21$warn = 1;
9fc9ab86 22$dangle = 0; ## Set true to enable warnings about dangling pointers
ebb2c5b9 23
e98cedbf 241;
25__END__
e98cedbf 26
6ea94d90 27=pod
28
e98cedbf 29=head1 NAME
30
0bff12d8 31Devel::Size - Perl extension for finding the memory usage of Perl variables
e98cedbf 32
33=head1 SYNOPSIS
34
0bff12d8 35 use Devel::Size qw(size total_size);
e98cedbf 36
0bff12d8 37 my $size = size("A string");
38
39 my @foo = (1, 2, 3, 4, 5);
40 my $other_size = size(\@foo);
41
42 my $foo = {a => [1, 2, 3],
9fc9ab86 43 b => {a => [1, 3, 4]}
5c2e1b12 44 };
5a83b7cf 45 my $total_size = total_size($foo);
5c2e1b12 46
e98cedbf 47=head1 DESCRIPTION
48
5a83b7cf 49This module figures out the real size of Perl variables in bytes, as
50accurately as possible.
51
0bff12d8 52Call functions with a reference to the variable you want the size
53of. If the variable is a plain scalar it returns the size of
5a83b7cf 54this scalar. If the variable is a hash or an array, use a reference
0bff12d8 55when calling.
56
57=head1 FUNCTIONS
58
59=head2 size($ref)
e98cedbf 60
5c2e1b12 61The C<size> function returns the amount of memory the variable
0bff12d8 62returns. If the variable is a hash or an array, it only reports
63the amount used by the structure, I<not> the contents.
64
65=head2 total_size($ref)
5c2e1b12 66
0bff12d8 67The C<total_size> function will traverse the variable and look
68at the sizes of contents. Any references contained in the variable
69will also be followed, so this function can be used to get the
70total size of a multidimensional data structure. At the moment
71there is no way to get the size of an array or a hash and its
72elements without using this function.
5c2e1b12 73
b98fcdb9 74=head1 EXPORT
e98cedbf 75
0bff12d8 76None but default, but optionally C<size> and C<total_size>.
e98cedbf 77
b98fcdb9 78=head1 UNDERSTANDING MEMORY ALLOCATION
79
80Please note that the following discussion of memory allocation in perl
81is based on the perl 5.8.0 sources. While this is generally
82applicable to all versions of perl, some of the gory details are
83omitted. It also makes some presumptions on how your system memory
84allocator works so, while it will be generally correct, it may not
85exactly reflect your system. (Generally the only issue is the size of
86the constant values we'll talk about, not their existence)
87
88=head2 The C library
89
9fc9ab86 90It's important first to understand how your OS and libraries handle
b98fcdb9 91memory. When the perl interpreter needs some memory, it asks the C
92runtime library for it, using the C<malloc()> call. C<malloc> has one
93parameter, the size of the memory allocation you want, and returns a
94pointer to that memory. C<malloc> also makes sure that the pointer it
95returns to you is properly aligned. When you're done with the memory
96you hand it back to the library with the C<free()> call. C<free> has
9fc9ab86 97one parameter, the pointer that C<malloc> returned.
98There are a couple of interesting ramifications to this.
b98fcdb9 99
100Because malloc has to return an aligned pointer, it will round up the
101memory allocation to make sure that the memory it returns is aligned
102right. What that alignment is depends on your CPU, OS, and compiler
103settings, but things are generally aligned to either a 4 or 8 byte
104boundary. That means that if you ask for 1 byte, C<malloc> will
105silently round up to either 4 or 8 bytes, though it doesn't tell the
106program making the request, so the extra memory can't be used.
107
108Since C<free> isn't given the size of the memory chunk you're
109freeing, it has to track it another way. Most libraries do this by
110tacking on a length field just before the memory it hands to your
111program. (It's put before the beginning rather than after the end
112because it's less likely to get mangled by program bugs) This size
113field is the size of your platform integer, Generally either 4 or 8
114bytes.
115
116So, if you asked for 1 byte, malloc would build something like this:
117
118 +------------------+
119 | 4 byte length |
120 +------------------+ <----- the pointer malloc returns
121 | your 1 byte |
122 +------------------+
123 | 3 bytes padding |
124 +------------------+
125
126As you can see, you asked for 1 byte but C<malloc> used 8. If your
127integers were 8 bytes rather than 4, C<malloc> would have used 16 bytes
128to satisfy your 1 byte request.
129
130The C memory allocation system also keeps a list of free memory
131chunks, so it can recycle freed memory. For performance reasons, some
132C memory allocation systems put a limit to the number of free
133segments that are on the free list, or only search through a small
134number of memory chunks waiting to be recycled before just
135allocating more memory from the system.
136
137The memory allocation system tries to keep as few chunks on the free
138list as possible. It does this by trying to notice if there are two
139adjacent chunks of memory on the free list and, if there are,
140coalescing them into a single larger chunk. This works pretty well,
141but there are ways to have a lot of memory on the free list yet still
142not have anything that can be allocated. If a program allocates one
143million eight-byte chunks, for example, then frees every other chunk,
144there will be four million bytes of memory on the free list, but none
145of that memory can be handed out to satisfy a request for 10
146bytes. This is what's referred to as a fragmented free list, and can
147be one reason why your program could have a lot of free memory yet
148still not be able to allocate more, or have a huge process size and
149still have almost no memory actually allocated to the program running.
150
151=head2 Perl
152
153Perl's memory allocation scheme is a bit convoluted, and more complex
0430b7f7 154than can really be addressed here, but there is one common spot where Perl's
b98fcdb9 155memory allocation is unintuitive, and that's for hash keys.
156
157When you have a hash, each entry has a structure that points to the
158key and the value for that entry. The value is just a pointer to the
159scalar in the entry, and doesn't take up any special amount of
160memory. The key structure holds the hash value for the key, the key
161length, and the key string. (The entry and key structures are
162separate so perl can potentially share keys across multiple hashes)
163
164The entry structure has three pointers in it, and takes up either 12
165or 24 bytes, depending on whether you're on a 32 bit or 64 bit
166system. Since these structures are of fixed size, perl can keep a big
167pool of them internally (generally called an arena) so it doesn't
168have to allocate memory for each one.
169
170The key structure, though, is of variable length because the key
171string is of variable length, so perl has to ask the system for a
172memory allocation for each key. The base size of this structure is
1738 or 16 bytes (once again, depending on whether you're on a 32 bit or
17464 bit system) plus the string length plus two bytes.
175
176Since this memory has to be allocated from the system there's the
177malloc size-field overhead (4 or 8 bytes) plus the alignment bytes (0
178to 7, depending on your system and the key length)
179that get added on to the chunk perl requests. If the key is only 1
180character, and you're on a 32 bit system, the allocation will be 16
181bytes. If the key is 7 characters then the allocation is 24 bytes on
182a 32 bit system. If you're on a 64 bit system the numbers get even
183larger.
184
b98fcdb9 185=head1 DANGERS
186
c037a281 187Since version 0.72, Devel::Size uses a new pointer tracking mechanism
9fc9ab86 188that consumes far less memory than was previously the case. It does this
189by using a bit vector where 1 bit represents each 4- or 8-byte aligned pointer
190(32- or 64-bit platform dependant) that could exist. Further, it segments
191that bit vector and only allocates each chunk when an address is seen within
30fe4f47 192that chunk. Since version 0.73, chunks are allocated in blocks of 2**16 bits
193(ie 8K), accessed via a 256-way tree. The tree is 2 levels deep on a 32 bit
194system, 6 levels deep on a 64 bit system. This avoids having make any
195assumptions about address layout on 64 bit systems or trade offs about sizes
196to allocate. It assumes that the addresses of allocated pointers are reasonably
197contiguous, so that relevant parts of the tree stay in the CPU cache.
9fc9ab86 198
199Besides saving a lot of memory, this change means that Devel::Size
200runs significantly faster than previous versions.
201
5073b933 202=head1 Messages: texts originating from this module.
203
204=head2 Errors
205
206=over 4
207
9fc9ab86 208=item "Devel::Size: Unknown variable type"
209
210The thing (or something contained within it) that you gave to
5073b933 211total_size() was unrecognisable as a Perl entity.
212
213=back
214
215=head2 warnings
216
9fc9ab86 217These messages warn you that for some types, the sizes calculated may not include
218everything that could be associated with those types. The differences are usually
5073b933 219insignificant for most uses of this module.
220
221These may be disabled by setting
222
9fc9ab86 223 $Devel::Size::warn = 0
5073b933 224
225=over 4
226
9fc9ab86 227=item "Devel::Size: Calculated sizes for CVs are incomplete"
228
229=item "Devel::Size: Calculated sizes for FMs are incomplete"
5073b933 230
9fc9ab86 231=item "Devel::Size: Calculated sizes for compiled regexes are incompatible, and probably always will be"
232
233=back
234
c037a281 235=head2 New warnings since 0.72
9fc9ab86 236
237Devel::Size has always been vulnerable to trapping when traversing Perl's
238internal data structures, if it encounters uninitialised (dangling) pointers.
239
1a36ac09 240MSVC provides exception handling able to deal with this possibility, and when
241built with MSVC Devel::Size will now attempt to ignore (or log) them and
242continue. These messages are mainly of interest to Devel::Size and core
243developers, and so are disabled by default.
9fc9ab86 244
245They may be enabled by setting
246
247 $Devel::Size::dangle = 0
248
249=over 4
250
251=item "Devel::Size: Can't determine class of operator OPx_XXXX, assuming BASEOP\n"
252
253=item "Devel::Size: Encountered bad magic at: 0xXXXXXXXX"
254
255=item "Devel::Size: Encountered dangling pointer in opcode at: 0xXXXXXXXX"
256
257=item "Devel::Size: Encountered invalid pointer: 0xXXXXXXXX"
5073b933 258
5073b933 259=back
260
e98cedbf 261=head1 BUGS
262
fea63ffa 263Doesn't currently walk all the bits for code refs, formats, and
6a9ad7ec 264IO. Those throw a warning, but a minimum size for them is returned.
e98cedbf 265
b98fcdb9 266Devel::Size only counts the memory that perl actually allocates. It
267doesn't count 'dark' memory--memory that is lost due to fragmented free lists,
268allocation alignments, or C library overhead.
269
e98cedbf 270=head1 AUTHOR
271
272Dan Sugalski dan@sidhe.org
273
98ecbbc6 274Small portion taken from the B module as shipped with perl 5.6.2.
275
56ae08a8 276Previously maintained by Tels <http://bloodgate.com>
9fc9ab86 277
56ae08a8 278New pointer tracking & exception handling for 0.72 by BrowserUK
279
280Currently maintained by Nicholas Clark
0430b7f7 281
98ecbbc6 282=head1 COPYRIGHT
283
6ea94d90 284Copyright (C) 2005 Dan Sugalski, Copyright (C) 2007-2008 Tels
98ecbbc6 285
286This module is free software; you can redistribute it and/or modify it
5a83b7cf 287under the same terms as Perl v5.8.8.
98ecbbc6 288
e98cedbf 289=head1 SEE ALSO
290
0430b7f7 291perl(1), L<Devel::Size::Report>.
e98cedbf 292
293=cut