X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=Size.pm;h=f1a3866eebb2fd039ccc162bc4827ba9643e0c64;hb=966a1570c7ce7e07dfb937f4d5ae0ab35a029496;hp=77568947392096a1899394f4b612bc30766225c7;hpb=4ab42718926473a80fab54000ffbdb7fa22732e6;p=p5sagit%2FDevel-Size.git diff --git a/Size.pm b/Size.pm index 7756894..f1a3866 100644 --- a/Size.pm +++ b/Size.pm @@ -16,7 +16,7 @@ require DynaLoader; # If you do not need this, moving things directly into @EXPORT or @EXPORT_OK # will save memory. %EXPORT_TAGS = ( 'all' => [ qw( - size + size total_size ) ] ); @EXPORT_OK = ( @{ $EXPORT_TAGS{'all'} } ); @@ -24,7 +24,7 @@ require DynaLoader; @EXPORT = qw( ); -$VERSION = '0.03'; +$VERSION = '0.56'; bootstrap Devel::Size $VERSION; @@ -32,37 +32,181 @@ bootstrap Devel::Size $VERSION; 1; __END__ -# Below is stub documentation for your module. You better edit it! =head1 NAME -Devel::Size - Perl extension for finding the memory usage of perl variables +Devel::Size - Perl extension for finding the memory usage of Perl variables =head1 SYNOPSIS - use Devel::Size qw(size); - $size = size("abcde"); - $other_size = size(\@foo); + use Devel::Size qw(size total_size); -=head1 DESCRIPTION + my $size = size("A string"); + + my @foo = (1, 2, 3, 4, 5); + my $other_size = size(\@foo); -This module figures out the real sizes of perl variables. Call it with -a reference to the variable you want the size of. If you pass in a -plain scalar it returns the size of that scalar. (Just be careful if -you're asking for the size of a reference, as it'll follow the -reference if you don't reference it first) + my $foo = {a => [1, 2, 3], + b => {a => [1, 3, 4]} + }; + my $total_size = total_size($foo); -=head2 EXPORT +=head1 DESCRIPTION -None by default. +This module figures out the real sizes of Perl variables in bytes. +Call functions with a reference to the variable you want the size +of. If the variable is a plain scalar it returns the size of +the scalar. If the variable is a hash or an array, use a reference +when calling. + +=head1 FUNCTIONS + +=head2 size($ref) + +The C function returns the amount of memory the variable +returns. If the variable is a hash or an array, it only reports +the amount used by the structure, I the contents. + +=head2 total_size($ref) + +The C function will traverse the variable and look +at the sizes of contents. Any references contained in the variable +will also be followed, so this function can be used to get the +total size of a multidimensional data structure. At the moment +there is no way to get the size of an array or a hash and its +elements without using this function. + +=head1 EXPORT + +None but default, but optionally C and C. + +=head1 UNDERSTANDING MEMORY ALLOCATION + +Please note that the following discussion of memory allocation in perl +is based on the perl 5.8.0 sources. While this is generally +applicable to all versions of perl, some of the gory details are +omitted. It also makes some presumptions on how your system memory +allocator works so, while it will be generally correct, it may not +exactly reflect your system. (Generally the only issue is the size of +the constant values we'll talk about, not their existence) + +=head2 The C library + +It's important firtst to understand how your OS and libraries handle +memory. When the perl interpreter needs some memory, it asks the C +runtime library for it, using the C call. C has one +parameter, the size of the memory allocation you want, and returns a +pointer to that memory. C also makes sure that the pointer it +returns to you is properly aligned. When you're done with the memory +you hand it back to the library with the C call. C has +one parameter, the pointer that C returned. There are a couple of interesting ramifications to this. + +Because malloc has to return an aligned pointer, it will round up the +memory allocation to make sure that the memory it returns is aligned +right. What that alignment is depends on your CPU, OS, and compiler +settings, but things are generally aligned to either a 4 or 8 byte +boundary. That means that if you ask for 1 byte, C will +silently round up to either 4 or 8 bytes, though it doesn't tell the +program making the request, so the extra memory can't be used. + +Since C isn't given the size of the memory chunk you're +freeing, it has to track it another way. Most libraries do this by +tacking on a length field just before the memory it hands to your +program. (It's put before the beginning rather than after the end +because it's less likely to get mangled by program bugs) This size +field is the size of your platform integer, Generally either 4 or 8 +bytes. + +So, if you asked for 1 byte, malloc would build something like this: + + +------------------+ + | 4 byte length | + +------------------+ <----- the pointer malloc returns + | your 1 byte | + +------------------+ + | 3 bytes padding | + +------------------+ + +As you can see, you asked for 1 byte but C used 8. If your +integers were 8 bytes rather than 4, C would have used 16 bytes +to satisfy your 1 byte request. + +The C memory allocation system also keeps a list of free memory +chunks, so it can recycle freed memory. For performance reasons, some +C memory allocation systems put a limit to the number of free +segments that are on the free list, or only search through a small +number of memory chunks waiting to be recycled before just +allocating more memory from the system. + +The memory allocation system tries to keep as few chunks on the free +list as possible. It does this by trying to notice if there are two +adjacent chunks of memory on the free list and, if there are, +coalescing them into a single larger chunk. This works pretty well, +but there are ways to have a lot of memory on the free list yet still +not have anything that can be allocated. If a program allocates one +million eight-byte chunks, for example, then frees every other chunk, +there will be four million bytes of memory on the free list, but none +of that memory can be handed out to satisfy a request for 10 +bytes. This is what's referred to as a fragmented free list, and can +be one reason why your program could have a lot of free memory yet +still not be able to allocate more, or have a huge process size and +still have almost no memory actually allocated to the program running. + +=head2 Perl + +Perl's memory allocation scheme is a bit convoluted, and more complex +than can really be addressed here, but there is one common spot where perl's +memory allocation is unintuitive, and that's for hash keys. + +When you have a hash, each entry has a structure that points to the +key and the value for that entry. The value is just a pointer to the +scalar in the entry, and doesn't take up any special amount of +memory. The key structure holds the hash value for the key, the key +length, and the key string. (The entry and key structures are +separate so perl can potentially share keys across multiple hashes) + +The entry structure has three pointers in it, and takes up either 12 +or 24 bytes, depending on whether you're on a 32 bit or 64 bit +system. Since these structures are of fixed size, perl can keep a big +pool of them internally (generally called an arena) so it doesn't +have to allocate memory for each one. + +The key structure, though, is of variable length because the key +string is of variable length, so perl has to ask the system for a +memory allocation for each key. The base size of this structure is +8 or 16 bytes (once again, depending on whether you're on a 32 bit or +64 bit system) plus the string length plus two bytes. + +Since this memory has to be allocated from the system there's the +malloc size-field overhead (4 or 8 bytes) plus the alignment bytes (0 +to 7, depending on your system and the key length) +that get added on to the chunk perl requests. If the key is only 1 +character, and you're on a 32 bit system, the allocation will be 16 +bytes. If the key is 7 characters then the allocation is 24 bytes on +a 32 bit system. If you're on a 64 bit system the numbers get even +larger. + +This does mean that hashes eat up a I of memory, both in memory +Devel::Size can track (the memory actually in the structures and +strings) and that it can't (the malloc alignment and length overhead). + +=head1 DANGERS + +Devel::Size, because of the way it works, can consume a +considerable amount of memory as it runs. It will use five +pointers, two integers, and two bytes worth of storage, plus +potential alignment and bucket overhead, per thing it looks at. This +memory is released at the end, but it may fragment your free pool, +and will definitely expand your process' memory footprint. =head1 BUGS -Only does plain scalars, hashes, and arrays. No sizes for globs or code refs. Yet. +Doesn't currently walk all the bits for code refs, formats, and +IO. Those throw a warning, but a minimum size for them is returned. -Also, this module currently only returns the size used by the variable -itself, I the contents of arrays or hashes, nor does it follow -references past one level. That's for later. +Devel::Size only counts the memory that perl actually allocates. It +doesn't count 'dark' memory--memory that is lost due to fragmented free lists, +allocation alignments, or C library overhead. =head1 AUTHOR