7 PPI::Normal - Normalize Perl Documents
11 Perl Documents, as created by PPI, are typically filled with all sorts of
12 mess such as whitespace and comments and other things that don't effect
13 the actual meaning of the code.
15 In addition, because there is more than one way to do most things, and the
16 syntax of Perl itself is quite flexible, there are many ways in which the
17 "same" code can look quite different.
19 PPI::Normal attempts to resolve this by providing a variety of mechanisms
20 and algorithms to "normalize" Perl Documents, and determine a sort of base
21 form for them (although this base form will be a memory structure, and
22 not something that can be turned back into Perl source code).
24 The process itself is quite complex, and so for convenience and
25 extensibility it has been separated into a number of layers. At a later
26 point, it will be possible to write Plugin classes to insert additional
27 normalization steps into the various different layers.
29 In addition, you can choose to do the normalization only as deep as a
30 particular layer, depending on aggressively you want the normalization
39 use List::MoreUtils ();
40 use PPI::Util '_Document';
41 use PPI::Document::Normalized ();
43 use vars qw{$VERSION %LAYER};
47 # Registered function store
58 #####################################################################
63 =head2 register $function => $layer, ...
65 The C<register> method is used by normalization method providers to
66 tell the normalization engines which functions need to be run, and
67 in which layer they apply.
69 Provide a set of key/value pairs, where the key is the full name of the
70 function (in string form), and the value is the layer (see description
71 of the layers above) in which it should be run.
73 Returns true if all functions are registered, or C<undef> on error.
84 defined $function and defined &{"$function"}
85 or Carp::croak("Bad function name provided to PPI::Normal");
88 # Has it already been added?
89 if ( List::MoreUtils::any { $_ eq $function } ) {
93 # Check the layer to add it to
95 defined $layer and $layer =~ /^(?:1|2)$/
96 or Carp::croak("Bad layer provided to PPI::Normal");
98 # Add to the layer data store
99 push @{ $LAYER{$layer} }, $function;
105 # With the registration mechanism in place, load in the main set of
106 # normalization methods to initialize the store.
107 use PPI::Normal::Standard;
113 #####################################################################
114 # Constructor and Accessors
120 my $level_1 = PPI::Normal->new;
121 my $level_2 = PPI::Normal->new(2);
123 Creates a new normalization object, to which Document objects
124 can be passed to be normalized.
126 Of course, what you probably REALLY want is just to call
127 L<PPI::Document>'s C<normalize> method.
129 Takes an optional single parameter of the normalisation layer
130 to use, which at this time can be either "1" or "2".
132 Returns a new C<PPI::Normal> object, or C<undef> on error.
134 =begin testing new after PPI::Document 12
136 # Check we actually set the layer at creation
137 my $layer_1 = PPI::Normal->new;
138 isa_ok( $layer_1, 'PPI::Normal' );
139 is( $layer_1->layer, 1, '->new creates a layer 1' );
140 my $layer_1a = PPI::Normal->new(1);
141 isa_ok( $layer_1a, 'PPI::Normal' );
142 is( $layer_1a->layer, 1, '->new(1) creates a layer 1' );
143 my $layer_2 = PPI::Normal->new(2);
144 isa_ok( $layer_2, 'PPI::Normal' );
145 is( $layer_2->layer, 2, '->new(2) creates a layer 2' );
148 is( PPI::Normal->new(3), undef, '->new only allows up to layer 2' );
149 is( PPI::Normal->new(undef), undef, '->new(evil) returns undef' );
150 is( PPI::Normal->new("foo"), undef, '->new(evil) returns undef' );
151 is( PPI::Normal->new(\"foo"), undef, '->new(evil) returns undef' );
152 is( PPI::Normal->new([]), undef, '->new(evil) returns undef' );
153 is( PPI::Normal->new({}), undef, '->new(evil) returns undef' );
162 (defined $_[0] and ! ref $_[0] and $_[0] =~ /^[12]$/) ? shift : return undef
177 The C<layer> accessor returns the normalisation layer of the object.
181 sub layer { $_[0]->{layer} }
187 #####################################################################
194 The C<process> method takes anything that can be converted to a
195 L<PPI::Document> (object, SCALAR ref, filename), loads it and
196 applies the normalisation process to the document.
198 Returns a L<PPI::Document::Normalized> object, or C<undef> on error.
200 =begin testing process after new 15
202 my $doc1 = PPI::Document->new(\'print "Hello World!\n";');
203 isa_ok( $doc1, 'PPI::Document' );
204 my $doc2 = \'print "Hello World!\n";';
205 my $doc3 = \' print "Hello World!\n"; # comment';
206 my $doc4 = \'print "Hello World!\n"';
208 # Normalize them at level 1
209 my $layer1 = PPI::Normal->new(1);
210 isa_ok( $layer1, 'PPI::Normal' );
211 my $nor11 = $layer1->process($doc1->clone);
212 my $nor12 = $layer1->process($doc2);
213 my $nor13 = $layer1->process($doc3);
214 isa_ok( $nor11, 'PPI::Document::Normalized' );
215 isa_ok( $nor12, 'PPI::Document::Normalized' );
216 isa_ok( $nor13, 'PPI::Document::Normalized' );
218 # The first 3 should be the same, the second not
219 is_deeply( { %$nor11 }, { %$nor12 }, 'Layer 1: 1 and 2 match' );
220 is_deeply( { %$nor11 }, { %$nor13 }, 'Layer 1: 1 and 3 match' );
222 # Normalize them at level 2
223 my $layer2 = PPI::Normal->new(2);
224 isa_ok( $layer2, 'PPI::Normal' );
225 my $nor21 = $layer2->process($doc1);
226 my $nor22 = $layer2->process($doc2);
227 my $nor23 = $layer2->process($doc3);
228 my $nor24 = $layer2->process($doc4);
229 isa_ok( $nor21, 'PPI::Document::Normalized' );
230 isa_ok( $nor22, 'PPI::Document::Normalized' );
231 isa_ok( $nor23, 'PPI::Document::Normalized' );
232 isa_ok( $nor24, 'PPI::Document::Normalized' );
234 # The first 3 should be the same, the second not
235 is_deeply( { %$nor21 }, { %$nor22 }, 'Layer 2: 1 and 2 match' );
236 is_deeply( { %$nor21 }, { %$nor23 }, 'Layer 2: 1 and 3 match' );
237 is_deeply( { %$nor21 }, { %$nor24 }, 'Layer 2: 1 and 4 match' );
244 my $self = ref $_[0] ? shift : shift->new;
246 # PPI::Normal objects are reusable, but not re-entrant
247 return undef if $self->{Document};
249 # Get or create the document
250 $self->{Document} = _Document(shift) or return undef;
252 # Work out what functions we need to call
254 foreach ( 1 .. $self->layer ) {
255 push @functions, @{ $LAYER{$_} };
258 # Execute each function
259 foreach my $function ( @functions ) {
261 &{"$function"}( $self->{Document} );
264 # Create the normalized Document object
265 my $Normalized = PPI::Document::Normalized->new(
266 Document => $self->{Document},
268 functions => \@functions,
272 delete $self->{Document};
282 The following normalisation layers are implemented. When writing
283 plugins, you should register each transformation function with the
286 =head2 Layer 1 - Insignificant Data Removal
288 The basic step common to all normalization, layer 1 scans through the
289 Document and removes all whitespace, comments, POD, and anything else
290 that returns false for its C<significant> method.
292 It also checks each Element and removes known-useless sub-element
293 metadata such as the Element's physical position in the file.
295 =head2 Layer 2 - Significant Element Removal
297 After the removal of the insignificant data, Layer 2 removed larger, more
298 complex, and superficially "significant" elements, that can be removed
299 for the purposes of normalisation.
301 Examples from this layer include pragmas, now-useless statement
302 separators (since the PDOM tree is holding statement elements), and
303 several other minor bits and pieces.
305 =head2 Layer 3 - TO BE COMPLETED
307 This version of the forward-port of the Perl::Compare functionality
308 to the 0.900+ API of PPI only implements Layer 1 and 2 at this time.
312 - Write the other 4-5 layers :)
316 See the L<support section|PPI/SUPPORT> in the main module.
320 Adam Kennedy E<lt>adamk@cpan.orgE<gt>
324 Copyright 2005 Adam Kennedy.
326 This program is free software; you can redistribute
327 it and/or modify it under the same terms as Perl itself.
329 The full text of the license can be found in the
330 LICENSE file included with this module.