4 use warnings FATAL => 'all';
6 use HTML::Zoom::ZConfig;
7 use HTML::Zoom::ReadFH;
8 use HTML::Zoom::Transform;
9 use HTML::Zoom::TransformBuilder;
12 my ($class, $args) = @_;
14 $new->{zconfig} = HTML::Zoom::ZConfig->new($args->{zconfig}||{});
18 sub zconfig { shift->_self_or_new->{zconfig} }
21 ref($_[0]) ? $_[0] : $_[0]->new
25 bless({ %{$_[0]}, %{$_[1]} }, ref($_[0]));
29 my $self = shift->_self_or_new;
31 initial_events => shift,
36 my $self = shift->_self_or_new;
37 $self->from_events($self->zconfig->parser->html_to_events($_[0]))
41 my $self = shift->_self_or_new;
43 $self->from_html(do { local (@ARGV, $/) = ($filename); <> });
48 die "No events to build from - forgot to call from_html?"
49 unless $self->{initial_events};
50 my $sutils = $self->zconfig->stream_utils;
51 my $stream = $sutils->stream_from_array(@{$self->{initial_events}});
52 $stream = $_->apply_to_stream($stream) for @{$self->{transforms}||[]};
57 HTML::Zoom::ReadFH->from_zoom(shift);
62 [ $self->zconfig->stream_utils->stream_to_array($self->to_stream) ];
72 my ($self, $code) = @_;
79 $self->zconfig->producer->html_from_stream($self->to_stream);
84 ref($self)->new($self)->from_html($self->to_html);
88 my $self = shift->_self_or_new;
92 @{$self->{transforms}||[]},
99 my $self = shift->_self_or_new;
100 my ($selector, $filter) = @_;
101 $self->with_transform(
102 HTML::Zoom::Transform->new({
103 zconfig => $self->zconfig,
104 selector => $selector,
105 filters => [ $filter ]
111 my $self = shift->_self_or_new;
113 return HTML::Zoom::TransformBuilder->new({
114 zconfig => $self->zconfig,
115 selector => $selector,
120 # There's a bug waiting to happen here: if you do something like
122 # $zoom->select('.foo')
123 # ->remove_attribute(class => 'foo')
125 # ->well_anything_really
127 # the second action won't execute because it doesn't match anymore.
128 # Ideally instead we'd merge the match subs but that's more complex to
129 # implement so I'm deferring it for the moment.
133 die "Can't call ->then without a previous transform"
134 unless $self->{transforms};
135 $self->select($self->{transforms}->[-1]->selector);
142 HTML::Zoom - selector based streaming template engine
148 my $template = <<HTML;
151 <title>Hello people</title>
154 <h1 id="greeting">Placeholder</h1>
157 <p>Name: <span class="name">Bob</span></p>
158 <p>Age: <span class="age">23</span></p>
160 <hr class="between" />
166 my $output = HTML::Zoom
167 ->from_html($template)
168 ->select('title, #greeting')->replace_content('Hello world & dog!')
169 ->select('#list')->repeat_content(
172 $_->select('.name')->replace_content('Matt')
173 ->select('.age')->replace_content('26')
176 $_->select('.name')->replace_content('Mark')
177 ->select('.age')->replace_content('0x29')
180 $_->select('.name')->replace_content('Epitaph')
181 ->select('.age')->replace_content('<redacted>')
184 { repeat_between => '.between' }
198 <title>Hello world & dog!</title>
201 <h1 id="greeting">Hello world & dog!</h1>
204 <p>Name: <span class="name">Matt</span></p>
205 <p>Age: <span class="age">26</span></p>
207 <hr class="between" />
209 <p>Name: <span class="name">Mark</span></p>
210 <p>Age: <span class="age">0x29</span></p>
212 <hr class="between" />
214 <p>Name: <span class="name">Epitaph</span></p>
215 <p>Age: <span class="age"><redacted></span></p>
225 is($output, $expect, 'Synopsis code works ok');
229 =head1 DANGER WILL ROBINSON
231 This is a 0.9 release. That means that I'm fairly happy the API isn't going
232 to change in surprising and upsetting ways before 1.0 and a real compatibility
233 freeze. But it also means that if it turns out there's a mistake the size of
234 a politician's ego in the API design that I haven't spotted yet there may be
235 a bit of breakage between here and 1.0. Hopefully not though. Appendages
236 crossed and all that.
238 Worse still, the rest of the distribution isn't documented yet. I'm sorry.
239 I suck. But lots of people have been asking me to ship this, docs or no, so
240 having got this class itself at least somewhat documented I figured now was
241 a good time to cut a first real release.
245 HTML::Zoom is a lazy, stream oriented, streaming capable, mostly functional,
246 CSS selector based semantic templating engine for HTML and HTML-like
249 Which is, on the whole, a bit of a mouthful. So let me step back a moment
250 and explain why you care enough to understand what I mean:
254 HTML::Zoom is the cure for JQuery envy. When your javascript guy pushes a
255 piece of data into a document by doing:
257 $('.username').replaceAll(username);
259 In HTML::Zoom one can write
261 $zoom->select('.username')->replace_content($username);
263 which is, I hope, almost as clear, hampered only by the fact that Zoom can't
264 assume a global document and therefore has nothing quite so simple as the
265 $() function to get the initial selection.
267 L<HTML::Zoom::SelectorParser> implements a subset of the JQuery selector
268 specification, and will continue to track that rather than the W3C standards
269 for the forseeable future on grounds of pragmatism. Also on grounds of their
270 spec is written in EN_US rather than EN_W3C, and I read the former much better.
272 I am happy to admit that it's very, very much a subset at the moment - see the
273 L<HTML::Zoom::SelectorParser> POD for what's currently there, and expect more
274 and more to be supported over time as we need it and patch it in.
276 =head2 CLEAN TEMPLATES
278 HTML::Zoom is the cure for messy templates. How many times have you looked at
281 <form action="/somewhere">
282 [% FOREACH field IN fields %]
283 <label for="[% field.id %]">[% field.label %]</label>
284 <input name="[% field.name %]" type="[% field.type %]" value="[% field.value %]" />
288 and despaired of the fact that neither the HTML structure nor the logic are
289 remotely easy to read? Fortunately, with HTML::Zoom we can separate the two
292 <form class="myform" action="/somewhere">
297 $zoom->select('.myform')->repeat_content([
298 map { my $field = $_; sub {
301 ->add_attribute( for => $field->{id} )
303 ->replace_content( $field->{label} )
306 ->add_attribute( name => $field->{name} )
308 ->add_attribute( type => $field->{type} )
310 ->add_attribute( value => $field->{value} )
315 This is, admittedly, very much not shorter. However, it makes it extremely
316 clear what's happening and therefore less hassle to maintain. Especially
317 because it allows the designer to fiddle with the HTML without cutting
318 himself on sharp ELSE clauses, and the developer to add available data to
319 the template without getting angle bracket cuts on sensitive parts.
321 Better still, HTML::Zoom knows that it's inserting content into HTML and
322 can escape it for you - the example template should really have been:
324 <form action="/somewhere">
325 [% FOREACH field IN fields %]
326 <label for="[% field.id | html %]">[% field.label | html %]</label>
327 <input name="[% field.name | html %]" type="[% field.type | html %]" value="[% field.value | html %]" />
331 and frankly I'll take slightly more code any day over *that* crawling horror.
333 (addendum: I pick on L<Template Toolkit|Template> here specifically because
334 it's the template system I hate the least - for text templating, I don't
335 honestly think I'll ever like anything except the next version of Template
336 Toolkit better - but HTML isn't text. Zoom knows that. Do you?)
338 =head2 PUTTING THE FUN INTO FUNCTIONAL
340 The principle of HTML::Zoom is to provide a reusable, functional container
341 object that lets you build up a set of transforms to be applied; every method
342 call you make on a zoom object returns a new object, so it's safe to do so
343 on one somebody else gave you without worrying about altering state (with
344 the notable exception of ->next for stream objects, which I'll come to later).
348 my $z2 = $z1->select('.name')->replace_content($name);
350 my $z3 = $z2->select('.title')->replace_content('Ms.');
352 each time produces a new Zoom object. If you want to package up a set of
353 transforms to re-use, HTML::Zoom provides an 'apply' method:
355 my $add_name = sub { $_->select('.name')->replace_content($name) };
357 my $same_as_z2 = $z1->apply($add_name);
359 =head2 LAZINESS IS A VIRTUE
361 HTML::Zoom does its best to defer doing anything until it's absolutely
362 required. The only point at which it descends into state is when you force
363 it to create a stream, directly by:
365 my $stream = $zoom->to_stream;
367 while (my $evt = $stream->next) {
368 # handle zoom event here
373 my $final_html = $zoom->to_html;
375 my $fh = $zoom->to_fh;
377 while (my $chunk = $fh->getline) {
381 Better still, the $fh returned doesn't create its stream until the first
382 call to getline, which means that until you call that and force it to be
383 stateful you can get back to the original stateless Zoom object via:
385 my $zoom = $fh->to_zoom;
387 which is exceedingly handy for filtering L<Plack> PSGI responses, among other
390 Because HTML::Zoom doesn't try and evaluate everything up front, you can
391 generally put things together in whatever order is most appropriate. This
394 my $start = HTML::Zoom->from_html($html);
396 my $zoom = $start->select('div')->replace_content('THIS IS A DIV!');
400 my $start = HTML::Zoom->select('div')->replace_content('THIS IS A DIV!');
402 my $zoom = $start->from_html($html);
404 will produce equivalent final $zoom objects, thus proving that there can be
405 more than one way to do it without one of them being a
406 L<bait and switch|Switch>.
408 =head2 STOCKTON TO DARLINGTON UNDER STREAM POWER
410 HTML::Zoom's execution always happens in terms of streams under the hood
411 - that is, the basic pattern for doing anything is -
413 my $stream = get_stream_from_somewhere
415 while (my ($evt) = $stream->next) {
416 # do something with the event
419 More importantly, all selectors and filters are also built as stream
420 operations, so a selector and filter pair is effectively:
424 my $next_evt = $self->parent_stream->next;
425 if ($self->selector_matches($next_evt)) {
426 return $self->apply_filter_to($next_evt);
432 Internally, things are marginally more complicated than that, but not enough
433 that you as a user should normally need to care.
435 In fact, an HTML::Zoom object is mostly just a container for the relevant
436 information from which to build the final stream that does the real work. A
437 stream built from a Zoom object is a stream of events from parsing the
438 initial HTML, wrapped in a filter stream per selector/filter pair provided
441 The upshot of this is that the application of filters works just as well on
442 streams as on the original Zoom object - in fact, when you run a
443 L</repeat_content> operation your subroutines are applied to the stream for
444 that element of the repeat, rather than constructing a new zoom per repeat
449 $_->select('div')->replace_content('I AM A DIV!');
451 works on both HTML::Zoom objects themselves and HTML::Zoom stream objects and
452 shares sufficient of the implementation that you can generally forget the
453 difference - barring the fact that a stream already has state attached so
454 things like to_fh are no longer available.
456 =head2 POP! GOES THE WEASEL
458 ... and by Weasel, I mean layout.
460 HTML::Zoom's filehandle object supports an additional event key, 'flush',
461 that is transparent to the rest of the system but indicates to the filehandle
462 object to end a getline operation at that point and return the HTML so far.
464 This means that in an environment where streaming output is available, such
465 as a number of the L<Plack> PSGI handlers, you can add the flush key to an
466 event in order to ensure that the HTML generated so far is flushed through
467 to the browser right now. This can be especially useful if you know you're
468 about to call a web service or a potentially slow database query or similar
469 to ensure that at least the header/layout of your page renders now, improving
470 perceived user responsiveness while your application waits around for the
473 This is currently exposed by the 'flush_before' option to the collect filter,
474 which incidentally also underlies the replace and repeat filters, so to
475 indicate we want this behaviour to happen before a query is executed we can
476 write something like:
478 $zoom->select('.item')->repeat(sub {
479 if (my $row = $db_thing->next) {
480 return sub { $_->select('.item-name')->replace_content($row->name) }
484 }, { flush_before => 1 });
486 which should have the desired effect given a sufficiently lazy $db_thing (for
487 example a L<DBIx::Class::ResultSet> object).
489 =head2 A FISTFUL OF OBJECTS
491 At the core of an HTML::Zoom system lurks an L<HTML::Zoom::ZConfig> object,
492 whose purpose is to hang on to the various bits and pieces that things need
493 so that there's a common way of accessing shared functionality.
495 Were I a computer scientist I would probably call this an "Inversion of
496 Control" object - which you'd be welcome to google to learn more about, or
497 you can just imagine a computer scientist being suspended upside down over
498 a pit. Either way works for me, I'm a pure maths grad.
500 The ZConfig object hangs on to one each of the following for you:
504 =item * An HTML parser, normally L<HTML::Zoom::Parser::BuiltIn>
506 =item * An HTML producer (emitter), normally L<HTML::Zoom::Producer::BuiltIn>
508 =item * An object to build event filters, normally L<HTML::Zoom::FilterBuilder>
510 =item * An object to parse CSS selectors, normally L<HTML::Zoom::SelectorParser>
512 =item * An object to build streams, normally L<HTML::Zoom::StreamUtils>
516 In theory you could replace any of these with anything you like, but in
517 practice you're probably best restricting yourself to subclasses, or at
518 least things that manage to look like the original if you squint a bit.
520 If you do something more clever than that, or find yourself overriding things
521 in your ZConfig a lot, please please tell us about it via one of the means
522 mentioned under L</SUPPORT>.
524 =head2 SEMANTIC DIDACTIC
526 Some will argue that overloading CSS selectors to do data stuff is a terrible
527 idea, and possibly even a step towards the "Concrete Javascript" pattern
528 (which I abhor) or Smalltalk's Morphic (which I ignore, except for the part
529 where it keeps reminding me of the late, great Tony Hart's plasticine friend).
531 To which I say, "eh", "meh", and possibly also "feh". If it really upsets
532 you, either use extra classes for this (and remove them afterwards) or
533 use special fake elements or, well, honestly, just use something different.
534 L<Template::Semantic> provides a similar idea to zoom except using XPath
535 and XML::LibXML transforms rather than a lightweight streaming approach -
536 maybe you'd like that better. Or maybe you really did want
537 L<Template Toolkit|Template> after all. It is still damn good at what it does,
540 So far, however, I've found that for new sites the designers I'm working with
541 generally want to produce nice semantic HTML with classes that represent the
542 nature of the data rather than the structure of the layout, so sharing them
543 as a common interface works really well for us.
545 In the absence of any evidence that overloading CSS selectors has killed
546 children or unexpectedly set fire to grandmothers - and given microformats
547 have been around for a while there's been plenty of opportunity for
548 octagenarian combustion - I'd suggest you give it a try and see if you like it.
550 =head2 GET THEE TO A SUMMARY!
554 HTML::Zoom is a lazy, stream oriented, streaming capable, mostly functional,
555 CSS selector based semantic templating engine for HTML and HTML-like
558 But I said that already. Although hopefully by now you have some idea what I
559 meant when I said it. If you didn't have any idea the first time. I mean, I'm
560 not trying to call you stupid or anything. Just saying that maybe it wasn't
561 totally obvious without the explanation. Or something.
565 Maybe we should just move on to the method docs.
571 my $zoom = HTML::Zoom->new;
573 my $zoom = HTML::Zoom->new({ zconfig => $zconfig });
575 Create a new empty Zoom object. You can optionally pass an
576 L<HTML::Zoom::ZConfig> instance if you're trying to override one or more of
577 the default components.
579 This method isn't often used directly since several other methods can also
580 act as constructors, notable L</select> and L</from_html>
584 my $zconfig = $zoom->zconfig;
586 Retrieve the L<HTML::Zoom::ZConfig> instance used by this Zoom object. You
587 shouldn't usually need to call this yourself.
591 my $zoom = HTML::Zoom->from_html($html);
593 my $z2 = $z1->from_html($html);
595 Parses the HTML using the current zconfig's parser object and returns a new
596 zoom instance with that as the source HTML to be transformed.
600 my $zoom = HTML::Zoom->from_file($file);
602 my $z2 = $z1->from_file($file);
604 Convenience method - slurps the contents of $file and calls from_html with it.
608 my $stream = $zoom->to_stream;
610 while (my ($evt) = $stream->next) {
613 Creates a stream, starting with a stream of the events from the HTML supplied
614 via L</from_html> and then wrapping it in turn with each selector+filter pair
615 that have been applied to the zoom object.
619 my $fh = $zoom->to_fh;
621 call_something_expecting_a_filehandle($fh);
623 Returns an L<HTML::Zoom::ReadFH> instance that will create a stream the first
624 time its getline method is called and then return all HTML up to the next
625 event with 'flush' set.
627 You can pass this filehandle to compliant PSGI handlers (and probably most
634 Runs the zoom object's transforms without doing anything with the results.
636 Normally used to get side effects of a zoom run - for example when using
637 L<HTML::Zoom::FilterBuilder/collect> to slurp events for scraping or layout.
641 my $z2 = $z1->apply(sub {
642 $_->select('div')->replace_content('I AM A DIV!') })
645 Sets $_ to the zoom object and then runs the provided code. Basically syntax
646 sugar, the following is entirely equivalent:
649 shift->select('div')->replace_content('I AM A DIV!') })
652 my $z2 = $sub->($z1);
656 my $html = $zoom->to_html;
658 Runs the zoom processing and returns the resulting HTML.
662 my $z2 = $z1->memoize;
664 Creates a new zoom whose source HTML is the results of the original zoom's
665 processing. Effectively syntax sugar for:
667 my $z2 = HTML::Zoom->from_html($z1->to_html);
669 but preserves your L<HTML::Zoom::ZConfig> object.
673 my $zoom = HTML::Zoom->with_filter(
674 'div', $filter_builder->replace_content('I AM A DIV!')
677 my $z2 = $z1->with_filter(
678 'div', $filter_builder->replace_content('I AM A DIV!')
681 Lower level interface than L</select> to adding filters to your zoom object.
683 In normal usage, you probably don't need to call this yourself.
687 my $zoom = HTML::Zoom->select('div')->replace_content('I AM A DIV!');
689 my $z2 = $z1->select('div')->replace_content('I AM A DIV!');
691 Returns an intermediary object of the class L<HTML::Zoom::TransformBuilder>
692 on which methods of your L<HTML::Zoom::FilterBuilder> object can be called.
694 In normal usage you should generally always put the pair of method calls
695 together; the intermediary object isn't designed or expected to stick around.
699 my $z2 = $z1->select('div')->add_attribute(class => 'spoon')
701 ->replace_content('I AM A DIV!');
703 Re-runs the previous select to allow you to chain actions together on the
710 =item * Matt S. Trout
716 This library is free software, you can redistribute it and/or modify
717 it under the same terms as Perl itself.