4 use warnings FATAL => 'all';
6 use HTML::Zoom::ZConfig;
7 use HTML::Zoom::MatchWithoutFilter;
8 use HTML::Zoom::ReadFH;
11 my ($class, $args) = @_;
13 $new->{zconfig} = HTML::Zoom::ZConfig->new($args->{zconfig}||{});
17 sub zconfig { shift->_self_or_new->{zconfig} }
20 ref($_[0]) ? $_[0] : $_[0]->new
24 bless({ %{$_[0]}, %{$_[1]} }, ref($_[0]));
28 my $self = shift->_self_or_new;
30 initial_events => $self->zconfig->parser->html_to_events($_[0])
35 my $self = shift->_self_or_new;
37 $self->from_html(do { local (@ARGV, $/) = ($filename); <> });
42 die "No events to build from - forgot to call from_html?"
43 unless $self->{initial_events};
44 my $sutils = $self->zconfig->stream_utils;
45 my $stream = $sutils->stream_from_array(@{$self->{initial_events}});
46 foreach my $filter_spec (@{$self->{filters}||[]}) {
47 $stream = $sutils->wrap_with_filter($stream, @{$filter_spec});
53 HTML::Zoom::ReadFH->from_zoom(shift);
58 $self->zconfig->stream_utils->stream_to_array($self->to_stream);
63 my ($self, $code) = @_;
70 $self->zconfig->producer->html_from_stream($self->to_stream);
75 ref($self)->new($self)->from_html($self->to_html);
79 my $self = shift->_self_or_new;
80 my ($selector, $filter) = @_;
81 my $match = $self->parse_selector($selector);
83 filters => [ @{$self->{filters}||[]}, [ $match, $filter ] ]
88 my $self = shift->_self_or_new;
90 my $match = $self->parse_selector($selector);
91 return HTML::Zoom::MatchWithoutFilter->construct(
92 $self, $match, $self->zconfig->filter_builder,
96 # There's a bug waiting to happen here: if you do something like
98 # $zoom->select('.foo')
99 # ->remove_attribute(class => 'foo')
101 # ->well_anything_really
103 # the second action won't execute because it doesn't match anymore.
104 # Ideally instead we'd merge the match subs but that's more complex to
105 # implement so I'm deferring it for the moment.
109 die "Can't call ->then without a previous filter"
110 unless $self->{filters};
111 $self->select($self->{filters}->[-1][0]);
115 my ($self, $selector) = @_;
116 return $selector if ref($selector); # already a match sub
117 $self->zconfig->selector_parser->parse_selector($selector);
124 HTML::Zoom - selector based streaming template engine
130 my $template = <<HTML;
133 <title>Hello people</title>
136 <h1 id="greeting">Placeholder</h1>
139 <p>Name: <span class="name">Bob</span></p>
140 <p>Age: <span class="age">23</span></p>
142 <hr class="between" />
148 my $output = HTML::Zoom
149 ->from_html($template)
150 ->select('title, #greeting')->replace_content('Hello world & dog!')
151 ->select('#list')->repeat_content(
154 $_->select('.name')->replace_content('Matt')
155 ->select('.age')->replace_content('26')
158 $_->select('.name')->replace_content('Mark')
159 ->select('.age')->replace_content('0x29')
162 $_->select('.name')->replace_content('Epitaph')
163 ->select('.age')->replace_content('<redacted>')
166 { repeat_between => '.between' }
180 <title>Hello world & dog!</title>
183 <h1 id="greeting">Hello world & dog!</h1>
186 <p>Name: <span class="name">Matt</span></p>
187 <p>Age: <span class="age">26</span></p>
189 <hr class="between" />
191 <p>Name: <span class="name">Mark</span></p>
192 <p>Age: <span class="age">0x29</span></p>
194 <hr class="between" />
196 <p>Name: <span class="name">Epitaph</span></p>
197 <p>Age: <span class="age"><redacted></span></p>
207 is($output, $expect, 'Synopsis code works ok');
211 =head1 DANGER WILL ROBINSON
213 This is a 0.9 release. That means that I'm fairly happy the API isn't going
214 to change in surprising and upsetting ways before 1.0 and a real compatibility
215 freeze. But it also means that if it turns out there's a mistake the size of
216 a politician's ego in the API design that I haven't spotted yet there may be
217 a bit of breakage between here and 1.0. Hopefully not though. Appendages
218 crossed and all that.
220 Worse still, the rest of the distribution isn't documented yet. I'm sorry.
221 I suck. But lots of people have been asking me to ship this, docs or no, so
222 having got this class itself at least somewhat documented I figured now was
223 a good time to cut a first real release.
227 HTML::Zoom is a lazy, stream oriented, streaming capable, mostly functional,
228 CSS selector based semantic templating engine for HTML and HTML-like
231 Which is, on the whole, a bit of a mouthful. So let me step back a moment
232 and explain why you care enough to understand what I mean:
236 HTML::Zoom is the cure for JQuery envy. When your javascript guy pushes a
237 piece of data into a document by doing:
239 $('.username').replaceAll(username);
241 In HTML::Zoom one can write
243 $zoom->select('.username')->replace_content($username);
245 which is, I hope, almost as clear, hampered only by the fact that Zoom can't
246 assume a global document and therefore has nothing quite so simple as the
247 $() function to get the initial selection.
249 L<HTML::Zoom::SelectorParser> implements a subset of the JQuery selector
250 specification, and will continue to track that rather than the W3C standards
251 for the forseeable future on grounds of pragmatism. Also on grounds of their
252 spec is written in EN_US rather than EN_W3C, and I read the former much better.
254 I am happy to admit that it's very, very much a subset at the moment - see the
255 L<HTML::Zoom::SelectorParser> POD for what's currently there, and expect more
256 and more to be supported over time as we need it and patch it in.
258 =head2 CLEAN TEMPLATES
260 HTML::Zoom is the cure for messy templates. How many times have you looked at
263 <form action="/somewhere">
264 [% FOREACH field IN fields %]
265 <label for="[% field.id %]">[% field.label %]</label>
266 <input name="[% field.name %]" type="[% field.type %]" value="[% field.value %]" />
270 and despaired of the fact that neither the HTML structure nor the logic are
271 remotely easy to read? Fortunately, with HTML::Zoom we can separate the two
274 <form class="myform" action="/somewhere">
279 $zoom->select('.myform')->repeat_content([
280 map { my $field = $_; sub {
283 ->add_attribute( for => $field->{id} )
285 ->replace_content( $field->{label} )
288 ->add_attribute( name => $field->{name} )
290 ->add_attribute( type => $field->{type} )
292 ->add_attribute( value => $field->{value} )
297 This is, admittedly, very much not shorter. However, it makes it extremely
298 clear what's happening and therefore less hassle to maintain. Especially
299 because it allows the designer to fiddle with the HTML without cutting
300 himself on sharp ELSE clauses, and the developer to add available data to
301 the template without getting angle bracket cuts on sensitive parts.
303 Better still, HTML::Zoom knows that it's inserting content into HTML and
304 can escape it for you - the example template should really have been:
306 <form action="/somewhere">
307 [% FOREACH field IN fields %]
308 <label for="[% field.id | html %]">[% field.label | html %]</label>
309 <input name="[% field.name | html %]" type="[% field.type | html %]" value="[% field.value | html %]" />
313 and frankly I'll take slightly more code any day over *that* crawling horror.
315 (addendum: I pick on L<Template Toolkit|Template> here specifically because
316 it's the template system I hate the least - for text templating, I don't
317 honestly think I'll ever like anything except the next version of Template
318 Toolkit better - but HTML isn't text. Zoom knows that. Do you?)
320 =head2 PUTTING THE FUN INTO FUNCTIONAL
322 The principle of HTML::Zoom is to provide a reusable, functional container
323 object that lets you build up a set of transforms to be applied; every method
324 call you make on a zoom object returns a new object, so it's safe to do so
325 on one somebody else gave you without worrying about altering state (with
326 the notable exception of ->next for stream objects, which I'll come to later).
330 my $z2 = $z1->select('.name')->replace_content($name);
332 my $z3 = $z2->select('.title')->replace_content('Ms.');
334 each time produces a new Zoom object. If you want to package up a set of
335 transforms to re-use, HTML::Zoom provides an 'apply' method:
337 my $add_name = sub { $_->select('.name')->replace_content($name) };
339 my $same_as_z2 = $z1->apply($add_name);
341 =head2 LAZINESS IS A VIRTUE
343 HTML::Zoom does its best to defer doing anything until it's absolutely
344 required. The only point at which it descends into state is when you force
345 it to create a stream, directly by:
347 my $stream = $zoom->as_stream;
349 while (my $evt = $stream->next) {
350 # handle zoom event here
355 my $final_html = $zoom->to_html;
357 my $fh = $zoom->to_fh;
359 while (my $chunk = $fh->getline) {
363 Better still, the $fh returned doesn't create its stream until the first
364 call to getline, which means that until you call that and force it to be
365 stateful you can get back to the original stateless Zoom object via:
367 my $zoom = $fh->to_zoom;
369 which is exceedingly handy for filtering L<Plack> PSGI responses, among other
372 Because HTML::Zoom doesn't try and evaluate everything up front, you can
373 generally put things together in whatever order is most appropriate. This
376 my $start = HTML::Zoom->from_html($html);
378 my $zoom = $start->select('div')->replace_content('THIS IS A DIV!');
382 my $start = HTML::Zoom->select('div')->replace_content('THIS IS A DIV!');
384 my $zoom = $start->from_html($html);
386 will produce equivalent final $zoom objects, thus proving that there can be
387 more than one way to do it without one of them being a
388 L<bait and switch|Switch>.
390 =head2 STOCKTON TO DARLINGTON UNDER STREAM POWER
392 HTML::Zoom's execution always happens in terms of streams under the hood
393 - that is, the basic pattern for doing anything is -
395 my $stream = get_stream_from_somewhere
397 while (my ($evt) = $stream->next) {
398 # do something with the event
401 More importantly, all selectors and filters are also built as stream
402 operations, so a selector and filter pair is effectively:
406 my $next_evt = $self->parent_stream->next;
407 if ($self->selector_matches($next_evt)) {
408 return $self->apply_filter_to($next_evt);
414 Internally, things are marginally more complicated than that, but not enough
415 that you as a user should normally need to care.
417 In fact, an HTML::Zoom object is mostly just a container for the relevant
418 information from which to build the final stream that does the real work. A
419 stream built from a Zoom object is a stream of events from parsing the
420 initial HTML, wrapped in a filter stream per selector/filter pair provided
423 The upshot of this is that the application of filters works just as well on
424 streams as on the original Zoom object - in fact, when you run a
425 L</repeat_content> operation your subroutines are applied to the stream for
426 that element of the repeat, rather than constructing a new zoom per repeat
431 $_->select('div')->replace_content('I AM A DIV!');
433 works on both HTML::Zoom objects themselves and HTML::Zoom stream objects and
434 shares sufficient of the implementation that you can generally forget the
435 difference - barring the fact that a stream already has state attached so
436 things like to_fh are no longer available.
438 =head2 POP! GOES THE WEASEL
440 ... and by Weasel, I mean layout.
442 HTML::Zoom's filehandle object supports an additional event key, 'flush',
443 that is transparent to the rest of the system but indicates to the filehandle
444 object to end a getline operation at that point and return the HTML so far.
446 This means that in an environment where streaming output is available, such
447 as a number of the L<Plack> PSGI handlers, you can add the flush key to an
448 event in order to ensure that the HTML generated so far is flushed through
449 to the browser right now. This can be especially useful if you know you're
450 about to call a web service or a potentially slow database query or similar
451 to ensure that at least the header/layout of your page renders now, improving
452 perceived user responsiveness while your application waits around for the
455 This is currently exposed by the 'flush_before' option to the collect filter,
456 which incidentally also underlies the replace and repeat filters, so to
457 indicate we want this behaviour to happen before a query is executed we can
458 write something like:
460 $zoom->select('.item')->repeat(sub {
461 if (my $row = $db_thing->next) {
462 return sub { $_->select('.item-name')->replace_content($row->name) }
466 }, { flush_before => 1 });
468 which should have the desired effect given a sufficiently lazy $db_thing (for
469 example a L<DBIx::Class::ResultSet> object).
471 =head2 A FISTFUL OF OBJECTS
473 At the core of an HTML::Zoom system lurks an L<HTML::Zoom::ZConfig> object,
474 whose purpose is to hang on to the various bits and pieces that things need
475 so that there's a common way of accessing shared functionality.
477 Were I a computer scientist I would probably call this an "Inversion of
478 Control" object - which you'd be welcome to google to learn more about, or
479 you can just imagine a computer scientist being suspended upside down over
480 a pit. Either way works for me, I'm a pure maths grad.
482 The ZConfig object hangs on to one each of the following for you:
486 =item * An HTML parser, normally L<HTML::Zoom::Parser::BuiltIn>
488 =item * An HTML producer (emitter), normally L<HTML::Zoom::Producer::BuiltIn>
490 =item * An object to build event filters, normally L<HTML::Zoom::FilterBuilder>
492 =item * An object to parse CSS selectors, normally L<HTML::Zoom::SelectorParser>
494 =item * An object to build streams, normally L<HTML::Zoom::StreamUtils>
498 In theory you could replace any of these with anything you like, but in
499 practice you're probably best restricting yourself to subclasses, or at
500 least things that manage to look like the original if you squint a bit.
502 If you do something more clever than that, or find yourself overriding things
503 in your ZConfig a lot, please please tell us about it via one of the means
504 mentioned under L</SUPPORT>.
506 =head2 SEMANTIC DIDACTIC
508 Some will argue that overloading CSS selectors to do data stuff is a terrible
509 idea, and possibly even a step towards the "Concrete Javascript" pattern
510 (which I abhor) or Smalltalk's Morphic (which I ignore, except for the part
511 where it keeps reminding me of the late, great Tony Hart's plasticine friend).
513 To which I say, "eh", "meh", and possibly also "feh". If it really upsets
514 you, either use extra classes for this (and remove them afterwards) or
515 use special fake elements or, well, honestly, just use something different.
516 L<Template::Semantic> provides a similar idea to zoom except using XPath
517 and XML::LibXML transforms rather than a lightweight streaming approach -
518 maybe you'd like that better. Or maybe you really did want
519 L<Template Toolkit|Template> after all. It is still damn good at what it does,
522 So far, however, I've found that for new sites the designers I'm working with
523 generally want to produce nice semantic HTML with classes that represent the
524 nature of the data rather than the structure of the layout, so sharing them
525 as a common interface works really well for us.
527 In the absence of any evidence that overloading CSS selectors has killed
528 children or unexpectedly set fire to grandmothers - and given microformats
529 have been around for a while there's been plenty of opportunity for
530 octagenarian combustion - I'd suggest you give it a try and see if you like it.
532 =head2 GET THEE TO A SUMMARY!
536 HTML::Zoom is a lazy, stream oriented, streaming capable, mostly functional,
537 CSS selector based semantic templating engine for HTML and HTML-like
540 But I said that already. Although hopefully by now you have some idea what I
541 meant when I said it. If you didn't have any idea the first time. I mean, I'm
542 not trying to call you stupid or anything. Just saying that maybe it wasn't
543 totally obvious without the explanation. Or something.
547 Maybe we should just move on to the method docs.
553 my $zoom = HTML::Zoom->new;
555 my $zoom = HTML::Zoom->new({ zconfig => $zconfig });
557 Create a new empty Zoom object. You can optionally pass an
558 L<HTML::Zoom::ZConfig> instance if you're trying to override one or more of
559 the default components.
561 This method isn't often used directly since several other methods can also
562 act as constructors, notable L</select> and L</from_html>
566 my $zconfig = $zoom->zconfig;
568 Retrieve the L<HTML::Zoom::ZConfig> instance used by this Zoom object. You
569 shouldn't usually need to call this yourself.
573 my $zoom = HTML::Zoom->from_html($html);
575 my $z2 = $z1->from_html($html);
577 Parses the HTML using the current zconfig's parser object and returns a new
578 zoom instance with that as the source HTML to be transformed.
582 my $zoom = HTML::Zoom->from_file($file);
584 my $z2 = $z1->from_file($file);
586 Convenience method - slurps the contents of $file and calls from_html with it.
590 my $stream = $zoom->to_stream;
592 while (my ($evt) = $stream->next) {
595 Creates a stream, starting with a stream of the events from the HTML supplied
596 via L</from_html> and then wrapping it in turn with each selector+filter pair
597 that have been applied to the zoom object.
601 my $fh = $zoom->to_fh;
603 call_something_expecting_a_filehandle($fh);
605 Returns an L<HTML::Zoom::ReadFH> instance that will create a stream the first
606 time its getline method is called and then return all HTML up to the next
607 event with 'flush' set.
609 You can pass this filehandle to compliant PSGI handlers (and probably most
616 Runs the zoom object's transforms without doing anything with the results.
618 Normally used to get side effects of a zoom run - for example when using
619 L<HTML::Zoom::FilterBuilder/collect> to slurp events for scraping or layout.
623 my $z2 = $z1->apply(sub {
624 $_->select('div')->replace_content('I AM A DIV!') })
627 Sets $_ to the zoom object and then runs the provided code. Basically syntax
628 sugar, the following is entirely equivalent:
631 shift->select('div')->replace_content('I AM A DIV!') })
634 my $z2 = $sub->($z1);
638 my $html = $zoom->to_html;
640 Runs the zoom processing and returns the resulting HTML.
644 my $z2 = $z1->memoize;
646 Creates a new zoom whose source HTML is the results of the original zoom's
647 processing. Effectively syntax sugar for:
649 my $z2 = HTML::Zoom->from_html($z1->to_html);
651 but preserves your L<HTML::Zoom::ZConfig> object.
655 my $zoom = HTML::Zoom->with_filter(
656 'div', $filter_builder->replace_content('I AM A DIV!')
659 my $z2 = $z1->with_filter(
660 'div', $filter_builder->replace_content('I AM A DIV!')
663 Lower level interface than L</select> to adding filters to your zoom object.
665 In normal usage, you probably don't need to call this yourself.
669 my $zoom = HTML::Zoom->select('div')->replace_content('I AM A DIV!');
671 my $z2 = $z1->select('div')->replace_content('I AM A DIV!');
673 Returns an intermediary object of the class L<HTML::Zoom::MatchWithoutFilter>
674 on which methods of your L<HTML::Zoom::FilterBuilder> object can be called.
676 In normal usage you should generally always put the pair of method calls
677 together; the intermediary object isn't designed or expected to stick around.
681 my $z2 = $z1->select('div')->add_attribute(class => 'spoon')
683 ->replace_content('I AM A DIV!');
685 Re-runs the previous select to allow you to chain actions together on the
688 =head2 parse_selector
690 my $matcher = $zoom->parse_selector('div');
692 Used by L</select> and L</with_filter> to invoke the current
693 L<HTML::Zoom::SelectorParser> object to create a matcher object (currently
694 a coderef but this is an implementation detail) for that selector.
696 In normal usage, you probably don't need to call this yourself.