jQuery-like selectors for PHP DOMDocument

You can use the Symfony DomCrawler component, enabling you to use css selectors for DOM traversing: https://packagist.org/packages/symfony/dom-crawler


I created a library that allows you to crawl HTML5 and XML documents just like you do with jQuery.

You can find the library on GitHub.

It should allow you to do exactly what you want!

Under the hood, it uses symfony/DomCrawler for conversion of CSS selectors to XPath selectors. It always uses the same DomDocument, even when passing one object to another, to ensure decent performance.

The library also includes its own zero config autoloader for PSR-0 compatible libraries. The example included should work out of the box without any additional configuration.


Example use:

namespace PowerTools;

// Get file content
$htmlcode = file_get_contents( 'https://github.com' );

// Define your DOMCrawler based on file string
$H = new DOM_Query( $htmlcode );

// Define your DOMCrawler based on an existing DOM_Query instance
$H = new DOM_Query( $H->select('body') );

// Passing a string (CSS selector)
$s = $H->select( 'div.foo' );

// Passing an element object (DOM Element)
$s = $H->select( $documentBody );

// Passing a DOM Query object
$s = $H->select( $H->select('p + p') );

// Select the body tag
$body = $H->select('body');

// Combine different classes as one selector to get all site blocks
$siteblocks = $body->select('.site-header, .masthead, .site-body, .site-footer');

// Nest your methods just like you would with jQuery
$siteblocks->select('button')->add('span')->addClass('icon icon-printer');

// Use a lambda function to set the text of all site blocks
$siteblocks->text(function( $i, $val) {
    return $i . " - " . $val->attr('class');
});

// Append the following HTML to all site blocks
$siteblocks->append('<div class="site-center"></div>');

// Use a descendant selector to select the site's footer
$sitefooter = $body->select('.site-footer > .site-center');

// Set some attributes for the site's footer
$sitefooter->attr(array('id' => 'aweeesome', 'data-val' => 'see'));

// Use a lambda function to set the attributes of all site blocks
$siteblocks->attr('data-val', function( $i, $val) {
    return $i . " - " . $val->attr('class') . " - photo by Kelly Clark";
});

// Select the parent of the site's footer
$sitefooterparent = $sitefooter->parent();

// Remove the class of all i-tags within the site's footer's parent
$sitefooterparent->select('i')->removeAttr('class');

// Wrap the site's footer within two nex selectors
$sitefooter->wrap('<section><div class="footer-wrapper"></div></section>');

[...]

Supported methods :
  • [x] $ (1)
  • [x] $.parseHTML
  • [x] $.parseXML
  • [x] $.parseJSON
  • [x] $selection.add
  • [x] $selection.addClass
  • [x] $selection.after
  • [x] $selection.append
  • [x] $selection.attr
  • [x] $selection.before
  • [x] $selection.children
  • [x] $selection.closest
  • [x] $selection.contents
  • [x] $selection.detach
  • [x] $selection.each
  • [x] $selection.eq
  • [x] $selection.empty (2)
  • [x] $selection.find
  • [x] $selection.first
  • [x] $selection.get
  • [x] $selection.insertAfter
  • [x] $selection.insertBefore
  • [x] $selection.last
  • [x] $selection.parent
  • [x] $selection.parents
  • [x] $selection.remove
  • [x] $selection.removeAttr
  • [x] $selection.removeClass
  • [x] $selection.text
  • [x] $selection.wrap

  1. Renamed 'select', for obvious reasons
  2. Renamed 'void', since 'empty' is a reserved word in PHP

If you want to manipulate the DOM ala Jquery, PHPQuery is something for you.

http://code.google.com/p/phpquery/

A simple example of what you can do with it.

// almost everything can be a chain
$li = null;
$doc['ul > li']
        ->addClass('my-new-class')
        ->filter(':last')
                ->addClass('last-li');

Take a look at the DOMXPath class in PHP. It uses XPath, so you'll need to read up on XPath syntax if you're unfamiliar with it. There's some documentation on MSDN, or you can read the W3 spec if you're particularly brave.

To solve your example problem: //cube[@currency] is an XPath query that selects all elements in the document with a currency attribute. Usage of this with the DOMXPath class would look like this:

$xpath = new DOMXpath($myDomDocument);
$cubesWithCurrencies = $xpath->query('//cube[@currency]');

$cubesWithCurrencies is now a DOMNodeList that you can iterate over.