How to use org.w3c.dom.NodeList with Java 8 Stream API?

Considering the introduction of default methods to ensure backwards compatibility, I fail to understand why this Interface doesn't have stream() function.

The interface was defined before Java 8 existed. Since Stream did not exist prior Java 8, it follows that NodeList could not support it.

How do I use a NodeList in combination with the Stream API?

You can't. At least, not directly.

If it's discouraged to do so, what are the reasons for that?

It is not "discouraged". Rather it is not supported.

The primary reason is history. See above.

It is possible that the people responsible for specifying the org.w3c.dom APIs for Java (i.e. the W3 consortium) will bring out a new edition of the APIs that will be more friendly to Java 8. However, that would introduce a bunch of new compatibility issues. The new edition of the APIs would would not be binary compatibility with the current ones, and would not be compatible with pre-Java 8 JVMs.


However, it is more complicated than just getting the W3 Consortium to update the APIs.

The DOM APIs are defined in CORBA IDL, and the Java APIs are "generated" by applying the CORBA Java mapping to the IDL. This mapping is specified by the OMG ... not the W3 Consortium. So creating a "Java 8 Stream friendly" version of the org.w3c.dom APIs would entail either getting the OMG to update the CORBA Java mapping to be Stream aware (which would be problematic from a CORBA compatibility perspective, at least) or breaking the connection between the Java API and CORBA.

Unfortunately, finding out what (if anything) is happening in the OMG world on refreshing the IDL to Java mapping is difficult ... unless you work for an OMG member organization, etcetera. I don't.


java8 Stream.iterate
Use like that:

    Stream.iterate(0, i -> i + 1)
          .limit (nodeList.getLength())
          .map (nodeList::item).forEach...

For java9 iterate there is an enhanced version of iterate, while previous version is available as well:

    Stream.iterate(0, i -> i < nodeList.getLength(), i -> i + 1)
          .map (nodeList::item).forEach...

Both versions of iterate still the same in Java14


Here is an example where stream is used for finding a specific NodeList element :

private String elementOfInterest;       // id of element
private String elementOfInterestValue;  // value of element

public boolean searchForElementOfInterest(Document doc)
{
        boolean bFound=false;
        NodeList nList = doc.getElementsByTagName("entity");

        // since NodeList does not have stream implemented, then use this hack
        Stream<Node> nodeStream = IntStream.range(0, nList.getLength()).mapToObj(nList::item);
        // search for element of interest in the NodeList
        if(nodeStream.parallel().filter(this::isElementOfInterest).collect(Collectors.toList()).size() > 0)
                bFound=true;

        return bFound;
}

private boolean isElementOfInterest(Node nNode)
{
        boolean bFound=false;
        assert(nNode != null);
        if (nNode.getNodeType() == Node.ELEMENT_NODE) {
                Element eElement = (Element) nNode;
                String id = eElement.getElementsByTagName("id").item(0).getTextContent();
                String data = eElement.getElementsByTagName("data").item(0).getTextContent();
                if (id.contentEquals(elementOfInterest) && data.contentEquals(elementOfInterestValue))
                        bFound = true;
        }
        return bFound;
}

The DOM is a strange beast, the API is defined in a language-independent way by the W3C and then mapped into various different programming languages, so Java can't add anything Java-specific to the core DOM interfaces that wasn't part of the DOM spec in the first place.

So while you can't use a NodeList as a stream, you can easily create a stream from a NodeList, using e.g.

Stream<Node> nodeStream = IntStream.range(0, nodeList.getLength())
                                   .mapToObj(nodeList::item);

However, there is one big caveat - a DOM NodeList is live, and reflects changes to the original DOM tree since the list was created. If you add or remove elements in the DOM tree they may magically appear or disappear from existing NodeLists, and this may cause strange effects if this happens mid-iteration. If you want a "dead" node list you'll need to copy it into an array or list, as you are doing already.

Tags:

Java

Dom