find words in html page with javascript

To find the element that word exists in, you'd have to traverse the entire tree looking in just the text nodes, applying the same test as above. Once you find the word in a text node, return the parent of that node.

var word = "foo",
    queue = [document.body],
    curr
;
while (curr = queue.pop()) {
    if (!curr.textContent.match(word)) continue;
    for (var i = 0; i < curr.childNodes.length; ++i) {
        switch (curr.childNodes[i].nodeType) {
            case Node.TEXT_NODE : // 3
                if (curr.childNodes[i].textContent.match(word)) {
                    console.log("Found!");
                    console.log(curr);
                    // you might want to end your search here.
                }
                break;
            case Node.ELEMENT_NODE : // 1
                queue.push(curr.childNodes[i]);
                break;
        }
    }
}

this works in Firefox, no promises for IE.

What it does is start with the body element and check to see if the word exists inside that element. If it doesn't, then that's it, and the search stops there. If it is in the body element, then it loops through all the immediate children of the body. If it finds a text node, then see if the word is in that text node. If it finds an element, then push that into the queue. Keep on going until you've either found the word or there's no more elements to search.


You can try using XPath, it's fast and accurate

http://www.w3schools.com/Xpath/xpath_examples.asp

Also if XPath is a bit more complicated, then you can try any javascript library like jQuery that hides the boilerplate code and makes it easier to express about what you want found.

Also, as from IE8 and the next Firefox 3.5 , there is also Selectors API implemented. All you need to do is use CSS to express what to search for.


You can iterate through DOM elements, looking for a substring within them. Neither fast nor elegant, but for small HTML might work well enough.

I'd try something recursive, like: (code not tested)

findText(node, text) {
  if(node.childNodes.length==0) {//leaf node
   if(node.textContent.indexOf(text)== -1) return [];
   return [node];
  }
  var matchingNodes = new Array();
  for(child in node.childNodes) {
    matchingNodes.concat(findText(child, text));
  }
  return matchingNodes;
}