Drupal - Finding nodes that have not been indexed

The status of node indexing is based on the search_dataset. This table stores content keyword blobs and their associated sid, the primary key for the content's keyword (i.e. the nid). When compared/joined against the node table, it should let you see which nodes aren't index.

From what it sounds like, you've already spotted problem node(s) so it's just a matter of confirmation. Removing the node(s) from being indexed (e.g. hacking NodeSearch::indexNode()) to confirm it's the problem, then finding out what content in the node is blocking indexer.

Tags:

Search

Cron