Find documents with empty string value on elasticsearch

Found solution here https://github.com/elastic/elasticsearch/issues/7515 It works without reindex.

PUT t/t/1
{
  "textContent": ""
}

PUT t/t/2
{
  "textContent": "foo"
}

GET t/t/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "exists": {
            "field": "textContent"
          }
        }
      ],
      "must_not": [
        {
          "wildcard": {
            "textContent": "*"
          }
        }
      ]
    }
  }
}

For those of you using elastic search 5.2 or above, and still stuck. Easiest way is to reindex your data correctly with the keyword type. Then all the searches for empty values worked. Like this:

"query": {
    "term": {"MY_FIELD_TO_SEARCH": ""}
}

Actually, when I reindex my database and rerun the query. It worked =)

The problem was that my field was type: text and NOT a keyword. Changed the index to keyword and reindexed:

curl -X PUT https://username:[email protected]:9200/mycoolindex

curl -X PUT https://user:[email protected]:9200/mycoolindex/_mapping/mycooltype -d '{
  "properties": {
            "MY_FIELD_TO_SEARCH": {
                    "type": "keyword"
                },
}'

curl -X PUT https://username:[email protected]:9200/_reindex -d '{
 "source": {
   "index": "oldindex"
 },
 "dest": {
    "index": "mycoolindex"
 }
}'

I hope this helps someone who was as stuck as I was finding those empty values.


If you are using the default analyzer (standard) there is nothing for it to analyze if it is an empty string. So you need to index the field verbatim (not analyzed). Here is an example:

Add a mapping that will index the field untokenized, if you need a tokenized copy of the field indexed as well you can use a Multi Field type.

PUT http://localhost:9200/test/_mapping/demo
{
  "demo": {
    "properties": {
      "_content": {
        "type": "string",
        "index": "not_analyzed"
      }
    }
  }
}

Next, index a couple of documents.

/POST http://localhost:9200/test/demo/1/
{
  "_content": ""
}

/POST http://localhost:9200/test/demo/2
{
  "_content": "some content"
}

Execute a search:

POST http://localhost:9200/test/demo/_search
{
  "query": {
    "filtered": {
      "filter": {
        "term": {
          "_content": ""
        }
      }
    }
  }
}

Returns the document with the empty string.

{
    took: 2,
    timed_out: false,
    _shards: {
        total: 5,
        successful: 5,
        failed: 0
    },
    hits: {
        total: 1,
        max_score: 0.30685282,
        hits: [
            {
                _index: test,
                _type: demo,
                _id: 1,
                _score: 0.30685282,
                _source: {
                    _content: ""
                }
            }
        ]
    }
}

Even with the default analyzer you can do this kind of search: use a script filter, which is slower but can handle the empty string:

curl -XPOST 'http://localhost:9200/test/demo/_search' -d '
{
 "query": {
   "filtered": {
     "filter": {
       "script": {
         "script": "_source._content.length() == 0"
       }
     }
   }
 }
}'

It will return the document with empty string as _content without a special mapping

As pointed by @js_gandalf, this is deprecated for ES>5.0. Instead you should use: query->bool->filter->script as in https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html