what does _doc represents in elasticsearch?

From ElasticSearch 8.x version, only _doc is supported and it is just an endpoint name, not a document type.

In 7.0, _doc represents the endpoint name instead of the document type. The _doc component is a permanent part of the path for the document index, get, and delete APIs going forward, and will not be removed in 8.0.

Elasticsearch 8.x Specifying types in requests is no longer supported. The include_type_name parameter is removed.

Schedule For Removal of Mapping Types


_doc is a mapping type, which by the way is now deprecated.

A mapping type used to be a separate collection inside the same index. E.g. a twitter index could have a mapping of type user for storing all users, and a mapping of type tweet to store all tweets. Both of these types still belong to the same index, so you could search inside multiple types in the same index.

Since elaticsearch came out with the news to deprecate mapping types for several reasons, they forced v6 users to ONLY use 1 mapping type per index i.e. you can have either user or tweet inside the twitter index, but not both. They further recommended to be consistent and use _doc as the name of the mapping type. But this can literally be any string - dog, cat, etc. It is just recommended to be _doc because in v7 the mapping type field is completely going away. So if every index in elasticsearch only has 1 mapping type, then it would be easier to migrate to v7 because you just have to remove the mapping type and all documents would then directly come under the index.


I believe these two use cases are not using the _doc terminology for the same purpose:

  1. The keyword _doc for sorting is new in Elasticsearch 2 and is a replacement for the old scan and scroll way to efficiently paginate deep into the results of a query. There is no actual _doc field in the documents.

  2. The _doc syntax to be used in the _source portion of a search (or get, update, etc) request has not been implemented as shown at the beginning of that git discussion, but using the fielddata_fields field instead. It has nothing to do with the usage of _doc in sorting.

In the scripting documentation you'll find a section about document field data, that is extremely fast to read as it is stored in memory and is accessible using a similar doc syntax (that might add to the confusion).