How to add an elasticsearch index during docker build

The simple way of doing this could be using below Dockerfile.

Run this Dockerfile with docker build -t elasticsearch-custom:latest .

FROM elasticsearch:5.5.1 AS esbuilder
ADD script.sh path/to/insert/script.sh
RUN apt-get update \
    && apt-get install procps -y \
    && apt-get install httping -y \
    && /docker-entrypoint.sh elasticsearch -d -E path.data=/tmp/data \
    && while ! httping -qc1 http://localhost:9200 ; do sleep 1 ; done \
    && path/to/insert/script.sh \
    && apt-get clean

FROM elasticsearch:5.5.1
COPY --from=esbuilder /tmp/data/ /usr/share/elasticsearch/data/

And then just run docker run -t -d elasticsearch-custom:latest


I've had a similar problem.

I wanted to create a docker container with preloaded data (via some scripts and json files in the repo). The data inside elasticsearch was not going to change during the execution and I wanted as few build steps as possible (ideally only docker-compose up -d).

One option would be to do it manually once, and store the elasticsearch data folder (with a docker volume) in the repository. But then I would have had duplicate data and I would have to check in manually a new version of the data folder every time the data changes.

The solution

  1. Make elasticsearch write data to a folder that is not declared as a volume in elasticsearchs' official dockerfile.

RUN mkdir /data && chown -R elasticsearch:elasticsearch /data && echo 'es.path.data: /data' >> config/elasticsearch.yml && echo 'path.data: /data' >> config/elasticsearch.yml

(the folder needs to be created with the right permissions)

  1. Download wait-for-it

ADD https://raw.githubusercontent.com/vishnubob/wait-for-it/e1f115e4ca285c3c24e847c4dd4be955e0ed51c2/wait-for-it.sh /utils/wait-for-it.sh

This script will wait until elasticsearch is up to run our insert commands.

  1. Insert data into elasticsearch

RUN /docker-entrypoint.sh elasticsearch -p /tmp/epid & /bin/bash /utils/wait-for-it.sh -t 0 localhost:9200 -- path/to/insert/script.sh; kill $(cat /tmp/epid) && wait $(cat /tmp/epid); exit 0;

This command starts elasticsearch during the build process, inserts data and takes it down in one RUN command. The container is left as it was except for elasticsearch's data folder which has been properly initialized now.

Summary

FROM elasticsearch

RUN mkdir /data && chown -R elasticsearch:elasticsearch /data && echo 'es.path.data: /data' >> config/elasticsearch.yml && echo 'path.data: /data' >> config/elasticsearch.yml

ADD https://raw.githubusercontent.com/vishnubob/wait-for-it/e1f115e4ca285c3c24e847c4dd4be955e0ed51c2/wait-for-it.sh /utils/wait-for-it.sh

# Copy the files you may need and your insert script

RUN /docker-entrypoint.sh elasticsearch -p /tmp/epid & /bin/bash /utils/wait-for-it.sh -t 0 localhost:9200 -- path/to/insert/script.sh; kill $(cat /tmp/epid) && wait $(cat /tmp/epid); exit 0;

And that's it! When you run this image, the database will have preloaded data, indexes, etc...