how to run a python jupyter notebook daily automatically

Update
recently I came across papermill which is for executing and parameterizing notebooks.

https://github.com/nteract/papermill

papermill local/input.ipynb s3://bkt/output.ipynb -p alpha 0.6 -p l1_ratio 0.1

This seems better than nbconvert, because you can use parameters. You still have to trigger this command with a scheduler. Below is an example with cron on Ubuntu.


Old Answer

nbconvert --execute

can execute a jupyter notebook, this embedded into a cronjob will do what you want.

Example setup on Ubuntu:

Create yourscript.sh with the following content:

/opt/anaconda/envs/yourenv/bin/jupyter nbconvert \
                      --execute \
                      --to notebook /path/to/yournotebook.ipynb \
                      --output /path/to/yournotebook-output.ipynb

You have more options except --to notebook. I like this option since you have a fully executable "log"-File afterwards.

I recommend using a virtual environment to run your notebook, to avoid that future updates mess with your script. Do not forget to install nbconvert into the environment.

Now create a cronjob, that runs every day e.g. at 5:10 AM, by typing crontab -e in your terminal and add this line:

10 5 * * * /path/to/yourscript.sh

Try the SeekWell Chrome Extension. It lets you schedule notebooks to run weekly, daily, hourly or every 5 minutes, right from Jupyter Notebooks. You can also send DataFrames directly to Sheets or Slack if you like.

Here's a demo video, and there is more info in the Chrome Web Store link above as well.

**Disclosure: I'm a SeekWell co-founder


It's better to combine with airflow if you want to have higher quality. I packaged them in a docker image, https://github.com/michaelchanwahyan/datalab.

It is done by modifing an open source package nbparameterize and integrating the passing arguments such as execution_date. Graph can be generated on the fly The output can be updated and saved within inside the notebook.

When it is executed

  • the notebook will be read and inject the parameters
  • the notebook is executed and the output will overwrite the original path

Besides, it also installed and configured common tools such as spark, keras, tensorflow, etc.