Set python path for Spark worker

A standard way of setting environmental variables, including PYSPARK_PYTHON, is to use conf/spark-env.sh file. Spark comes with a template file (conf/spark-env.sh.template) which explains the most common options.

It is a normal bash script so you can use it the same way as you would with .bashrc

You'll find more details in a Spark Configuration Guide.


By the following code you can change the python path only for the current job, which also allow different python path for driver and executors:

    PYSPARK_DRIVER_PYTHON=/home/user1/anaconda2/bin/python PYSPARK_PYTHON=/usr/local/anaconda2/bin/python pyspark --master ..