Spark context 'sc' not defined

Just a little improvement. Add following at top of your python script file.

#! /bin/python
from pyspark import SparkContext, SparkConf
sc =SparkContext()

# your code starts here

You have to creat instance of SparkContext like following:

import:

from pyspark import SparkContext

and then:

sc =SparkContext.getOrCreate()

NB:sc =SparkContext.getOrCreate() works well than sc =SparkContext().


you need to do the following after you have pyspark in your path:

from pyspark import SparkContext
sc =SparkContext()

One solution is adding pyspark-shell to the shell environment variable PYSPARK_SUBMIT_ARGS:

export PYSPARK_SUBMIT_ARGS="--master local[2] pyspark-shell"

There is a change in python/pyspark/java_gateway.py , which requires PYSPARK_SUBMIT_ARGS includes pyspark-shell if a PYSPARK_SUBMIT_ARGS variable is set by a user.