java.lang.NoClassDefFoundError: org/apache/spark/sql/SparkSession

If you're running from inside Intellij IDEA, and you've marked your spark library as "provided", like so: "org.apache.spark" %% "spark-sql" % "3.0.1" % "provided", Then you need edit your Run/Debug configuration and check the "Include dependencies with Provided scope" box.


Probably you are deploying your application on the cluster with lower Spark version.

Please check Spark version on your cluster - it should be the same as version in pom.xml. Please also note, that all Spark dependencies should be marked as provided when you use spark-submit to deploy application


I was facing this issue while running from the Intellij editor. I had marked the spark jars as provided in pom.xml, see below:

<dependency>
     <groupId>org.apache.spark</groupId>
     <artifactId>spark-sql_2.11</artifactId>
     <version>2.4.0</version>
     <scope>provided</scope>
 </dependency>

On removing the provided scope, the error was gone.

On making provided spark jars they would be provided only on running the application with spark-submit or having the spark jars on the classpath


when submitting with spark-submit , check that your project has the same dependency as spark version in pom.xml,

This may be because you have two spark versions on the same machine


If you want to have different Spark installations on your machine, you can create different soft links and can use the exact spark version on which you have build your project

spark1-submit -> /Users/test/sparks/spark-1.6.2-bin-hadoop2.6/bin/spark-submit

spark2–submit -> /Users/test/sparks/spark-2.1.1-bin-hadoop2.7/bin/spark-submit

Here is a link from Cloudera blog about multiple Spark versions https://community.cloudera.com/t5/Advanced-Analytics-Apache-Spark/Multiple-Spark-version-on-the-same-cluster/td-p/39880