How to create SparkSession with Hive support (fails with "Hive classes are not found")?

Add following dependency to your maven project.

<dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-hive_2.11</artifactId>
        <version>2.0.0</version>
</dependency>

tl;dr You have to make sure that Spark SQL's spark-hive dependency and all transitive dependencies are available at runtime on the CLASSPATH of a Spark SQL application (not build time that is simply required for compilation only).


In other words, you have to have org.apache.spark.sql.hive.HiveSessionStateBuilder and org.apache.hadoop.hive.conf.HiveConf classes on the CLASSPATH of the Spark application (which has little to do with sbt or maven).

The former HiveSessionStateBuilder is part of spark-hive dependency (incl. all the transitive dependencies).

The latter HiveConf is part of hive-exec dependency (that is a transitive dependency of the above spark-hive dependency).


I've looked into the source code, and found that despite HiveSessionState(in spark-hive), another class HiveConf is also needed to initiate SparkSession. And HiveConf is not contained in spark-hive*jar, maybe you can find it in hive related jars and put it in your classpath.