How to get rid of derby.log, metastore_db from Spark Shell

The use of the hive.metastore.warehouse.dir is deprecated since Spark 2.0.0, see the docs.

As hinted by this answer, the real culprit for both the metastore_db directory and the derby.log file being created in every working subdirectory is the derby.system.home property defaulting to ..

Thus, a default location for both can be specified by adding the following line to spark-defaults.conf:

spark.driver.extraJavaOptions -Dderby.system.home=/tmp/derby

where /tmp/derby can be replaced by the directory of your choice.


For spark-shell, to avoid having the metastore_db directory and avoid doing it in the code (since the context/session is already created and you won't stop them and recreate them with the new configuration each time), you have to set its location in hive-site.xml file and copy this file into spark conf directory.
A sample hive-site.xml file to make the location of metastore_db in /tmp (refer to my answer here):

<configuration>
   <property>
     <name>javax.jdo.option.ConnectionURL</name>
     <value>jdbc:derby:;databaseName=/tmp/metastore_db;create=true</value>
     <description>JDBC connect string for a JDBC metastore</description>
   </property>
   <property>
     <name>javax.jdo.option.ConnectionDriverName</name>
     <value>org.apache.derby.jdbc.EmbeddedDriver</value>
     <description>Driver class name for a JDBC metastore</description>
   </property>
   <property>
      <name>hive.metastore.warehouse.dir</name>
      <value>/tmp/</value>
      <description>location of default database for the warehouse</description>
   </property>
</configuration>

After that you could start your spark-shell as the following to get rid of derby.log as well

$ spark-shell --conf "spark.driver.extraJavaOptions=-Dderby.stream.error.file=/tmp"

Try setting derby.system.home to some other directory as a system property before firing up the spark shell. Derby will create new databases there. The default value for this property is .

Reference: https://db.apache.org/derby/integrate/plugin_help/properties.html