"Bad substitution" when submitting spark job to yarn-cluster

It is caused by hdp.version not getting substituted correctly. You have to set hdp.version in the file java-opts under $SPARK_HOME/conf.

And you have to set

spark.driver.extraJavaOptions -Dhdp.version=XXX 
spark.yarn.am.extraJavaOptions -Dhdp.version=XXX

in spark-defaults.conf under $SPARK_HOME/conf where XXX is the version of hdp.


If you are using spark with hdp, then you have to do the following things:

Add these entries in $SPARK_HOME/conf/spark-defaults.conf

spark.driver.extraJavaOptions -Dhdp.version=2.2.0.0-2041 (your installed HDP version)

spark.yarn.am.extraJavaOptions -Dhdp.version=2.2.0.0-2041 (your installed HDP version)

Create a file called java-opts in $SPARK_HOME/conf and add the installed HDP version to that file like this:

-Dhdp.version=2.2.0.0-2041 (your installed HDP version)

To figure out which hdp version is installed, please run this command in the cluster:

hdp-select status hadoop-client