Spark : check your cluster UI to ensure that workers are registered

You can check your cluster's work node cores: your application can't exceed that. For example, you have two work node. And per work node you have 4 cores. Then you have 2 applications to run. So you can give every application 4 cores to run the job.

You can set like this in the code:

SparkConf sparkConf = new SparkConf().setAppName("JianSheJieDuan")
                          .set("spark.cores.max", "4");

It works for me.


I have done configuration and performance tuning for many spark clusters and this is a very common/normal message to see when you are first prepping/configuring a cluster to handle your workloads.

This is unequivocally due to insufficient resources to have the job launched. The job is requesting one of:

  • more memory per worker than allocated to it (1GB)
  • more CPU's than available on the cluster

Finally figured out what the answer is.

When deploying a spark program on a YARN cluster, the master URL is just yarn.

So in the program, the spark context should just looks like:

val conf = new SparkConf().setAppName("SimpleApp")

Then this eclipse project should be built using Maven and the generated jar should be deployed on the cluster by copying it to the cluster and then running the following command

spark-submit --master yarn --class "SimpleApp" Recommender_2-0.0.1-SNAPSHOT.jar

This means that running from eclipse directly would not work.