Why does a job fail with "No space left on device", but df says otherwise?

By default Spark uses the /tmp directory to store intermediate data. If you actually do have space left on some device -- you can alter this by creating the file SPARK_HOME/conf/spark-defaults.conf and adding the line. Here SPARK_HOME is wherever you root directory for the spark install is.

spark.local.dir                     SOME/DIR/WHERE/YOU/HAVE/SPACE

You need to also monitor df -i which shows how many inodes are in use.

on each machine, we create M * R temporary files for shuffle, where M = number of map tasks, R = number of reduce tasks.


If you do indeed see that disks are running out of inodes to fix the problem you can:

  • Decrease partitions (see coalesce with shuffle = false).
  • One can drop the number to O(R) by “consolidating files”. As different file-systems behave differently it’s recommended that you read up on spark.shuffle.consolidateFiles and see https://spark-project.atlassian.net/secure/attachment/10600/Consolidating%20Shuffle%20Files%20in%20Spark.pdf.
  • Sometimes you may simply find that you need your DevOps to increase the number of inodes the FS supports.


Consolidating files has been removed from spark since version 1.6. https://issues.apache.org/jira/browse/SPARK-9808


Apache Spark