Stopping a Running Spark Application

If you wish to kill an application that is failing repeatedly, you may do so through:

./bin/spark-class org.apache.spark.deploy.Client kill <master url> <driver ID>

You can find the driver ID through the standalone Master web UI at http://:8080.

From Spark Doc


Revisiting this because I wasn't able to use the existing answer without debugging a few things.

My goal was to programmatically kill a driver that runs persistently once a day, deploy any updates to the code, then restart it. So I won't know ahead of time what my driver ID is. It took me some time to figure out that you can only kill the drivers if you submitted your driver with the --deploy-mode cluster option. It also took me some time to realize that there was a difference between application ID and driver ID, and while you can easily correlate an application name with an application ID, I have yet to find a way to divine the driver ID through their api endpoints and correlate that to either an application name or the class you are running. So while run-class org.apache.spark.deploy.Client kill <master url> <driver ID> works, you need to make sure you are deploying your driver in cluster mode and are using the driver ID and not the application ID.

Additionally, there is a submission endpoint that spark provides by default at http://<spark master>:6066/v1/submissions and you can use http://<spark master>:6066/v1/submissions/kill/<driver ID> to kill your driver.

Since I wasn't able to find the driver ID that correlated to a specific job from any api endpoint, I wrote a python web scraper to get the info from the basic spark master web page at port 8080 then kill it using the endpoint at port 6066. I'd prefer to get this data in a supported way, but this is the best solution I could find.

#!/usr/bin/python

import sys, re, requests, json
from selenium import webdriver

classes_to_kill = sys.argv
spark_master = 'masterurl'

driver = webdriver.PhantomJS()
driver.get("http://" + spark_master + ":8080/")

for running_driver in driver.find_elements_by_xpath("//*/div/h4[contains(text(), 'Running Drivers')]"):
    for driver_id in running_driver.find_elements_by_xpath("..//table/tbody/tr/td[contains(text(), 'driver-')]"):
        for class_to_kill in classes_to_kill:
            right_class = driver_id.find_elements_by_xpath("../td[text()='" + class_to_kill + "']")
            if len(right_class) > 0:
                driver_to_kill = re.search('^driver-\S+', driver_id.text).group(0)
                print "Killing " + driver_to_kill
                result = requests.post("http://" + spark_master + ":6066/v1/submissions/kill/" + driver_to_kill)
                print json.dumps(json.loads(result.text), indent=4)

driver.quit()

Tags:

Apache Spark