Spark 2.2 Illegal pattern component: XXX java.lang.IllegalArgumentException: Illegal pattern component: XXX

I found the answer.

The default for the timestampFormat is yyyy-MM-dd'T'HH:mm:ss.SSSXXX which is an illegal argument. It needs to be set when you are writing the dataframe out.

The fix is to change that to ZZ which will include the timezone.

.option("timestampFormat", "yyyy/MM/dd HH:mm:ss ZZ")

Ensure you are using the correct version of commons-lang3


Use commons-lang3-3.5.jar fixed the original error. I didn't check the source code to tell why but it is no surprising as the original exception happens at org.apache.commons.lang3.time.FastDatePrinter.parsePattern( I also noticed the file /usr/lib/spark/jars/commons-lang3-3.5.jar (on an EMR cluster instance) which also suggest 3.5 is the consistent version to use.