How to convert a string column with milliseconds to a timestamp with milliseconds in Spark 2.1 using Scala?

UDF with SimpleDateFormat works. The idea is taken from the Ram Ghadiyaram's link to an UDF logic.

import java.text.SimpleDateFormat
import java.sql.Timestamp
import org.apache.spark.sql.functions.udf
import scala.util.{Try, Success, Failure}

val getTimestamp: (String => Option[Timestamp]) = s => s match {
  case "" => None
  case _ => {
    val format = new SimpleDateFormat("MM/dd/yyyy' 'HH:mm:ss.SSS")
    Try(new Timestamp(format.parse(s).getTime)) match {
      case Success(t) => Some(t)
      case Failure(_) => None
    }    
  }
}

val getTimestampUDF = udf(getTimestamp)
val tdf = Seq((1L, "05/26/2016 01:01:01.601"), (2L, "#$@#@#")).toDF("id", "dts")
val tts = getTimestampUDF($"dts")
tdf.withColumn("ts", tts).show(2, false)

with output:

+---+-----------------------+-----------------------+
|id |dts                    |ts                     |
+---+-----------------------+-----------------------+
|1  |05/26/2016 01:01:01.601|2016-05-26 01:01:01.601|
|2  |#$@#@#                 |null                   |
+---+-----------------------+-----------------------+

There is an easier way than making a UDF. Just parse the millisecond data and add it to the unix timestamp (the following code works with pyspark and should be very close the scala equivalent):

timeFmt = "yyyy/MM/dd HH:mm:ss.SSS"
df = df.withColumn('ux_t', unix_timestamp(df.t, format=timeFmt) + substring(df.t, -3, 3).cast('float')/1000)

Result: '2017/03/05 14:02:41.865' is converted to 1488722561.865