Support Questions

Find answers, ask questions, and share your expertise

spark sql transformation of string to timestamp

avatar
Explorer

I am running this query through beeline and it transforms my string value from string to timestamp successfully.

select cast(regexp_replace(createdatetime,'(\\d{4})-(\\d{2})-(\\d{2})T(\\d{2}):(\\d{2}):(\\d{2}).(\\d{3})Z','$1-$2-$3 $4:$5:$6.$7') as timestamp) as thetime from L2_view where load_dt='20170908' and createdatetime is not null;

When I run the same query through spark sqlContext I get nulls.

sqlContext.sql("select cast(regexp_replace(createdatetime,'(\\d{4})-(\\d{2})-(\\d{2})T(\\d{2}):(\\d{2}):(\\d{2}).(\\d{3})Z','$1-$2-$3 $4:$5:$6.$7') as timestamp) as thetime from L2_view where load_dt='20170908' and createdatetime is not null").show

+-------+ |thetime| +-------+ +-------+

Can you explain why this happens? I am running Spark 1.6.

1 ACCEPTED SOLUTION

avatar

@kenny creed

you are using regexp_replace in spark which gives you string datatype. In spark you have to use cast to convert it. Given below an example which might help in solving your problem:

Hope it helps!

val res = df.select($"id", $"date", unix_timestamp($"date", "yyyy/MM/dd HH:mm:ss").cast(TimestampType).as("timestamp"), current_timestamp(), current_date())

View solution in original post

2 REPLIES 2

avatar

@kenny creed

you are using regexp_replace in spark which gives you string datatype. In spark you have to use cast to convert it. Given below an example which might help in solving your problem:

Hope it helps!

val res = df.select($"id", $"date", unix_timestamp($"date", "yyyy/MM/dd HH:mm:ss").cast(TimestampType).as("timestamp"), current_timestamp(), current_date())

avatar
Explorer

Perfect, thanks!