Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

spark sql transformation of string to timestamp

avatar
New Member

I am running this query through beeline and it transforms my string value from string to timestamp successfully.

select cast(regexp_replace(createdatetime,'(\\d{4})-(\\d{2})-(\\d{2})T(\\d{2}):(\\d{2}):(\\d{2}).(\\d{3})Z','$1-$2-$3 $4:$5:$6.$7') as timestamp) as thetime from L2_view where load_dt='20170908' and createdatetime is not null;

When I run the same query through spark sqlContext I get nulls.

sqlContext.sql("select cast(regexp_replace(createdatetime,'(\\d{4})-(\\d{2})-(\\d{2})T(\\d{2}):(\\d{2}):(\\d{2}).(\\d{3})Z','$1-$2-$3 $4:$5:$6.$7') as timestamp) as thetime from L2_view where load_dt='20170908' and createdatetime is not null").show

+-------+ |thetime| +-------+ +-------+

Can you explain why this happens? I am running Spark 1.6.

1 ACCEPTED SOLUTION

avatar

@kenny creed

you are using regexp_replace in spark which gives you string datatype. In spark you have to use cast to convert it. Given below an example which might help in solving your problem:

Hope it helps!

val res = df.select($"id", $"date", unix_timestamp($"date", "yyyy/MM/dd HH:mm:ss").cast(TimestampType).as("timestamp"), current_timestamp(), current_date())

View solution in original post

2 REPLIES 2

avatar

@kenny creed

you are using regexp_replace in spark which gives you string datatype. In spark you have to use cast to convert it. Given below an example which might help in solving your problem:

Hope it helps!

val res = df.select($"id", $"date", unix_timestamp($"date", "yyyy/MM/dd HH:mm:ss").cast(TimestampType).as("timestamp"), current_timestamp(), current_date())

avatar
New Member

Perfect, thanks!