Support Questions

Report Inappropriate Content · ‎11-09-2017

event csv file looks like this

1|cb8c55702adb93| 379743| 61| 3| |

2|79a85fa78311b9| 1794259| 81| 2| |

3|822932ce3d8757| 1179111| 182| 2| |

4|85281d0a49f7ac| 1777797| 234| 2| |

This my code spark scala code

import org.joda.time._
case class flight(display_id: Int ,uuid:String, document_id :Int, timestamp:String, platformgeo_location:String)
valstreamdf=sc.textFile("/FileStore/tables/y6ak4fzq1504260076447/events.csv").map(_.split(",")).map(x=>flight(x(0).toInt,x(1).toString,x(2).toInt,x(3).toString,x(4).toString)).toDF()
streamdf.show()
streamdf.registerTempTable("event1")
val result = sqlContext.sql("select * from event1 limit 10")
val addP = (p: Int) => udf( (x: Int) => x + p ) val stamp = streamdf.withColumn("timestamp", addP(1465876799)($"timestamp")).toDF()
stamp.show() stamp.registerTempTable("stamp")
new org.joda.time.DateTime(1465876799*1000)
val df = sqlContext.sql("select from_unixtime(timestamp,'YYYY-MM-dd') as 'ts' from stamp")
when am exceuting last command which is in bold

val df type mismatch error am getting

how to resolve this problem please help me out

Thanks in advance

swathi.T

Shu_ashu · ‎11-10-2017

@swathi thukkaraju

As you are using timestamp field data type as string, can you cast that to Bigint or int as per your requirements then from_unixtime will work.

Possible Outputs for your timestamp value 1465876799, you can check them in hive (or) beeline shell.

hive> select from_unixtime(1465876799, 'yyyy-MM-dd');
2016-06-13

hive> select from_unixtime(CAST(1465876799000 as int), 'yyyy-MM-dd');  
2010-12-21

hive> select from_unixtime(CAST(1465876799000 as bigint), 'yyyy-MM-dd');
48421-10-14

select from_unixtime(CAST(1465876799000/1000 as BIGINT), 'yyyy-MM-dd');
2016-06-13

Error:-

hive> select from_unixtime(CAST(1465876799000 as string), 'yyyy-MM-dd'); 
FAILED: SemanticException [Error 10014]: Line 1:7 Wrong arguments ''yyyy-MM-dd'': No matching method for class org.apache.hadoop.hive.ql.udf.UDFFromUnixTime with (string, string). Possible choices: _FUNC_(bigint)  _FUNC_(bigint, string)  _FUNC_(int)  _FUNC_(int, string)

As you can view above i did cast 1465876799000 as string but it is giving error with possible choices are bigint,int.

Possible Query for your case:-

val df = sqlContext.sql("select from_unixtime(cast(timestamp as bigint),'YYYY-MM-dd') as 'ts' from stamp")

(or)

change data type in case class

case class flight(display_id: Int ,uuid:String, document_id :Int, timestamp:BigInt, platformgeo_location:String)val df = sqlContext.sql("select from_unixtime(timestamp,'YYYY-MM-dd') as 'ts' from stamp")

I have mentioned all the possible outputs above by testing them in hive shell by using datatypes as int,bigint.

You can pick which is best fit for your case.

Cloudera Community

Support Questions

convert milliseconds data frame column into Unixtimestamp using sprak scala ??