Support Questions

Find answers, ask questions, and share your expertise

Write column as date with format Java-Spark

avatar

I'm using Java-Spark.

I have the following table in Dataset object:

<code>      creationDate
  15/06/2018 09:15:28

I make select to this column

<code>Dataset<Row> ds = dataframe.select(new Column("creationDate").as("mydate").cast("date"));

And I write it with:

<code>ds.write().mode(mode).save(hdfsDirectory);

Try also:

<code>ds.write().option("dateFormat","dd/MM/yyyy HH:mm:ss").mode(mode).save(hdfsDirectory);

But When I'm looking on my table the column mydate is null.

How can I write my date into my Hive table? I know the default date format should be dd-MM-yyyy but my text is with dd/MM/yyyy format and I can't change it.

Any suggestions?

Thanks.

2 REPLIES 2

avatar

@Is Ta

null means the conversion failed. I think this is due your initial creationDate is actually a timestamp not a date. The following code is scala-spark as I'm not used to java-spark so much hopefully you can change it for java:

//dataframe is the original dataframe containing the creationDate column
val ds = dataframe.withColumn("timestamp",to_timestamp($"creationDate","dd/MM/yyyy HH:mm:ss"))
val result = ds.withColumn("date_formatted",date_format($"timestamp","dd/MM/yyyy HH:mm:ss"))
result.show
This is some example of the output:
+-------------------+-------------------+-------------------+ | input_date| timestamp| date_formatted| +-------------------+-------------------+-------------------+ |15/06/2018 09:15:28|2018-06-15 09:15:28|15/06/2018 09:15:28| |03/06/1982 09:15:28|1982-06-03 09:15:28|03/06/1982 09:15:28| +-------------------+-------------------+-------------------+

This also is saved correctly when you write to a file since the actual date_formatted column is a string.

HTH

*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.

avatar
@Is Ta

Please let me know if the above has helped you?

Thanks!