11-27-2017 04:45 AM
I'm attemtping to import a csv file into a table into Hadoop.
The file has imported but the fields with a timestamp field have returned as null for all rows (no nulls in original file).
I've currently formatted the columns in the csv file as custom format yyyy-mm-dd hh:dd:ss
and set the field in the Hadoop table as a timestamp. Presumably the excel format isn't being recognised.
How can I resolve this?
11-27-2017 02:29 PM - edited 11-27-2017 02:39 PM
The standard timestamp format is"yyyy-MM-dd HH:mm:ss[.SSSSSS]". Is there any warning printed when selecting from this table? Could you share a sample row in the text file?
11-28-2017 06:04 AM
Example of the field is
None of the fields have any second values. They're all 00.
Within the file it's formatted as dd/mm/yyyy hh:mm.
I've tried changing it to yyyy-mm-dd hh:mm:ss but when I do that, save and re-open it's reverted to the original format.
I can upload into a text field (and it appears as 27/01/2016 11:25) but then if I cast the field as a timestamp it then returns null values.
11-28-2017 11:27 AM - edited 11-28-2017 11:29 AM
You can create a table with a text column first, and then use to_timestamp to convert it into a table with a timestamp column.
create table timestamp_table as select column_1, column_2, to_timestamp(column_3, "dd/MM/yyyy HH:mm") from text_table;
For documentations on timestamp-related functions, please visit https://impala.apache.org/docs/build/html/topics/impala_datetime_functions.html