Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

PutHiveStreaming with null timestamp values

Highlighted

PutHiveStreaming with null timestamp values

New Contributor

I have the target field type in hive as timestamp and from the source I get the json that has either proper timestamp field or "" or null sometimes. I am converting the source JsonToAvro before using PutHiveStreaming processor. The records with proper timestamp format gets into my hive target table successfully. But those that with ""/null (Empty String set) values show the error - Illegal format. Timestamp format should be" YYYY-MM-DD HH:MM:SS[.fffffffff] ". I know if I can default it to some date when it is null/empty, it works.But I do not want that. I want it to be as null in my target table when it is null. How can I achieve this?

2 REPLIES 2

Re: PutHiveStreaming with null timestamp values

New Contributor

Hi John,

I saw your response on stack overflow:


"We got around this problem by adjusting the schema before calling puthivestreaming, though I think it was better if puthivestreaming handled empty string values for timestamp fields. Anyways, here is the sample avro schema that am using { "type": "record", "name": "Test_Schema", "fields":[ {"name" :"id", "type" : ["string","null"]}, {"name" :"record_date", "type" : ["string","null"]}, {"name" :"operation", "type" : ["string","null"]} ] }"

How did you adjust the schema?

Cheers.

Re: PutHiveStreaming with null timestamp values

New Contributor

Hello Jackson,

This is how we did it.

Step1:

create UpdateAttribute with the following three attributes - schemaStart, schemaEndTags, recordDateSchemaElement

schemaStart ---> { "type": "record", "name": "Test_Schema", "fields":[ {"name" :"id", "type" : ["string","null"]}, {"name" :"operation", "type" : ["string","null"]}

schemaEndTags ---> ]}

recordDateSchemaElement ----> ${record_date:isEmpty():ifElse('', ', {"name" :"record_date", "type" : ["string","null"]}')}

Step2:

Another UpdateAttribute that connects from the above processor

completeSchema ---> ${allAttributes("schemaStart", "recordDateSchemaElement","schemaEndTags" ):join(" ")}

Now, your completeSchema will have "record_date" element only if it is NOT empty.

Hope this helps.