Support Questions

Find answers, ask questions, and share your expertise
Announcements
Welcome to the upgraded Community! Read this blog to see What’s New!

Best Practice - JSON to Avro, data type preserving

avatar
Super Collaborator

Hi All, What is the best approach to convert a JSON to AVRO preserving the source datatypes. My source JSON has a field with timestamp (value would look like 2017-01-26T00:00:00-05:00) which I need to eventually insert to a hive table with column type timestamp. When I infer the schema, I get String for the timestamp field. Is there some pre-formatting that I can do on the timestamp field so that it gets inferred as timestamp field. Current flow is as below - JSON>>AVRO(infer/manually add schema)>>Streaming Insert to hive

21 REPLIES 21

avatar
Contributor

Hi @Arun A K

This is a known issue where the datatypes are not preserved. https://issues.apache.org/jira/browse/NIFI-2624 which talks about Oracle/SQL datatypes not being preserved. You should also check out https://gist.github.com/ijokarumawak/69b29fa7b11c2ada656823db614af373

As mentioned by @Karthik Narayanan, best approach would be to use Record Oriented processors.

Labels