Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

nifi convertavrotoorc decimal datatype issue

Solved Go to solution
Highlighted

nifi convertavrotoorc decimal datatype issue

Explorer

Hi, I am using nifi to load data to hive from sqlserver. I am using convertavrotoOrc to load data in orc format via puthdfs and puthiveql. Sqlserver datatypes of decimal are converted into binary datatype by processor and data is not readable. How I can correct this.

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: nifi convertavrotoorc decimal datatype issue

Super Guru

ConvertAvroToORC is in the Hive bundle which uses Avro 1.7.7, which does not support logical types such as decimal. This is discussed in NIFI-5079, where it was decided to add support via a PutORC processor in the upcoming Hive 3 bundle (slated for NiFi 1.7.0 and HDF 3.2).

If you are using a version of NiFi prior to 1.6.0, then upgrading may help solve the original conversion issue when fetching from SQLServer (via NIFI-4846). However if you're converting to ORC you will still run into the issue above. A workaround might be to store the Avro directly to HDFS and put the Hive table atop the Avro data vs ORC data.

View solution in original post

4 REPLIES 4
Highlighted

Re: nifi convertavrotoorc decimal datatype issue

Super Guru

ConvertAvroToORC is in the Hive bundle which uses Avro 1.7.7, which does not support logical types such as decimal. This is discussed in NIFI-5079, where it was decided to add support via a PutORC processor in the upcoming Hive 3 bundle (slated for NiFi 1.7.0 and HDF 3.2).

If you are using a version of NiFi prior to 1.6.0, then upgrading may help solve the original conversion issue when fetching from SQLServer (via NIFI-4846). However if you're converting to ORC you will still run into the issue above. A workaround might be to store the Avro directly to HDFS and put the Hive table atop the Avro data vs ORC data.

View solution in original post

Highlighted

Re: nifi convertavrotoorc decimal datatype issue

Super Collaborator

@Matt Burgess,

if we follow your approach , we can then use INSERT INTO SELECT to load from Avro table to ORC table.? reason being in some cases like ACID in Hive only works in ORC format.

Highlighted

Re: nifi convertavrotoorc decimal datatype issue

Super Guru

Yes I think that will work. Also if you convert from decimal to a different type then you should be able to use PutHiveStreaming, although that isn't always as performant as it could be. In the upcoming Hive 3 bundle, there is a new Streaming API and PutHive3Streaming should be much faster (and Avro logical types are supported).

Re: nifi convertavrotoorc decimal datatype issue

Explorer

Thanks @Matt Burgess and @Saikrishna Tarapareddy. I converted upstream decimal to string to get it correctly in incremental ACID mode into hive table.

Don't have an account?
Coming from Hortonworks? Activate your account here