Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

NIFI putHiveStreaming processor error

avatar
Super Collaborator

db-hive-temp.xml

nifi-db-hive.jpg

please see the attached error , i am feeding it JSON data but its complaining about AVRO format.

1 ACCEPTED SOLUTION

avatar
Master Guru
@Sami Ahmad

Output of QueryDatabaseTable processor are always in avro format , so you need to use PutHiveStreaming processor after Querydatabasetable processor.

As PutHiveStreaming processor expects incoming data to be in avro format and we are getting incoming data from querydatabasetable in avro format.

Flow:-

1.QueryDatabasetable
2.PutHiveStreaming
3.LogAttribute

please refer to below links regarding table creation of puthivestreaming processor

https://community.hortonworks.com/questions/59411/how-to-use-puthivestreaming.html
https://community.hortonworks.com/articles/52856/stream-data-into-hive-like-a-king-using-nifi.html

View solution in original post

15 REPLIES 15

avatar
Super Collaborator

db-record.zipi think its the data issue , I am uploaded a queue entry please take a look.

the table ddl is as follows :

CREATE TABLE purchase_acct_orc (
acct_num BIGINT,
pur_id BIGINT,
pur_det_id BIGINT,
product_pur_product_code STRING,
prod_amt FLOAT,
accttype_acct_type_code STRING,
acctstat_acct_status_code STRING,
emp_emp_code STRING,
plaza_plaza_id STRING,
purstat_pur_status_code STRING
)
PARTITIONED BY (pur_trans_date TIMESTAMP)
CLUSTERED BY(acct_num) INTO 5 BUCKETS
STORED AS ORC
TBLPROPERTIES ("transactional"="true")

avatar
Super Collaborator

why do I have "LogAttribute" with QueryDatabasetable processor? its showing a red block on the corner which normally indicates issues I think.

from QueryDatabaseTable there are two relations one going to PutHiveStreaming and one going to LogAttribute . why do I need two relationships for success ?
(please see attached)

capture.jpg

avatar
Super Collaborator

puhive-after-querydatabase.jpg

I was told by the HW engineer who setup this initial setup that putHiveStreaming processor is not able to write into hive table cause the data is not in JSON format and I need to convert it to JSON first .. ?

please see the attached error when I put HiveStreaming after the QueryDatabase processor.

avatar
Master Guru

What is the Hive table DDL?

It needs to be bucketed and an ORC file.

See my example:

https://community.hortonworks.com/articles/174538/apache-deep-learning-101-using-apache-mxnet-with-h...

avatar
Master Guru

should be Avro with a schema.

Get a document or example code from an engineer

avatar
Super Collaborator

my hive table is bucketed and in ORC format.

so why am I getting error ? and what does this error means?