I'm working on NIFI 1.5.0. I downloaded Apache NIFI and installed on one of my HDP cluster nodes. I'm running the test against a 3 nodes HDP 2.6 cluster.
I'm following a tutorial from following link:
Everything works fine, but I get error at PutHiveStreaming step. It's basically complaining about an error connecting against Hive endpoint and the table. The table in Hive is created as managed table, and not as an external table. I can telnet to Hive metastore server and port from my NIFI node. What would be root-cause of this issue?nifi-error2.png
The PutHiveStreaming worked w/ HDF build of NiFi version without any problem. I downloaded
nifi-184.108.40.206.0.0.0-453-bin.tar.gz from HDF, and installed it on a node as standalone application, and file streamed to hive table without any issues. Thanks for your help..
Is your cluster kerberized? Can you access Hive from the command line of that machine? That article is a bit old. can you post any log messages.
The HDF version of NiFi is configured just for this and the addition of ambari makes everything easier.
The Nifi user may not have permissions to the /apps/hive/warehouse directory
What is it's permissions? What is your Hive local scheme?
Does NiFi users have HDFS read/write permissions?
Thanks Timothy. This could be a reason, I don't have a NIFI user that I've created. How do I create a NIFI user?
The table(OLYMPICS) is in default schema. Here is the DDL for the Hive table.
--created hive table
OLYMPICS(CITY STRING,EDITION INT,SPORT STRING,SUB_SPORT STRING,ATHLETE STRING,COUNTRY STRING,GENDER STRING,EVENT STRING,EVENT_GENDER STRING,MEDAL STRING)
CLUSTERED BY (EDITION)INTO 3 BUCKETS
ROW FORMAT DELIMITED
STORED AS ORC
In order for Hive Streaming to work the following has to be in place:
The table needs to be created properly with permissions
Example Table DDL
CREATE TABLE `inception`( uuid STRING, top1pct STRING, top1 STRING, top2pct STRING, top2 STRING, top3pct STRING, top3 STRING, top4pct STRING, top4 STRING, top5pct STRING, top5 STRING, imagefilename STRING, runtime STRING) CLUSTERED BY ( top1) INTO 3 BUCKETS ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' TBLPROPERTIES ( 'transactional'='true')