Created 02-26-2018 08:03 PM
I'm working on NIFI 1.5.0. I downloaded Apache NIFI and installed on one of my HDP cluster nodes. I'm running the test against a 3 nodes HDP 2.6 cluster.
I'm following a tutorial from following link:
https://community.hortonworks.com/articles/52856/stream-data-into-hive-like-a-king-using-nifi.html
Everything works fine, but I get error at PutHiveStreaming step. It's basically complaining about an error connecting against Hive endpoint and the table. The table in Hive is created as managed table, and not as an external table. I can telnet to Hive metastore server and port from my NIFI node. What would be root-cause of this issue?nifi-error2.png
Error attached:
Created 02-27-2018 03:52 AM
I believe you'll want to run the HDF build of NiFi which has libraries tailored to work with HDP Hive.
Created 02-27-2018 03:52 AM
I believe you'll want to run the HDF build of NiFi which has libraries tailored to work with HDP Hive.
Created 03-05-2018 08:03 PM
The PutHiveStreaming worked w/ HDF build of NiFi version without any problem. I downloaded
nifi-1.2.0.3.0.0.0-453-bin.tar.gz from HDF, and installed it on a node as standalone application, and file streamed to hive table without any issues. Thanks for your help.
.Created 02-27-2018 06:14 PM
Is your cluster kerberized? Can you access Hive from the command line of that machine? That article is a bit old. can you post any log messages.
The HDF version of NiFi is configured just for this and the addition of ambari makes everything easier.
Created 02-27-2018 08:28 PM
Created 03-05-2018 08:04 PM
Thanks Tim. NiFi Flow worked with HDF version of NiFi.
Created 02-27-2018 06:46 PM
The Nifi user may not have permissions to the /apps/hive/warehouse directory
What is it's permissions? What is your Hive local scheme?
Does NiFi users have HDFS read/write permissions?
Created 02-27-2018 08:00 PM
Thanks Timothy. This could be a reason, I don't have a NIFI user that I've created. How do I create a NIFI user?
The table(OLYMPICS) is in default schema. Here is the DDL for the Hive table.
--created hive table
CREATE TABLE
OLYMPICS(CITY STRING,EDITION INT,SPORT STRING,SUB_SPORT STRING,ATHLETE STRING,COUNTRY STRING,GENDER STRING,EVENT STRING,EVENT_GENDER STRING,MEDAL STRING)
CLUSTERED BY (EDITION)INTO 3 BUCKETS
ROW FORMAT DELIMITED
STORED AS ORC
LOCATION '/tmp/olympics'
TBLPROPERTIES('transactional'='true');
Created 02-27-2018 06:48 PM
In order for Hive Streaming to work the following has to be in place:
Created 02-27-2018 06:59 PM
The table needs to be created properly with permissions
Example Table DDL
CREATE TABLE `inception`( uuid STRING, top1pct STRING, top1 STRING, top2pct STRING, top2 STRING, top3pct STRING, top3 STRING, top4pct STRING, top4 STRING, top5pct STRING, top5 STRING, imagefilename STRING, runtime STRING) CLUSTERED BY ( top1) INTO 3 BUCKETS ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' TBLPROPERTIES ( 'transactional'='true')