Support Questions

Find answers, ask questions, and share your expertise

putHive3Streaming Error

avatar
New Contributor

I made a flow to put csv data on hive. First I put files on HDFS and then create a hive external table. It's worked.

Now I'm trying putHive3Streaming to insert CSV data on hive internal table, but I'm getting errors:

Table DDL (I removed some details) for putHive3Streaming

CREATE TABLE <TABLE NAME>(
        <field> STRING,
        ...)
    CLUSTERED BY (message) INTO 5 BUCKETS
    STORED as ORC
    TBLPROPERTIES("transactional"="true")

putHive3Streaming config

92981-erro1.png

Controller Service Config

92982-erro2.png

Error message

2018-10-23 14:26:25,263 WARN [Timer-Driven Process Thread-3] o.a.h.streaming.HiveStreamingConnection Unable to validate the table for connection: { metastore-uri: thrift://<HOST>, database: <DB>, table: <TABLE>, partitioned-table: false, dynamic-partitioning: false, username: nifi, secure-mode: false, record-writer: HiveRecordWriter, agent-info: NiFi PutHive3Streaming [a125ea2d-0166-1000-ffff-ffffd045c94a] thread 74[Timer-Driven Process Thread-3] } org.apache.hadoop.hive.metastore.api.MetaException: org.apache.hadoop.security.AccessControlException: Permission denied: user=nifi, access=EXECUTE, inode="/warehouse/tablespace/managed/hive":hive:hadoop:drwx------ at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:399) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:315) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:242) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:193) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:606) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkTraverse(FSDirectory.java:1799) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkTraverse(FSDirectory.java:1817) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.resolvePath(FSDirectory.java:674) at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getFileInfo(FSDirStatAndListingOp.java:114) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3106) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1154) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:966) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_req_result$get_table_req_resultStandardScheme.read(ThriftHiveMetastore.java) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_req_result$get_table_req_resultStandardScheme.read(ThriftHiveMetastore.java) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_req_result.read(ThriftHiveMetastore.java) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_table_req(ThriftHiveMetastore.java:2079) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_table_req(ThriftHiveMetastore.java:2066) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1578) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1570) at sun.reflect.GeneratedMethodAccessor828.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:208) at com.sun.proxy.$Proxy207.getTable(Unknown Source) at org.apache.hive.streaming.HiveStreamingConnection.validateTable(HiveStreamingConnection.java:401) at org.apache.hive.streaming.HiveStreamingConnection.<init>(HiveStreamingConnection.java:202) at org.apache.hive.streaming.HiveStreamingConnection.<init>(HiveStreamingConnection.java:118) at org.apache.hive.streaming.HiveStreamingConnection$Builder.connect(HiveStreamingConnection.java:332) at org.apache.nifi.processors.hive.PutHive3Streaming.makeStreamingConnection(PutHive3Streaming.java:508) at org.apache.nifi.processors.hive.PutHive3Streaming.onTrigger(PutHive3Streaming.java:413) at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165) at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203) at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
2018-10-23 14:26:25,264 ERROR [Timer-Driven Process Thread-3] o.a.n.processors.hive.PutHive3Streaming PutHive3Streaming[id=a125ea2d-0166-1000-ffff-ffffd045c94a] PutHive3Streaming[id=a125ea2d-0166-1000-ffff-ffffd045c94a] failed to process session due to java.lang.NullPointerException; Processor Administratively Yielded for 1 sec: java.lang.NullPointerException java.lang.NullPointerException: null at org.apache.nifi.processors.hive.PutHive3Streaming.onTrigger(PutHive3Streaming.java:447) at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165) at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203) at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 2018-10-23 14:26:25,264 WARN [Timer-Driven Process Thread-3] o.a.n.controller.tasks.ConnectableTask Administratively Yielding PutHive3Streaming[id=a125ea2d-0166-1000-ffff-ffffd045c94a] due to uncaught Exception: java.lang.NullPointerException java.lang.NullPointerException: null at org.apache.nifi.processors.hive.PutHive3Streaming.onTrigger(PutHive3Streaming.java:447) at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165) at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203) at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
1 ACCEPTED SOLUTION

avatar
Master Guru
@Carlos Cardoso

There is AccessControl exception in your shared logs..

org.apache.hadoop.hive.metastore.api.MetaException: org.apache.hadoop.security.AccessControlException: Permission denied: user=nifi, access=EXECUTE, inode="/warehouse/tablespace/managed/hive":hive:hadoop:drwx------

Make sure nifi user having appropriate permission on this directory "/warehouse/tablespace/managed/hive" access to the directory and try to ingest data into table again.

-

If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues

View solution in original post

5 REPLIES 5

avatar
Master Guru
@Carlos Cardoso

There is AccessControl exception in your shared logs..

org.apache.hadoop.hive.metastore.api.MetaException: org.apache.hadoop.security.AccessControlException: Permission denied: user=nifi, access=EXECUTE, inode="/warehouse/tablespace/managed/hive":hive:hadoop:drwx------

Make sure nifi user having appropriate permission on this directory "/warehouse/tablespace/managed/hive" access to the directory and try to ingest data into table again.

-

If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues

avatar
New Contributor

nifi needs permition on this directory? This is a internal table managed by hive.

I made a policy on ranger and gave permission to nifi insert on this table. I can insert on this table with user nifi using a hive client, but it's not working with the putHive3Streaming processor.

@Shu

avatar
Master Guru

Even though it's managed by Hive, unless Hive is performing tasks as the Hive user (which here it is not), I believe Shu is right and you'll likely need an HDFS policy in Ranger to allow user nifi to access the hive warehouse.

avatar
New Contributor

You are right. I applied the policy and everthing is working now. Thanks @Shu @Matt Burgess

avatar
Cloudera Employee

I have the same problem.

I set the permission to 777 for all users.

[nifi@hdp-srv2 ~]$ hdfs dfs -ls /warehouse/tablespace/managed/hive/
Found 3 items
drwxrwxrwx+  - hive hadoop          0 2019-05-27 14:53 /warehouse/tablespace/managed/hive/information_schema.db
drwxrwxrwx+  - hive hadoop          0 2019-05-28 13:45 /warehouse/tablespace/managed/hive/sensor_data
drwxrwxrwx+  - hive hadoop          0 2019-05-27 14:53 /warehouse/tablespace/managed/hive/sys.db

Error still happens.

Caused by: org.apache.hadoop.hive.metastore.api.MetaException: java.security.AccessControlException: Permission denied: user=nifi, access=READ, inode="/warehouse/tablespace/managed/hive/sensor_data":hive:hadoop:drwxrwxrwx