Created on 10-23-2018 08:29 PM - edited 08-17-2019 08:23 PM
I made a flow to put csv data on hive. First I put files on HDFS and then create a hive external table. It's worked.
Now I'm trying putHive3Streaming to insert CSV data on hive internal table, but I'm getting errors:
Table DDL (I removed some details) for putHive3Streaming
CREATE TABLE <TABLE NAME>( <field> STRING, ...) CLUSTERED BY (message) INTO 5 BUCKETS STORED as ORC TBLPROPERTIES("transactional"="true")
putHive3Streaming config
Controller Service Config
Error message
2018-10-23 14:26:25,263 WARN [Timer-Driven Process Thread-3] o.a.h.streaming.HiveStreamingConnection Unable to validate the table for connection: { metastore-uri: thrift://<HOST>, database: <DB>, table: <TABLE>, partitioned-table: false, dynamic-partitioning: false, username: nifi, secure-mode: false, record-writer: HiveRecordWriter, agent-info: NiFi PutHive3Streaming [a125ea2d-0166-1000-ffff-ffffd045c94a] thread 74[Timer-Driven Process Thread-3] } org.apache.hadoop.hive.metastore.api.MetaException: org.apache.hadoop.security.AccessControlException: Permission denied: user=nifi, access=EXECUTE, inode="/warehouse/tablespace/managed/hive":hive:hadoop:drwx------ at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:399) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:315) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:242) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:193) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:606) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkTraverse(FSDirectory.java:1799) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkTraverse(FSDirectory.java:1817) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.resolvePath(FSDirectory.java:674) at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getFileInfo(FSDirStatAndListingOp.java:114) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3106) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1154) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:966) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_req_result$get_table_req_resultStandardScheme.read(ThriftHiveMetastore.java) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_req_result$get_table_req_resultStandardScheme.read(ThriftHiveMetastore.java) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_req_result.read(ThriftHiveMetastore.java) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_table_req(ThriftHiveMetastore.java:2079) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_table_req(ThriftHiveMetastore.java:2066) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1578) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1570) at sun.reflect.GeneratedMethodAccessor828.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:208) at com.sun.proxy.$Proxy207.getTable(Unknown Source) at org.apache.hive.streaming.HiveStreamingConnection.validateTable(HiveStreamingConnection.java:401) at org.apache.hive.streaming.HiveStreamingConnection.<init>(HiveStreamingConnection.java:202) at org.apache.hive.streaming.HiveStreamingConnection.<init>(HiveStreamingConnection.java:118) at org.apache.hive.streaming.HiveStreamingConnection$Builder.connect(HiveStreamingConnection.java:332) at org.apache.nifi.processors.hive.PutHive3Streaming.makeStreamingConnection(PutHive3Streaming.java:508) at org.apache.nifi.processors.hive.PutHive3Streaming.onTrigger(PutHive3Streaming.java:413) at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165) at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203) at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
2018-10-23 14:26:25,264 ERROR [Timer-Driven Process Thread-3] o.a.n.processors.hive.PutHive3Streaming PutHive3Streaming[id=a125ea2d-0166-1000-ffff-ffffd045c94a] PutHive3Streaming[id=a125ea2d-0166-1000-ffff-ffffd045c94a] failed to process session due to java.lang.NullPointerException; Processor Administratively Yielded for 1 sec: java.lang.NullPointerException java.lang.NullPointerException: null at org.apache.nifi.processors.hive.PutHive3Streaming.onTrigger(PutHive3Streaming.java:447) at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165) at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203) at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 2018-10-23 14:26:25,264 WARN [Timer-Driven Process Thread-3] o.a.n.controller.tasks.ConnectableTask Administratively Yielding PutHive3Streaming[id=a125ea2d-0166-1000-ffff-ffffd045c94a] due to uncaught Exception: java.lang.NullPointerException java.lang.NullPointerException: null at org.apache.nifi.processors.hive.PutHive3Streaming.onTrigger(PutHive3Streaming.java:447) at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165) at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203) at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
Created 10-23-2018 09:47 PM
There is AccessControl exception in your shared logs..
org.apache.hadoop.hive.metastore.api.MetaException: org.apache.hadoop.security.AccessControlException: Permission denied: user=nifi, access=EXECUTE, inode="/warehouse/tablespace/managed/hive":hive:hadoop:drwx------
Make sure nifi user having appropriate permission on this directory "/warehouse/tablespace/managed/hive" access to the directory and try to ingest data into table again.
-
If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues
Created 10-23-2018 09:47 PM
There is AccessControl exception in your shared logs..
org.apache.hadoop.hive.metastore.api.MetaException: org.apache.hadoop.security.AccessControlException: Permission denied: user=nifi, access=EXECUTE, inode="/warehouse/tablespace/managed/hive":hive:hadoop:drwx------
Make sure nifi user having appropriate permission on this directory "/warehouse/tablespace/managed/hive" access to the directory and try to ingest data into table again.
-
If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues
Created 10-24-2018 12:39 AM
nifi needs permition on this directory? This is a internal table managed by hive.
I made a policy on ranger and gave permission to nifi insert on this table. I can insert on this table with user nifi using a hive client, but it's not working with the putHive3Streaming processor.
Created 10-24-2018 05:41 PM
Even though it's managed by Hive, unless Hive is performing tasks as the Hive user (which here it is not), I believe Shu is right and you'll likely need an HDFS policy in Ranger to allow user nifi to access the hive warehouse.
Created 10-26-2018 12:47 PM
You are right. I applied the policy and everthing is working now. Thanks @Shu @Matt Burgess
Created 05-28-2019 03:40 PM
I have the same problem.
I set the permission to 777 for all users.
[nifi@hdp-srv2 ~]$ hdfs dfs -ls /warehouse/tablespace/managed/hive/ Found 3 items drwxrwxrwx+ - hive hadoop 0 2019-05-27 14:53 /warehouse/tablespace/managed/hive/information_schema.db drwxrwxrwx+ - hive hadoop 0 2019-05-28 13:45 /warehouse/tablespace/managed/hive/sensor_data drwxrwxrwx+ - hive hadoop 0 2019-05-27 14:53 /warehouse/tablespace/managed/hive/sys.db
Error still happens.
Caused by: org.apache.hadoop.hive.metastore.api.MetaException: java.security.AccessControlException: Permission denied: user=nifi, access=READ, inode="/warehouse/tablespace/managed/hive/sensor_data":hive:hadoop:drwxrwxrwx