Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive: INSERT OVERWRITE does not work

Solved Go to solution

Hive: INSERT OVERWRITE does not work

Rising Star

I cannot get the following query run using PutHiveQL processor, it fails with the permission denied exception. I see that the processor emulates the same behavior as in Beeline.

But, I have been able to run the query from Hive CLI and writes to a file as expected. So, we know that Hive shell is an option, but can you let me know if there is any specific settings that causes this behavior in Beeline (Hive2) preventing to write to local filesystem?

insert overwrite local directory '/tmp' select current_date from dual

Error: Error while compiling statement: FAILED: HiveAccessControlException Permission denied: user [xxxxx] does not have [WRITE] privilege on [/tmp] (state=42000,code=40000)

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Hive: INSERT OVERWRITE does not work

Expert Contributor

Regardless the reason you've decided to use PutHiveQL for select statement (I would use SelectHiveQL and then PutFile processor to store a result of select)...

But... back to the use case. Both PutHiveQL and Beeline are JDBC based clients. Using "insert overwrite LOCAL directory" doesn't seem to have a sense, as you never know what LOCAL directory (which node, etc) will be in the context. If still temp files are required to be created, you can go with:

1. PutHiveQL (insert overwrite directory, not local).

2. GetHDFS

... continue your processes/transformations with FlowFiles as usual.

Please note, HDFS files will be deleted after GetHDFS will finish.

10 REPLIES 10

Re: Hive: INSERT OVERWRITE does not work

It seems nifi user dont have permissions to write into /tmp dir. You have two options. 1. Change the permission on /tmp folder to allow everyone to write into it. 2. If you have configured Ranger, make sure in the resource based policy for HDFS, nifi user is allowed access to all paths or specific paths you want write to.

10406-screen-shot-2016-12-16-at-52349-pm.gif

Re: Hive: INSERT OVERWRITE does not work

Rising Star

Ranger is specific to HDFS. I am referring to the issue with Hive writing to the local file system, see Hive statement in my summary above. And, /tmp directory is accessible by any user

Re: Hive: INSERT OVERWRITE does not work

Super Guru
@milind pandit

is right. if you do not have ranger enabled then add nifi user to the linux group which owns /tmp. This is ACL security. if you have ranger enabled then don't do this.

Re: Hive: INSERT OVERWRITE does not work

Rising Star

Please see my response above

Re: Hive: INSERT OVERWRITE does not work

Super Guru

Did you provide hive user access to local file system directory /tmp?

Re: Hive: INSERT OVERWRITE does not work

Rising Star

What I understand is that Hive temporarily uses the /tmp on the HDFS to write into and then copies over to Local directory. So, in the ranger the recursive access to /tmp HDFS folder has been issued. But, the issue still persists with NiFi

Re: Hive: INSERT OVERWRITE does not work

Super Guru

Do you have ranger audit enabled? if so please provide what the log shows when nifi tries to hit /tmp

Re: Hive: INSERT OVERWRITE does not work

Expert Contributor

Regardless the reason you've decided to use PutHiveQL for select statement (I would use SelectHiveQL and then PutFile processor to store a result of select)...

But... back to the use case. Both PutHiveQL and Beeline are JDBC based clients. Using "insert overwrite LOCAL directory" doesn't seem to have a sense, as you never know what LOCAL directory (which node, etc) will be in the context. If still temp files are required to be created, you can go with:

1. PutHiveQL (insert overwrite directory, not local).

2. GetHDFS

... continue your processes/transformations with FlowFiles as usual.

Please note, HDFS files will be deleted after GetHDFS will finish.

Highlighted

Re: Hive: INSERT OVERWRITE does not work

Rising Star

My use case requires to write in a delimited format. INSERT OVERWRITE LOCAL fits perfectly for this. I wish we had a way to custom delimit the content retrieved through SelectHiveQL processor, so I couldn't opt it.

I agree it's a good option of writing to HDFS instead. I will attempt to modify the process, but I still wonder why it wouldn't work with write to Local filesystem, through NiFi.