Support Questions

Find answers, ask questions, and share your expertise

Hive with Google Cloud Storage

I have installed a hadoop 2.6.5 version cluster in GCP using VM's instances. Used GCP connector and pointed by hdfs to use gs bucket. Added the below 2 entries in coresite.xml:

google.cloud.auth.service.account.json.keyfile=<Path-to-the-JSON-file> 
fs.gs.working.dir=/

When using hadoop gs -ls / works fine , but when I am creating a hive tables

CREATE EXTERNAL TABLE test1256(name string,id  int)   LOCATION   'gs://bucket/';

I get the following error:

Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:java.security.AccessControlException: Permission denied: user=hdpuser1, path="gs://bucket/":hive:hive:drwx------) (state=08S01,code=1)

Apart form changes to coresite.xml are there any changes to be made at hive.xml also?

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_cloud-data-access/content/authentication...

1 ACCEPTED SOLUTION

Mentor

@sudi ts

To you have access to the GCP IAM console? When treating a service account as a resource, you can grant permission to a user to access that service account. You can grant the Owner, Editor, Viewer, or Service Account User role to a user to access the service account.

View solution in original post

12 REPLIES 12

Mentor

@sudi ts

You need to copy the connector into the hadoop-client and hive-client location otherwise you will hit an error

cp gcs-connector-latest-hadoop2.jar /usr/hdp/current/hadoop-client/lib/ 
cp gcs-connector-latest-hadoop2.jar /usr/hdp/current/hive-client/lib 

The below command should run successfully

$ hdfs dfs -ls gs://bucket/ 

This should run fine, but the issue you are having is with permission for hdpuser1 you will need to correct by running

$ hdfs dfs -chown hdpuser1 gs://bucket/ 

Now your create table should work, while logged in as hdpuser1

CREATE EXTERNAL TABLE test1256(name string,id int) LOCATION 'gs://bucket/'; 

Please let me know. If you found this answer addressed your question, please take a moment to log in and click the "Accept" link on the answer.

Hi,

Thanks a lot for the info, but still facing the same issue.

I did create the user in AD and have a valid ticket , hdfs command does work accessing the GCS but cannot create external hive table.

Mentor

@sudi ts

Can you share the latest error?

Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:java.security.AccessControlException: Permission denied: user=hdpuser1, path="gs://bucket/":hive:hive:drwx------) (state=08S01,code=1)

hdpuser1 is an AD user, using the same user I execute

$ hdfs dfs -ls gs://bucket/

but using beeline when I try to create an external table it fails

Mentor

@sudi ts

This is clearly a permission issue "Permission denied: user=hdpuser1, path="gs://bucket/":hive:hive:drwx------)"

Have you tried using ACL's

gsutil acl ch -u hdpuser1:WRITE gs://bucket/

And retry

@Geoffrey Shelton Okot

I did try, but still fails.

CommandException: hdpuser1:WRITE is not a valid ACL changehdpuser1 is not a valid scope type

The GCS bucket has storage admin rights given to service account

hadoop fs -ls gs://bucket/ = works fine

Mentor

@sudi ts

To you have access to the GCP IAM console? When treating a service account as a resource, you can grant permission to a user to access that service account. You can grant the Owner, Editor, Viewer, or Service Account User role to a user to access the service account.

@Geoffrey Shelton Okot

I was able to create hive external table pointing my storage as GCS. But it only works as hive superuser but doesn't work as a normal hive user meaning, hdpuser1 cannot create hive table it fails with above error, but if execute su - hive it works .

I am no sure how to rectify this.

Cloudera Employee

Hi @sudi ts

Can you share some more information about this deployment.

- Is doAs enabled (hive.server2.enable.doAs)

- What is the authorization mechanism? Is the Ranger Authorizer being used.

If you can pull a stack trace from the HiveServer2 logs, that'll be very useful.

HDP-2.6.5 ships with the Google connector, so there's no need to replace any jars. The GS connectivity is working given that you can create this table if logging in as the hive user, and list files via hadoop fs -ls.

Cloud storage Access Control is generally handled via Cloud Provider constructs - such as IAM roles. Hadoop interaction in terms of file owners and permissions doesn't capture this. The user returned by hadoop fs -ls will typically be the logged in user, and the permissions don't indicate much.

Hi @sseth

Issue is resolved after adding following property in core-site.xml

fs.gs.reported.permissions=777

Normal Users can access hive and create external table pointing to GCS location.

@sseth

I have downloaded the latest jar

https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jar

Tried creating the external table and its failing with following error:

FAILED: HiveAccessControlException Permission denied: user [abcd] does not have [READ] privilege on [gs://hdp-opt1/forhive/languages] (state=42000,code=40000)

I have enabled Hive plugin and set the permission of 777 in coresite.xml

Where there any changes made to jar?? I also see few properties have changed in this link:

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_cloud-data-access/content/gcp-cluster-co...

Is it mandatory to use the json key? If my vm instance has required permission to talk to gcs

New Contributor

@sudi ts Were you able to resolve this issue?

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.