Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Sqoop create table error

When importing data from Postgres using Sqoop, I get the following error:

 

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:User user1 does not have privileges for CREATETABLE)

 

However when I use beeline, I am able to create tables using user1. However, if I start the old hive-cli I get the exact same error as above. Is there a way to make sqoop use beeline vs the old hive to create the tables? 

 

On CDH 5.11.1, with Sentry enabled. 

6 REPLIES 6

Champion

@vpatel-517787377

 

Can you share your sqoop script (or) confirm you are using "--hive-database dbname" ?

 

because if you sqoop import without above condition, it may try to create & import in default DB where user1 won't have create privilege.

 

but with beeline, you would have used dbname when you login where user1 have access to create/import table

beeline -u "jdbc:hive2://hostname:10000/dbname;principal...."

 

so you may feel that beeline works differently than normal hive 

Yes, I am using -hive-database. Here is the command:
sqoop-import --connect jdbc:postgresql://mymachine:5432/db --connection-manager org.apache.sqoop.manager.PGBulkloadManager --table std_clnt_employee --num-mappers 8 --split-by col1 --fields-terminated-by '\t' --target-dir /tmp/sqoop_import --hive-import --create-hive-table --hive-database=default --hive-table=test_pg --direct --verbose -- --schema pg_schema
Here is where it fails:
17/06/28 16:12:46 INFO hive.HiveImport: Loading uploaded data into Hive
17/06/28 16:12:46 DEBUG hive.HiveImport: Using in-process Hive instance.
17/06/28 16:12:46 DEBUG util.SubprocessSecurityManager: Installing subprocess security manager
17/06/28 16:12:46 WARN conf.HiveConf: HiveConf of name hive.optimize.mapjoin.mapreduce does not exist
17/06/28 16:12:46 WARN conf.HiveConf: HiveConf of name hive.auto.convert.sortmerge.join.noconditionaltask does not exist
17/06/28 16:12:46 WARN conf.HiveConf: HiveConf of name hive.optimize.mapjoin.mapreduce does not exist
17/06/28 16:12:46 WARN conf.HiveConf: HiveConf of name hive.auto.convert.sortmerge.join.noconditionaltask does not exist

Logging initialized using configuration in jar:file:/opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/jars/hive-common-1.1.0-cdh5.11.1.jar!/hive-log4j.properties
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:User user1 does not have privileges for CREATETABLE)

I can reproduce this error like this:
hive> create table test ( a int) ;
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:User user1 does not have privileges for CREATETABLE)

However the same command works in beeline :
0: jdbc:hive2://namenode1.corpdom.com:10000/d> create table test ( a int);
INFO : Compiling command(queryId=hive_20170628161616_9a2d989d-2abc-468a-862c-47fe6ea630e6): create table test ( a int)
INFO : Semantic Analysis Completed
INFO : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
INFO : Completed compiling command(queryId=hive_20170628161616_9a2d989d-2abc-468a-862c-47fe6ea630e6); Time taken: 0.21 seconds
INFO : Executing command(queryId=hive_20170628161616_9a2d989d-2abc-468a-862c-47fe6ea630e6): create table test ( a int)
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Completed executing command(queryId=hive_20170628161616_9a2d989d-2abc-468a-862c-47fe6ea630e6); Time taken: 0.368 seconds
INFO : OK
No rows affected (0.807 seconds)

We have sentry enabled, but I'm not sure why the permission model would be different.

Champion

@vpatel-517787377

 

 

Couple of recommendations,

1. you are using default db... if possible, use a different DB instead of default, it may help you

(or)

2. Follow below steps to understand the privilege that user1 has on default DB

Run this command in Linux: and get the groups that user belongs to...

$ id <userid>


Login to beeline and run the following commands:
# The below command will show the Roles that belongs to group. Get the group_name from above cmd

beeline> SHOW ROLE GRANT GROUP group_name; 

 

# The below command will show the Roles that has access on databases. Get role_name from above cmd

beeline> SHOW GRANT ROLE role_name; 

You can also do this Hue if you have access. This will help you to identify the user1 access on a particular DB. 

 

Champion
Does Sqoop (version 1) work with Sentry? I suspect that it doesn't and you should be using Sqoop2. This wiki covers integrating the two and seems to require that the user running the sqoop2 process have Sentry god access.

https://cwiki.apache.org/confluence/display/SENTRY/Sqoop2+integration+with+Sentry

I don't the issue is with Sqoop 1. We have another cluster (with an older version of CDH) where sqoop 1 works with Sentry.
Besides, since hive cli doesn't work which maybe the primary reason. Will update the thread with ROLE information.

Thanks for your help. It turned out that the data nodes did not have the group configured correctly for user1 and that was causing the issue. After that was configured correctly and restarted it worked fine. 

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.