Support Questions

Find answers, ask questions, and share your expertise

Sqoop connection to Kerberos authenticated RDBMS as part of sqoop-export

avatar
New Contributor

While doing a sqoop-export to load data from HDFS to Oracle hosted on cloud, I face the below issue.

 

ERROR manager.SqlManager: Error executing statement: java.sql.SQLException: ORA-01017: invalid username/password; logon denied

java.sql.SQLException: ORA-01017: invalid username/password; logon denied

 

I have done a kinit username@domainname already before initializing the sqoop command and kinit has generated a valid kerberos ticket as well. Even though I do this and run the command from the location where the kerberos user keytab file is located, it still gives me the above issue. As I can see in the sqoop.properties file, parameter org.apache.sqoop.security.authentication.type=KERBEROS is commented out. Is this the reason for the above issue? How can we resolve the above issue where we need sqoop to export data from HDFS to kerberos authenticated DB?

1 ACCEPTED SOLUTION

avatar
Master Guru

@RjsChw,

 

Ah, now I get what you are trying to do (auth to Oracle db with Kerberos).

I'm sad to say that is not currently a feature in Sqoop. 

The feature request has been entered upstream via SQOOP-3446.

 

View solution in original post

5 REPLIES 5

avatar
Master Guru

@RjsChw,

 

I am not a Sqoop expert, but the error you getting is not coming from Hadoop services; rather, the login to your Oracle Database is failing.  In this case, I believe you should review how you are supplying credentials to authenticate to the Oracle DB and verify they are correct.

avatar
Master Mentor

@RjsChw 

The error you are encountering is Oracle related you should validate the username password against the oracle database. If you have an oracle client installed on your laptop do the below there are many variations. But let your DBA give you the username/password for the database you are trying to export.

Variant 1
sqlplus /nolog
Connect user/password@dbname

Variant 2
sqlplus user@orcl

Variant 3
sqlplus user/password@hostname:port/sid

-----------

ERROR manager.SqlManager: Error executing statement: java.sql.SQLException: ORA-01017: invalid username/password; logon denied
java.sql.SQLException: ORA-01017: invalid username/password; logon denied

 

Having said that Sqoop on the command line will display your password in clear text, that's not secure so below is a way to encrypt your password so that your sqoop jobs are secured from prying eyes, to do that you will use the hadoop package to encrypt your oracle credential.

 

In the below example I am creating a password for my fictitious  testDB and using the name in the alias to easily identify it from the 100's of db's. In the below example I am using MySQL database the alias doesn't matter ie oracle.testDB.alias or db2.testDB.alias, the most important is the password it should match the password of the user of the Oracle/MySQL/dbs user

 

Encrypting SQOOP password

Generating the jceks file you MUST provide a path to your hdfs home, create one before executing this command

$ hadoop credential create mysql.testDB.alias -provider jceks://hdfs/user/george/mysql.testDB.password.jceks
Enter password: [database_password]
Enter password again: [database_password]
mysql.testDB.alias has been successfully created.
org.apache.hadoop.security.alias.JavaKeyStoreProvider has been updated.

Validating the encyrpted password creation

The encrypted password jceks will be MUST be written to your hdfs home
$ hdfs dfs -ls /user/george
Found 1 items
-rwx------ 3 george hdfs 503 2018-09-02 01:40 /user/george/mysql.testDB.password.jceks

 

Running the sqoop with the jceks alias

Assumption my mysql database test is running on host is pomme.cloudera.com port 3306

$ sqoop import -Dhadoop.security.credential.provider.path=jceks://hdfs/user/george/mysql.testDB.password.jceks --driver com.mysql.jdbc.Driver --connect jdbc:mysql://pomme.cloudera.com:3306/test --username george --password-alias mysql.testDB.alias --table "customer" --target-dir /user/george/test


Success output

Warning: /usr/hdp/2.6.2.0-205/accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
18/09/02 02:08:04 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.6.2.0-205
18/09/02 02:08:06 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time.
18/09/02 02:08:06 INFO manager.SqlManager: Using default fetchSize of 1000
18/09/02 02:08:06 INFO tool.CodeGenTool: Beginning code generation
.......... some text removed here..............
18/09/02 02:08:18 INFO mapreduce.Job: The url to track the job: http://pomme.cloudera.com:8088/proxy/application_1535835049607_0002/
18/09/02 02:08:18 INFO mapreduce.Job: Running job: job_1535835049607_0002
18/09/02 02:08:55 INFO mapreduce.Job: Job job_1535835049607_0002 running in uber mode : false
Total megabyte-milliseconds taken by all map tasks=917431296
Map-Reduce Framework
Map input records=2170
Map output records=2170
Input split bytes=396
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=944
CPU time spent (ms)=11690
Physical memory (bytes) snapshot=669270016
Virtual memory (bytes) snapshot=18275794944
Total committed heap usage (bytes)=331350016
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=243892
18/09/02 02:11:48 INFO mapreduce.ImportJobBase: Transferred 238.1758 KB in 218.8164 seconds (1.0885 KB/sec)
18/09/02 02:11:48 INFO mapreduce.ImportJobBase: Retrieved 2170 records.

 

Sqoop import in hdfs

Check the export was successful
$ hdfs dfs -ls /user/george/test
Found 5 items
-rw-r--r-- 3 george hdfs 0 2018-09-02 02:11 /user/george/test/_SUCCESS
-rw-r--r-- 3 george hdfs 60298 2018-09-02 02:10 /user/george/test/part-m-00000
-rw-r--r-- 3 george hdfs 60894 2018-09-02 02:10 /user/george/test/part-m-00001
-rw-r--r-- 3 george hdfs 62050 2018-09-02 02:11 /user/george/test/part-m-00002
-rw-r--r-- 3 george hdfs 60650 2018-09-02 02:11 /user/george/test/part-m-00003

Check the values in he splits

$ hdfs dfs -cat /user/george/test/part-m-00000
1,Julian Stuart,sagittis.felis@sedhendrerit.com,Suspendisse Tristique Neque Associates,9230 Metus. Av.,Pemberton,Mexico
2,Ferris Fulton,condimentum@morbitristique.co.uk,Nunc Ltd,256-788 At Avenue,Northampton,China
3,Byron Irwin,adipiscing.Mauris@DonecnibhQuisque.edu,Nascetur Ridiculus Foundation,4042 Non, St.,Gattatico,Lithuania
..........................some text removed ................
18,Peter Middleton,purus.Nullam.scelerisque@egetdictumplacerat.com,Erat In Consectetuer Associates,1618 Donec St.,Grand Island,Thailand

 

Voila 

 

 

avatar
New Contributor

@bgooley @Shelton 

So, I am able to connect to SQL Developer/SQL Plus for the kerberos authenticated DB.

 

Steps followed:

1. kinit -k -t <keytabfile> <user>@<Realm/domain>

2. A ticket is created by the above command. Point the credential file location in SQL Developer to this ticket file location.

3. Provide host,post and DB service name. Click on the checkbox where it mentions Kerberos authentication.

4. Voila, SQL Developer connects to the database (without passing username/password)

 

But, similar set of steps fails in Sqoop to connect to the db since there is no option in Sqoop to mention that the DB has to connected using kerberos authentication (There was a checkbox in sqldeveloper). Is there any way to do the same in Sqoop?

 

My sqoop command is as below:

 

sqoop export -Dmapred.job.queue.name=<processingqueue> --connect "jdbc:oracle:thin:@hostname:portnumber/DBNAME" --table <targettable> --hcatalog-table <tableinhive> --hcatalog-partition-keys <partitioncolumn> --hcatalog-partition-values <partitionvalue> --hcatalog-database <dbname>

 

The above command works fine when I provide a username and password for an on-premise database but for a database with kerberos authentication (where you are not supposed to provide any username/password, it fails)

avatar
Master Guru

@RjsChw,

 

Ah, now I get what you are trying to do (auth to Oracle db with Kerberos).

I'm sad to say that is not currently a feature in Sqoop. 

The feature request has been entered upstream via SQOOP-3446.

 

avatar
New Contributor

@bgooley Ah,alright! Will be willing to see the feature being rolled out soon. Thanks!