28787
DISCUSSIONS
102055
MEMBERS
3161
ARTICLES
Created on 03-30-2018 03:35 AM - edited 09-16-2022 06:02 AM
Hello everyone!
I have a Kerberos based cluster where I need to schedule several Oozie workflows.
Normal sqoop and shell actions work fine, but I am having problems with sqoop actions that import data in hive tables (--hive-import). I've tried to use hcat credentials in the workflow but I got the following error in Oozie web console:
JA009: org.apache.hive.hcatalog.common.HCatException : 9001 : Exception occurred while processing HCat request : TException while getting delegation token.. Cause : org.apache.thrift.transport.TTransportException
This is the workflow:
<workflow-app xmlns="uri:oozie:workflow:0.5" name="wf-hive"> <credentials> <credential name='hcat_credentials' type='hcat'> <property> <name>hcat.metastore.uri</name> <value>thrift://trmas-xxxxx.yyyy.local:9083</value> </property> <property> <name>hcat.metastore.principal</name> <value>hive/_HOST@XXXXX.LOCAL</value> </property> </credential> </credentials> <start to="HIVE_IMPORT"/> <action name="HIVE_IMPORT" cred="hcat_credentials"> <sqoop xmlns="uri:oozie:sqoop-action:0.2"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <prepare> <delete path="hdfs://trmas-6b8bc78c.xxxxx.yyyy:8020/user/myuser/mytable"/> </prepare> <job-xml>hdfs://trmas-6b8bc78c.xxxxx.yyyy:8020/user/myuser/hive-site.xml</job-xml> <arg>import</arg> <arg>--connection-manager</arg> <arg>com.cloudera.connector.teradata.TeradataManager</arg> <arg>--connect</arg> <arg>jdbc:teradata://x.y.z.w/DATABASE=DLT01V</arg> <arg>--username</arg> <arg>username</arg> <arg>--password</arg> <arg>password</arg> <arg>--query</arg> <arg>SELECT * from mytable WHERE $CONDITIONS</arg> <arg>--split-by</arg> <arg>__temp__</arg> <arg>--hive-import</arg> <arg>--hive-table</arg> <arg>dbtest.mytable</arg> <arg>--num-mappers</arg> <arg>1</arg> <arg>--target-dir</arg> <arg>hdfs://trmas-6b8bc78c.xxxxx.yyyy:8020/user/myuser/mytable</arg> </sqoop> <ok to="END_NODE"/> <error to="KILL_NODE"/> </action> <kill name="KILL_NODE"> <message>${wf:errorMessage(wf:lastErrorNode())}</message> </kill> <end name="END_NODE"/> </workflow-app>
I am not sure how to valorize hcat.metastore.uri and hcat.metastore.principal properties...I have used the values of respectfully hive.metastore.uris and hive.metastore.kerberos.principal from the hive-site.xml, is this correct?
Unfortunately I can't use Sqoop2 and its support for HiveServer2 authentication because the Cloudera Teradata connector does not support Sqoop2 yet, thus I think I should authenticate with the HCat credentials.
Please anyone can provide any help on this one?