Support Questions
Find answers, ask questions, and share your expertise

Oozie Sqoop action fails in kerberized cluster - HDP 2.5

Re: Oozie Sqoop action fails in kerberized cluster - HDP 2.5

Contributor

I used your workflow.xml and tested to import the data from SQL and I am getting the same error

12779 [main] INFO  hive.metastore  - Trying to connect to metastore with URI thrift://CaLDEV02:9083
12826 [main] WARN  hive.metastore  - set_ugi() not successful, Likely cause: new client talking to old server. Continuing without it.
org.apache.thrift.transport.TTransportException
	at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
	at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)

Here is the workflow.xml

<?xml version="1.0" encoding="UTF-8"?> <workflow-app xmlns="uri:oozie:workflow:0.4" name="hive-wf"> <global> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <job-xml>${SqoopSiteXml}</job-xml> <job-xml>${HiveSiteXml}</job-xml> <configuration> <property> <name>mapred.job.queue.name</name> <value>${queueName}</value> </property> </configuration> </global> <credentials> <credential name='hive_auth' type='hcat'> <property> <name>hcat.metastore.uri</name> <value>thrift://caldev02:9083</value> </property> <property> <name>hcat.metastore.principal</name> <value>hive/_HOST@MYDOMAIN.COM</value> </property> </credential> </credentials>

<start to="import-sqoop"/>

<action name='import-sqoop' cred="hive_auth"> <sqoop xmlns='uri:oozie:sqoop-action:0.4'> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <configuration> <property> <name>mapred.compress.map.output</name> <value>true</value> </property> </configuration> <arg>import</arg> <arg>--connect</arg> <arg>jdbc:sqlserver://MyID;database=TestDatabase</arg> <arg>--table</arg> <arg>Account</arg> <arg>--username</arg> <arg>hadoopuser</arg> <arg>--password</arg> <arg>myPassword</arg> <arg>--hcatalog-table</arg> <arg>Account_orc</arg> <arg>--create-hcatalog-table</arg> <arg>--hcatalog-home</arg> <arg>/usr/hdp/current/hive-webhcat</arg> <arg>--hcatalog-storage-stanza</arg> <arg>stored as orc tblproperties ("orc.compress"="SNAPPY")</arg> <arg>-m</arg> <arg>1</arg> </sqoop> <ok to="end"/> <error to="kill"/> </action> <kill name="kill"> <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <end name="end"/> </workflow-app>

Job.Properties

nameNode = hdfs://MyNameNode:8020

jobTracker = ResourceManagerServer:8050 SqoopSiteXml = /user/myuser/Application/oozie/lib/sqoop-oozie-site.xml HiveSiteXml = /user/myuser/Application/oozie/lib/hive-oozie-site.xml HcatalogHome = /usr/hdp/current/hive-webhcat oozie.wf.application.path = /user/mdrxsqoop/Application/oozie/Test_workFlow oozie.libpath = /user/myuser/Application/oozie/lib queueName = default oozie.use.system.libpath = true oozie.wf.validate.ForkJoin = false oozie.service.WorkflowAppService.system.libpath = /user/oozie/share/lib/lib_20160902204711 oozie.action.sharelib.for.sqoop = hcatalog,sqoop,hive hive.metastore.sasl.enabled=true hive.metastore.kerberos.principal=hive/_HOST@MyDOMAIN.COM user.name=myuser@MyDOMAIN.COM

Thanks

ram

Re: Oozie Sqoop action fails in kerberized cluster - HDP 2.5

Contributor

Hi one more thing I noticed is oozie is not passing the domain name to the hivemetastore. Please see the log file below.

OOZIE execution log file from Hivemetastore.log file

2016-09-28 01:24:48,800 DEBUG [pool-7-thread-4]: security.UserGroupInformation (UserGroupInformation.java:logPrivilegedAction(1751)) - PrivilegedAction as:mysusersqoop (auth:PROXY) via hive/caldevn02@MYDOMIAN.COM (auth:KERBEROS) from:org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:546)

SQOOP job -exec myjob execution Log from Hivemetastore.log.

security.UserGroupInformation (UserGroupInformation.java:logPrivilegedAction(1751)) - PrivilegedAction as:myusersqoop@MYDOMIAN.COM(auth:PROXY) via hive/caldevn02@MYDOMIAN.COM (auth:KERBEROS) from:org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:546)

How can setup the oozie to pass domain for the user executing the oozie job.

Thank you for your help.

Ram

Re: Oozie Sqoop action fails in kerberized cluster - HDP 2.5

Expert Contributor

@Ramakrishna Pratapa

I have used following job.properties and workflow.xml so that sqoop import works for me in secure environment.

In workflow.xml, I have specified the hive-site.xml in <file> that has been copied from /etc/hive/conf/hive-site.xml.

job.properties

nameNode=hdfs://machine-2-1.openstacklocal:8020
jobTracker=machine-2-1.openstacklocal:8050
queueName=default
oozie.use.system.libpath=true
oozie.libpath=${nameNode}/user/oozie/share/lib
oozie.wf.application.path=${nameNode}/user/ambari-qa/sqoop-import/
oozie.action.sharelib.for.sqoop=sqoop,oozie,hive,hcatalog
mapreduce.job.user.name=ambari-qa
user.name=ambari-qa

workflow.xml

Re: Oozie Sqoop action fails in kerberized cluster - HDP 2.5

Contributor

Thank you Peeyush, I will test this today and will send you the results. One more thing I noticed is when I started oozie I am seeing the following Warnings in the

oozie-error.log file

2016-09-26 11:12:35,459 WARN ConfigurationService:523 - SERVER[] Invalid configuration defined, [oozie.service.ProxyUserService.proxyuser.falcon.hosts] 2016-09-26 11:12:35,460 WARN ConfigurationService:523 - SERVER[] Invalid configuration defined, [oozie.service.ProxyUserService.proxyuser.hue.hosts] 2016-09-26 11:12:35,460 WARN ConfigurationService:523 - SERVER[] Invalid configuration defined, [oozie.service.ProxyUserService.proxyuser.falcon.groups] 2016-09-26 11:12:35,460 WARN ConfigurationService:523 - SERVER[] Invalid configuration defined, [oozie.service.ProxyUserService.proxyuser.knox.hosts] 2016-09-26 11:12:35,461 WARN ConfigurationService:523 - SERVER[] Invalid configuration defined, [oozie.service.ProxyUserService.proxyuser.knox.groups] 2016-09-26 11:12:35,461 WARN ConfigurationService:523 - SERVER[] Invalid configuration defined, [oozie.service.AuthorizationService.security.enabled] 2016-09-26 11:12:35,461 WARN ConfigurationService:523 - SERVER[] Invalid configuration defined, [oozie.service.ProxyUserService.proxyuser.hue.groups] 2016-09-26 11:12:35,469 WARN Services:523 - SERVER[] System ID [oozie-oozi] exceeds maximum length [10], trimming 2016-09-26 11:12:36,983 WARN Services:523 - SERVER[] Previous services singleton active, destroying it 2016-09-26 11:12:37,045 WARN ConfigurationService:523 - SERVER[] Invalid configuration defined, [oozie.service.ProxyUserService.proxyuser.falcon.hosts] 2016-09-26 11:12:37,046 WARN ConfigurationService:523 - SERVER[] Invalid configuration defined, [oozie.service.ProxyUserService.proxyuser.hue.hosts] 2016-09-26 11:12:37,049 WARN ConfigurationService:523 - SERVER[] Invalid configuration defined, [oozie.service.ProxyUserService.proxyuser.falcon.groups] @ "oozie-error.log" [readonly] 49L, 7824C

I am thinking this may be causing error in my environment where as in un-secured cluster I am not seeing these warning. The secured cluster is on HDP 2.5 where as un-secure cluster is on HDP 2.4

Stay tune I will post results.

Thank you for your help.

Thanks

Ram

Re: Oozie Sqoop action fails in kerberized cluster - HDP 2.5

Contributor

Hi, I have updated job.properties file and workflow to add hive-site.xml as you suggested and I am still getting the same error.

9207 [Thread-19] INFO  org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities  - Time taken: 2.336 seconds
9207 [Thread-19] INFO  org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities  - Time taken: 2.336 seconds
12722 [main] INFO  hive.metastore  - Trying to connect to metastore with URI thrift://caldev-hdp-n02.txhubdw.com:9083
12767 [main] WARN  hive.metastore  - set_ugi() not successful, Likely cause: new client talking to old server. Continuing without it.
org.apache.thrift.transport.TTransportException
	at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
	at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
	at org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:380)
	at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:230)

The following is from HiveMetastore Log file:

2016-09-26 16:13:20,843 DEBUG [pool-7-thread-199]: security.UserGroupInformation (UserGroupInformation.java:logPrivilegedAction(1751)) - PrivilegedAction as:hive/caldev02.MYEXAMPLE.COM (auth:KERBEROS) from:org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:606) 2016-09-26 16:13:20,843 DEBUG [pool-7-thread-199]: transport.TSaslServerTransport (TSaslServerTransport.java:getTransport(213)) - transport map does not contain key 2016-09-26 16:13:20,843 DEBUG [pool-7-thread-199]: transport.TSaslTransport (TSaslTransport.java:open(261)) - opening transport org.apache.thrift.transport.TSaslServerTransport@5b91a6a9 2016-09-26 16:13:20,844 DEBUG [pool-7-thread-199]: transport.TSaslTransport (TSaslTransport.java:sendSaslMessage(162)) - SERVER: Writing message with status ERROR and payload length 19 2016-09-26 16:13:20,844 DEBUG [pool-7-thread-199]: transport.TSaslServerTransport (TSaslServerTransport.java:getTransport(218)) - failed to open server transport org.apache.thrift.transport.TTransportException: Invalid status -128 at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:609) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:606) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1704) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:606) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 2016-09-26 16:13:20,844 ERROR [pool-7-thread-199]: server.TThreadPoolServer (TThreadPoolServer.java:run(297)) - Error occurred during processing of message. java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: Invalid status -128 at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:609) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:606)

Please let me know

Thanks

ram

Re: Oozie Sqoop action fails in kerberized cluster - HDP 2.5

Expert Contributor

Did you find any solution of the problem?

I have same case.