Member since
10-01-2017
4
Posts
0
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2959 | 10-22-2018 01:41 PM |
10-22-2018
01:41 PM
The issue is that Spark couldn't read hive-site.xml and hence providing the properties in SparkSession builder fixed the issue. spark_session = (SparkSession
.builder
.config("hive.metastore.uris", "thrift://server.country.cloud.company.com:9083")
.config("hive.metastore.client.socket.timeout", "300")
.config("hive.metastore.warehouse.dir", "/user/hive/warehouse")
.config("hive.warehouse.subdir.inherit.perms", "true")
.config("hive.execution.engine", "mr")
.config("hive.metastore.execute.setugi", "true")
.config("hive.support.concurrency", "true")
.config("hive.zookeeper.quorum", "<<REDACTED>>")
.config("hive.zookeeper.client.port", "2181")
.config("hive.zookeeper.namespace", "hive_zookeeper_namespace_hive")
.config("hive.cluster.delegation.token.store.class", "org.apache.hadoop.hive.thrift.MemoryTokenStore")
.config("hive.server2.enable.doAs", "false")
.config("hive.metastore.sasl.enabled", "true")
.config("hive.server2.authentication", "kerberos")
.config("hive.metastore.kerberos.principal", "<<REDACTED>>")
.config("hive.server2.authentication.kerberos.principal", "<<REDACTED>>")
.config("hive.server2.use.SSL", "true")
.config("hive.exec.dynamic.partition", "true")
.config("hive.exec.dynamic.partition.mode", "nonstrict")
.enableHiveSupport()
.appName('app_name')
.getOrCreate())
... View more
10-22-2018
11:52 AM
Environment: Cloudera Enterprise 5.10.2.
We have a python spark job that is submitted using spark2-submit from the edge node and is working as expected. The spark job queries the hive tables, and it works in both client and cluster mode.
However, when it's submitted via a shell action from the oozie workflow, it fails to connect to the metastore. Actually, it opens a connection and then looses it. Any lead on how to fix would be helpful.
=======================================================================================
18/10/22 10:52:23 INFO hive.metastore: Trying to connect to metastore with URI thrift://correcturl.country.cloud.company.com:9083
18/10/22 10:52:23 INFO hive.metastore: Opened a connection to metastore, current connections: 1
18/10/22 10:52:23 WARN hive.metastore: set_ugi() not successful, Likely cause: new client talking to old server. Continuing without it.
org.apache.thrift.transport.TTransportException
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:380)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:230)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_set_ugi(ThriftHiveMetastore.java:3741)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.set_ugi(ThriftHiveMetastore.java:3727)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:437)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:235)
at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1490)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:67)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:82)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2935)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2954)
at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3179)
at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:210)
at org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:197)
at org.apache.hadoop.hive.ql.metadata.Hive.<init>(Hive.java:307)
at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:268)
at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:243)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.yarn.security.HiveCredentialProvider$$anonfun$obtainCredentials$1.apply$mcV$sp(HiveCredentialProvider.scala:91)
at org.apache.spark.deploy.yarn.security.HiveCredentialProvider$$anonfun$obtainCredentials$1.apply(HiveCredentialProvider.scala:90)
at org.apache.spark.deploy.yarn.security.HiveCredentialProvider$$anonfun$obtainCredentials$1.apply(HiveCredentialProvider.scala:90)
at org.apache.spark.deploy.yarn.security.HiveCredentialProvider$$anon$1.run(HiveCredentialProvider.scala:123)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1907)
at org.apache.spark.deploy.yarn.security.HiveCredentialProvider.doAsRealUser(HiveCredentialProvider.scala:122)
at org.apache.spark.deploy.yarn.security.HiveCredentialProvider.obtainCredentials(HiveCredentialProvider.scala:90)
at org.apache.spark.deploy.yarn.security.ConfigurableCredentialManager$$anonfun$obtainCredentials$2.apply(ConfigurableCredentialManager.scala:80)
at org.apache.spark.deploy.yarn.security.ConfigurableCredentialManager$$anonfun$obtainCredentials$2.apply(ConfigurableCredentialManager.scala:78)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
at scala.collection.MapLike$DefaultValuesIterable.foreach(MapLike.scala:206)
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104)
at org.apache.spark.deploy.yarn.security.ConfigurableCredentialManager.obtainCredentials(ConfigurableCredentialManager.scala:78)
at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:398)
at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:874)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:171)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:171)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:236)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
at py4j.GatewayConnection.run(GatewayConnection.java:214)
at java.lang.Thread.run(Thread.java:745)
18/10/22 10:52:23 INFO hive.metastore: Connected to metastore.
18/10/22 10:52:23 WARN metastore.RetryingMetaStoreClient: MetaStoreClient lost connection. Attempting to reconnect.
org.apache.thrift.transport.TTransportException
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_all_functions(ThriftHiveMetastore.java:3339)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_all_functions(ThriftHiveMetastore.java:3327)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getAllFunctions(HiveMetaStoreClient.java:2060)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:105)
at com.sun.proxy.$Proxy18.getAllFunctions(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:1998)
at com.sun.proxy.$Proxy18.getAllFunctions(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3179)
at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:210)
at org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:197)
at org.apache.hadoop.hive.ql.metadata.Hive.<init>(Hive.java:307)
at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:268)
at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:243)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.yarn.security.HiveCredentialProvider$$anonfun$obtainCredentials$1.apply$mcV$sp(HiveCredentialProvider.scala:91)
at org.apache.spark.deploy.yarn.security.HiveCredentialProvider$$anonfun$obtainCredentials$1.apply(HiveCredentialProvider.scala:90)
at org.apache.spark.deploy.yarn.security.HiveCredentialProvider$$anonfun$obtainCredentials$1.apply(HiveCredentialProvider.scala:90)
at org.apache.spark.deploy.yarn.security.HiveCredentialProvider$$anon$1.run(HiveCredentialProvider.scala:123)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1907)
at org.apache.spark.deploy.yarn.security.HiveCredentialProvider.doAsRealUser(HiveCredentialProvider.scala:122)
at org.apache.spark.deploy.yarn.security.HiveCredentialProvider.obtainCredentials(HiveCredentialProvider.scala:90)
at org.apache.spark.deploy.yarn.security.ConfigurableCredentialManager$$anonfun$obtainCredentials$2.apply(ConfigurableCredentialManager.scala:80)
at org.apache.spark.deploy.yarn.security.ConfigurableCredentialManager$$anonfun$obtainCredentials$2.apply(ConfigurableCredentialManager.scala:78)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
at scala.collection.MapLike$DefaultValuesIterable.foreach(MapLike.scala:206)
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104)
at org.apache.spark.deploy.yarn.security.ConfigurableCredentialManager.obtainCredentials(ConfigurableCredentialManager.scala:78)
at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:398)
at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:874)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:171)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:171)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:236)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
at py4j.GatewayConnection.run(GatewayConnection.java:214)
at java.lang.Thread.run(Thread.java:745)
18/10/22 10:52:24 INFO hive.metastore: Closed a connection to metastore, current connections: 0
18/10/22 10:52:24 INFO hive.metastore: Trying to connect to metastore with URI thrift://correcturl.country.cloud.company.com:9083
18/10/22 10:52:24 INFO hive.metastore: Opened a connection to metastore, current connections: 1
18/10/22 10:52:24 WARN hive.metastore: set_ugi() not successful, Likely cause: new client talking to old server. Continuing without it.
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Oozie
-
Apache Spark