Reply
New Contributor
Posts: 3
Registered: ‎02-27-2017

Issue with PySpark and Hue

Hi All,

 

We are using Hue 3.11 on Centos7 and connecting to Hortonworks cluster (2.5.3). While running below pyspark commands from Hue UI :

 

from pyspark.sql import HiveContext

from pyspark import SparkConf, SparkContext

 

sqlContext = HiveContext(sc)

 

sqlContext.sql("show tables")

 

We are getting below error:

 

org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521) ... 29 more Caused by: MetaException(message:Could not connect to meta store using any of the

URIs provided. Most recent failure: org.apache.thrift.transport.TTransportException: GSS initiate failed at

 

Note : We are using Kerberised  Cluster and Impersonation is enabled on HDP cluster on Hive. 

 

Detailed error log:

 

Traceback (most recent call last): File "/usr/hdp/2.5.3.0-37/spark/python/lib/pyspark.zip/pyspark/sql/context.py", line 580, in sql return DataFrame(self._ssql_ctx.sql(sqlQuery),
self) File "/usr/hdp/2.5.3.0-37/spark/python/lib/pyspark.zip/pyspark/sql/context.py", line 683, in _ssql_ctx self._scala_HiveContext = self._get_hive_ctx() File
"/usr/hdp/2.5.3.0-37/spark/python/lib/pyspark.zip/pyspark/sql/context.py", line 692, in _get_hive_ctx return self._jvm.HiveContext(self._jsc.sc()) File
"/usr/hdp/2.5.3.0-37/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 1064, in __call__ answer, self._gateway_client, None, self._fqn) File
"/usr/hdp/2.5.3.0-37/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 45, in deco return f(*a, **kw) File
"/usr/hdp/2.5.3.0-37/spark/python/lib/py4j-0.9-src.zip/py4j/protocol.py", line 308, in get_return_value format(target_id, ".", name), value) Py4JJavaError: An error occurred while
calling None.org.apache.spark.sql.hive.HiveContext. : java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522) at
org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:204) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at
org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:249) at
org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:345) at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:255) at
org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:459) at org.apache.spark.sql.hive.HiveContext.defaultOverrides(HiveContext.scala:233) at
org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:236) at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:101) at
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at
py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381) at py4j.Gateway.invoke(Gateway.java:214) at
py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79) at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68) at
py4j.GatewayConnection.run(GatewayConnection.java:209) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Unable to instantiate
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1523) at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86) at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132) at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005) at
org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503) ... 23 more Caused by:
java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at
org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521) ... 29 more Caused by: MetaException(message:Could not connect to meta store using any of the
URIs provided. Most recent failure: org.apache.thrift.transport.TTransportException: GSS initiate failed at
org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:316) at
org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) at java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at
org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:420) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:236) at
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at
org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521) at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86) at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132) at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005) at
org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503) at
org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:204) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at
org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:249) at
org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:345) at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:255) at
org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:459) at org.apache.spark.sql.hive.HiveContext.defaultOverrides(HiveContext.scala:233) at
org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:236) at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:101) at
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at
py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381) at py4j.Gateway.invoke(Gateway.java:214) at
py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79) at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68) at
py4j.GatewayConnection.run(GatewayConnection.java:209) at java.lang.Thread.run(Thread.java:745) ) at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:466) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:236) at
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74) ... 34 more

 

 

Highlighted
Explorer
Posts: 12
Registered: ‎11-14-2016

Re: Issue with PySpark and Hue

Hi Arathod,

 

Were you able to solve this issue? I'm currently running into the same issue.

 

Thanks.

New Contributor
Posts: 2
Registered: ‎05-31-2017

Re: Issue with PySpark and Hue

Is the cluster Kerborized? If so, the "GSS Initate failed" error usually means that there no valid kerberos ticket for the current user who is running the script. 

Explorer
Posts: 12
Registered: ‎11-14-2016

Re: Issue with PySpark and Hue

Yes, it's kerberized cluster. Do you have hue as superuser in Spark?

Announcements