Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Spark- shell org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.thrift.transport.TTranspo

Highlighted

Spark- shell org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.thrift.transport.TTranspo

Explorer

Hi All,

 

Installed 8 node cluster with sentry, Kerberos, spark 1.6 and spark 2.3.0.cloudera2

Java 1.8.0_121

 

CDH 5 :  5.15.0-1.cdh5.15.0.p0.21

Spark2 parcel :    2.3.0.cloudera2-1.cdh5.13.3.p0.316101

CSD :                      SPARK2_ON_YARN-2.3.0.cloudera2.jar

 

Issues : 1) while invoking Spark-shell throwing below error (not working)

              2)  Spark2-shell hive query not returning any results throwing below error, Both shells not working

 

 

  1. Spark- shell :

 

      ____              __

     / __/__  ___ _____/ /__

    _\ \/ _ \/ _ `/ __/  '_/

   /___/ .__/\_,_/_/ /_/\_\   version 1.6.0

      /_/

Using Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_121)

Type in expressions to have them evaluated.

Type :help for more information.

org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset

        at org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:220)

Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset                                      

        at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3658)                                                                                                             

        at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:231)                                                                                                              

        at org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:215)                                                                                                      

        ... 71 more                                                                                                                                                                            

Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset                                                                                        

        at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)                                                                                                    

        at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)       

 

java.lang.NullPointerException                                                                                                                                                                

        at org.apache.spark.sql.SQLContext$.createListenerAndUI(SQLContext.scala:1387)                                                                                                          

        at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:101)                                                                                                                

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)    

 

        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)                                                                                                                        

                                                                                                                                                                                               

<console>:16: error: not found: value sqlContext                                                                                                                                              

         import sqlContext.implicits._                                                                                                                                                          

                ^                                                                                                                                                                              

<console>:16: error: not found: value sqlContext                                                                                                                                              

         import sqlContext.sql

 

         ________________________________________________________________________________

   

2 )    Spark2-shellval hiveDF=spark.sql ("select * fromorg.apache.thrift.transport.TTransportException       at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)       at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:453)       at org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37)

  1.        at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
  2.        at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:435)
  3.        at org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:376)
  4.        at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
  5. 19/03/18 22:42:32 WARN metastore.RetryingMetaStoreClient: MetaStoreClient lost connection. Attempting to reconnect.

 

           ___________________________________________________________________________

 

              Hive : hadoop-cmf-hive-HIVEMETASTORE

 

 

 

 

2019-03-18 09:15:00,655 ERROR org.apache.thrift.server.TThreadPoolServer: [pool-5-thread-10]: Error occurred during processing of message.

java.lang.RuntimeException: org.apache.hadoop.security.authorize.AuthorizationException: User: hive/FQDN @ REALM is not allowed to impersonate PRINCIPLE @REALM

        at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:757)

        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

        at java.lang.Thread.run(Thread.java:745)

Caused by: org.apache.hadoop.security.authorize.AuthorizationException: User: hive/ FQDN @ REALM is not allowed to impersonate PRINCIPLE @ REALM

        at org.apache.hadoop.security.authorize.DefaultImpersonationProvider.authorize(DefaultImpersonationProvider.java:123)

        at org.apache.hadoop.security.authorize.ProxyUsers.authorize(ProxyUsers.java:102)

        at org.apache.hadoop.security.authorize.ProxyUsers.authorize(ProxyUsers.java:116)

        at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:727)

        ... 4 more

 

 

Appreciate your help in advance  

 

Kind Regards

R

1 REPLY 1

Re: Spark- shell org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.thrift.transport.TTra

Cloudera Employee

Hello R,

 

 

Following suggests that HiveMetaStore isn't allowing connections from PRINCIPLE @ REALM.

 

Caused by: org.apache.hadoop.security.authorize.AuthorizationException: User: hive/ FQDN @ REALM is not allowed to impersonate PRINCIPLE @ REALM

A very common reason for this to occur is configuring "hadoop.proxyuser.hive.groups" per steps from link [1]. If this access is required for the user running Spark jobs, you can add the group of this user to hadoop.proxyuser.hive.groups or the username of this user to hadoop.proxyuser.hive.users

 

 

[1]

https://www.cloudera.com/documentation/enterprise/5-15-x/topics/sg_sentry_service_config.html#concep...