Created 09-16-2016 07:02 AM
Hi,
I am running a spark program on secured cluster which creates SqlContext for creating dataframe over phoenix table.
When I run my program in local mode with --master option set to local[2] my program works completely fine, however when I try to run same program with master option set to yarn-client, I am getting below exception:
Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=5, exceptions: Fri Sep 16 12:14:10 IST 2016, RpcRetryingCaller{globalStartTime=1474008247898, pause=100, retries=5}, org.apache.hadoop.hbase.MasterNotRunningException: com.google.protobuf.ServiceException: java.io.IOException: Could not set up IO Streams to demo-qa2-nn/10.60.2.15:16000 at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:147) at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4083) at org.apache.hadoop.hbase.client.HBaseAdmin.getTableDescriptor(HBaseAdmin.java:528) at org.apache.hadoop.hbase.client.HBaseAdmin.getTableDescriptor(HBaseAdmin.java:550) at org.apache.phoenix.query.ConnectionQueryServicesImpl.ensureTableCreated(ConnectionQueryServicesImpl.java:810) ... 50 more Caused by: org.apache.hadoop.hbase.MasterNotRunningException: com.google.protobuf.ServiceException: java.io.IOException: Could not set up IO Streams to demo-qa2-nn/10.60.2.15:16000 at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1540) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.makeStub(ConnectionManager.java:1560) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveMasterService(ConnectionManager.java:1711) at org.apache.hadoop.hbase.client.MasterCallable.prepare(MasterCallable.java:38) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:124) ... 54 more Caused by: com.google.protobuf.ServiceException: java.io.IOException: Could not set up IO Streams to demo-qa2-nn/10.60.2.15:16000 at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:223) at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:287) at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.isMasterRunning(MasterProtos.java:58152) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.isMasterRunning(ConnectionManager.java:1571) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStubNoRetries(ConnectionManager.java:1509) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1531) ... 58 more Caused by: java.io.IOException: Could not set up IO Streams to demo-qa2-nn/10.60.2.15:16000 at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:779) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:887) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:856) at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1200) at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:213) ... 63 more Caused by: java.lang.RuntimeException: SASL authentication failed. The most likely cause is missing or invalid credentials. Consider 'kinit'. at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$1.run(RpcClientImpl.java:679) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.handleSaslConnectionFailure(RpcClientImpl.java:637) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:745) ... 67 more Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212) at org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:179) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(RpcClientImpl.java:611) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$600(RpcClientImpl.java:156) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:737) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:734) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:734) ... 67 more Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147) at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:121) at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187) at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:223) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179) at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193) ... 76 more
PFB program and command I am using:
val sparkConf = new SparkConf().setAppName(appName) .set("spark.kyro.registrationRequired", "true") //always use kyro CustomKryoRegistrator.register(sparkConf) val sc=new SparkContext(sparkConf) val sqlContext = new org.apache.spark.sql.SQLContext(sc) sqlContext.setConf("spark.sql.parquet.binaryAsString", "true") val df = sqlContext.read.format("org.apache.phoenix.spark") .option("table", table_name) .option("zkUrl", "demo-qa2-dn03,demo-qa2-dn01,demo-qa2-dn02") .load() df.show();
Command:
spark-submit --jars $(echo ./lib/*.jar | tr ' ' ','),$(echo ./conf/*.* | tr ' ' ','),/usr/hdp/2.4.2.0-258/hbase/lib/hbase-client-1.1.2.2.4.2.0-258.jar,/usr/hdp/2.4.2.0-258/hbase/lib/hbase-common-1.1.2.2.4.2.0-258.jar,/usr/hdp/2.4.2.0-258/hbase/lib/hbase-server-1.1.2.2.4.2.0-258.jar,/usr/hdp/2.4.2.0-258/hbase/lib/hbase-hadoop-compat-1.1.2.2.4.2.0-258.jar,/usr/hdp/2.4.2.0-258/hbase/lib/hbase-protocol-1.1.2.2.4.2.0-258.jar,/usr/hdp/2.4.2.0-258/phoenix/phoenix-4.4.0.2.4.2.0-258-thin-client.jar,/usr/hdp/2.4.2.0-258/phoenix/lib/phoenix-core-4.4.0.2.4.2.0-258.jar,/usr/hdp/2.4.2.0-258/phoenix/lib/phoenix-spark-4.4.0.2.4.2.0-258.jar,/usr/hdp/2.4.2.0-258/hbase/lib/phoenix-4.4.0.2.4.2.0-258-client.jar --driver-class-path $(echo ./lib/*.jar | tr ' ' ','),$(echo ./conf/*.* | tr ' ' ','),/usr/hdp/2.4.2.0-258/hbase/lib/hbase-client-1.1.2.2.4.2.0-258.jar,/usr/hdp/2.4.2.0-258/hbase/lib/hbase-common-1.1.2.2.4.2.0-258.jar,/usr/hdp/2.4.2.0-258/hbase/lib/hbase-protocol-1.1.2.2.4.2.0-258.jar,/usr/hdp/2.4.2.0-258/hbase/lib/hbase-server-1.1.2.2.4.2.0-258.jar,/usr/hdp/2.4.2.0-258/hbase/lib/hbase-hadoop-compat-1.1.2.2.4.2.0-258.jar,/usr/hdp/2.4.2.0-258/phoenix/lib/phoenix-spark-4.4.0.2.4.2.0-258.jar,/usr/hdp/2.4.2.0-258/phoenix/lib/phoenix-core-4.4.0.2.4.2.0-258.jar,/usr/hdp/2.4.2.0-258/phoenix/phoenix-4.4.0.2.4.2.0-258-thin-client.jar,/usr/hdp/2.4.2.0-258/hbase/lib/phoenix-4.4.0.2.4.2.0-258-client.jar --master yarn-client --class com.xyz.demo.dq.DataQualityApplicationHandler tr-dq-16.7.0.0.0.jar org ss1 Phoenix tr-dq-job.properties QUALITY
Created 09-20-2016 10:01 AM
After adding hbase jars in spark.driver.extraClassPath my job is working fine.
spark-submit --jars /usr/hdp/2.4.2.0-258/phoenix/phoenix-4.4.0.2.4.2.0-258-client.jar --conf "spark.driver.extraClassPath=/usr/hdp/2.4.2.0-258/hbase/lib/hbase-common-1.1.2.2.4.2.0-258.jar:/usr/hdp/2.4.2.0-258/hbase/lib/hbase-client-1.1.2.2.4.2.0-258.jar:/usr/hdp/2.4.2.0-258/hbase/lib/hbase-server-1.1.2.2.4.2.0-258.jar:/usr/hdp/2.4.2.0-258/hbase/lib/hbase-protocol-1.1.2.2.4.2.0-258.jar:/usr/hdp/2.4.2.0-258/hbase/lib/htrace-core-3.1.0-incubating.jar:/usr/hdp/2.4.2.0-258/hbase/lib/guava-12.0.1.jar" --master yarn-client --principal ctadmin@EXAMPLE.COM --keytab /etc/security/keytabs/ctadmin.keytab --class com.dq.DataQualityApplicationHandler tr-dq-16.7.0.0.0.jar org QUALITY
Root Cause: When spark-submit detects YARN cluster deployment mode, org.apache.spark.deploy.yarn.Client is used for app submission. While getting HBASE_DELEGATION_TOKEN yarn was not getting HBase jars. Hence SPARK_CLASSPATH should be configured with required Hbase jars, so that yarn will get it while taking delegation token.
Note: Starting from spark 1.5, spark.driver.extraClasspath need to be set instead of exporting SPARK_CLASSPATH.
Created 09-16-2016 07:31 AM
May be you are hitting following bug :-
https://issues.apache.org/jira/browse/PHOENIX-2817
Would you mind trying the workaround mention at the end of the ticket:-
For people waiting on this fix there is a very simple workaround provided that you use the default zk port and path. It's as simple as only listing the the server names "server1,server2" so the plugin builds the url correctly: jdbc:phoenix:server1,server2:2181:/hbase Then the delegation tokens setup by spark-submit take care of security so Phoenix doesn't need to do anything with principals or keytabs. The thing I find a bit confusing is that for other tools the zookeeper quorum URL includes the port and the path, while for Phoenix the zk quorum property is just the server list.
Created 09-16-2016 07:36 AM
Yes I also gone through this JIRA but I am unable to understand workaround provided.
What I understood from workaround is instead of providing full zookeper url only provide comma separated ip/host of zookeeper.
Created 09-16-2016 07:48 AM
And along with that, just have hbase-site.xml in the class path of spark.
OR(try)
Created 09-16-2016 08:03 AM
@Ankit Singhal I added hbase-site.xml in spark conf directory on all nodes and restarted spark service but it didn't works. Also hbas-site.xml is already present in my classpath.
Created 09-16-2016 09:59 AM
Then it's a different issue you are facing. can you follow this thread.
http://search-hadoop.com/m/9UY0h2etOtv1p28Si2&subj=Phoenix+Spark+JDBC+Kerberos+
Created 09-16-2016 01:13 PM
Also I tried upgrading my phoenix to 4.8 but it didn't work
Created 09-16-2016 01:21 PM
Have you also tried what @Josh Elser has mentioned on the following thread http://search-hadoop.com/m/9UY0h2etOtv1p28Si2&subj=Phoenix+Spark+JDBC+Kerberos+ and get the root cause of the problem.
Created 09-20-2016 10:01 AM
After adding hbase jars in spark.driver.extraClassPath my job is working fine.
spark-submit --jars /usr/hdp/2.4.2.0-258/phoenix/phoenix-4.4.0.2.4.2.0-258-client.jar --conf "spark.driver.extraClassPath=/usr/hdp/2.4.2.0-258/hbase/lib/hbase-common-1.1.2.2.4.2.0-258.jar:/usr/hdp/2.4.2.0-258/hbase/lib/hbase-client-1.1.2.2.4.2.0-258.jar:/usr/hdp/2.4.2.0-258/hbase/lib/hbase-server-1.1.2.2.4.2.0-258.jar:/usr/hdp/2.4.2.0-258/hbase/lib/hbase-protocol-1.1.2.2.4.2.0-258.jar:/usr/hdp/2.4.2.0-258/hbase/lib/htrace-core-3.1.0-incubating.jar:/usr/hdp/2.4.2.0-258/hbase/lib/guava-12.0.1.jar" --master yarn-client --principal ctadmin@EXAMPLE.COM --keytab /etc/security/keytabs/ctadmin.keytab --class com.dq.DataQualityApplicationHandler tr-dq-16.7.0.0.0.jar org QUALITY
Root Cause: When spark-submit detects YARN cluster deployment mode, org.apache.spark.deploy.yarn.Client is used for app submission. While getting HBASE_DELEGATION_TOKEN yarn was not getting HBase jars. Hence SPARK_CLASSPATH should be configured with required Hbase jars, so that yarn will get it while taking delegation token.
Note: Starting from spark 1.5, spark.driver.extraClasspath need to be set instead of exporting SPARK_CLASSPATH.
Created 06-20-2017 04:45 PM
I always get
org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed servers list: host1/192.168.35.45:60020 at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:708) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:907) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl
spark-submit --master yarn-cluster --keytab “/my.keytab" --principal “myv@IO-INT.COM" --driver-class-path "/opt/cloudera/parcels/CDH/lib/hbase/lib/hbase-common-1.2.0-cdh5.10.1.jar:/opt/cloudera/parcels/CDH/lib/hbase/lib/hbase-client-1.2.0-cdh5.10.1.jar:/opt/cloudera/parcels/CDH/lib/hbase/lib/hbase-protocol-1.2.0-cdh5.10.1.jar:/opt/cloudera/parcels/CDH/lib/hbase/lib/hbase-server-1.2.0-cdh5.10.1.jar:/opt/cloudera/parcels/CDH/lib/hbase/lib/hbase-spark-1.2.0-cdh5.10.1.jar:/opt/cloudera/parcels/CDH/lib/hbase/lib/guava-12.0.1.jar:/etc/hbase/conf/:/etc/hbase/conf/hbase-site.xml:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.2.0-incubating.jar:/opt/cloudera/parcels/CDH/lib/hbase/lib/" spark-services-0.0.1-SNAPSHOT.jar