About shariyar_murtaz

shariyar_murtaz · ‎08-12-2016

Setting SPARK_HOME in hadoopn-env.sh solved the issue. For others who have the same issue. Just add the following line in this file: /usr/hdp/your_version_number/hadoop/conf export SPARK_HOME=/usr/hdp/current/spark-client

shariyar_murtaz · ‎08-11-2016

No arguments are passed correctly. This is how my application is accepting it as I am using org.apache.commons.cli.BasicParser. I verified it multiple times by printing them inside the application. There is nothing wrong there. Thanks for you help.

shariyar_murtaz · ‎08-11-2016

yarn-cluster <workflow-app name="${wf_name}" xmlns="uri:oozie:workflow:0.4"> <start to="spark"/> <action name="spark"> <spark xmlns="uri:oozie:spark-action:0.1"> <job-tracker>${job_tracker}</job-tracker> <name-node>${name_node}</name-node> <master>${master}</master> <mode>cluster</mode> <name>logminer</name> <class>logminer.main.LogMinerMain</class> <jar>${filesystem}/${baseLoc}/oozie/lib/argus-logminer-1.0.jar</jar> <spark-opts>--driver-memory 4G --executor-memory 4G --num-executors 3 --executor-cores 5</spark-opts> <arg>-logtype</arg> <arg>adraw</arg> <arg>-inputfile</arg> <arg>/user/inputfile-march-3.txt</arg> <arg>-configfile</arg> <arg>${filesystem}/${baseLoc}/oozie/logminer.properties</arg> <arg>-mode</arg> <arg>test</arg> </spark> <ok to="success_email"/> <error to="fail_email"/> </action> <action name="success_email"> <email xmlns="uri:oozie:email-action:0.1"> <to>${emailTo}</to> <cc>${emailCC}</cc> <subject>${wf_name}: Successful run at ${wf:id()}</subject> <body>The workflow [${wf:id()}] ran succesfully.</body> </email> <ok to="end"/> <error to="fail_email"/> </action> <action name="fail_email"> <email xmlns="uri:oozie:email-action:0.1"> <to>${emailTo}</to> <cc>${emailCC}</cc> <subject>${wf_name}: Failed at ${wf:id()}</subject> <body>The workflow [${wf:id()}] failed at [${wf:lastErrorNode()}] with the following message: ${wf:errorMessage(wf:lastErrorNode())}</body> </email> <ok to="fail"/> <error to="fail"/> </action> <kill name="fail"> <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <end name="end"/> </workflow-app>

shariyar_murtaz · ‎08-10-2016

Spark: 1.3.1 HDP: 2.3.0.0-2557 I don't see any SPARK_HOME variable in my shell. But here is the list of jar from hdp/currrent/spark-client /usr/hdp/current/spark-client/lib datanucleus-api-jdo-3.2.6.jar datanucleus-rdbms-3.2.9.jar spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar datanucleus-core-3.2.10.jar spark-1.3.1.2.3.0.0-2557-yarn-shuffle.jar spark-examples-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar

shariyar_murtaz · ‎08-10-2016

Hi, I am trying to launch a spark application which works perfectly well from shell but executors fail when launched from OOzie. On the slaves (Executors) side, I see the following: Error: Could not find or load main class org.apache.spark.executor.CoarseGrainedExecutorBackend On the driver side I see the following, but it is not really any null pointer in my code. My code is working fine when I launch spark directly from shell. It has something to do with executors. [Driver] ERROR ogminer.main.LogMinerMain - nulljava.lang.InterruptedExceptionat java.lang.Object.wait(Native Method) ~[?:1.8.0_66]at java.lang.Object.wait(Object.java:502) ~[?:1.8.0_66] at org.apache.spark.scheduler.JobWaiter.awaitResult(JobWaiter.scala:73)~[spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar:?] at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:513)~[spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar:?] at org.apache.spark.SparkContext.runJob(SparkContext.scala:1466)~[spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar:?] at org.apache.spark.SparkContext.runJob(SparkContext.scala:1484)~[spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar:?] at org.apache.spark.SparkContext.runJob(SparkContext.scala:1498) ~[spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar:?] at org.apache.spark.SparkContext.runJob(SparkContext.scala:1512)~[spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar:?] at org.apache.spark.rdd.RDD.collect(RDD.scala:813)~[spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar:?] at org.apache.spark.api.java.JavaRDDLike$class.collect(JavaRDDLike.scala:320)~[spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar:?] at org.apache.spark.api.java.AbstractJavaRDDLike.collect(JavaRDDLike.scala:46)~[spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar:?] at logminer.main.LogSparkTester.test(LogSparkTester.java:214)~[__app__.jar:?] at logminer.main.LogMinerMain.testTrainOnHdfs(LogMinerMain.java:232)~[__app__.jar:?] at com.telus.argus.logminer.main.LogMinerMain.main(LogMinerMain.java:159)[__app__.jar:?] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_66]at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8. 0_66] atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_66]at java.lang.reflect.Method.invoke(Method.java:497) ~[?:1.8.0_66] at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:484) [spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar:?] I am not sure how to solve this issue. I have put all the spark related jars in the lib folder for this oozie job. Here is my directory structure on hdfs for this OOzie job oozie/ oozie/workflow.xml oozie/job.properties oozie/lib/argus-logminer-1.0.jar oozie/lib/core-site.xml oozie/lib/hdfs-site.xml oozie/lib/kms-site.xml oozie/lib/mapred-site.xml oozie/lib/oozie-sharelib-spark-4.2.0.2.3.0.0-2557.jar oozie/lib/spark-1.3.1.2.3.0.0-2557-yarn-shuffle.jar oozie/lib/spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar oozie/lib/yarn-site.xml Does any know how to solve this? Any idea which jar has this: CoarseGrainedExceutorBackend class?

shariyar_murtaz · ‎07-21-2016

Yes, I got the same kerberos credential error that I posted above forloginUserFromKeytab() When I shipped files, the error changed slightly to: can't get password from the keytab.

shariyar_murtaz · ‎07-21-2016

Sorry, I just corrected the code that worked for me, loginUserFromKeytab() didn't work but loginUserFromKeytabAndReturnUGI with doAs() worked.

shariyar_murtaz · ‎07-21-2016

I tried to ship keytab file using "--files" option and then read that file using SparkFiles.get("xyz.keytab"). I have also tried the following statement but it didn't work. UserGroupInformation.loginUserFromKeytab("name@xyz.com", keyTab); However, your suggestion about adding ugi.doAs function helped me resolved this issue. Here is the full code, if anyone else gets into the same trouble: UserGroupInformation.setConfiguration(conf); String keyTab="/etc/security/keytabs/somekeytab" UserGroupInformation ugi=UserGroupInformation.loginUserFromKeytabAndReturnUGI("name@xyz.com", keyTab); UserGroupInformation.setLoginUser(ugi); ugi.doAs(new PrivilegedExceptionAction<Void>() { @Override public Void run() throws IOException { connection=ConnectionFactory.createConnection(conf); return null; } });

shariyar_murtaz · ‎07-21-2016

Hi, I am running a spark application in a Kerberos based HDP platform. This spark application connects to HBase, write and read data perfectly well in a local mode on any node in the cluster. However, when I run this application on the cluster by using "-master yarn and --deploymode client (or cluster)" the Kerberos authentication fails. I have tried all sorts of things by doing Kinit outside of the application on each node, and doing Kerberos authentication inside the application as well but none of it has worked so far. In the local mode, nothing seems to have any issue and everything works: when I do kinit outside and do not perform any authentication inside the application. However, in the cluster mode nothing works whether I authenticate inside the application of outside the application. Here is an extract of the stack trace: ERROR ipc.AbstractRpcClient: SASL authentication failed. The most likely cause is missing or invalid credentials. Consider 'kinit'.javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) Below is the code that I used for authenticating inside the application: Configuration conf = HBaseConfiguration.create(); conf.addResource(new Path(hbaseConfDir,"hbase-site.xml")); conf.addResource(new Path(hadoopConfDir,"core-site.xml")); conf.set("hbase.client.keyvalue.maxsize", "0");conf.set("hbase.rpc.controllerfactory.class","org.apache.hadoop.hbase.ipc.RpcControllerFactory"); <b> conf.set("hadoop.security.authentication", "kerberos"); conf.set("hbase.security.authentication", "kerberos"); UserGroupInformation.setConfiguration(conf); String keyTab="/etc/security/keytabs/somekeytab"; UserGroupInformation ugi = UserGroupInformation.loginUserFromKeytabAndReturnUGI("name@xyz.com", keyTab); UserGroupInformation.setLoginUser(ugi); </b> connection=ConnectionFactory.createConnection(conf); logger.debug("HBase connected"); Adding or removing the bold lines in the above code didn't really have any effect other than the fact that when the bold lines are there kinit outside of the application is not needed. Please let me know how can I solve this problem. It has been quite some time I am hitting my head on this issue.

Online	Offline
Last Visited	‎11-14-2016 03:39 PM

Member Since	‎07-21-2016 03:52 PM
Last Visited	‎11-14-2016 03:39 PM
Posts	15
Kudos received	8

Cloudera Community

Re: Spark application fails on slaves when launchi...

Re: Spark application fails on slaves when launchi...

Re: Spark application fails on slaves when launchi...

Re: Spark application fails on slaves when launchi...

Spark application fails on slaves when launching f...

Re: Spark can't connect to HBase using Kerberos i...

Re: Spark can't connect to HBase using Kerberos i...

Re: Spark can't connect to HBase using Kerberos i...

Spark can't connect to HBase using Kerberos in Cl...