Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to inject local classpath of 3rd-party libs to Oozie Spark action?

How to inject local classpath of 3rd-party libs to Oozie Spark action?

Contributor

We have a Spark app that uses CLAB Phoenix to access HBase tables.  It is working on command line and I am trying to set it up as Oozie action.  However, I am having trouble importing the class paths into Oozie using available Hue 3.9 GUI (CDH 5.7).

 

The previous related questions that I can find (such as this) as well as this blog post all suggest making physical copies of the library jars, and put them in HDFS (1) workflow/lib, or (2) Oozie sharelib dir.  However, the Phoenix package has 70+ files (~210MB), and is already installed on the entire cluster.  It seems inefficient and wasteful to upload all that into HDFS and swoosh them around the network unnecessarily.

 

With spark-submit, we can pass in the path using "spark.driver.extraClassPath" and "spark.executor.extraClassPath" .  However, according to OOZIE-2277, it's not possible with Oozie.  Setting them in <action><spark><configuration><property> just gets ignored:

 

Warning: Ignoring non-spark config property: "spark.executor.extraClassPath=/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/*:/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/lib/*:/opt/spark/lib/*"
Warning: Ignoring non-spark config property: "spark.driver.extraClassPath=/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/*:/opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/lib/*:/opt/spark/lib/*"

 

The same log file shows that "spark.driver.extraClassPath" and "spark.executor.extraClassPath" are being populated, from what looks like Oozie sharelib contents.  Is there a way to add to it through environment variable or something?

 

Thanks,

Miles

 

2 REPLIES 2

Re: How to inject local classpath of 3rd-party libs to Oozie Spark action?

Contributor

Tried uploading the Phoenix jars to a separate HDFS location, then point oozie.libpath to it in workflow def.  Now it caused AM launching to fail:

 

2017-01-18 13:56:13,053 ERROR [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
java.lang.NoSuchMethodError: org.apache.hadoop.mapred.TaskLog.createLogSyncer()Ljava/util/concurrent/ScheduledExecutorService;
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.<init>(MRAppMaster.java:258)

 

Really prefer not to mess with Oozie sharelib - it seems effectively considered a part of CDH installation.  The blog didn't really explain how users should append 3rd-party content to it.  And Phoenix is only used by a subset of workflows anyway.

 

Could the problem be with Hue-Oozie integration?  Very confusing area - appreciate any tips anyone has!

 

Highlighted

Re: How to inject local classpath of 3rd-party libs to Oozie Spark action?

Contributor

When I copied the Phoenix client jar to workflow/lib directory, Oozie included it in the Spark container.  However, AppMaster now fails to launch:

 

(syslog)

2017-01-18 22:52:30,735 ERROR [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
java.lang.NoSuchMethodError: org.apache.hadoop.mapred.TaskLog.createLogSyncer()Ljava/util/concurrent/ScheduledExecutorService;
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.<init>(MRAppMaster.java:258)
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.<init>(MRAppMaster.java:241)
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1456)
2017-01-18 22:52:30,754 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting with status 1

 

This looks like OOZIE-2389.  Using the workaround suggested therein, I was able to launch the Spark task, but org.apache.spark.deploy.SparkSubmit.main() failed immediately with no info.

 

I used phoenix-4.7.0-clabs-phoenix1.3.0-client.jar, not *thin-client.jar which doesn't contain the org.apache.phoenix.spark driver.  Does it have any dependent jars that need to be copied along, or any version conflict with CDH 5.7.1?

 

 

Don't have an account?
Coming from Hortonworks? Activate your account here