Support Questions

Find answers, ask questions, and share your expertise

Cloudera 5.4.x oozie Custom Actions, how to use scala main classes?

avatar
Explorer

Please forgive me if this appears twice, the forum claimed to discard my last message after successful preview and an attempt to post it.

 

I have an oozie custom action on the Cloudera Quickstart 5.4.x VM, with an associated main class.  The main class is in scala (for intereoperability with a framework), and I'm having some challenges during testing.  Before going too deep into the details I would like to ask the following questions:

 

  1. What is the recommended approach for the custom action executor to resolve the class name of the main class, particularly if the main class is written in scala?
  2. What is the automated and recommended way to determine what (if any) dependencies must be provided with (i) the custom action and (ii) the main class?
  3. What is the recommended way to package the dependencies (e.g. put extra jars in the directory, use a shadow/fat jar)?

The details follow for a simplified sanitized version of what I'm dealing with that shows the salient features.

 

First, let's consider the custom action executor, whch has the following structure, please note the labeled problematic line impacted at runtime by the presence/absence of dependencies.


package org.apache.oozie.action.hadoop;

// all the imports are shown
import org.apache.action.hadoop.MyScalaShellMain;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.oozie.action.ActionExecutorException;
import org.jdom.Element;
import org.jdom.Namespace;

import java.util.List;


public class MyShellActionExecutor extends JavaActionExecutor {
// stuff deleted
@Override
protected String getLauncherMain(Configuration launcherConf, Element actionXml) {
final String classNameToLaunch = MyScalaShellMain.class.getName(); // This line fails without installing additional dependencies
LOG.info("getLauncherMain, classNameToLaunch = \"" + classNameToLaunch + "\"");
return launcherConf.get(LauncherMapper.CONF_OOZIE_ACTION_MAIN_CLASS, MyScalaShellMain.class.getName());
}

// more stuff deleted
}

 The MyScalaShellMain class is a scala class with a companion object (to support the static main method), with the following structure:

 

package org.apache.action.hadoop

import java.io._
import java.util
import com.typesafe.scalalogging.Logging
import com.typesafe.scalalogging.slf4j.Logger
import org.slf4j.LoggerFactory

import scala.collection.JavaConverters._

import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.Path
import org.apache.hadoop.util.Shell
import org.apache.oozie.action.hadoop._


import com.typesafe.scalalogging;


class MyScalaShellMain extends LauncherMain with Logging {
val logger = Logger(LoggerFactory.getLogger(getClass))

@throws(classOf[Exception])
protected def run(args: Array[String]) {
System.out.println(s"""run(args = ${args}) invoked""")
val actionConf: Configuration = loadActionConf
System.out.println("run, Before execute, actionConf = " + actionConf.toString)
val exitCode: Int = execute(actionConf)
System.out.println("run After Execute, exitcode = ")
if (exitCode != 0) {
throw new MyLauncherMainException(1)
}
}
// stuff deleted
}

/**
* Companion object
*/
object MyScalaShellMain{
/**
* @param args Invoked from LauncherMapper:map()
* @throws Exception
*/
@throws(classOf[Exception])
def main(args: Array[String]): Unit = {
System.out.println("Starting main(args = " + args + ")")
System.out.println("The object's name is " + this.getClass.getName)
val mssm = new MyScalaShellMain
mssm.run(args)
//LauncherMain.run(classOf[MyScalaShellMain], args)
}

}

If I build this in the same jar and install it in both the /var/lib/oozie directory and the /lib directory next to my workflow.xml as a shared lib, I get a ClassNotFoundException, rethrown as a NoClassDefFoundError, pointing to indicated line in the custom action executor, but complaining about com/typesafe/scalalogging/Logging (is a static method invoking the constructor here?).

 

So I tried to resolve the depenencies by including the following jars in the /var/lib/oozie directory and the lib directory next to my workflow.xml as a shared lib.

  • scala-library-2.10.5.jar

  • scala-logging-slf4j_2.10-2.1.2.jar

  • scala-logging-api_2.10-2.1.2.jar

(The versions are selected to be compatible with the framework used for the production main).

 

Fortunately the workflow runs with a successful completion status, jowever now the logging of the oozie workflow is no longer populated.

I think that is due to dependency conflicts, and am not sure how to correctly resolve them.

1 ACCEPTED SOLUTION

avatar
Contributor
1. all classes related to the custom action would need to be in /var/lib/oozie 2. all main and its dependencies would need to be in the sharelib [1] directory [1] http://blog.cloudera.com/blog/2014/05/how-to-use-the-sharelib-in-apache-oozie-cdh-5/

View solution in original post

1 REPLY 1

avatar
Contributor
1. all classes related to the custom action would need to be in /var/lib/oozie 2. all main and its dependencies would need to be in the sharelib [1] directory [1] http://blog.cloudera.com/blog/2014/05/how-to-use-the-sharelib-in-apache-oozie-cdh-5/