Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here. Want to know more about what has changed? Check out the Community News blog.

Error "tried to access method com.google.common.base.Stopwatch...." in Hue/Oozie Spark action.

SOLVED Go to solution

Error "tried to access method com.google.common.base.Stopwatch...." in Hue/Oozie Spark action.

Explorer

This is frustrating because I had this working previously, but it no longer works correctly.

 

I'm executing TeraGen/TeraSort/TeraValidate from the com.github.ehiggs.spark.terasort library as a training method.

 

I can usually execute TeraGen successfully, but on the TeraSort step, I get the error: 

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, tried to access method com.google.common.base.Stopwatch.<init>()V from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
java.lang.IllegalAccessError: tried to access method com.google.common.base.Stopwatch.<init>()V from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat

If I move the TeraSort step above the TeraGen step, I can execute TeraSort, then TeraGen, then TeraSort again, but I get that error on TeraValidate.

 

Can anyone identify what I'm doing wrong?

 

The Hue/Oozie editor creates the following workflow.xml file:

 

<workflow-app name="TeraGen_-_TeraSort_-_TeraValidate" xmlns="uri:oozie:workflow:0.5">
  <start to="spark-0883"/>
  <kill name="Kill">
    <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
  </kill>
  <action name="spark-f631">
    <spark xmlns="uri:oozie:spark-action:0.1">
      <job-tracker>${jobTracker}</job-tracker>
      <name-node>${nameNode}</name-node>
      <master></master>
      <mode></mode>
      <name>TeraSort</name>
        <class>com.github.ehiggs.spark.terasort.TeraSort</class>
      <jar>/user/hue/oozie/workspaces/hue-oozie-1450123297.08/lib/spark-terasort.jar</jar>
        <spark-opts>--jars /user/hue/oozie/workspaces/hue-oozie-1450123297.08/lib/spark-terasort.jar</spark-opts>
        <arg>/user/davidw/terasort-benchmark.in</arg>
        <arg>/user/davidw/terasort-benchmark.out</arg>
    </spark>
    <ok to="spark-504c"/>
    <error to="Kill"/>
  </action>
  <action name="spark-0883">
    <spark xmlns="uri:oozie:spark-action:0.1">
      <job-tracker>${jobTracker}</job-tracker>
      <name-node>${nameNode}</name-node>
      <prepare>
        <delete path="${nameNode}/user/davidw/terasort-benchmark.in"/>
        <delete path="${nameNode}/user/davidw/terasort-benchmark.out"/>
        <delete path="${nameNode}/user/davidw/terasort-benchmark.validate"/>
      </prepare>
      <master></master>
      <mode></mode>
      <name>TeraGen</name>
        <class>com.github.ehiggs.spark.terasort.TeraGen</class>
      <jar>/user/hue/oozie/workspaces/hue-oozie-1450123297.08/lib/spark-terasort.jar</jar>
        <spark-opts>--jars /user/hue/oozie/workspaces/hue-oozie-1450123297.08/lib/spark-terasort.jar</spark-opts>
        <arg>1g</arg>
        <arg>/user/davidw/terasort-benchmark.in</arg>
    </spark>
    <ok to="spark-f631"/>
    <error to="Kill"/>
  </action>
  <action name="spark-504c">
    <spark xmlns="uri:oozie:spark-action:0.1">
      <job-tracker>${jobTracker}</job-tracker>
      <name-node>${nameNode}</name-node>
      <master></master>
      <mode></mode>
      <name>TeraValidate</name>
        <class>com.github.ehiggs.spark.terasort.TeraValidate</class>
      <jar>/user/hue/oozie/workspaces/hue-oozie-1450123297.08/lib/spark-terasort.jar</jar>
        <spark-opts>--jars /user/hue/oozie/workspaces/hue-oozie-1450123297.08/lib/spark-terasort.jar</spark-opts>
        <arg>/user/davidw/terasort-benchmark.out</arg>
        <arg>/user/davidw/terasort-benchmark.validate</arg>
    </spark>
    <ok to="End"/>
    <error to="Kill"/>
  </action>
  <end name="End"/>
</workflow-app>

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Error "tried to access method com.google.common.base.Stopwatch...." in Hue/Oozie Spark

Explorer

I was able to rebuild the Oozie job and make it work, although I really don't know what is different.

 

I built the job in sequence this time, so that the steps are listed in-sequence in the XML file.  

I also built the job steps to reference the lib directory in the job's path.

I had previously had success with explicit references, but these didn't seem necessary.  

I moved the prepare steps to a point right before they were needed instead of all on the first step.

I eliminated the output directory definition for TeraValidate because it doesn't seem to be used.

Finally, I let Hue/Oozie choose the defaults for Master and Mode.  I played around with trying to use YARN and cluster, but these didn't work.

 

My resulting XML (that works) looks like this:

 

<workflow-app name="TeraGen-TeraSort-TeraValidate" xmlns="uri:oozie:workflow:0.5">
  <start to="spark-27f0"/>
  <kill name="Kill">
    <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
  </kill>
  <action name="spark-27f0">
    <spark xmlns="uri:oozie:spark-action:0.1">
      <job-tracker>${jobTracker}</job-tracker>
      <name-node>${nameNode}</name-node>
      <prepare>
        <delete path="${nameNode}/user/davidw/terasort-benchmark.in"/>
      </prepare>
      <master>local[*]</master>
      <mode>client</mode>
      <name>TeraGen</name>
        <class>com.github.ehiggs.spark.terasort.TeraGen</class>
      <jar>lib/spark-terasort.jar</jar>
        <arg>1g</arg>
        <arg>/user/davidw/terasort-benchmark.in</arg>
    </spark>
    <ok to="spark-94fc"/>
    <error to="Kill"/>
  </action>
  <action name="spark-94fc">
    <spark xmlns="uri:oozie:spark-action:0.1">
      <job-tracker>${jobTracker}</job-tracker>
      <name-node>${nameNode}</name-node>
      <prepare>
        <delete path="${nameNode}/user/davidw/terasort-benchmark.out"/>
      </prepare>
      <master>local[*]</master>
      <mode>client</mode>
      <name>TeraSort</name>
        <class>com.github.ehiggs.spark.terasort.TeraSort</class>
      <jar>lib/spark-terasort.jar</jar>
        <arg>/user/davidw/terasort-benchmark.in</arg>
        <arg>/user/davidw/terasort-benchmark.out</arg>
    </spark>
    <ok to="spark-bcf9"/>
    <error to="Kill"/>
  </action>
  <action name="spark-bcf9">
    <spark xmlns="uri:oozie:spark-action:0.1">
      <job-tracker>${jobTracker}</job-tracker>
      <name-node>${nameNode}</name-node>
      <master>local[*]</master>
      <mode>client</mode>
      <name>TeraValidate</name>
        <class>com.github.ehiggs.spark.terasort.TeraValidate</class>
      <jar>lib/spark-terasort.jar</jar>
        <arg>/user/davidw/terasort-benchmark.out</arg>
    </spark>
    <ok to="End"/>
    <error to="Kill"/>
  </action>
  <end name="End"/>
</workflow-app>
 

1 REPLY 1
Highlighted

Re: Error "tried to access method com.google.common.base.Stopwatch...." in Hue/Oozie Spark

Explorer

I was able to rebuild the Oozie job and make it work, although I really don't know what is different.

 

I built the job in sequence this time, so that the steps are listed in-sequence in the XML file.  

I also built the job steps to reference the lib directory in the job's path.

I had previously had success with explicit references, but these didn't seem necessary.  

I moved the prepare steps to a point right before they were needed instead of all on the first step.

I eliminated the output directory definition for TeraValidate because it doesn't seem to be used.

Finally, I let Hue/Oozie choose the defaults for Master and Mode.  I played around with trying to use YARN and cluster, but these didn't work.

 

My resulting XML (that works) looks like this:

 

<workflow-app name="TeraGen-TeraSort-TeraValidate" xmlns="uri:oozie:workflow:0.5">
  <start to="spark-27f0"/>
  <kill name="Kill">
    <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
  </kill>
  <action name="spark-27f0">
    <spark xmlns="uri:oozie:spark-action:0.1">
      <job-tracker>${jobTracker}</job-tracker>
      <name-node>${nameNode}</name-node>
      <prepare>
        <delete path="${nameNode}/user/davidw/terasort-benchmark.in"/>
      </prepare>
      <master>local[*]</master>
      <mode>client</mode>
      <name>TeraGen</name>
        <class>com.github.ehiggs.spark.terasort.TeraGen</class>
      <jar>lib/spark-terasort.jar</jar>
        <arg>1g</arg>
        <arg>/user/davidw/terasort-benchmark.in</arg>
    </spark>
    <ok to="spark-94fc"/>
    <error to="Kill"/>
  </action>
  <action name="spark-94fc">
    <spark xmlns="uri:oozie:spark-action:0.1">
      <job-tracker>${jobTracker}</job-tracker>
      <name-node>${nameNode}</name-node>
      <prepare>
        <delete path="${nameNode}/user/davidw/terasort-benchmark.out"/>
      </prepare>
      <master>local[*]</master>
      <mode>client</mode>
      <name>TeraSort</name>
        <class>com.github.ehiggs.spark.terasort.TeraSort</class>
      <jar>lib/spark-terasort.jar</jar>
        <arg>/user/davidw/terasort-benchmark.in</arg>
        <arg>/user/davidw/terasort-benchmark.out</arg>
    </spark>
    <ok to="spark-bcf9"/>
    <error to="Kill"/>
  </action>
  <action name="spark-bcf9">
    <spark xmlns="uri:oozie:spark-action:0.1">
      <job-tracker>${jobTracker}</job-tracker>
      <name-node>${nameNode}</name-node>
      <master>local[*]</master>
      <mode>client</mode>
      <name>TeraValidate</name>
        <class>com.github.ehiggs.spark.terasort.TeraValidate</class>
      <jar>lib/spark-terasort.jar</jar>
        <arg>/user/davidw/terasort-benchmark.out</arg>
    </spark>
    <ok to="End"/>
    <error to="Kill"/>
  </action>
  <end name="End"/>
</workflow-app>