Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Spark from Oozie

Spark from Oozie

New Contributor

Hi,

 

I'm trying to execute an Spark action from Oozie but I get this error. I think that shareLib are right and I uploaded my jar to HDFS. I have no idea where it's the error. Any clue?

 

  Log Length: 4246
  Using properties file: null
  Parsed arguments:
  master                  yarn
  deployMode              cluster
  executorMemory          null
  executorCores           null
  totalExecutorCores      null
  propertiesFile          null
  driverMemory            null
  driverCores             null
  driverExtraClassPath    null
  driverExtraLibraryPath  null
  driverExtraJavaOptions  null
  supervise               false
  queue                   null
  numExecutors            null
  files                   null
  pyFiles                 null
  archives                null
  mainClass               Csv2Parquet
  primaryResource         hdfs://cnsalbsrvcl12.lvtc.gsnet.corp/tmp/ExamplesSpark-0.0.1-SNAPSHOT.jar
  name                    MySpark
  childArgs               []
  jars                    null
  packages                null
  repositories            null
  verbose                 true
  
  Spark properties used, including those specified through
  --conf and those from the properties file null:
  
  
  
  Main class:
  org.apache.spark.deploy.yarn.Client
  Arguments:
  --name
  MySpark
  --jar
  hdfs://xxxxxx/tmp/ExamplesSpark-0.0.1-SNAPSHOT.jar
  --class
  Csv2Parquet
  System properties:
  SPARK_SUBMIT -> true
  spark.app.name -> MySpark
  spark.master -> yarn-cluster
  Classpath elements:
  
  
  
  Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, Application finished with failed status
  org.apache.spark.SparkException: Application finished with failed status
  at org.apache.spark.deploy.yarn.Client.run(Client.scala:626)
  at org.apache.spark.deploy.yarn.Client$.main(Client.scala:651)
  at org.apache.spark.deploy.yarn.Client.main(Client.scala)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2 REPLIES 2

Re: Spark from Oozie

New Contributor
I was checking all logs in Oozie, YARN and SPARK and I finally saw the real error in YARN logs. The output directory was already created in HDFS.
Highlighted

Re: Spark from Oozie

Master Guru
Thank you for following up with the cause/resolution. Please feel free to mark your update as a solution to this thread, so other users with similar troubles may land here and be able to figure out the same.
Don't have an account?
Coming from Hortonworks? Activate your account here