Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Error submitting workflow in HUE

Error submitting workflow in HUE

New Contributor

I have a import error when i'm trying to submit a workflow with Hue, which contains a Spark-Action. 

The log is below: 

 

 

ignalUtils  - Registered signal handler for TERM
2019-01-30 16:31:46,443 [main] INFO  org.apache.spark.util.SignalUtils  - Registered signal handler for HUP
2019-01-30 16:31:46,443 [main] INFO  org.apache.spark.util.SignalUtils  - Registered signal handler for INT
2019-01-30 16:31:47,059 [main] INFO  org.apache.spark.deploy.yarn.ApplicationMaster  - Preparing Local resources
2019-01-30 16:31:48,010 [main] INFO  org.apache.spark.deploy.yarn.ApplicationMaster  - ApplicationAttemptId: appattempt_1548870193608_0036_000001
2019-01-30 16:31:48,019 [main] INFO  org.apache.spark.SecurityManager  - Changing view acls to: yarn,hdfs
2019-01-30 16:31:48,019 [main] INFO  org.apache.spark.SecurityManager  - Changing modify acls to: yarn,hdfs
2019-01-30 16:31:48,020 [main] INFO  org.apache.spark.SecurityManager  - Changing view acls groups to: 
2019-01-30 16:31:48,020 [main] INFO  org.apache.spark.SecurityManager  - Changing modify acls groups to: 
2019-01-30 16:31:48,021 [main] INFO  org.apache.spark.SecurityManager  - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(yarn, hdfs); groups with view permissions: Set(); users  with modify permissions: Set(yarn, hdfs); groups with modify permissions: Set()
2019-01-30 16:31:48,044 [main] INFO  org.apache.spark.deploy.yarn.ApplicationMaster  - Starting the user application in a separate Thread
2019-01-30 16:31:48,048 [main] INFO  org.apache.spark.deploy.yarn.ApplicationMaster  - Waiting for spark context initialization...
Traceback (most recent call last):
  File "mover.py", line 7, in <module>
    import happybase
ImportError: No module named happybase
2019-01-30 16:31:48,169 [Driver] ERROR org.apache.spark.deploy.yarn.ApplicationMaster  - User application exited with status 1
2019-01-30 16:31:48,172 [Driver] INFO  org.apache.spark.deploy.yarn.ApplicationMaster  - Final app status: FAILED, exitCode: 1, (reason: User application exited with status 1)
2019-01-30 16:31:48,179 [main] ERROR org.apache.spark.deploy.yarn.ApplicationMaster  - Uncaught exception: 
org.apache.spark.SparkException: Exception thrown in awaitResult: 
	at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)
	at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:454)
	at org.apache.spark.deploy.yarn.ApplicationMaster.org$apache$spark$deploy$yarn$ApplicationMaster$$runImpl(ApplicationMaster.scala:296)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply$mcV$sp(ApplicationMaster.scala:223)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply(ApplicationMaster.scala:223)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply(ApplicationMaster.scala:223)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$4.run(ApplicationMaster.scala:802)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1726)
	at org.apache.spark.deploy.yarn.ApplicationMaster.doAsUser(ApplicationMaster.scala:801)
	at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:222)
	at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:835)
	at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
Caused by: org.apache.spark.SparkUserAppException: User application exited with 1
	at org.apache.spark.deploy.PythonRunner$.main(PythonRunner.scala:105)
	at org.apache.spark.deploy.PythonRunner.main(PythonRunner.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:701)
2019-01-30 16:31:48,191 [pool-4-thread-1] INFO  org.apache.spark.util.ShutdownHookManager  - Shutdown hook called

In my cluster I have a Python virtualenv enviroment with all my dependencies, my cluster is configurated following the Cloudera indications for Spark in https://www.cloudera.com/documentation/enterprise/latest/topics/spark_python.html

Whe I use the spark-submit command on console i can run my app without any problems.

The problems just appears with Hue.

 

Researching I found this article http://www.learn4master.com/big-data/pyspark/run-pyspark-on-oozie and i try to do the same thing without success. 

My workflow code generated by Hue is: 

 

<workflow-app name="Copy by hour" xmlns="uri:oozie:workflow:0.5">
<start to="spark-c88a"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="spark-c88a" retry-max="1" retry-interval="1">
<spark xmlns="uri:oozie:spark-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>spark.executorEnv.PYSPARK_PYTHON</name>
<value>/opt/env_cluster/bin/python2</value>
</property>
<property>
<name>spark.yarn.appMasterEnv.PYSPARK_PYTHON</name>
<value>/opt/env_cluster/bin/python2</value>
</property>
</configuration>
<master>yarn</master>
<mode>cluster</mode>
<name>landing_to_daily</name>
<jar>mover.py</jar>
<arg>1</arg>
<arg>-s</arg>
<arg>eir_landing</arg>
<arg>-d</arg>
<arg>eir_daily</arg>
<file>/user/spark/eir/apps/mover.py#mover.py</file>
</spark>
<ok to="End"/>
<error to="email-77d4"/>
</action>
<action name="email-77d4">
<email xmlns="uri:oozie:email-action:0.2">
<to>prueba@mail.com</to>
<subject>Error | Copy by hour</subject>
<body>Error in Workflow landing to daily </body>
<content_type>text/plain</content_type>
</email>
<ok to="Kill"/>
<error to="Kill"/>
</action>
<end name="End"/>
</workflow-app>