Reply
Highlighted
New Contributor
Posts: 1
Registered: ‎01-30-2019

Error submitting workflow in HUE

I have a import error when i'm trying to submit a workflow with Hue, which contains a Spark-Action. 

The log is below: 

 

 

ignalUtils  - Registered signal handler for TERM
2019-01-30 16:31:46,443 [main] INFO  org.apache.spark.util.SignalUtils  - Registered signal handler for HUP
2019-01-30 16:31:46,443 [main] INFO  org.apache.spark.util.SignalUtils  - Registered signal handler for INT
2019-01-30 16:31:47,059 [main] INFO  org.apache.spark.deploy.yarn.ApplicationMaster  - Preparing Local resources
2019-01-30 16:31:48,010 [main] INFO  org.apache.spark.deploy.yarn.ApplicationMaster  - ApplicationAttemptId: appattempt_1548870193608_0036_000001
2019-01-30 16:31:48,019 [main] INFO  org.apache.spark.SecurityManager  - Changing view acls to: yarn,hdfs
2019-01-30 16:31:48,019 [main] INFO  org.apache.spark.SecurityManager  - Changing modify acls to: yarn,hdfs
2019-01-30 16:31:48,020 [main] INFO  org.apache.spark.SecurityManager  - Changing view acls groups to: 
2019-01-30 16:31:48,020 [main] INFO  org.apache.spark.SecurityManager  - Changing modify acls groups to: 
2019-01-30 16:31:48,021 [main] INFO  org.apache.spark.SecurityManager  - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(yarn, hdfs); groups with view permissions: Set(); users  with modify permissions: Set(yarn, hdfs); groups with modify permissions: Set()
2019-01-30 16:31:48,044 [main] INFO  org.apache.spark.deploy.yarn.ApplicationMaster  - Starting the user application in a separate Thread
2019-01-30 16:31:48,048 [main] INFO  org.apache.spark.deploy.yarn.ApplicationMaster  - Waiting for spark context initialization...
Traceback (most recent call last):
  File "mover.py", line 7, in <module>
    import happybase
ImportError: No module named happybase
2019-01-30 16:31:48,169 [Driver] ERROR org.apache.spark.deploy.yarn.ApplicationMaster  - User application exited with status 1
2019-01-30 16:31:48,172 [Driver] INFO  org.apache.spark.deploy.yarn.ApplicationMaster  - Final app status: FAILED, exitCode: 1, (reason: User application exited with status 1)
2019-01-30 16:31:48,179 [main] ERROR org.apache.spark.deploy.yarn.ApplicationMaster  - Uncaught exception: 
org.apache.spark.SparkException: Exception thrown in awaitResult: 
	at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)
	at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:454)
	at org.apache.spark.deploy.yarn.ApplicationMaster.org$apache$spark$deploy$yarn$ApplicationMaster$$runImpl(ApplicationMaster.scala:296)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply$mcV$sp(ApplicationMaster.scala:223)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply(ApplicationMaster.scala:223)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply(ApplicationMaster.scala:223)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$4.run(ApplicationMaster.scala:802)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1726)
	at org.apache.spark.deploy.yarn.ApplicationMaster.doAsUser(ApplicationMaster.scala:801)
	at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:222)
	at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:835)
	at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
Caused by: org.apache.spark.SparkUserAppException: User application exited with 1
	at org.apache.spark.deploy.PythonRunner$.main(PythonRunner.scala:105)
	at org.apache.spark.deploy.PythonRunner.main(PythonRunner.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:701)
2019-01-30 16:31:48,191 [pool-4-thread-1] INFO  org.apache.spark.util.ShutdownHookManager  - Shutdown hook called

In my cluster I have a Python virtualenv enviroment with all my dependencies, my cluster is configurated following the Cloudera indications for Spark in https://www.cloudera.com/documentation/enterprise/latest/topics/spark_python.html

Whe I use the spark-submit command on console i can run my app without any problems.

The problems just appears with Hue.

 

Researching I found this article http://www.learn4master.com/big-data/pyspark/run-pyspark-on-oozie and i try to do the same thing without success. 

My workflow code generated by Hue is: 

 

<workflow-app name="Copy by hour" xmlns="uri:oozie:workflow:0.5">
<start to="spark-c88a"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="spark-c88a" retry-max="1" retry-interval="1">
<spark xmlns="uri:oozie:spark-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>spark.executorEnv.PYSPARK_PYTHON</name>
<value>/opt/env_cluster/bin/python2</value>
</property>
<property>
<name>spark.yarn.appMasterEnv.PYSPARK_PYTHON</name>
<value>/opt/env_cluster/bin/python2</value>
</property>
</configuration>
<master>yarn</master>
<mode>cluster</mode>
<name>landing_to_daily</name>
<jar>mover.py</jar>
<arg>1</arg>
<arg>-s</arg>
<arg>eir_landing</arg>
<arg>-d</arg>
<arg>eir_daily</arg>
<file>/user/spark/eir/apps/mover.py#mover.py</file>
</spark>
<ok to="End"/>
<error to="email-77d4"/>
</action>
<action name="email-77d4">
<email xmlns="uri:oozie:email-action:0.2">
<to>prueba@mail.com</to>
<subject>Error | Copy by hour</subject>
<body>Error in Workflow landing to daily </body>
<content_type>text/plain</content_type>
</email>
<ok to="Kill"/>
<error to="Kill"/>
</action>
<end name="End"/>
</workflow-app>