Member since
06-13-2018
4
Posts
0
Kudos Received
0
Solutions
06-22-2018
07:29 AM
I'm having trouble rerunning failed workflows run by coordinators in oozie. When I click the rerun button I get: "RERUN action for could not be completed". Checking the oozie logs I see: org.apache.oozie.servlet.XServletException: E0302: Invalid parameter [Multiple app paths specified, only one is allowed] If I access the failed oozie URL in the logs (after stripping the CGI params) I get the job it's trying to submit. I see that both the following are defined: "conf":"<configuration>\r\n <property>\r\n <name>oozie.wf.application.path<\/name>\r\n <value>\/mypath\/workflow\/my_workflow.xml<\/value> ...
"appPath":"\/mypath\/workflow\/my_workflow.xml"
Is this the multiple app paths it talks of? Has anyone else seen this? I'm not adding configuration anywhere else to try and specify the workflow.
... View more
Labels:
- Labels:
-
Apache Oozie
06-13-2018
09:26 PM
I've added some logs. client mode produces the same error as cluster mode.
... View more
06-13-2018
03:49 PM
I'm running spark-submit using the following command: PYSPARK_PYTHON=./ROOT/myspark/bin/python /usr/hdp/current/spark2-client/bin/spark-submit \ --conf spark.yarn.appMasterEnv.PYSPARK_PYTHON=./ROOT/myspark/bin/python \ --master=yarn \ --deploy-mode=cluster \ --driver-memory=4g \ --archives=myspark.zip#ROOT \ --num-executors=32 \ --packages com.databricks:spark-avro_2.11:4.0.0 \ foo.py myspark.zip is a zipped conda environment. It was created using python with the zipfile pacakge. The files are stored without deflation. foo.py is my application code. This normally works, but if myspark.zip is greater than 2Gb I get: java.util.zip.ZipException: invalid CEN header (bad signature) My java version is: jdk1.8.0_112 It looks like older versions of java had this issue, but not my current one. I've written a test java class using java.util.zip which is able to unzip myspark.zip ok. I've checked that all my processes use the above version of java. YARN logs on console after above command. I've tried both --deploy-mode=cluster and --deploy-mode=client: 18/06/13 16:00:22 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 18/06/13 16:00:23 WARN DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded. 18/06/13 16:00:23 INFO RMProxy: Connecting to ResourceManager at myhost2.myfirm.com/10.87.11.17:8050 18/06/13 16:00:23 INFO Client: Requesting a new application from cluster with 6 NodeManagers 18/06/13 16:00:23 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (221184 MB per container) 18/06/13 16:00:23 INFO Client: Will allocate AM container, with 18022 MB memory including 1638 MB overhead 18/06/13 16:00:23 INFO Client: Setting up container launch context for our AM 18/06/13 16:00:23 INFO Client: Setting up the launch environment for our AM container 18/06/13 16:00:23 INFO Client: Preparing resources for our AM container 18/06/13 16:00:24 INFO Client: Use hdfs cache file as spark.yarn.archive for HDP, hdfsCacheFile:hdfs://myhost.myfirm.com:8020/hdp/apps/2.6.4.0-91/spark2/spark2-hdp-yarn-archive.tar.gz 18/06/13 16:00:24 INFO Client: Source and destination file systems are the same. Not copying hdfs://myhost.myfirm.com:8020/hdp/apps/2.6.4.0-91/spark2/spark2-hdp-yarn-archive.tar.gz 18/06/13 16:00:24 INFO Client: Uploading resource file:/home/myuser/.ivy2/jars/com.databricks_spark-avro_2.11-4.0.0.jar -> hdfs://myhost.myfirm.com:8020/user/myuser/.sparkStaging/application_1528901858967_0019/com.databri cks_spark-avro_2.11-4.0.0.jar 18/06/13 16:00:26 INFO Client: Uploading resource file:/home/myuser/.ivy2/jars/org.slf4j_slf4j-api-1.7.5.jar -> hdfs://myhost.myfirm.com:8020/user/myuser/.sparkStaging/application_1528901858967_0019/org.slf4j_slf4j-api-1. 7.5.jar 18/06/13 16:00:26 INFO Client: Uploading resource file:/home/myuser/.ivy2/jars/org.apache.avro_avro-1.7.6.jar -> hdfs://myhost.myfirm.com:8020/user/myuser/.sparkStaging/application_1528901858967_0019/org.apache.avro_avro- 1.7.6.jar 18/06/13 16:00:26 INFO Client: Uploading resource file:/home/myuser/.ivy2/jars/org.codehaus.jackson_jackson-core-asl-1.9.13.jar -> hdfs://myhost.myfirm.com:8020/user/myuser/.sparkStaging/application_1528901858967_0019/org .codehaus.jackson_jackson-core-asl-1.9.13.jar 18/06/13 16:00:26 INFO Client: Uploading resource file:/home/myuser/.ivy2/jars/org.codehaus.jackson_jackson-mapper-asl-1.9.13.jar -> hdfs://myhost.myfirm.com:8020/user/myuser/.sparkStaging/application_1528901858967_0019/o rg.codehaus.jackson_jackson-mapper-asl-1.9.13.jar 18/06/13 16:00:26 INFO Client: Uploading resource file:/home/myuser/.ivy2/jars/com.thoughtworks.paranamer_paranamer-2.3.jar -> hdfs://myhost.myfirm.com:8020/user/myuser/.sparkStaging/application_1528901858967_0019/com.tho ughtworks.paranamer_paranamer-2.3.jar 18/06/13 16:00:26 INFO Client: Uploading resource file:/home/myuser/.ivy2/jars/org.xerial.snappy_snappy-java-1.0.5.jar -> hdfs://myhost.myfirm.com:8020/user/myuser/.sparkStaging/application_1528901858967_0019/org.xerial.s nappy_snappy-java-1.0.5.jar 18/06/13 16:00:26 INFO Client: Uploading resource file:/home/myuser/.ivy2/jars/org.apache.commons_commons-compress-1.4.1.jar -> hdfs://myhost.myfirm.com:8020/user/myuser/.sparkStaging/application_1528901858967_0019/org.ap ache.commons_commons-compress-1.4.1.jar 18/06/13 16:00:26 INFO Client: Uploading resource file:/home/myuser/.ivy2/jars/org.tukaani_xz-1.0.jar -> hdfs://myhost.myfirm.com:8020/user/myuser/.sparkStaging/application_1528901858967_0019/org.tukaani_xz-1.0.jar 18/06/13 16:00:26 INFO Client: Source and destination file systems are the same. Not copying hdfs:/user/myuser/release/alphagenspark.zip#ROOT 18/06/13 16:00:26 INFO Client: Uploading resource file:/my/script/dir/spark/alphagen/foo.py -> hdfs://myhost.myfirm.com:8020/user/myuser/.sparkStaging/application_1528901858967_0019/foo.py 18/06/13 16:00:26 INFO Client: Uploading resource file:/usr/hdp/current/spark2-client/python/lib/pyspark.zip -> hdfs://myhost.myfirm.com:8020/user/myuser/.sparkStaging/application_1528901858967_0019/pyspark.zip 18/06/13 16:00:26 INFO Client: Uploading resource file:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.4-src.zip -> hdfs://myhost.myfirm.com:8020/user/myuser/.sparkStaging/application_1528901858967_0019/py4j-0.10.4-src .zip 18/06/13 16:00:26 WARN Client: Same path resource file:/home/myuser/.ivy2/jars/com.databricks_spark-avro_2.11-4.0.0.jar added multiple times to distributed cache. 18/06/13 16:00:26 WARN Client: Same path resource file:/home/myuser/.ivy2/jars/org.slf4j_slf4j-api-1.7.5.jar added multiple times to distributed cache. 18/06/13 16:00:26 WARN Client: Same path resource file:/home/myuser/.ivy2/jars/org.apache.avro_avro-1.7.6.jar added multiple times to distributed cache. 18/06/13 16:00:26 WARN Client: Same path resource file:/home/myuser/.ivy2/jars/org.codehaus.jackson_jackson-core-asl-1.9.13.jar added multiple times to distributed cache. 18/06/13 16:00:26 WARN Client: Same path resource file:/home/myuser/.ivy2/jars/org.codehaus.jackson_jackson-mapper-asl-1.9.13.jar added multiple times to distributed cache. 18/06/13 16:00:26 WARN Client: Same path resource file:/home/myuser/.ivy2/jars/com.thoughtworks.paranamer_paranamer-2.3.jar added multiple times to distributed cache. 18/06/13 16:00:26 WARN Client: Same path resource file:/home/myuser/.ivy2/jars/org.xerial.snappy_snappy-java-1.0.5.jar added multiple times to distributed cache. 18/06/13 16:00:26 WARN Client: Same path resource file:/home/myuser/.ivy2/jars/org.apache.commons_commons-compress-1.4.1.jar added multiple times to distributed cache. 18/06/13 16:00:26 WARN Client: Same path resource file:/home/myuser/.ivy2/jars/org.tukaani_xz-1.0.jar added multiple times to distributed cache. 18/06/13 16:00:27 INFO Client: Uploading resource file:/tmp/spark-6c26ae3b-7248-488f-bc33-9766251474bb/__spark_conf__4405623606341803690.zip -> hdfs://myhost.myfirm.com:8020/user/myuser/.sparkStaging/application_1528901858967_0019/__spark_conf__.zip 18/06/13 16:00:27 INFO SecurityManager: Changing view acls to: myuser 18/06/13 16:00:27 INFO SecurityManager: Changing modify acls to: myuser 18/06/13 16:00:27 INFO SecurityManager: Changing view acls groups to: 18/06/13 16:00:27 INFO SecurityManager: Changing modify acls groups to: 18/06/13 16:00:27 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(myuser); groups with view permissions: Set(); users with modify permissions: Set(myuser); groups with modify permissions: Set() 18/06/13 16:00:27 INFO Client: Submitting application application_1528901858967_0019 to ResourceManager 18/06/13 16:00:27 INFO YarnClientImpl: Submitted application application_1528901858967_0019 18/06/13 16:00:28 INFO Client: Application report for application_1528901858967_0019 (state: ACCEPTED) 18/06/13 16:00:28 INFO Client: client token: N/A diagnostics: AM container is launched, waiting for AM container to Register with RM ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1528923627110 final status: UNDEFINED tracking URL: http://myhost2.myfirm.com:8088/proxy/application_1528901858967_0019/ user: myuser 18/06/13 16:00:29 INFO Client: Application report for application_1528901858967_0019 (state: ACCEPTED) 18/06/13 16:00:30 INFO Client: Application report for application_1528901858967_0019 (state: ACCEPTED) 18/06/13 16:00:31 INFO Client: Application report for application_1528901858967_0019 (state: ACCEPTED) 18/06/13 16:00:32 INFO Client: Application report for application_1528901858967_0019 (state: ACCEPTED) 18/06/13 16:00:33 INFO Client: Application report for application_1528901858967_0019 (state: ACCEPTED) 18/06/13 16:00:34 INFO Client: Application report for application_1528901858967_0019 (state: ACCEPTED) 18/06/13 16:00:35 INFO Client: Application report for application_1528901858967_0019 (state: ACCEPTED) 18/06/13 16:00:36 INFO Client: Application report for application_1528901858967_0019 (state: ACCEPTED) 18/06/13 16:00:37 INFO Client: Application report for application_1528901858967_0019 (state: ACCEPTED) 18/06/13 16:00:38 INFO Client: Application report for application_1528901858967_0019 (state: ACCEPTED) 18/06/13 16:00:39 INFO Client: Application report for application_1528901858967_0019 (state: FAILED) 18/06/13 16:00:39 INFO Client: client token: N/A diagnostics: Application application_1528901858967_0019 failed 2 times due to AM Container for appattempt_1528901858967_0019_000002 exited with exitCode: -1000 For more detailed output, check the application tracking page: http://myhost2.myfirm.com:8088/cluster/app/application_1528901858967_0019 Then click on links to logs of each attempt. Diagnostics: java.util.zip.ZipException: invalid CEN header (bad signature) Failing this attempt. Failing the application. ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1528923627110 final status: FAILED tracking URL: http://myhost2.myfirm.com:8088/cluster/app/application_1528901858967_0019 user: myuser 18/06/13 16:00:39 INFO Client: Deleted staging directory hdfs://myhost.myfirm.com:8020/user/myuser/.sparkStaging/application_1528901858967_0019 Exception in thread "main" org.apache.spark.SparkException: Application application_1528901858967_0019 finished with failed status at org.apache.spark.deploy.yarn.Client.run(Client.scala:1187) at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1233) at org.apache.spark.deploy.yarn.Client.main(Client.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:782) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 18/06/13 16:00:39 INFO ShutdownHookManager: Shutdown hook called 18/06/13 16:00:39 INFO ShutdownHookManager: Deleting directory /tmp/spark-6c26ae3b-7248-488f-bc33-9766251474bb Has anyone seen this before?
... View more
Labels:
- Labels:
-
Apache Spark