28732
DISCUSSIONS
101747
MEMBERS
3157
ARTICLES
Created on 08-08-2016 12:00 AM - edited 08-08-2016 04:18 AM
We are using oozie workflow - spark action on yarn mode in CDH 5.8.0. When a job started, it will prepare a long time to upload the jar belong to '/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/'. And this process will take about approximate 5 minutes.
Following is several lines of the output logs:
2016-08-08 19:10:30,312 INFO [main] org.apache.spark.deploy.yarn.Client: Uploading resource file:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop/hadoop-annotations.jar -> hdfs://ns/user/hdfs/.sparkStaging/application_1469502027340_0471/hadoop-annotations.jar 2016-08-08 19:10:31,921 INFO [main] org.apache.spark.deploy.yarn.Client: Uploading resource file:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop/hadoop-auth.jar -> hdfs://ns/user/hdfs/.sparkStaging/application_1469502027340_0471/hadoop-auth.jar 2016-08-08 19:10:32,911 INFO [main] org.apache.spark.deploy.yarn.Client: Uploading resource file:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop/hadoop-aws.jar -> hdfs://ns/user/hdfs/.sparkStaging/application_1469502027340_0471/hadoop-aws.jar ... 2016-08-08 19:12:14,041 INFO [main] org.apache.spark.deploy.yarn.Client: Uploading resource file:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-hdfs/hadoop-hdfs-nfs.jar -> hdfs://ns/user/hdfs/.sparkStaging/application_1469502027340_0471/hadoop-hdfs-nfs.jar 2016-08-08 19:12:14,916 INFO [main] org.apache.spark.deploy.yarn.Client: Uploading resource file:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-hdfs/hadoop-hdfs-tests.jar -> hdfs://ns/user/hdfs/.sparkStaging/application_1469502027340_0471/hadoop-hdfs-tests.jar 2016-08-08 19:12:17,184 INFO [main] org.apache.spark.deploy.yarn.Client: Uploading resource file:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-hdfs/hadoop-hdfs.jar -> hdfs://ns/user/hdfs/.sparkStaging/application_1469502027340_0471/hadoop-hdfs.jar 2016-08-08 19:12:20,331 INFO [main] org.apache.spark.deploy.yarn.Client: Uploading resource file:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-hdfs/hadoop-hdfs-nfs-2.6.0-cdh5.8.0.jar -> hdfs://ns/user/hdfs/.sparkStaging/application_1469502027340_0471/hadoop-hdfs-nfs-2.6.0-cdh5.8.0.jar ... 2016-08-08 19:12:40,483 INFO [main] org.apache.spark.deploy.yarn.Client: Uploading resource file:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-yarn/hadoop-yarn-api.jar -> hdfs://ns/user/hdfs/.sparkStaging/application_1469502027340_0471/hadoop-yarn-api.jar 2016-08-08 19:12:41,400 INFO [main] org.apache.spark.deploy.yarn.Client: Uploading resource file:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell.jar -> hdfs://ns/user/hdfs/.sparkStaging/application_1469502027340_0471/hadoop-yarn-applications-distributedshell.jar 2016-08-08 19:12:42,386 INFO [main] org.apache.spark.deploy.yarn.Client: Uploading resource file:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-yarn/hadoop-yarn-applications-unmanaged-am-launcher.jar -> hdfs://ns/user/hdfs/.sparkStaging/application_1469502027340_0471/hadoop-yarn-applications-unmanaged-am-launcher.jar 2016-08-08 19:12:43,615 INFO [main] org.apache.spark.deploy.yarn.Client: Uploading resource file:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-yarn/hadoop-yarn-client.jar -> hdfs://ns/user/hdfs/.sparkStaging/application_1469502027340_0471/hadoop-yarn-client.jar 2016-08-08 19:12:44,632 INFO [main] org.apache.spark.deploy.yarn.Client: Uploading resource file:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-yarn/hadoop-yarn-common.jar -> hdfs://ns/user/hdfs/.sparkStaging/application_1469502027340_0471/hadoop-yarn-common.jar ... 2016-08-08 19:13:50,199 INFO [main] org.apache.spark.deploy.yarn.Client: Uploading resource file:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-mapreduce/apacheds-i18n-2.0.0-M15.jar -> hdfs://ns/user/hdfs/.sparkStaging/application_1469502027340_0471/apacheds-i18n-2.0.0-M15.jar 2016-08-08 19:13:51,934 INFO [main] org.apache.spark.deploy.yarn.Client: Uploading resource file:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-mapreduce/apacheds-kerberos-codec-2.0.0-M15.jar -> hdfs://ns/user/hdfs/.sparkStaging/application_1469502027340_0471/apacheds-kerberos-codec-2.0.0-M15.jar 2016-08-08 19:13:53,658 INFO [main] org.apache.spark.deploy.yarn.Client: Uploading resource file:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-mapreduce/api-asn1-api-1.0.0-M20.jar -> hdfs://ns/user/hdfs/.sparkStaging/application_1469502027340_0471/api-asn1-api-1.0.0-M20.jar 2016-08-08 19:13:55,297 INFO [main] org.apache.spark.deploy.yarn.Client: Uploading resource file:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-mapreduce/api-util-1.0.0-M20.jar -> hdfs://ns/user/hdfs/.sparkStaging/application_1469502027340_0471/api-util-1.0.0-M20.jar 2016-08-08 19:13:56,768 INFO [main] org.apache.spark.deploy.yarn.Client: Uploading resource file:/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-mapreduce/commons-beanutils-1.7.0.jar -> hdfs://ns/user/hdfs/.sparkStaging/application_1469502027340_0471/commons-beanutils-1.7.0.jar ...
Can we skip the process of upload hadoop jars for speed up the workflow.