Created 08-30-2016 08:15 AM
Sandbox HDP-2.5.0 Spark 2.0.0 - Spark Submit Yarn Cluster Mode -- Spark Shell LzoCodec not found
I have installed Spark 2.0.0 in Sandbox HDP-2.5.0 in accordance to Paul Hargis great post:
Thanks Paul.
Spark-Submit in Yarn-Client mode works as per log here:
[root@sandbox ~]# cd /usr/hdp/current/spark2-client
[root@sandbox spark2-client]# ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode client --driver-memory 2g --executor-memory 2g --executor-core
s 1 examples/jars/spark-examples*.jar 10
16/08/28 14:38:42 INFO spark.SparkContext: Running Spark version 2.0.0
16/08/28 14:38:42 INFO spark.SecurityManager: Changing view acls to: root
16/08/28 14:38:42 INFO spark.SecurityManager: Changing modify acls to: root
16/08/28 14:38:42 INFO spark.SecurityManager: Changing view acls groups to:
16/08/28 14:38:42 INFO spark.SecurityManager: Changing modify acls groups to:
16/08/28 14:38:42 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(
); users with modify permissions: Set(root); groups with modify permissions: Set()
16/08/28 14:38:43 INFO util.Utils: Successfully started service 'sparkDriver' on port 36008.
16/08/28 14:38:43 INFO spark.SparkEnv: Registering MapOutputTracker
16/08/28 14:38:43 INFO spark.SparkEnv: Registering BlockManagerMaster
16/08/28 14:38:43 INFO storage.DiskBlockManager: Created local directory at /tmp/blockmgr-b5149ef4-928d-455e-bf83-2159e12f88f7
16/08/28 14:38:43 INFO memory.MemoryStore: MemoryStore started with capacity 912.3 MB
16/08/28 14:38:43 INFO spark.SparkEnv: Registering OutputCommitCoordinator
16/08/28 14:38:43 INFO util.log: Logging initialized @2226ms
16/08/28 14:38:43 INFO server.Server: jetty-9.2.z-SNAPSHOT
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6e1e5b02{/jobs,null,AVAILABLE}
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@ae918c9{/jobs/json,null,AVAILABLE}
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4d5a39b7{/jobs/job,null,AVAILABLE}
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5e83450d{/jobs/job/json,null,AVAILABLE}
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@7c2a88f4{/stages,null,AVAILABLE}
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4c858adb{/stages/json,null,AVAILABLE}
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@535f571c{/stages/stage,null,AVAILABLE}
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@18501a07{/stages/stage/json,null,AVAILABLE}
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@32dcce09{/stages/pool,null,AVAILABLE}
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3e5acaf5{/stages/pool/json,null,AVAILABLE}
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3ac2bace{/storage,null,AVAILABLE}
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@46764885{/storage/json,null,AVAILABLE}
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@7f9337e6{/storage/rdd,null,AVAILABLE}
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1a3b1e79{/storage/rdd/json,null,AVAILABLE}
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1f4da763{/environment,null,AVAILABLE}
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@232864a3{/environment/json,null,AVAILABLE}
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@30e71b5d{/executors,null,AVAILABLE}
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@14b58fc0{/executors/json,null,AVAILABLE}
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1bf090df{/executors/threadDump,null,AVAILABLE}
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4eb72ecd{/executors/threadDump/json,null,AVAILABLE}
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5c61bd1a{/static,null,AVAILABLE}
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@14c62558{/,null,AVAILABLE}
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5cbdbf0f{/api,null,AVAILABLE}
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@2d4aa15a{/stages/stage/kill,null,AVAILABLE}
16/08/28 14:38:43 INFO server.ServerConnector: Started ServerConnector@51fcbb35{HTTP/1.1}{0.0.0.0:4041}
16/08/28 14:38:43 INFO server.Server: Started @2388ms
16/08/28 14:38:43 INFO ui.SparkUI: Bound SparkUI to 0.0.0.0, and started at <a href="http://10.0.2.15:4041/">http://10.0.2.15:4041</a>
16/08/28 14:38:43 INFO spark.SparkContext: Added JAR file:/usr/hdp/2.5.0.0-817/spark2/examples/jars/spark-examples_2.11-2.0.0.jar at spark://10.0.2.15:36008/jars/spark-examples_2.11
-2.0.0.jar with timestamp 1472395123767
16/08/28 14:38:44 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers0.0.2.15:8050
16/08/28 14:38:44 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (7680 MB per container)
16/08/28 14:38:44 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
16/08/28 14:38:44 INFO yarn.Client: Setting up the launch environment for our AM container
16/08/28 14:38:44 INFO yarn.Client: Preparing resources for our AM container
16/08/28 14:38:44 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
020/user/root/.sparkStaging/application_1472394965674_0001/__spark_libs__6748274495232790272.zip419767250f0/__spark_libs__6748274495232790272.zip -> hdfs://sandbox.hortonworks.com:8
16/08/28 14:38:48 INFO yarn.Client: Uploading resource file:/tmp/spark-a10e8972-1076-4a61-a014-8419767250f0/__spark_conf__6530127439911581770.zip -> hdfs://sandbox.hortonworks.com:8
020/user/root/.sparkStaging/application_1472394965674_0001/__spark_conf__.zip
16/08/28 14:38:48 INFO spark.SecurityManager: Changing modify acls to: root
16/08/28 14:38:48 INFO spark.SecurityManager: Changing view acls groups to:
16/08/28 14:38:48 INFO spark.SecurityManager: Changing modify acls groups to:
); users with modify permissions: Set(root); groups with modify permissions: Set()led; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(
16/08/28 14:38:48 INFO yarn.Client: Submitting application application_1472394965674_0001 to ResourceManager
16/08/28 14:38:48 INFO impl.YarnClientImpl: Submitted application application_1472394965674_0001
16/08/28 14:38:49 INFO yarn.Client: Application report for application_1472394965674_0001 (state: ACCEPTED)ation_1472394965674_0001 and attemptId None
16/08/28 14:38:49 INFO yarn.Client:
client token: N/A
ApplicationMaster host: N/As launched, waiting for AM container to Register with RM
ApplicationMaster RPC port: -1
queue: default
final status: UNDEFINED18
tracking URL: <a href="http://sandbox.hortonworks.com:8088/proxy/application_1472394965674_0001/">http://sandbox.hortonworks.com:8088/proxy/application_1472394965674_0001/</a>
user: root
16/08/28 14:38:51 INFO yarn.Client: Application report for application_1472394965674_0001 (state: ACCEPTED)
16/08/28 14:38:52 INFO yarn.Client: Application report for application_1472394965674_0001 (state: ACCEPTED)
16/08/28 14:38:52 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(null)
PROXY_URI_BASES -> <a href="http://sandbox.hortonworks.com:8088/proxy/application_1472394965674_0001">http://sandbox.hortonworks.com:8088/proxy/application_1472394965674_0001</a>), /proxy/application_1472394965674_0001lter, Map(PROXY_HOSTS -> sandbox.hortonworks.com,
16/08/28 14:38:52 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
16/08/28 14:38:53 INFO yarn.Client: Application report for application_1472394965674_0001 (state: RUNNING)
16/08/28 client token: N/An.Client:
diagnostics: N/A
ApplicationMaster host: 10.0.2.15
queue: defaultter RPC port: 0
start time: 1472395128618
final status: UNDEFINED
user: rootRL: <a href="http://sandbox.hortonworks.com:8088/proxy/application_1472394965674_0001/">http://sandbox.hortonworks.com:8088/proxy/application_1472394965674_0001/</a>
16/08/28 14:38:53 INFO cluster.YarnClientSchedulerBackend: Application application_1472394965674_0001 has started running.
16/08/28 14:38:53 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 35756.
16/08/28 14:38:53 INFO storage.BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 10.0.2.15, 35756)
16/08/28 14:38:53 INFO storage.BlockManagerMasterEndpoint: Registering block manager 10.0.2.15:35756 with 912.3 MB RAM, BlockManagerId(driver, 10.0.2.15, 35756)
16/08/28 14:38:53 INFO storage.BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 10.0.2.15, 35756)
16/08/28 14:38:54 INFO scheduler.EventLoggingListener: Logging events to hdfs:///spark-history/application_1472394965674_0001
16/08/28 14:38:56 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(null) (10.0.2.15:36932) with ID 1
16/08/28 14:38:56 INFO storage.BlockManagerMasterEndpoint: Registering block manager sandbox.hortonworks.com:41061 with 912.3 MB RAM, BlockManagerId(1, sandbox.hortonworks.com, 4106
16/08/28 14:38:57 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(null) (10.0.2.15:36936) with ID 2
16/08/28 14:38:57 INFO storage.BlockManagerMasterEndpoint: Registering block manager sandbox.hortonworks.com:41746 with 912.3 MB RAM, BlockManagerId(2, sandbox.hortonworks.com, 4174
6)
16/08/28 14:38:57 WARN spark.SparkContext: Use an existing SparkContext, some configuration may not take effect.ter reached minRegisteredResourcesRatio: 0.8
16/08/28 14:38:57 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@46a61277{/SQL,null,AVAILABLE}
16/08/28 14:38:57 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@b4b5885{/SQL/json,null,AVAILABLE}
16/08/28 14:38:57 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@2bcd7bea{/SQL/execution/json,null,AVAILABLE}
16/08/28 14:38:57 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@59bde227{/static/sql,null,AVAILABLE}
16/08/28 14:38:57 INFO internal.SharedState: Warehouse path is 'file:/usr/hdp/2.5.0.0-817/spark2/spark-warehouse'.
16/08/28 14:38:57 INFO scheduler.DAGScheduler: Got job 0 (reduce at SparkPi.scala:38) with 10 output partitions
16/08/28 14:38:57 INFO scheduler.DAGScheduler: Final stage: ResultStage 0 (reduce at SparkPi.scala:38)
16/08/28 14:38:57 INFO scheduler.DAGScheduler: Parents of final stage: List()
16/08/28 14:38:57 INFO scheduler.DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34), which has no missing parents
16/08/28 14:38:57 INFO memory.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1832.0 B, free 912.3 MB)
16/08/28 14:38:57 INFO memory.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1169.0 B, free 912.3 MB)
16/08/28 14:38:57 INFO spark.SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1012size: 1169.0 B, free: 912.3 MB)
16/08/28 14:38:57 INFO scheduler.DAGScheduler: Submitting 10 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34)
16/08/28 14:38:57 INFO cluster.YarnScheduler: Adding task set 0.0 with 10 tasks
16/08/28 14:38:57 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, sandbox.hortonworks.com, partition 1, PROCESS_LOCAL, 5411 bytes)
16/08/28 14:38:58 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Launching task 0 on executor id: 2 hostname: sandbox.hortonworks.com.
16/08/28 14:38:58 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Launching task 1 on executor id: 1 hostname: sandbox.hortonworks.com.
16/08/28 14:38:58 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on sandbox.hortonworks.com:41746 (size: 1169.0 B, free: 912.3 MB)
16/08/28 14:38:59 INFO scheduler.TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, sandbox.hortonworks.com, partition 2, PROCESS_LOCAL, 5411 bytes)
16/08/28 14:38:59 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Launching task 2 on executor id: 1 hostname: sandbox.hortonworks.com.
16/08/28 14:38:59 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Launching task 3 on executor id: 2 hostname: sandbox.hortonworks.com.5411 bytes)
16/08/28 14:38:59 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 1084 ms on sandbox.hortonworks.com (1/10)
16/08/28 14:38:59 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 1061 ms on sandbox.hortonworks.com (2/10)
16/08/28 14:38:59 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Launching task 4 on executor id: 1 hostname: sandbox.hortonworks.com.5411 bytes)
16/08/28 14:38:59 INFO scheduler.TaskSetManager: Finished task 2.0 in stage 0.0 (TID 2) in 88 ms on sandbox.hortonworks.com (3/10)
16/08/28 14:38:59 INFO scheduler.TaskSetManager: Starting task 5.0 in stage 0.0 (TID 5, sandbox.hortonworks.com, partition 5, PROCESS_LOCAL, 5411 bytes)
16/08/28 14:38:59 INFO scheduler.TaskSetManager: Finished task 3.0 in stage 0.0 (TID 3) in 101 ms on sandbox.hortonworks.com (4/10)works.com.
16/08/28 14:38:59 INFO scheduler.TaskSetManager: Starting task 6.0 in stage 0.0 (TID 6, sandbox.hortonworks.com, partition 6, PROCESS_LOCAL, 5411 bytes)
16/08/28 14:38:59 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Launching task 6 on executor id: 1 hostname: sandbox.hortonworks.com.
16/08/28 14:38:59 INFO scheduler.TaskSetManager: Starting task 7.0 in stage 0.0 (TID 7, sandbox.hortonworks.com, partition 7, PROCESS_LOCAL, 5411 bytes)
16/08/28 14:38:59 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Launching task 7 on executor id: 2 hostname: sandbox.hortonworks.com.
16/08/28 14:38:59 INFO scheduler.TaskSetManager: Finished task 5.0 in stage 0.0 (TID 5) in 48 ms on sandbox.hortonworks.com (6/10)
16/08/28 14:38:59 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Launching task 8 on executor id: 1 hostname: sandbox.hortonworks.com.5411 bytes)
16/08/28 14:38:59 INFO scheduler.TaskSetManager: Finished task 6.0 in stage 0.0 (TID 6) in 48 ms on sandbox.hortonworks.com (7/10)
16/08/28 14:38:59 INFO scheduler.TaskSetManager: Starting task 9.0 in stage 0.0 (TID 9, sandbox.hortonworks.com, partition 9, PROCESS_LOCAL, 5411 bytes)
16/08/28 14:38:59 INFO scheduler.TaskSetManager: Finished task 7.0 in stage 0.0 (TID 7) in 40 ms on sandbox.hortonworks.com (8/10)nworks.com.
16/08/28 14:38:59 INFO scheduler.TaskSetManager: Finished task 8.0 in stage 0.0 (TID 😎 in 38 ms on sandbox.hortonworks.com (9/10)
16/08/28 14:38:59 INFO scheduler.TaskSetManager: Finished task 9.0 in stage 0.0 (TID 9) in 31 ms on sandbox.hortonworks.com (10/10)
16/08/28 14:38:59 INFO scheduler.DAGScheduler: ResultStage 0 (reduce at SparkPi.scala:38) finished in 1.293 s
16/08/28 14:38:59 INFO scheduler.DAGScheduler: Job 0 finished: reduce at SparkPi.scala:38, took 1.605653 s
Pi is roughly 3.1418151418151417
16/08/28 14:38:59 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@2d4aa15a{/stages/stage/kill,null,UNAVAILABLE}
Spark-Submit in Yarn-cluster mode fails as per log here:
[root@sandbox spark2-client]# ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster --driver-memory 2g --executor-memory 2g --executor-cor es 1 examples/jars/spark-examples*.jar 10 16/08/28 14:41:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/08/28 14:41:08 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded. 16/08/28 14:41:08 INFO client.RMProxy: Connecting to ResourceManager at sandbox.hortonworks.com/10.0.2.15:8050 16/08/28 14:41:09 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers 16/08/28 14:41:09 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (7680 MB per container) 16/08/28 14:41:09 INFO yarn.Client: Will allocate AM container, with 2248 MB memory including 200 MB overhead 16/08/28 14:41:09 INFO yarn.Client: Setting up container launch context for our AM 16/08/28 14:41:09 INFO yarn.Client: Setting up the launch environment for our AM container 16/08/28 14:41:09 INFO yarn.Client: Preparing resources for our AM container 16/08/28 14:41:09 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME. 16/08/28 14:41:10 INFO yarn.Client: Uploading resource file:/tmp/spark-e72e7961-7ec9-4282-806d-9d95e2d7f0fc/__spark_libs__4204158628332382181.zip -> hdfs://sandbox.hortonworks.com:8 020/user/root/.sparkStaging/application_1472394965674_0002/__spark_libs__4204158628332382181.zip 16/08/28 14:41:11 INFO yarn.Client: Uploading resource file:/usr/hdp/2.5.0.0-817/spark2/examples/jars/spark-examples_2.11-2.0.0.jar -> hdfs://sandbox.hortonworks.com:8020/user/root/ .sparkStaging/application_1472394965674_0002/spark-examples_2.11-2.0.0.jar 16/08/28 14:41:12 INFO yarn.Client: Uploading resource file:/tmp/spark-e72e7961-7ec9-4282-806d-9d95e2d7f0fc/__spark_conf__2789110900476377363.zip -> hdfs://sandbox.hortonworks.com:8 020/user/root/.sparkStaging/application_1472394965674_0002/__spark_conf__.zip 16/08/28 14:41:12 WARN yarn.Client: spark.yarn.am.extraJavaOptions will not take effect in cluster mode 16/08/28 14:41:12 INFO spark.SecurityManager: Changing view acls to: root 16/08/28 14:41:12 INFO spark.SecurityManager: Changing modify acls to: root 16/08/28 14:41:12 INFO spark.SecurityManager: Changing view acls groups to: 16/08/28 14:41:12 INFO spark.SecurityManager: Changing modify acls groups to: 16/08/28 14:41:12 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set( ); users with modify permissions: Set(root); groups with modify permissions: Set() 16/08/28 14:41:12 INFO yarn.Client: Submitting application application_1472394965674_0002 to ResourceManager 16/08/28 14:41:12 INFO impl.YarnClientImpl: Submitted application application_1472394965674_0002 16/08/28 14:41:13 INFO yarn.Client: Application report for application_1472394965674_0002 (state: ACCEPTED) 16/08/28 14:41:13 INFO yarn.Client: client token: N/A diagnostics: AM container is launched, waiting for AM container to Register with RM ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1472395272580 final status: UNDEFINED tracking URL: <a href="http://sandbox.hortonworks.com:8088/proxy/application_1472394965674_0002/">http://sandbox.hortonworks.com:8088/proxy/application_1472394965674_0002/</a> user: root 16/08/28 14:41:14 INFO yarn.Client: Application report for application_1472394965674_0002 (state: ACCEPTED) 16/08/28 14:41:15 INFO yarn.Client: Application report for application_1472394965674_0002 (state: FAILED) 16/08/28 14:41:15 INFO yarn.Client: client token: N/A diagnostics: Application application_1472394965674_0002 failed 2 times due to AM Container for appattempt_1472394965674_0002_000002 exited with exitCode: 1 For more detailed output, check the application tracking page: <a href="http://sandbox.hortonworks.com:8088/cluster/app/application_1472394965674_0002">http://sandbox.hortonworks.com:8088/cluster/app/application_1472394965674_0002</a> Then click on links to logs of each att empt. Diagnostics: Exception from container-launch. Container id: container_e17_1472394965674_0002_02_000001 Exit code: 1 Exception message: /hadoop/yarn/local/usercache/root/appcache/application_1472394965674_0002/container_e17_1472394965674_0002_02_000001/launch_container.sh: line 25: $PWD:$PWD/__spa rk_conf__:$PWD/__spark_libs__/*:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop- doop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework /hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/ hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution Stack trace: ExitCodeException exitCode=1: /hadoop/yarn/local/usercache/root/appcache/application_1472394965674_0002/container_e17_1472394965674_0002_02_000001/launch_container.sh: line 25: $PWD:$PWD/__spark_conf__:$PWD/__spark_libs__/*:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*: /usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-f yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/ hadoop/tools/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution at org.apache.hadoop.util.Shell.run(Shell.java:820)va:909) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1099) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.lang.Thread.run(Thread.java:745)or$Worker.run(ThreadPoolExecutor.java:615) Failing this attempt. Failing the application. ApplicationMaster host: N/A ApplicationMaster RPC port: -1 start time: 1472395272580 final status: FAILED tracking URL: <a href="http://sandbox.hortonworks.com:8088/cluster/app/application_1472394965674_0002">http://sandbox.hortonworks.com:8088/cluster/app/application_1472394965674_0002</a> 16/08/28 14:41:15 INFO yarn.Client: Deleting staging directory hdfs://sandbox.hortonworks.com:8020/user/root/.sparkStaging/application_1472394965674_0002 Exception in thread "main" org.apache.spark.SparkException: Application application_1472394965674_0002 finished with failed status at org.apache.spark.deploy.yarn.Client.run(Client.scala:1132) at org.apache.spark.deploy.yarn.Client.main(Client.scala):1175) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at java.lang.reflect.Method.invoke(Method.java:606)DelegatingMethodAccessorImpl.java:43) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:729) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)0) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 16/08/28 14:41:15 INFO util.ShutdownHookManager: Shutdown hook called [root@sandbox spark2-client]# utdownHookManager: Deleting directory /tmp/spark-e72e7961-7ec9-4282-806d-9d95e2d7f0fc
Any help to resolve this would be appreciated.
In Spark-Shell mode, called with the following command:
[root@sandbox spark2-client]# ./bin/spark-shell --master yarn
I am encountering a LzoCodec not found error, as per log here:
[root@sandbox spark2-client]# ./bin/spark-shell --master yarn Setting default log level to "WARN". 16/08/28 14:44:42 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME. 16/08/28 14:44:54 WARN spark.SparkContext: Use an existing SparkContext, some configuration may not take effect. Spark context Web UI available at <a href="http://10.0.2.15:4041/">http://10.0.2.15:4041</a> Spark session available as 'spark'.ster = yarn, app id = application_1472394965674_0003). Welcome to ____ __ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.0.0 /_/ Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.7.0_101) Type in expressions to have them evaluated. Type :help for more information. scala> val file = sc.textFile("/tmp/data") file: org.apache.spark.rdd.RDD[String] = /tmp/data MapPartitionsRDD[1] at textFile at <console>:24 java.lang.RuntimeException: Error in configuring object)).map(word => (word, 1)).reduceByKey(_ + _) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78) at org.apache.spark.rdd.HadoopRDD.getInputFormat(HadoopRDD.scala:186).java:136) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:199) at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:248) at scala.Option.getOrElse(Option.scala:121)ions$2.apply(RDD.scala:246) at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:246) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:248)D.scala:35) at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:246) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:248) at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:246) at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) at org.apache.spark.Partitioner$.defaultPartitioner(Partitioner.scala:65) at org.apache.spark.rdd.PairRDDFunctions$anonfun$reduceByKey$3.apply(PairRDDFunctions.scala:328) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)tions.scala:328) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:358) ... 48 elided.spark.rdd.PairRDDFunctions.reduceByKey(PairRDDFunctions.scala:327) Caused by: java.lang.reflect.InvocationTargetException: java.lang.IllegalArgumentException: Compression codec com.hadoop.compression.lzo.LzoCodec not found. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) Caused by: java.lang.IllegalArgumentException: Compression codec com.hadoop.compression.lzo.LzoCodec not found. at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:139) at org.apache.hadoop.io.compress.CompressionCodecFactory.<init>(CompressionCodecFactory.java:180) ... 83 morehe.hadoop.mapred.TextInputFormat.configure(TextInputFormat.java:45) Caused by: java.lang.ClassNotFoundException: Class com.hadoop.compression.lzo.LzoCodec not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101) ... 85 morehe.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:132) scala>
Any help to resolve this would be appreciated.
Thanks.
Amit
Created 08-30-2016 01:01 PM
Resolution for Spark Submit issue: add java-opts file in /usr/hdp/current/spark2-client/conf/
[root@sandbox conf]# cat java-opts -Dhdp.version=2.5.0.0-817
Spark Submit working example:
[root@sandbox spark2-client]# ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster --driver-memory 2g --executor-memory 2g --ex ecutor-cores 1 examples/jars/spark-examples*.jar 10 16/08/29 17:44:57 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/08/29 17:44:58 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded. 16/08/29 17:44:58 INFO client.RMProxy: Connecting to ResourceManager at sandbox.hortonworks.com/10.0.2.15:8050 16/08/29 17:44:58 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers 16/08/29 17:44:58 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (7680 MB per container) 16/08/29 17:44:58 INFO yarn.Client: Will allocate AM container, with 2248 MB memory including 200 MB overhead 16/08/29 17:44:58 INFO yarn.Client: Setting up container launch context for our AM 16/08/29 17:44:58 INFO yarn.Client: Setting up the launch environment for our AM container 16/08/29 17:44:58 INFO yarn.Client: Preparing resources for our AM container 16/08/29 17:44:58 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME. 16/08/29 17:45:00 INFO yarn.Client: Uploading resource file:/tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b/__spark_libs__3503948162159958877.zip -> hdfs://sandbox.hortonw orks.com:8020/user/root/.sparkStaging/application_1472397144295_0006/__spark_libs__3503948162159958877.zip 16/08/29 17:45:01 INFO yarn.Client: Uploading resource file:/usr/hdp/2.5.0.0-817/spark2/examples/jars/spark-examples_2.11-2.0.0.jar -> hdfs://sandbox.hortonworks.com:8020/ user/root/.sparkStaging/application_1472397144295_0006/spark-examples_2.11-2.0.0.jar 16/08/29 17:45:01 INFO yarn.Client: Uploading resource file:/tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b/__spark_conf__4613069544481307021.zip -> hdfs://sandbox.hortonw orks.com:8020/user/root/.sparkStaging/application_1472397144295_0006/__spark_conf__.zip 16/08/29 17:45:01 WARN yarn.Client: spark.yarn.am.extraJavaOptions will not take effect in cluster mode 16/08/29 17:45:01 INFO spark.SecurityManager: Changing view acls to: root 16/08/29 17:45:01 INFO spark.SecurityManager: Changing modify acls to: root 16/08/29 17:45:01 INFO spark.SecurityManager: Changing view acls groups to: 16/08/29 17:45:01 INFO spark.SecurityManager: Changing modify acls groups to: 16/08/29 17:45:01 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permiss ions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set() 16/08/29 17:45:01 INFO yarn.Client: Submitting application application_1472397144295_0006 to ResourceManager 16/08/29 17:45:01 INFO impl.YarnClientImpl: Submitted application application_1472397144295_0006 16/08/29 17:45:02 INFO yarn.Client: Application report for application_1472397144295_0006 (state: ACCEPTED) 16/08/29 17:45:02 INFO yarn.Client: client token: N/A diagnostics: AM container is launched, waiting for AM container to Register with RM ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1472492701409 final status: UNDEFINED tracking URL: http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/ user: root 16/08/29 17:45:03 INFO yarn.Client: Application report for application_1472397144295_0006 (state: ACCEPTED) 16/08/29 17:45:04 INFO yarn.Client: Application report for application_1472397144295_0006 (state: ACCEPTED) 16/08/29 17:45:05 INFO yarn.Client: Application report for application_1472397144295_0006 (state: ACCEPTED) 16/08/29 17:45:06 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:06 INFO yarn.Client: client token: N/A diagnostics: N/A ApplicationMaster host: 10.0.2.15 ApplicationMaster RPC port: 0 queue: default start time: 1472492701409 final status: UNDEFINED tracking URL: http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/ user: root 16/08/29 17:45:07 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:08 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:09 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:10 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:11 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:12 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:13 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:14 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:15 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:16 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:17 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:18 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:19 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:20 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:21 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:22 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:23 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:24 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:25 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:26 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:27 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:28 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:29 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:30 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:31 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:32 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:33 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:34 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:35 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:36 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:37 INFO yarn.Client: Application report for application_1472397144295_0006 (state: FINISHED) 16/08/29 17:45:37 INFO yarn.Client: client token: N/A diagnostics: N/A ApplicationMaster host: 10.0.2.15 ApplicationMaster RPC port: 0 queue: default start time: 1472492701409 final status: SUCCEEDED tracking URL: http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/ user: root 16/08/29 17:45:37 INFO util.ShutdownHookManager: Shutdown hook called 16/08/29 17:45:37 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b [root@sandbox spark2-client]#
Resolution for Spark Shell issue (lzo-codec): add the following 2 lines in your spark-defaults.conf
spark.driver.extraClassPath /usr/hdp/current/hadoop-client/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar spark.driver.extraLibraryPath /usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64
Spark Shell working example:
[root@sandbox spark2-client]# ./bin/spark-shell --master yarn --deploy-mode client --driver-memory 2g --executor-memory 2g --executor-cores 1 Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). 16/08/29 17:47:09 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME. 16/08/29 17:47:21 WARN spark.SparkContext: Use an existing SparkContext, some configuration may not take effect. Spark context Web UI available at http://10.0.2.15:4041 Spark context available as 'sc' (master = yarn, app id = application_1472397144295_0007). Spark session available as 'spark'. Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.0.0 /_/ Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.7.0_101) Type in expressions to have them evaluated. Type :help for more information. scala> sc.getConf.getAll.foreach(println) (spark.eventLog.enabled,true) (spark.yarn.scheduler.heartbeat.interval-ms,5000) (hive.metastore.warehouse.dir,file:/usr/hdp/2.5.0.0-817/spark2/spark-warehouse) (spark.repl.class.outputDir,/tmp/spark-fa16d4d3-8ec8-4b0e-a1da-5a2dffe39d08/repl-5dd28f29-ae03-4965-a535-18a95173b173) (spark.yarn.am.extraJavaOptions,-Dhdp.version=2.5.0.0-817) (spark.yarn.containerLauncherMaxThreads,25) (spark.driver.extraJavaOptions,-Dhdp.version=2.5.0.0-817) (spark.driver.extraLibraryPath,/usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64) (spark.driver.appUIAddress,http://10.0.2.15:4041) (spark.driver.host,10.0.2.15) (spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_URI_BASES,http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0007) (spark.yarn.preserve.staging.files,false) (spark.home,/usr/hdp/current/spark2-client) (spark.app.name,Spark shell) (spark.repl.class.uri,spark://10.0.2.15:37426/classes) (spark.ui.port,4041) (spark.yarn.max.executor.failures,3) (spark.submit.deployMode,client) (spark.yarn.executor.memoryOverhead,200) (spark.ui.filters,org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) (spark.driver.extraClassPath,/usr/hdp/current/hadoop-client/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar) (spark.executor.memory,2g) (spark.yarn.driver.memoryOverhead,200) (spark.hadoop.yarn.timeline-service.enabled,false) (spark.executor.extraLibraryPath,/usr/hdp/current/hadoop-client/lib/native) (spark.app.id,application_1472397144295_0007) (spark.executor.id,driver) (spark.yarn.queue,default) (spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_HOSTS,sandbox.hortonworks.com) (spark.eventLog.dir,hdfs:///spark-history) (spark.master,yarn) (spark.driver.port,37426) (spark.yarn.submit.file.replication,3) (spark.sql.catalogImplementation,hive) (spark.driver.memory,2g) (spark.jars,) (spark.executor.cores,1) scala> val file = sc.textFile("/tmp/data") file: org.apache.spark.rdd.RDD[String] = /tmp/data MapPartitionsRDD[1] at textFile at <console>:24 scala> val counts = file.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey(_ + _) counts: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[4] at reduceByKey at <console>:26 scala> counts.take(10) res1: Array[(String, Int)] = Array((hadoop.tasklog.noKeepSplits=4,1), (log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=${yarn.se rver.resourcemanager.appsummary.logger},1), (Unless,1), (this,4), (hadoop.mapreduce.jobsummary.log.file=hadoop-mapreduce.jobsummary.log,1), (under,4), (log4j.appender.RFA. layout.ConversionPattern=%d{ISO8601},2), (log4j.appender.DRFAAUDIT.layout=org.apache.log4j.PatternLayout,1), (AppSummaryLogging,1), (log4j.appender.RMAUDIT.layout=org.apac he.log4j.PatternLayout,1)) scala>
Created 08-30-2016 01:01 PM
Resolution for Spark Submit issue: add java-opts file in /usr/hdp/current/spark2-client/conf/
[root@sandbox conf]# cat java-opts -Dhdp.version=2.5.0.0-817
Spark Submit working example:
[root@sandbox spark2-client]# ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster --driver-memory 2g --executor-memory 2g --ex ecutor-cores 1 examples/jars/spark-examples*.jar 10 16/08/29 17:44:57 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/08/29 17:44:58 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded. 16/08/29 17:44:58 INFO client.RMProxy: Connecting to ResourceManager at sandbox.hortonworks.com/10.0.2.15:8050 16/08/29 17:44:58 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers 16/08/29 17:44:58 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (7680 MB per container) 16/08/29 17:44:58 INFO yarn.Client: Will allocate AM container, with 2248 MB memory including 200 MB overhead 16/08/29 17:44:58 INFO yarn.Client: Setting up container launch context for our AM 16/08/29 17:44:58 INFO yarn.Client: Setting up the launch environment for our AM container 16/08/29 17:44:58 INFO yarn.Client: Preparing resources for our AM container 16/08/29 17:44:58 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME. 16/08/29 17:45:00 INFO yarn.Client: Uploading resource file:/tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b/__spark_libs__3503948162159958877.zip -> hdfs://sandbox.hortonw orks.com:8020/user/root/.sparkStaging/application_1472397144295_0006/__spark_libs__3503948162159958877.zip 16/08/29 17:45:01 INFO yarn.Client: Uploading resource file:/usr/hdp/2.5.0.0-817/spark2/examples/jars/spark-examples_2.11-2.0.0.jar -> hdfs://sandbox.hortonworks.com:8020/ user/root/.sparkStaging/application_1472397144295_0006/spark-examples_2.11-2.0.0.jar 16/08/29 17:45:01 INFO yarn.Client: Uploading resource file:/tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b/__spark_conf__4613069544481307021.zip -> hdfs://sandbox.hortonw orks.com:8020/user/root/.sparkStaging/application_1472397144295_0006/__spark_conf__.zip 16/08/29 17:45:01 WARN yarn.Client: spark.yarn.am.extraJavaOptions will not take effect in cluster mode 16/08/29 17:45:01 INFO spark.SecurityManager: Changing view acls to: root 16/08/29 17:45:01 INFO spark.SecurityManager: Changing modify acls to: root 16/08/29 17:45:01 INFO spark.SecurityManager: Changing view acls groups to: 16/08/29 17:45:01 INFO spark.SecurityManager: Changing modify acls groups to: 16/08/29 17:45:01 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permiss ions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set() 16/08/29 17:45:01 INFO yarn.Client: Submitting application application_1472397144295_0006 to ResourceManager 16/08/29 17:45:01 INFO impl.YarnClientImpl: Submitted application application_1472397144295_0006 16/08/29 17:45:02 INFO yarn.Client: Application report for application_1472397144295_0006 (state: ACCEPTED) 16/08/29 17:45:02 INFO yarn.Client: client token: N/A diagnostics: AM container is launched, waiting for AM container to Register with RM ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1472492701409 final status: UNDEFINED tracking URL: http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/ user: root 16/08/29 17:45:03 INFO yarn.Client: Application report for application_1472397144295_0006 (state: ACCEPTED) 16/08/29 17:45:04 INFO yarn.Client: Application report for application_1472397144295_0006 (state: ACCEPTED) 16/08/29 17:45:05 INFO yarn.Client: Application report for application_1472397144295_0006 (state: ACCEPTED) 16/08/29 17:45:06 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:06 INFO yarn.Client: client token: N/A diagnostics: N/A ApplicationMaster host: 10.0.2.15 ApplicationMaster RPC port: 0 queue: default start time: 1472492701409 final status: UNDEFINED tracking URL: http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/ user: root 16/08/29 17:45:07 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:08 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:09 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:10 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:11 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:12 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:13 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:14 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:15 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:16 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:17 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:18 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:19 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:20 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:21 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:22 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:23 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:24 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:25 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:26 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:27 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:28 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:29 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:30 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:31 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:32 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:33 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:34 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:35 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:36 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING) 16/08/29 17:45:37 INFO yarn.Client: Application report for application_1472397144295_0006 (state: FINISHED) 16/08/29 17:45:37 INFO yarn.Client: client token: N/A diagnostics: N/A ApplicationMaster host: 10.0.2.15 ApplicationMaster RPC port: 0 queue: default start time: 1472492701409 final status: SUCCEEDED tracking URL: http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/ user: root 16/08/29 17:45:37 INFO util.ShutdownHookManager: Shutdown hook called 16/08/29 17:45:37 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b [root@sandbox spark2-client]#
Resolution for Spark Shell issue (lzo-codec): add the following 2 lines in your spark-defaults.conf
spark.driver.extraClassPath /usr/hdp/current/hadoop-client/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar spark.driver.extraLibraryPath /usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64
Spark Shell working example:
[root@sandbox spark2-client]# ./bin/spark-shell --master yarn --deploy-mode client --driver-memory 2g --executor-memory 2g --executor-cores 1 Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). 16/08/29 17:47:09 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME. 16/08/29 17:47:21 WARN spark.SparkContext: Use an existing SparkContext, some configuration may not take effect. Spark context Web UI available at http://10.0.2.15:4041 Spark context available as 'sc' (master = yarn, app id = application_1472397144295_0007). Spark session available as 'spark'. Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.0.0 /_/ Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.7.0_101) Type in expressions to have them evaluated. Type :help for more information. scala> sc.getConf.getAll.foreach(println) (spark.eventLog.enabled,true) (spark.yarn.scheduler.heartbeat.interval-ms,5000) (hive.metastore.warehouse.dir,file:/usr/hdp/2.5.0.0-817/spark2/spark-warehouse) (spark.repl.class.outputDir,/tmp/spark-fa16d4d3-8ec8-4b0e-a1da-5a2dffe39d08/repl-5dd28f29-ae03-4965-a535-18a95173b173) (spark.yarn.am.extraJavaOptions,-Dhdp.version=2.5.0.0-817) (spark.yarn.containerLauncherMaxThreads,25) (spark.driver.extraJavaOptions,-Dhdp.version=2.5.0.0-817) (spark.driver.extraLibraryPath,/usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64) (spark.driver.appUIAddress,http://10.0.2.15:4041) (spark.driver.host,10.0.2.15) (spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_URI_BASES,http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0007) (spark.yarn.preserve.staging.files,false) (spark.home,/usr/hdp/current/spark2-client) (spark.app.name,Spark shell) (spark.repl.class.uri,spark://10.0.2.15:37426/classes) (spark.ui.port,4041) (spark.yarn.max.executor.failures,3) (spark.submit.deployMode,client) (spark.yarn.executor.memoryOverhead,200) (spark.ui.filters,org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) (spark.driver.extraClassPath,/usr/hdp/current/hadoop-client/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar) (spark.executor.memory,2g) (spark.yarn.driver.memoryOverhead,200) (spark.hadoop.yarn.timeline-service.enabled,false) (spark.executor.extraLibraryPath,/usr/hdp/current/hadoop-client/lib/native) (spark.app.id,application_1472397144295_0007) (spark.executor.id,driver) (spark.yarn.queue,default) (spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_HOSTS,sandbox.hortonworks.com) (spark.eventLog.dir,hdfs:///spark-history) (spark.master,yarn) (spark.driver.port,37426) (spark.yarn.submit.file.replication,3) (spark.sql.catalogImplementation,hive) (spark.driver.memory,2g) (spark.jars,) (spark.executor.cores,1) scala> val file = sc.textFile("/tmp/data") file: org.apache.spark.rdd.RDD[String] = /tmp/data MapPartitionsRDD[1] at textFile at <console>:24 scala> val counts = file.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey(_ + _) counts: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[4] at reduceByKey at <console>:26 scala> counts.take(10) res1: Array[(String, Int)] = Array((hadoop.tasklog.noKeepSplits=4,1), (log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=${yarn.se rver.resourcemanager.appsummary.logger},1), (Unless,1), (this,4), (hadoop.mapreduce.jobsummary.log.file=hadoop-mapreduce.jobsummary.log,1), (under,4), (log4j.appender.RFA. layout.ConversionPattern=%d{ISO8601},2), (log4j.appender.DRFAAUDIT.layout=org.apache.log4j.PatternLayout,1), (AppSummaryLogging,1), (log4j.appender.RMAUDIT.layout=org.apac he.log4j.PatternLayout,1)) scala>
Created 08-30-2016 03:43 PM
Yep, this worked for me as well. Thanks.