Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Sandbox HDP-2.5.0 Spark 2.0.0 - Spark Submit Yarn Cluster Mode -- Spark Shell LzoCodec not found

avatar
Rising Star

Sandbox HDP-2.5.0 Spark 2.0.0 - Spark Submit Yarn Cluster Mode -- Spark Shell LzoCodec not found

I have installed Spark 2.0.0 in Sandbox HDP-2.5.0 in accordance to Paul Hargis great post:

https://community.hortonworks.com/articles/53029/how-to-install-and-run-spark-20-on-hdp-25-sandbox.h...

Thanks Paul.

Spark-Submit in Yarn-Client mode works as per log here:

[root@sandbox ~]# cd /usr/hdp/current/spark2-client                                                                                                                                  
[root@sandbox spark2-client]# ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode client --driver-memory 2g --executor-memory 2g --executor-core
s 1 examples/jars/spark-examples*.jar 10
16/08/28 14:38:42 INFO spark.SparkContext: Running Spark version 2.0.0                                                                                                               
16/08/28 14:38:42 INFO spark.SecurityManager: Changing view acls to: root                                                                                                            
16/08/28 14:38:42 INFO spark.SecurityManager: Changing modify acls to: root                                                                                                          
16/08/28 14:38:42 INFO spark.SecurityManager: Changing view acls groups to:                                                                                                          
16/08/28 14:38:42 INFO spark.SecurityManager: Changing modify acls groups to:                                                                                                        
16/08/28 14:38:42 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permissions: Set(
); users  with modify permissions: Set(root); groups with modify permissions: Set()                                                                                                  
16/08/28 14:38:43 INFO util.Utils: Successfully started service 'sparkDriver' on port 36008.                                                                                         
16/08/28 14:38:43 INFO spark.SparkEnv: Registering MapOutputTracker                                                                                                                  
16/08/28 14:38:43 INFO spark.SparkEnv: Registering BlockManagerMaster                                                                                                                
16/08/28 14:38:43 INFO storage.DiskBlockManager: Created local directory at /tmp/blockmgr-b5149ef4-928d-455e-bf83-2159e12f88f7                                                       
16/08/28 14:38:43 INFO memory.MemoryStore: MemoryStore started with capacity 912.3 MB                                                                                                
16/08/28 14:38:43 INFO spark.SparkEnv: Registering OutputCommitCoordinator                                                                                                           
16/08/28 14:38:43 INFO util.log: Logging initialized @2226ms                                                                                                                         
16/08/28 14:38:43 INFO server.Server: jetty-9.2.z-SNAPSHOT                                                                                                                           
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6e1e5b02{/jobs,null,AVAILABLE}                                                                  
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@ae918c9{/jobs/json,null,AVAILABLE}                                                              
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4d5a39b7{/jobs/job,null,AVAILABLE}                                                              
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5e83450d{/jobs/job/json,null,AVAILABLE}                                                         
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@7c2a88f4{/stages,null,AVAILABLE}                                                                
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4c858adb{/stages/json,null,AVAILABLE}                                                           
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@535f571c{/stages/stage,null,AVAILABLE}                                                          
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@18501a07{/stages/stage/json,null,AVAILABLE}                                                     
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@32dcce09{/stages/pool,null,AVAILABLE}                                                           
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3e5acaf5{/stages/pool/json,null,AVAILABLE}                                                      
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3ac2bace{/storage,null,AVAILABLE}                                                               
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@46764885{/storage/json,null,AVAILABLE}                                                          
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@7f9337e6{/storage/rdd,null,AVAILABLE}                                                           
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1a3b1e79{/storage/rdd/json,null,AVAILABLE}                                                      
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1f4da763{/environment,null,AVAILABLE}                                                           
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@232864a3{/environment/json,null,AVAILABLE}                                                      
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@30e71b5d{/executors,null,AVAILABLE}                                                             
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@14b58fc0{/executors/json,null,AVAILABLE}                                                        
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1bf090df{/executors/threadDump,null,AVAILABLE}                                                  
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4eb72ecd{/executors/threadDump/json,null,AVAILABLE}                                             
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5c61bd1a{/static,null,AVAILABLE}                                                                
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@14c62558{/,null,AVAILABLE}                                                                      
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5cbdbf0f{/api,null,AVAILABLE}                                                                   
16/08/28 14:38:43 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@2d4aa15a{/stages/stage/kill,null,AVAILABLE}                                                     
16/08/28 14:38:43 INFO server.ServerConnector: Started ServerConnector@51fcbb35{HTTP/1.1}{0.0.0.0:4041}                                                                              
16/08/28 14:38:43 INFO server.Server: Started @2388ms                                                                                                                                
16/08/28 14:38:43 INFO ui.SparkUI: Bound SparkUI to 0.0.0.0, and started at <a href="http://10.0.2.15:4041/">http://10.0.2.15:4041</a> 

16/08/28 14:38:43 INFO spark.SparkContext: Added JAR file:/usr/hdp/2.5.0.0-817/spark2/examples/jars/spark-examples_2.11-2.0.0.jar at spark://10.0.2.15:36008/jars/spark-examples_2.11
-2.0.0.jar with timestamp 1472395123767                                                                                                                                              
16/08/28 14:38:44 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers0.0.2.15:8050                                                                       

16/08/28 14:38:44 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (7680 MB per container)                       
16/08/28 14:38:44 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead                                                                         
16/08/28 14:38:44 INFO yarn.Client: Setting up the launch environment for our AM container                                                                                           

16/08/28 14:38:44 INFO yarn.Client: Preparing resources for our AM container                                                                                                         
16/08/28 14:38:44 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.                                     
020/user/root/.sparkStaging/application_1472394965674_0001/__spark_libs__6748274495232790272.zip419767250f0/__spark_libs__6748274495232790272.zip -> hdfs://sandbox.hortonworks.com:8

16/08/28 14:38:48 INFO yarn.Client: Uploading resource file:/tmp/spark-a10e8972-1076-4a61-a014-8419767250f0/__spark_conf__6530127439911581770.zip -> hdfs://sandbox.hortonworks.com:8
020/user/root/.sparkStaging/application_1472394965674_0001/__spark_conf__.zip                                                                                                        
16/08/28 14:38:48 INFO spark.SecurityManager: Changing modify acls to: root                                                                                                          

16/08/28 14:38:48 INFO spark.SecurityManager: Changing view acls groups to:                                                                                                          
16/08/28 14:38:48 INFO spark.SecurityManager: Changing modify acls groups to:                                                                                                        
); users  with modify permissions: Set(root); groups with modify permissions: Set()led; ui acls disabled; users  with view permissions: Set(root); groups with view permissions: Set(

16/08/28 14:38:48 INFO yarn.Client: Submitting application application_1472394965674_0001 to ResourceManager                                                                         
16/08/28 14:38:48 INFO impl.YarnClientImpl: Submitted application application_1472394965674_0001                                                                                     
16/08/28 14:38:49 INFO yarn.Client: Application report for application_1472394965674_0001 (state: ACCEPTED)ation_1472394965674_0001 and attemptId None                               

16/08/28 14:38:49 INFO yarn.Client:                                                                                                                                                  
         client token: N/A                                                                                                                                                           
         ApplicationMaster host: N/As launched, waiting for AM container to Register with RM                                                                                         

         ApplicationMaster RPC port: -1                                                                                                                                              
         queue: default                                                                                                                                                              
         final status: UNDEFINED18                                                                                                                                                   

 tracking URL: <a href="http://sandbox.hortonworks.com:8088/proxy/application_1472394965674_0001/">http://sandbox.hortonworks.com:8088/proxy/application_1472394965674_0001/</a> 
         user: root                                                                                                                                                                  
16/08/28 14:38:51 INFO yarn.Client: Application report for application_1472394965674_0001 (state: ACCEPTED)                                                                          

16/08/28 14:38:52 INFO yarn.Client: Application report for application_1472394965674_0001 (state: ACCEPTED)                                                                          
16/08/28 14:38:52 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(null)                                                 
PROXY_URI_BASES -> <a href="http://sandbox.hortonworks.com:8088/proxy/application_1472394965674_0001">http://sandbox.hortonworks.com:8088/proxy/application_1472394965674_0001</a>), /proxy/application_1472394965674_0001lter, Map(PROXY_HOSTS -> sandbox.hortonworks.com, 

16/08/28 14:38:52 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter                                                                      
16/08/28 14:38:53 INFO yarn.Client: Application report for application_1472394965674_0001 (state: RUNNING)                                                                           
16/08/28 client token: N/An.Client:                                                                                                                                                  

         diagnostics: N/A                                                                                                                                                            
         ApplicationMaster host: 10.0.2.15                                                                                                                                           
         queue: defaultter RPC port: 0                                                                                                                                               

         start time: 1472395128618                                                                                                                                                   
         final status: UNDEFINED                                                                                                                                                     
 user: rootRL: <a href="http://sandbox.hortonworks.com:8088/proxy/application_1472394965674_0001/">http://sandbox.hortonworks.com:8088/proxy/application_1472394965674_0001/</a> 

16/08/28 14:38:53 INFO cluster.YarnClientSchedulerBackend: Application application_1472394965674_0001 has started running.                                                           
16/08/28 14:38:53 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 35756.                                            
16/08/28 14:38:53 INFO storage.BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 10.0.2.15, 35756)                                                                 

16/08/28 14:38:53 INFO storage.BlockManagerMasterEndpoint: Registering block manager 10.0.2.15:35756 with 912.3 MB RAM, BlockManagerId(driver, 10.0.2.15, 35756)                     
16/08/28 14:38:53 INFO storage.BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 10.0.2.15, 35756)                                                                  
16/08/28 14:38:54 INFO scheduler.EventLoggingListener: Logging events to hdfs:///spark-history/application_1472394965674_0001                                                        

16/08/28 14:38:56 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(null) (10.0.2.15:36932) with ID 1                                    
16/08/28 14:38:56 INFO storage.BlockManagerMasterEndpoint: Registering block manager sandbox.hortonworks.com:41061 with 912.3 MB RAM, BlockManagerId(1, sandbox.hortonworks.com, 4106
16/08/28 14:38:57 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(null) (10.0.2.15:36936) with ID 2                                    

16/08/28 14:38:57 INFO storage.BlockManagerMasterEndpoint: Registering block manager sandbox.hortonworks.com:41746 with 912.3 MB RAM, BlockManagerId(2, sandbox.hortonworks.com, 4174
6)                                                                                                                                                                                   
16/08/28 14:38:57 WARN spark.SparkContext: Use an existing SparkContext, some configuration may not take effect.ter reached minRegisteredResourcesRatio: 0.8                         

16/08/28 14:38:57 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@46a61277{/SQL,null,AVAILABLE}                                                                   
16/08/28 14:38:57 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@b4b5885{/SQL/json,null,AVAILABLE}                                                               
16/08/28 14:38:57 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@2bcd7bea{/SQL/execution/json,null,AVAILABLE}                                                    

16/08/28 14:38:57 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@59bde227{/static/sql,null,AVAILABLE}                                                            
16/08/28 14:38:57 INFO internal.SharedState: Warehouse path is 'file:/usr/hdp/2.5.0.0-817/spark2/spark-warehouse'.                                                                   
16/08/28 14:38:57 INFO scheduler.DAGScheduler: Got job 0 (reduce at SparkPi.scala:38) with 10 output partitions                                                                      

16/08/28 14:38:57 INFO scheduler.DAGScheduler: Final stage: ResultStage 0 (reduce at SparkPi.scala:38)                                                                               
16/08/28 14:38:57 INFO scheduler.DAGScheduler: Parents of final stage: List()                                                                                                        
16/08/28 14:38:57 INFO scheduler.DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34), which has no missing parents                               

16/08/28 14:38:57 INFO memory.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1832.0 B, free 912.3 MB)                                                     
16/08/28 14:38:57 INFO memory.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1169.0 B, free 912.3 MB)                                               
16/08/28 14:38:57 INFO spark.SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1012size: 1169.0 B, free: 912.3 MB)                                              

16/08/28 14:38:57 INFO scheduler.DAGScheduler: Submitting 10 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34)                                       
16/08/28 14:38:57 INFO cluster.YarnScheduler: Adding task set 0.0 with 10 tasks                                                                                                      
16/08/28 14:38:57 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, sandbox.hortonworks.com, partition 1, PROCESS_LOCAL, 5411 bytes)                             

16/08/28 14:38:58 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Launching task 0 on executor id: 2 hostname: sandbox.hortonworks.com.                                        
16/08/28 14:38:58 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Launching task 1 on executor id: 1 hostname: sandbox.hortonworks.com.                                        
16/08/28 14:38:58 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on sandbox.hortonworks.com:41746 (size: 1169.0 B, free: 912.3 MB)                                

16/08/28 14:38:59 INFO scheduler.TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, sandbox.hortonworks.com, partition 2, PROCESS_LOCAL, 5411 bytes)                             
16/08/28 14:38:59 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Launching task 2 on executor id: 1 hostname: sandbox.hortonworks.com.                                        
16/08/28 14:38:59 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Launching task 3 on executor id: 2 hostname: sandbox.hortonworks.com.5411 bytes)                             

16/08/28 14:38:59 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 1084 ms on sandbox.hortonworks.com (1/10)                                                 
16/08/28 14:38:59 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 1061 ms on sandbox.hortonworks.com (2/10)                                                 
16/08/28 14:38:59 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Launching task 4 on executor id: 1 hostname: sandbox.hortonworks.com.5411 bytes)                             

16/08/28 14:38:59 INFO scheduler.TaskSetManager: Finished task 2.0 in stage 0.0 (TID 2) in 88 ms on sandbox.hortonworks.com (3/10)                                                   
16/08/28 14:38:59 INFO scheduler.TaskSetManager: Starting task 5.0 in stage 0.0 (TID 5, sandbox.hortonworks.com, partition 5, PROCESS_LOCAL, 5411 bytes)                             
16/08/28 14:38:59 INFO scheduler.TaskSetManager: Finished task 3.0 in stage 0.0 (TID 3) in 101 ms on sandbox.hortonworks.com (4/10)works.com.                                        

16/08/28 14:38:59 INFO scheduler.TaskSetManager: Starting task 6.0 in stage 0.0 (TID 6, sandbox.hortonworks.com, partition 6, PROCESS_LOCAL, 5411 bytes)                             
16/08/28 14:38:59 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Launching task 6 on executor id: 1 hostname: sandbox.hortonworks.com.                                        
16/08/28 14:38:59 INFO scheduler.TaskSetManager: Starting task 7.0 in stage 0.0 (TID 7, sandbox.hortonworks.com, partition 7, PROCESS_LOCAL, 5411 bytes)                             

16/08/28 14:38:59 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Launching task 7 on executor id: 2 hostname: sandbox.hortonworks.com.                                        
16/08/28 14:38:59 INFO scheduler.TaskSetManager: Finished task 5.0 in stage 0.0 (TID 5) in 48 ms on sandbox.hortonworks.com (6/10)                                                   
16/08/28 14:38:59 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Launching task 8 on executor id: 1 hostname: sandbox.hortonworks.com.5411 bytes)                             

16/08/28 14:38:59 INFO scheduler.TaskSetManager: Finished task 6.0 in stage 0.0 (TID 6) in 48 ms on sandbox.hortonworks.com (7/10)                                                   
16/08/28 14:38:59 INFO scheduler.TaskSetManager: Starting task 9.0 in stage 0.0 (TID 9, sandbox.hortonworks.com, partition 9, PROCESS_LOCAL, 5411 bytes)                             
16/08/28 14:38:59 INFO scheduler.TaskSetManager: Finished task 7.0 in stage 0.0 (TID 7) in 40 ms on sandbox.hortonworks.com (8/10)nworks.com.                                        

16/08/28 14:38:59 INFO scheduler.TaskSetManager: Finished task 8.0 in stage 0.0 (TID 😎 in 38 ms on sandbox.hortonworks.com (9/10)                                                   
16/08/28 14:38:59 INFO scheduler.TaskSetManager: Finished task 9.0 in stage 0.0 (TID 9) in 31 ms on sandbox.hortonworks.com (10/10)                                                  
16/08/28 14:38:59 INFO scheduler.DAGScheduler: ResultStage 0 (reduce at SparkPi.scala:38) finished in 1.293 s                                                                        

16/08/28 14:38:59 INFO scheduler.DAGScheduler: Job 0 finished: reduce at SparkPi.scala:38, took 1.605653 s                                                                           
Pi is roughly 3.1418151418151417                                                                                                                                                     
16/08/28 14:38:59 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@2d4aa15a{/stages/stage/kill,null,UNAVAILABLE}  

Spark-Submit in Yarn-cluster mode fails as per log here:

[root@sandbox spark2-client]# ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster --driver-memory 2g --executor-memory 2g --executor-cor
es 1 examples/jars/spark-examples*.jar 10                                                                                                                                            
16/08/28 14:41:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/08/28 14:41:08 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
16/08/28 14:41:08 INFO client.RMProxy: Connecting to ResourceManager at sandbox.hortonworks.com/10.0.2.15:8050
16/08/28 14:41:09 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers
16/08/28 14:41:09 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (7680 MB per container)
16/08/28 14:41:09 INFO yarn.Client: Will allocate AM container, with 2248 MB memory including 200 MB overhead
16/08/28 14:41:09 INFO yarn.Client: Setting up container launch context for our AM
16/08/28 14:41:09 INFO yarn.Client: Setting up the launch environment for our AM container
16/08/28 14:41:09 INFO yarn.Client: Preparing resources for our AM container
16/08/28 14:41:09 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
16/08/28 14:41:10 INFO yarn.Client: Uploading resource file:/tmp/spark-e72e7961-7ec9-4282-806d-9d95e2d7f0fc/__spark_libs__4204158628332382181.zip -> hdfs://sandbox.hortonworks.com:8
020/user/root/.sparkStaging/application_1472394965674_0002/__spark_libs__4204158628332382181.zip
16/08/28 14:41:11 INFO yarn.Client: Uploading resource file:/usr/hdp/2.5.0.0-817/spark2/examples/jars/spark-examples_2.11-2.0.0.jar -> hdfs://sandbox.hortonworks.com:8020/user/root/
.sparkStaging/application_1472394965674_0002/spark-examples_2.11-2.0.0.jar
16/08/28 14:41:12 INFO yarn.Client: Uploading resource file:/tmp/spark-e72e7961-7ec9-4282-806d-9d95e2d7f0fc/__spark_conf__2789110900476377363.zip -> hdfs://sandbox.hortonworks.com:8
020/user/root/.sparkStaging/application_1472394965674_0002/__spark_conf__.zip
16/08/28 14:41:12 WARN yarn.Client: spark.yarn.am.extraJavaOptions will not take effect in cluster mode
16/08/28 14:41:12 INFO spark.SecurityManager: Changing view acls to: root
16/08/28 14:41:12 INFO spark.SecurityManager: Changing modify acls to: root
16/08/28 14:41:12 INFO spark.SecurityManager: Changing view acls groups to: 
16/08/28 14:41:12 INFO spark.SecurityManager: Changing modify acls groups to: 
16/08/28 14:41:12 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permissions: Set(
); users  with modify permissions: Set(root); groups with modify permissions: Set()
16/08/28 14:41:12 INFO yarn.Client: Submitting application application_1472394965674_0002 to ResourceManager
16/08/28 14:41:12 INFO impl.YarnClientImpl: Submitted application application_1472394965674_0002
16/08/28 14:41:13 INFO yarn.Client: Application report for application_1472394965674_0002 (state: ACCEPTED)
16/08/28 14:41:13 INFO yarn.Client: 
         client token: N/A
         diagnostics: AM container is launched, waiting for AM container to Register with RM
         ApplicationMaster host: N/A
         ApplicationMaster RPC port: -1
         queue: default
         start time: 1472395272580
         final status: UNDEFINED
 tracking URL: <a href="http://sandbox.hortonworks.com:8088/proxy/application_1472394965674_0002/">http://sandbox.hortonworks.com:8088/proxy/application_1472394965674_0002/</a>
         user: root
16/08/28 14:41:14 INFO yarn.Client: Application report for application_1472394965674_0002 (state: ACCEPTED)
16/08/28 14:41:15 INFO yarn.Client: Application report for application_1472394965674_0002 (state: FAILED)
16/08/28 14:41:15 INFO yarn.Client: 
         client token: N/A
         diagnostics: Application application_1472394965674_0002 failed 2 times due to AM Container for appattempt_1472394965674_0002_000002 exited with  exitCode: 1
For more detailed output, check the application tracking page: <a href="http://sandbox.hortonworks.com:8088/cluster/app/application_1472394965674_0002">http://sandbox.hortonworks.com:8088/cluster/app/application_1472394965674_0002</a> Then click on links to logs of each att
empt.
Diagnostics: Exception from container-launch.
Container id: container_e17_1472394965674_0002_02_000001
Exit code: 1
Exception message: /hadoop/yarn/local/usercache/root/appcache/application_1472394965674_0002/container_e17_1472394965674_0002_02_000001/launch_container.sh: line 25: $PWD:$PWD/__spa
rk_conf__:$PWD/__spark_libs__/*:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-
doop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework

/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/
hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution                                                                          
Stack trace: ExitCodeException exitCode=1: /hadoop/yarn/local/usercache/root/appcache/application_1472394965674_0002/container_e17_1472394965674_0002_02_000001/launch_container.sh: 

line 25: $PWD:$PWD/__spark_conf__:$PWD/__spark_libs__/*:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:
/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-f
yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/

hadoop/tools/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution                                                  

        at org.apache.hadoop.util.Shell.run(Shell.java:820)va:909)                                                                                                                   

        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1099)                                                                                                
        at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)                                                     
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81))                                                        

        at java.util.concurrent.FutureTask.run(FutureTask.java:262)                                                                                                                  
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)                                                                                           
        at java.lang.Thread.run(Thread.java:745)or$Worker.run(ThreadPoolExecutor.java:615)                                                                                           
Failing this attempt. Failing the application.                                                                                                                                       

         ApplicationMaster host: N/A                                                                                                                                                 
         ApplicationMaster RPC port: -1                                                                                                                                              
         start time: 1472395272580                                                                                                                                                   

         final status: FAILED                                                                                                                                                        
 tracking URL: <a href="http://sandbox.hortonworks.com:8088/cluster/app/application_1472394965674_0002">http://sandbox.hortonworks.com:8088/cluster/app/application_1472394965674_0002</a> 
16/08/28 14:41:15 INFO yarn.Client: Deleting staging directory hdfs://sandbox.hortonworks.com:8020/user/root/.sparkStaging/application_1472394965674_0002                            

Exception in thread "main" org.apache.spark.SparkException: Application application_1472394965674_0002 finished with failed status                                                   
        at org.apache.spark.deploy.yarn.Client.run(Client.scala:1132)                                                                                                                
        at org.apache.spark.deploy.yarn.Client.main(Client.scala):1175)                                                                                                              

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)                                                                                                               
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)                                                                                             
        at java.lang.reflect.Method.invoke(Method.java:606)DelegatingMethodAccessorImpl.java:43)                                                                                     

        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:729)                                                                  
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)                                                                                                   
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)0)                                                                                                        

        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)                                                                                                               
16/08/28 14:41:15 INFO util.ShutdownHookManager: Shutdown hook called                                                                                                                
[root@sandbox spark2-client]# utdownHookManager: Deleting directory /tmp/spark-e72e7961-7ec9-4282-806d-9d95e2d7f0fc                                                                  

Any help to resolve this would be appreciated.

In Spark-Shell mode, called with the following command:

[root@sandbox spark2-client]# ./bin/spark-shell --master yarn

I am encountering a LzoCodec not found error, as per log here:

[root@sandbox spark2-client]# ./bin/spark-shell --master yarn                                                                                                                        
Setting default log level to "WARN".
16/08/28 14:44:42 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.                                     

16/08/28 14:44:54 WARN spark.SparkContext: Use an existing SparkContext, some configuration may not take effect.                                                                     
Spark context Web UI available at <a href="http://10.0.2.15:4041/">http://10.0.2.15:4041</a> 
Spark session available as 'spark'.ster = yarn, app id = application_1472394965674_0003).                                                                                            

Welcome to                                                                                                                                                                           
      ____              __                                                                                                                                                           
    _\ \/ _ \/ _ `/ __/  '_/                                                                                                                                                         

   /___/ .__/\_,_/_/ /_/\_\   version 2.0.0                                                                                                                                          
      /_/                                                                                                                                                                            
Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.7.0_101)

Type in expressions to have them evaluated.                                                                                                                                          
Type :help for more information.                                                                                                                                                     
scala> val file = sc.textFile("/tmp/data")                                                                                                                                           

file: org.apache.spark.rdd.RDD[String] = /tmp/data MapPartitionsRDD[1] at textFile at <console>:24                                                                                   

java.lang.RuntimeException: Error in configuring object)).map(word => (word, 1)).reduceByKey(_ + _)                                                                                  

  at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112)                                                                                                     
  at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)                                                                                                         
  at org.apache.spark.rdd.HadoopRDD.getInputFormat(HadoopRDD.scala:186).java:136)                                                                                                    

  at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:199)                                                                                                               
  at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:248)                                                                                                             
  at scala.Option.getOrElse(Option.scala:121)ions$2.apply(RDD.scala:246)                                                                                                             

  at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)                                                                                                                              
  at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)                                                                                                  
  at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:246)                                                                                                             

  at scala.Option.getOrElse(Option.scala:121)                                                                                                                                        
  at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)                                                                                                                              
  at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:248)D.scala:35)                                                                                                  

  at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:246)                                                                                                             
  at scala.Option.getOrElse(Option.scala:121)                                                                                                                                        
  at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)                                                                                                  

  at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:248)                                                                                                             
  at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:246)                                                                                                             
  at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)                                                                                                                              

  at org.apache.spark.Partitioner$.defaultPartitioner(Partitioner.scala:65)                                                                                                          
  at org.apache.spark.rdd.PairRDDFunctions$anonfun$reduceByKey$3.apply(PairRDDFunctions.scala:328)                                                                                  
  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)tions.scala:328)                                                                                  

  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)                                                                                                  
  at org.apache.spark.rdd.RDD.withScope(RDD.scala:358)                                                                                                                               
  ... 48 elided.spark.rdd.PairRDDFunctions.reduceByKey(PairRDDFunctions.scala:327)                                                                                                   

Caused by: java.lang.reflect.InvocationTargetException: java.lang.IllegalArgumentException: Compression codec com.hadoop.compression.lzo.LzoCodec not found.                         
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)                                                                                                                     
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)                                                                                           

  at java.lang.reflect.Method.invoke(Method.java:606)                                                                                                                                
  at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)                                                                                                     
Caused by: java.lang.IllegalArgumentException: Compression codec com.hadoop.compression.lzo.LzoCodec not found.                                                                      

  at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:139)                                                                         
  at org.apache.hadoop.io.compress.CompressionCodecFactory.<init>(CompressionCodecFactory.java:180)                                                                                  
  ... 83 morehe.hadoop.mapred.TextInputFormat.configure(TextInputFormat.java:45)                                                                                                     

Caused by: java.lang.ClassNotFoundException: Class com.hadoop.compression.lzo.LzoCodec not found                                                                                     
  at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)                                                                                                    
  ... 85 morehe.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:132)                                                                         
scala>    

Any help to resolve this would be appreciated.

Thanks.

Amit

1 ACCEPTED SOLUTION

avatar
Rising Star

Resolution for Spark Submit issue: add java-opts file in /usr/hdp/current/spark2-client/conf/

[root@sandbox conf]# cat java-opts                                                              
-Dhdp.version=2.5.0.0-817

Spark Submit working example:

[root@sandbox spark2-client]# ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster --driver-memory 2g --executor-memory 2g --ex
ecutor-cores 1 examples/jars/spark-examples*.jar 10                                                                                                                        
16/08/29 17:44:57 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable                        
16/08/29 17:44:58 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.                          
16/08/29 17:44:58 INFO client.RMProxy: Connecting to ResourceManager at sandbox.hortonworks.com/10.0.2.15:8050                                                             
16/08/29 17:44:58 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers                                                                          
16/08/29 17:44:58 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (7680 MB per container)             
16/08/29 17:44:58 INFO yarn.Client: Will allocate AM container, with 2248 MB memory including 200 MB overhead                                                              
16/08/29 17:44:58 INFO yarn.Client: Setting up container launch context for our AM                                                                                         
16/08/29 17:44:58 INFO yarn.Client: Setting up the launch environment for our AM container                                                                                 
16/08/29 17:44:58 INFO yarn.Client: Preparing resources for our AM container                                                                                               
16/08/29 17:44:58 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.                           
16/08/29 17:45:00 INFO yarn.Client: Uploading resource file:/tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b/__spark_libs__3503948162159958877.zip -> hdfs://sandbox.hortonw
orks.com:8020/user/root/.sparkStaging/application_1472397144295_0006/__spark_libs__3503948162159958877.zip                                                                 
16/08/29 17:45:01 INFO yarn.Client: Uploading resource file:/usr/hdp/2.5.0.0-817/spark2/examples/jars/spark-examples_2.11-2.0.0.jar -> hdfs://sandbox.hortonworks.com:8020/
user/root/.sparkStaging/application_1472397144295_0006/spark-examples_2.11-2.0.0.jar                                                                                       
16/08/29 17:45:01 INFO yarn.Client: Uploading resource file:/tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b/__spark_conf__4613069544481307021.zip -> hdfs://sandbox.hortonw
orks.com:8020/user/root/.sparkStaging/application_1472397144295_0006/__spark_conf__.zip                                                                                    
16/08/29 17:45:01 WARN yarn.Client: spark.yarn.am.extraJavaOptions will not take effect in cluster mode                                                                    
16/08/29 17:45:01 INFO spark.SecurityManager: Changing view acls to: root                                                                                                  
16/08/29 17:45:01 INFO spark.SecurityManager: Changing modify acls to: root                                                                                                
16/08/29 17:45:01 INFO spark.SecurityManager: Changing view acls groups to:                                                                                                
16/08/29 17:45:01 INFO spark.SecurityManager: Changing modify acls groups to:                                                                                              
16/08/29 17:45:01 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permiss
ions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()                                                                              
16/08/29 17:45:01 INFO yarn.Client: Submitting application application_1472397144295_0006 to ResourceManager                                                               
16/08/29 17:45:01 INFO impl.YarnClientImpl: Submitted application application_1472397144295_0006                                                                           
16/08/29 17:45:02 INFO yarn.Client: Application report for application_1472397144295_0006 (state: ACCEPTED)                                                                
16/08/29 17:45:02 INFO yarn.Client:                                                                                                                                        
         client token: N/A                                                                                                                                                 
         diagnostics: AM container is launched, waiting for AM container to Register with RM                                                                               
         ApplicationMaster host: N/A                                                                                                                                       
         ApplicationMaster RPC port: -1                                                                                                                                    
         queue: default                                                                                                                                                    
         start time: 1472492701409                                                                                                                                         
         final status: UNDEFINED                                                                                                                                           
 tracking URL: http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/ 
         user: root                                                                                                                                                        
16/08/29 17:45:03 INFO yarn.Client: Application report for application_1472397144295_0006 (state: ACCEPTED)                                                                
16/08/29 17:45:04 INFO yarn.Client: Application report for application_1472397144295_0006 (state: ACCEPTED)                                                                
16/08/29 17:45:05 INFO yarn.Client: Application report for application_1472397144295_0006 (state: ACCEPTED)                                                                
16/08/29 17:45:06 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:06 INFO yarn.Client:                                                                                                                                        
         client token: N/A                                                                                                                                                 
         diagnostics: N/A                                                                                                                                                  
         ApplicationMaster host: 10.0.2.15                                                                                                                                 
         ApplicationMaster RPC port: 0                                                                                                                                     
         queue: default                                                                                                                                                    
         start time: 1472492701409                                                                                                                                         
         final status: UNDEFINED                                                                                                                                           
 tracking URL: http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/ 
         user: root                                                                                                                                                        
16/08/29 17:45:07 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:08 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:09 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:10 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:11 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:12 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:13 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:14 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:15 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:16 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:17 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:18 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:19 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:20 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:21 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:22 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:23 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:24 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:25 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:26 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:27 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:28 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:29 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:30 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:31 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:32 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:33 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:34 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:35 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:36 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:37 INFO yarn.Client: Application report for application_1472397144295_0006 (state: FINISHED)                                                                
16/08/29 17:45:37 INFO yarn.Client:                                                                                                                                        
         client token: N/A                                                                                                                                                 
         diagnostics: N/A                                                                                                                                                  
         ApplicationMaster host: 10.0.2.15                                                                                                                                 
         ApplicationMaster RPC port: 0                                                                                                                                     
         queue: default                                                                                                                                                    
         start time: 1472492701409                                                                                                                                         
         final status: SUCCEEDED                                                                                                                                           
 tracking URL: http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/ 
         user: root                                                                                                                                                        
16/08/29 17:45:37 INFO util.ShutdownHookManager: Shutdown hook called                                                                                                      
16/08/29 17:45:37 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b                                                        
[root@sandbox spark2-client]#                                                                                                                                              

Resolution for Spark Shell issue (lzo-codec): add the following 2 lines in your spark-defaults.conf

spark.driver.extraClassPath /usr/hdp/current/hadoop-client/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar                                                                            
spark.driver.extraLibraryPath /usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64                                            

Spark Shell working example:

[root@sandbox spark2-client]# ./bin/spark-shell --master yarn --deploy-mode client --driver-memory 2g --executor-memory 2g --executor-cores 1                              
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel).                                                                                                                      
16/08/29 17:47:09 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.                           
16/08/29 17:47:21 WARN spark.SparkContext: Use an existing SparkContext, some configuration may not take effect.                                                           
Spark context Web UI available at http://10.0.2.15:4041 
Spark context available as 'sc' (master = yarn, app id = application_1472397144295_0007).                                                                                  
Spark session available as 'spark'.                                                                                                                                        
Welcome to                                                                                                                                                                 
      ____              __                                                                                                                                                 
     / __/__  ___ _____/ /__                                                                                                                                               
    _\ \/ _ \/ _ `/ __/  '_/                                                                                                                                               
   /___/ .__/\_,_/_/ /_/\_\   version 2.0.0                                                                                                                                
      /_/                                                                                                                                                                  
Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.7.0_101)                                                                                                      
Type in expressions to have them evaluated.                                                                                                                                
Type :help for more information.                                                                                                                                           

scala> sc.getConf.getAll.foreach(println)                                                                                                                                  
(spark.eventLog.enabled,true)                                                                                                                                              
(spark.yarn.scheduler.heartbeat.interval-ms,5000)                                                                                                                          
(hive.metastore.warehouse.dir,file:/usr/hdp/2.5.0.0-817/spark2/spark-warehouse)                                                                                            
(spark.repl.class.outputDir,/tmp/spark-fa16d4d3-8ec8-4b0e-a1da-5a2dffe39d08/repl-5dd28f29-ae03-4965-a535-18a95173b173)                                                     
(spark.yarn.am.extraJavaOptions,-Dhdp.version=2.5.0.0-817)                                                                                                                 
(spark.yarn.containerLauncherMaxThreads,25)                                                                                                                                
(spark.driver.extraJavaOptions,-Dhdp.version=2.5.0.0-817)                                                                                                                  
(spark.driver.extraLibraryPath,/usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64)                                         
(spark.driver.appUIAddress,http://10.0.2.15:4041) 
(spark.driver.host,10.0.2.15)                                                                                                                                              
(spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_URI_BASES,http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0007) 
(spark.yarn.preserve.staging.files,false)                                                                                                                                  
(spark.home,/usr/hdp/current/spark2-client)                                                                                                                                
(spark.app.name,Spark shell)                                                                                                                                               
(spark.repl.class.uri,spark://10.0.2.15:37426/classes)                                                                                                                     
(spark.ui.port,4041)                                                                                                                                                       
(spark.yarn.max.executor.failures,3)                                                                                                                                       
(spark.submit.deployMode,client)                                                                                                                                           
(spark.yarn.executor.memoryOverhead,200)                                                                                                                                   
(spark.ui.filters,org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter)                                                                                              
(spark.driver.extraClassPath,/usr/hdp/current/hadoop-client/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar)                                                                          
(spark.executor.memory,2g)                                                                                                                                                 
(spark.yarn.driver.memoryOverhead,200)                                                                                                                                     
(spark.hadoop.yarn.timeline-service.enabled,false)                                                                                                                         
(spark.executor.extraLibraryPath,/usr/hdp/current/hadoop-client/lib/native)                                                                                                
(spark.app.id,application_1472397144295_0007)                                                                                                                              
(spark.executor.id,driver)                                                                                                                                                 
(spark.yarn.queue,default)                                                                                                                                                 
(spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_HOSTS,sandbox.hortonworks.com)                                                               
(spark.eventLog.dir,hdfs:///spark-history)                                                                                                                                 
(spark.master,yarn)                                                                                                                                                        
(spark.driver.port,37426)                                                                                                                                                  
(spark.yarn.submit.file.replication,3)                                                                                                                                     
(spark.sql.catalogImplementation,hive)                                                                                                                                     
(spark.driver.memory,2g)                                                                                                                                                   
(spark.jars,)                                                                                                                                                              
(spark.executor.cores,1)                                                                                                                                                   

scala> val file = sc.textFile("/tmp/data")                                                                                                                                 
file: org.apache.spark.rdd.RDD[String] = /tmp/data MapPartitionsRDD[1] at textFile at <console>:24                                                                         

scala> val counts = file.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey(_ + _)                                                                        
counts: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[4] at reduceByKey at <console>:26                                                                            

scala> counts.take(10)                                                                                                                                                     
res1: Array[(String, Int)] = Array((hadoop.tasklog.noKeepSplits=4,1), (log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=${yarn.se
rver.resourcemanager.appsummary.logger},1), (Unless,1), (this,4), (hadoop.mapreduce.jobsummary.log.file=hadoop-mapreduce.jobsummary.log,1), (under,4), (log4j.appender.RFA.
layout.ConversionPattern=%d{ISO8601},2), (log4j.appender.DRFAAUDIT.layout=org.apache.log4j.PatternLayout,1), (AppSummaryLogging,1), (log4j.appender.RMAUDIT.layout=org.apac
he.log4j.PatternLayout,1))                                                                                                                                                 

scala>                                                                                                                                                                     


					
				
			
			
				
			
			
			
				

View solution in original post

2 REPLIES 2

avatar
Rising Star

Resolution for Spark Submit issue: add java-opts file in /usr/hdp/current/spark2-client/conf/

[root@sandbox conf]# cat java-opts                                                              
-Dhdp.version=2.5.0.0-817

Spark Submit working example:

[root@sandbox spark2-client]# ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster --driver-memory 2g --executor-memory 2g --ex
ecutor-cores 1 examples/jars/spark-examples*.jar 10                                                                                                                        
16/08/29 17:44:57 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable                        
16/08/29 17:44:58 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.                          
16/08/29 17:44:58 INFO client.RMProxy: Connecting to ResourceManager at sandbox.hortonworks.com/10.0.2.15:8050                                                             
16/08/29 17:44:58 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers                                                                          
16/08/29 17:44:58 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (7680 MB per container)             
16/08/29 17:44:58 INFO yarn.Client: Will allocate AM container, with 2248 MB memory including 200 MB overhead                                                              
16/08/29 17:44:58 INFO yarn.Client: Setting up container launch context for our AM                                                                                         
16/08/29 17:44:58 INFO yarn.Client: Setting up the launch environment for our AM container                                                                                 
16/08/29 17:44:58 INFO yarn.Client: Preparing resources for our AM container                                                                                               
16/08/29 17:44:58 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.                           
16/08/29 17:45:00 INFO yarn.Client: Uploading resource file:/tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b/__spark_libs__3503948162159958877.zip -> hdfs://sandbox.hortonw
orks.com:8020/user/root/.sparkStaging/application_1472397144295_0006/__spark_libs__3503948162159958877.zip                                                                 
16/08/29 17:45:01 INFO yarn.Client: Uploading resource file:/usr/hdp/2.5.0.0-817/spark2/examples/jars/spark-examples_2.11-2.0.0.jar -> hdfs://sandbox.hortonworks.com:8020/
user/root/.sparkStaging/application_1472397144295_0006/spark-examples_2.11-2.0.0.jar                                                                                       
16/08/29 17:45:01 INFO yarn.Client: Uploading resource file:/tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b/__spark_conf__4613069544481307021.zip -> hdfs://sandbox.hortonw
orks.com:8020/user/root/.sparkStaging/application_1472397144295_0006/__spark_conf__.zip                                                                                    
16/08/29 17:45:01 WARN yarn.Client: spark.yarn.am.extraJavaOptions will not take effect in cluster mode                                                                    
16/08/29 17:45:01 INFO spark.SecurityManager: Changing view acls to: root                                                                                                  
16/08/29 17:45:01 INFO spark.SecurityManager: Changing modify acls to: root                                                                                                
16/08/29 17:45:01 INFO spark.SecurityManager: Changing view acls groups to:                                                                                                
16/08/29 17:45:01 INFO spark.SecurityManager: Changing modify acls groups to:                                                                                              
16/08/29 17:45:01 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permiss
ions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()                                                                              
16/08/29 17:45:01 INFO yarn.Client: Submitting application application_1472397144295_0006 to ResourceManager                                                               
16/08/29 17:45:01 INFO impl.YarnClientImpl: Submitted application application_1472397144295_0006                                                                           
16/08/29 17:45:02 INFO yarn.Client: Application report for application_1472397144295_0006 (state: ACCEPTED)                                                                
16/08/29 17:45:02 INFO yarn.Client:                                                                                                                                        
         client token: N/A                                                                                                                                                 
         diagnostics: AM container is launched, waiting for AM container to Register with RM                                                                               
         ApplicationMaster host: N/A                                                                                                                                       
         ApplicationMaster RPC port: -1                                                                                                                                    
         queue: default                                                                                                                                                    
         start time: 1472492701409                                                                                                                                         
         final status: UNDEFINED                                                                                                                                           
 tracking URL: http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/ 
         user: root                                                                                                                                                        
16/08/29 17:45:03 INFO yarn.Client: Application report for application_1472397144295_0006 (state: ACCEPTED)                                                                
16/08/29 17:45:04 INFO yarn.Client: Application report for application_1472397144295_0006 (state: ACCEPTED)                                                                
16/08/29 17:45:05 INFO yarn.Client: Application report for application_1472397144295_0006 (state: ACCEPTED)                                                                
16/08/29 17:45:06 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:06 INFO yarn.Client:                                                                                                                                        
         client token: N/A                                                                                                                                                 
         diagnostics: N/A                                                                                                                                                  
         ApplicationMaster host: 10.0.2.15                                                                                                                                 
         ApplicationMaster RPC port: 0                                                                                                                                     
         queue: default                                                                                                                                                    
         start time: 1472492701409                                                                                                                                         
         final status: UNDEFINED                                                                                                                                           
 tracking URL: http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/ 
         user: root                                                                                                                                                        
16/08/29 17:45:07 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:08 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:09 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:10 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:11 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:12 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:13 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:14 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:15 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:16 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:17 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:18 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:19 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:20 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:21 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:22 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:23 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:24 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:25 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:26 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:27 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:28 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:29 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:30 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:31 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:32 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:33 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:34 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:35 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:36 INFO yarn.Client: Application report for application_1472397144295_0006 (state: RUNNING)                                                                 
16/08/29 17:45:37 INFO yarn.Client: Application report for application_1472397144295_0006 (state: FINISHED)                                                                
16/08/29 17:45:37 INFO yarn.Client:                                                                                                                                        
         client token: N/A                                                                                                                                                 
         diagnostics: N/A                                                                                                                                                  
         ApplicationMaster host: 10.0.2.15                                                                                                                                 
         ApplicationMaster RPC port: 0                                                                                                                                     
         queue: default                                                                                                                                                    
         start time: 1472492701409                                                                                                                                         
         final status: SUCCEEDED                                                                                                                                           
 tracking URL: http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/ 
         user: root                                                                                                                                                        
16/08/29 17:45:37 INFO util.ShutdownHookManager: Shutdown hook called                                                                                                      
16/08/29 17:45:37 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b                                                        
[root@sandbox spark2-client]#                                                                                                                                              

Resolution for Spark Shell issue (lzo-codec): add the following 2 lines in your spark-defaults.conf

spark.driver.extraClassPath /usr/hdp/current/hadoop-client/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar                                                                            
spark.driver.extraLibraryPath /usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64                                            

Spark Shell working example:

[root@sandbox spark2-client]# ./bin/spark-shell --master yarn --deploy-mode client --driver-memory 2g --executor-memory 2g --executor-cores 1                              
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel).                                                                                                                      
16/08/29 17:47:09 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.                           
16/08/29 17:47:21 WARN spark.SparkContext: Use an existing SparkContext, some configuration may not take effect.                                                           
Spark context Web UI available at http://10.0.2.15:4041 
Spark context available as 'sc' (master = yarn, app id = application_1472397144295_0007).                                                                                  
Spark session available as 'spark'.                                                                                                                                        
Welcome to                                                                                                                                                                 
      ____              __                                                                                                                                                 
     / __/__  ___ _____/ /__                                                                                                                                               
    _\ \/ _ \/ _ `/ __/  '_/                                                                                                                                               
   /___/ .__/\_,_/_/ /_/\_\   version 2.0.0                                                                                                                                
      /_/                                                                                                                                                                  
Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.7.0_101)                                                                                                      
Type in expressions to have them evaluated.                                                                                                                                
Type :help for more information.                                                                                                                                           

scala> sc.getConf.getAll.foreach(println)                                                                                                                                  
(spark.eventLog.enabled,true)                                                                                                                                              
(spark.yarn.scheduler.heartbeat.interval-ms,5000)                                                                                                                          
(hive.metastore.warehouse.dir,file:/usr/hdp/2.5.0.0-817/spark2/spark-warehouse)                                                                                            
(spark.repl.class.outputDir,/tmp/spark-fa16d4d3-8ec8-4b0e-a1da-5a2dffe39d08/repl-5dd28f29-ae03-4965-a535-18a95173b173)                                                     
(spark.yarn.am.extraJavaOptions,-Dhdp.version=2.5.0.0-817)                                                                                                                 
(spark.yarn.containerLauncherMaxThreads,25)                                                                                                                                
(spark.driver.extraJavaOptions,-Dhdp.version=2.5.0.0-817)                                                                                                                  
(spark.driver.extraLibraryPath,/usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64)                                         
(spark.driver.appUIAddress,http://10.0.2.15:4041) 
(spark.driver.host,10.0.2.15)                                                                                                                                              
(spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_URI_BASES,http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0007) 
(spark.yarn.preserve.staging.files,false)                                                                                                                                  
(spark.home,/usr/hdp/current/spark2-client)                                                                                                                                
(spark.app.name,Spark shell)                                                                                                                                               
(spark.repl.class.uri,spark://10.0.2.15:37426/classes)                                                                                                                     
(spark.ui.port,4041)                                                                                                                                                       
(spark.yarn.max.executor.failures,3)                                                                                                                                       
(spark.submit.deployMode,client)                                                                                                                                           
(spark.yarn.executor.memoryOverhead,200)                                                                                                                                   
(spark.ui.filters,org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter)                                                                                              
(spark.driver.extraClassPath,/usr/hdp/current/hadoop-client/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar)                                                                          
(spark.executor.memory,2g)                                                                                                                                                 
(spark.yarn.driver.memoryOverhead,200)                                                                                                                                     
(spark.hadoop.yarn.timeline-service.enabled,false)                                                                                                                         
(spark.executor.extraLibraryPath,/usr/hdp/current/hadoop-client/lib/native)                                                                                                
(spark.app.id,application_1472397144295_0007)                                                                                                                              
(spark.executor.id,driver)                                                                                                                                                 
(spark.yarn.queue,default)                                                                                                                                                 
(spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_HOSTS,sandbox.hortonworks.com)                                                               
(spark.eventLog.dir,hdfs:///spark-history)                                                                                                                                 
(spark.master,yarn)                                                                                                                                                        
(spark.driver.port,37426)                                                                                                                                                  
(spark.yarn.submit.file.replication,3)                                                                                                                                     
(spark.sql.catalogImplementation,hive)                                                                                                                                     
(spark.driver.memory,2g)                                                                                                                                                   
(spark.jars,)                                                                                                                                                              
(spark.executor.cores,1)                                                                                                                                                   

scala> val file = sc.textFile("/tmp/data")                                                                                                                                 
file: org.apache.spark.rdd.RDD[String] = /tmp/data MapPartitionsRDD[1] at textFile at <console>:24                                                                         

scala> val counts = file.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey(_ + _)                                                                        
counts: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[4] at reduceByKey at <console>:26                                                                            

scala> counts.take(10)                                                                                                                                                     
res1: Array[(String, Int)] = Array((hadoop.tasklog.noKeepSplits=4,1), (log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=${yarn.se
rver.resourcemanager.appsummary.logger},1), (Unless,1), (this,4), (hadoop.mapreduce.jobsummary.log.file=hadoop-mapreduce.jobsummary.log,1), (under,4), (log4j.appender.RFA.
layout.ConversionPattern=%d{ISO8601},2), (log4j.appender.DRFAAUDIT.layout=org.apache.log4j.PatternLayout,1), (AppSummaryLogging,1), (log4j.appender.RMAUDIT.layout=org.apac
he.log4j.PatternLayout,1))                                                                                                                                                 

scala>                                                                                                                                                                     


					
				
			
			
				
			
			
			
			
			
			
			
		

avatar

Yep, this worked for me as well. Thanks.