Support Questions

anandi · ‎08-28-2016

Sandbox HDP-2.5.0 TP Spark 1.6.2 - I am encounterning the following ERROR GPLNativeCodeLoader: Could not load native gpl library - ERROR LzoCodec: Cannot load native-lzo without native-hadoop

while running a simple word count on spark-shell

[root@sandbox ~]# cd $SPARK_HOME

[root@sandbox spark-client]# ./bin/spark-shell --master yarn-client --driver-memory 512m --executor-memory 512m --jars /us r/hdp/2.5.0.0-817/hadoop/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar

The following code is submitted at the Spark CLI

val file = sc.textFile("/tmp/data")
val counts = file.flatMap(line => line.split(" ")).map(word =>(word,1)).
reduceByKey(_ + _)
counts.saveAsTextFile("/tmp/wordcount")

This yields the following error:

ERROR GPLNativeCodeLoader: Could not load native gpl library

ERROR LzoCodec: Cannot load native-lzo without native-hadoop

The same error appear with or without adding the --jars parameter as here under:

--jars /us r/hdp/2.5.0.0-817/hadoop/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar

Full Log:

[root@sandbox ~]# cd $SPARK_HOME
[root@sandbox spark-client]# ./bin/spark-shell --master yarn-client --driver-memory 512m --executor-memory 512m --jars /us
r/hdp/2.5.0.0-817/hadoop/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar
16/08/2716:28:23 INFO SecurityManager:Changing view acls to: root
16/08/2716:28:23 INFO SecurityManager:Changing modify acls to: root
16/08/2716:28:23 INFO SecurityManager:SecurityManager: authentication disabled; ui acls disabled; users with view permis
sions:Set(root); users with modify permissions:Set(root)
16/08/2716:28:23 INFO HttpServer:Starting HTTP Server
16/08/2716:28:23 INFO Server: jetty-8.y.z-SNAPSHOT
16/08/2716:28:23 INFO AbstractConnector:StartedSocketConnector@0.0.0.0:43011
16/08/2716:28:23 INFO Utils:Successfully started service 'HTTP class server' on port 43011.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/'_/
/___/ .__/\_,_/_/ /_/\_\ version 1.6.2
/_/
Using Scala version 2.10.5 (OpenJDK 64-Bit Server VM, Java 1.7.0_101)
Type in expressions to have them evaluated.
Type :help for more information.
16/08/27 16:28:26 INFO SparkContext: Running Spark version 1.6.2
16/08/27 16:28:26 INFO SecurityManager: Changing view acls to: root
16/08/27 16:28:26 INFO SecurityManager: Changing modify acls to: root
16/08/27 16:28:26 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permis
sions: Set(root); users with modify permissions: Set(root)
16/08/27 16:28:26 INFO Utils: Successfully started service 'sparkDriver' on port 45506.
16/08/27 16:28:27 INFO Slf4jLogger: Slf4jLogger started
16/08/27 16:28:27 INFO Remoting: Starting remoting
16/08/27 16:28:27 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@10.0.2.15:44
829]
16/08/27 16:28:27 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 44829.
16/08/27 16:28:27 INFO SparkEnv: Registering MapOutputTracker
16/08/27 16:28:27 INFO SparkEnv: Registering BlockManagerMaster
16/08/27 16:28:27 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-0776b175-5dd7-49b9-adf7-f2cbd85a1e1b
16/08/27 16:28:27 INFO MemoryStore: MemoryStore started with capacity 143.6 MB
16/08/27 16:28:27 INFO SparkEnv: Registering OutputCommitCoordinator
16/08/27 16:28:27 INFO Server: jetty-8.y.z-SNAPSHOT
16/08/27 16:28:27 INFO AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040
16/08/27 16:28:27 INFO Utils: Successfully started service 'SparkUI' on port 4040.
16/08/27 16:28:27 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://10.0.2.15:4040
16/08/27 16:28:27 INFO HttpFileServer: HTTP File server directory is /tmp/spark-61ecb98e-989c-4396-9b30-032c4d5a2b90/httpd
-857ce699-7db0-428c-9af5-1dca4ec5330d
16/08/27 16:28:27 INFO HttpServer: Starting HTTP Server
16/08/27 16:28:27 INFO Server: jetty-8.y.z-SNAPSHOT
16/08/27 16:28:27 INFO AbstractConnector: Started SocketConnector@0.0.0.0:37515
16/08/27 16:28:27 INFO Utils: Successfully started service 'HTTP file server' on port 37515.
16/08/27 16:28:27 INFO SparkContext: Added JAR file:/usr/hdp/2.5.0.0-817/hadoop/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar at ht
tp://10.0.2.15:37515/jars/hadoop-lzo-0.6.0.2.5.0.0-817.jar with timestamp 1472315307772
spark.yarn.driver.memoryOverhead is set but does not apply in client mode.
16/08/27 16:28:28 INFO TimelineClientImpl: Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/
16/08/27 16:28:28 INFO RMProxy: Connecting to ResourceManager at sandbox.hortonworks.com/10.0.2.15:8050
16/08/27 16:28:28 INFO Client: Requesting a new application from cluster with 1 NodeManagers
16/08/27 16:28:28 INFO Client: Verifying our application has not requested more than the maximum memory capability of the
cluster (2250 MB per container)
16/08/27 16:28:28 INFO Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
16/08/27 16:28:28 INFO Client: Setting up container launch context for our AM
16/08/27 16:28:28 INFO Client: Setting up the launch environment for our AM container
16/08/27 16:28:28 INFO Client: Using the spark assembly jar on HDFS because you are using HDP, defaultSparkAssembly:hdfs:/
/sandbox.hortonworks.com:8020/hdp/apps/2.5.0.0-817/spark/spark-hdp-assembly.jar
16/08/27 16:28:28 INFO Client: Preparing resources for our AM container
16/08/27 16:28:28 INFO Client: Using the spark assembly jar on HDFS because you are using HDP, defaultSparkAssembly:hdfs:/
/sandbox.hortonworks.com:8020/hdp/apps/2.5.0.0-817/spark/spark-hdp-assembly.jar
16/08/27 16:28:28 INFO Client: Source and destination file systems are the same. Not copying hdfs://sandbox.hortonworks.co
m:8020/hdp/apps/2.5.0.0-817/spark/spark-hdp-assembly.jar
16/08/27 16:28:29 INFO Client: Uploading resource file:/tmp/spark-61ecb98e-989c-4396-9b30-032c4d5a2b90/__spark_conf__50848
04354575467223.zip -> hdfs://sandbox.hortonworks.com:8020/user/root/.sparkStaging/application_1472312154461_0006/__spark_c
onf__5084804354575467223.zip
16/08/27 16:28:29 INFO SecurityManager: Changing view acls to: root
16/08/27 16:28:29 INFO SecurityManager: Changing modify acls to: root
16/08/27 16:28:29 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permis
sions: Set(root); users with modify permissions: Set(root)
16/08/27 16:28:29 INFO Client: Submitting application 6 to ResourceManager
16/08/27 16:28:29 INFO YarnClientImpl: Submitted application application_1472312154461_0006
16/08/27 16:28:29 INFO SchedulerExtensionServices: Starting Yarn extension services with app application_1472312154461_000
6 and attemptId None
16/08/27 16:28:30 INFO Client: Application report for application_1472312154461_0006 (state: ACCEPTED)
16/08/27 16:28:30 INFO Client:
client token: N/A
diagnostics: AM container is launched, waiting for AM container to Register with RM
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1472315309252
final status: UNDEFINED
tracking URL: http://sandbox.hortonworks.com:8088/proxy/application_1472312154461_0006/
user: root
16/08/27 16:28:31 INFO Client: Application report for application_1472312154461_0006 (state: ACCEPTED)
16/08/27 16:28:32 INFO YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(nul
l)
16/08/27 16:28:32 INFO YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpF
ilter, Map(PROXY_HOSTS -> sandbox.hortonworks.com, PROXY_URI_BASES -> http://sandbox.hortonworks.com:8088/proxy/applicatio
n_1472312154461_0006), /proxy/application_1472312154461_0006
16/08/27 16:28:32 INFO JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
16/08/27 16:28:32 INFO Client: Application report for application_1472312154461_0006 (state: RUNNING)
16/08/27 16:28:32 INFO Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 10.0.2.15
ApplicationMaster RPC port: 0
queue: default
start time: 1472315309252
final status: UNDEFINED
tracking URL: http://sandbox.hortonworks.com:8088/proxy/application_1472312154461_0006/
user: root
16/08/27 16:28:32 INFO YarnClientSchedulerBackend: Application application_1472312154461_0006 has started running.
16/08/27 16:28:32 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on p
ort 34124.
16/08/27 16:28:32 INFO NettyBlockTransferService: Server created on 34124
16/08/27 16:28:32 INFO BlockManagerMaster: Trying to register BlockManager
16/08/27 16:28:32 INFO BlockManagerMasterEndpoint: Registering block manager 10.0.2.15:34124 with 143.6 MB RAM, BlockManag
erId(driver, 10.0.2.15, 34124)
16/08/27 16:28:32 INFO BlockManagerMaster: Registered BlockManager
16/08/27 16:28:32 INFO EventLoggingListener: Logging events to hdfs:///spark-history/application_1472312154461_0006
16/08/27 16:28:36 INFO YarnClientSchedulerBackend: Registered executor NettyRpcEndpointRef(null) (sandbox.hortonworks.com:
39728) with ID 1
16/08/27 16:28:36 INFO BlockManagerMasterEndpoint: Registering block manager sandbox.hortonworks.com:38362 with 143.6 MB R
AM, BlockManagerId(1, sandbox.hortonworks.com, 38362)
16/08/27 16:28:57 INFO YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxReg
isteredResourcesWaitingTime: 30000(ms)
16/08/27 16:28:57 INFO SparkILoop: Created spark context..
Spark context available as sc.
16/08/27 16:28:58 INFO HiveContext: Initializing execution hive, version 1.2.1
16/08/27 16:28:58 INFO ClientWrapper: Inspected Hadoop version: 2.7.1.2.5.0.0-817
16/08/27 16:28:58 INFO ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.7.1.2.5.0.0-8
17
16/08/27 16:28:58 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.Objec
tStore
16/08/27 16:28:58 INFO ObjectStore: ObjectStore, initialize called
16/08/27 16:28:58 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored
16/08/27 16:28:58 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
16/08/27 16:28:59 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/08/27 16:28:59 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/08/27 16:29:00 INFO ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,Stor
ageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
16/08/27 16:29:01 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-o
nly" so does not have its own datastore table.
16/08/27 16:29:01 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" s
o does not have its own datastore table.
16/08/27 16:29:02 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-o
nly" so does not have its own datastore table.
16/08/27 16:29:02 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" s
o does not have its own datastore table.
16/08/27 16:29:02 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
16/08/27 16:29:02 INFO ObjectStore: Initialized ObjectStore
16/08/27 16:29:02 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not
enabled so recording the schema version 1.2.0
16/08/27 16:29:02 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
16/08/27 16:29:03 INFO HiveMetaStore: Added admin role in metastore
16/08/27 16:29:03 INFO HiveMetaStore: Added public role in metastore
16/08/27 16:29:03 INFO HiveMetaStore: No user is added in admin role, since config is empty
16/08/27 16:29:03 INFO HiveMetaStore: 0: get_all_databases
16/08/27 16:29:03 INFO audit: ugi=root ip=unknown-ip-addr cmd=get_all_databases
16/08/27 16:29:03 INFO HiveMetaStore: 0: get_functions: db=default pat=*
16/08/27 16:29:03 INFO audit: ugi=root ip=unknown-ip-addr cmd=get_functions: db=default pat=*
16/08/27 16:29:03 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-o
nly" so does not have its own datastore table.
16/08/27 16:29:03 INFO SessionState: Created local directory: /tmp/6ebb0a60-b229-4dad-94a3-e2386ba7b4ec_resources
16/08/27 16:29:03 INFO SessionState: Created HDFS directory: /tmp/hive/root/6ebb0a60-b229-4dad-94a3-e2386ba7b4ec
16/08/27 16:29:03 INFO SessionState: Created local directory: /tmp/root/6ebb0a60-b229-4dad-94a3-e2386ba7b4ec
16/08/27 16:29:03 INFO SessionState: Created HDFS directory: /tmp/hive/root/6ebb0a60-b229-4dad-94a3-e2386ba7b4ec/_tmp_spac
e.db
16/08/27 16:29:03 INFO HiveContext: default warehouse location is /user/hive/warehouse
16/08/27 16:29:03 INFO HiveContext: Initializing HiveMetastoreConnection version 1.2.1 using Spark classes.
16/08/27 16:29:03 INFO ClientWrapper: Inspected Hadoop version: 2.7.1.2.5.0.0-817
16/08/27 16:29:03 INFO ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.7.1.2.5.0.0-8
17
16/08/27 16:29:04 INFO metastore: Trying to connect to metastore with URI thrift://sandbox.hortonworks.com:9083
16/08/27 16:29:04 INFO metastore: Connected to metastore.
16/08/27 16:29:04 INFO SessionState: Created local directory: /tmp/83a1e2d3-8c24-4f12-9841-fab259a77514_resources
16/08/27 16:29:04 INFO SessionState: Created HDFS directory: /tmp/hive/root/83a1e2d3-8c24-4f12-9841-fab259a77514
16/08/27 16:29:04 INFO SessionState: Created local directory: /tmp/root/83a1e2d3-8c24-4f12-9841-fab259a77514
16/08/27 16:29:04 INFO SessionState: Created HDFS directory: /tmp/hive/root/83a1e2d3-8c24-4f12-9841-fab259a77514/_tmp_spac
e.db
16/08/27 16:29:04 INFO SparkILoop: Created sql context (with Hive support)..
SQL context available as sqlContext.
scala> val file = sc.textFile("/tmp/data")
16/08/27 16:29:20 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 234.8 KB, free 234.8 KB)
16/08/27 16:29:20 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 28.1 KB, free 262.9
KB)
16/08/27 16:29:20 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 10.0.2.15:34124 (size: 28.1 KB, free: 143.6
MB)
16/08/27 16:29:20 INFO SparkContext: Created broadcast 0 from textFile at <console>:27
file: org.apache.spark.rdd.RDD[String] = /tmp/data MapPartitionsRDD[1] at textFile at <console>:27
scala> val counts = file.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey(_ + _)
16/08/27 16:29:35 ERROR GPLNativeCodeLoader: Could not load native gpl library
java.lang.UnsatisfiedLinkError: no gplcompression in java.library.path
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1889)
at java.lang.Runtime.loadLibrary0(Runtime.java:849)
at java.lang.System.loadLibrary(System.java:1088)
at com.hadoop.compression.lzo.GPLNativeCodeLoader.<clinit>(GPLNativeCodeLoader.java:32)
at com.hadoop.compression.lzo.LzoCodec.<clinit>(LzoCodec.java:71)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:278)
at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2147)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2112)
at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:132)
at org.apache.hadoop.io.compress.CompressionCodecFactory.<init>(CompressionCodecFactory.java:179)
at org.apache.hadoop.mapred.TextInputFormat.configure(TextInputFormat.java:45)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
at org.apache.spark.rdd.HadoopRDD.getInputFormat(HadoopRDD.scala:189)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:202)
at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:242)
at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:240)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:240)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:242)
at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:240)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:240)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:242)
at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:240)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:240)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:242)
at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:240)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:240)
at org.apache.spark.Partitioner$.defaultPartitioner(Partitioner.scala:65)
at org.apache.spark.rdd.PairRDDFunctions$anonfun$reduceByKey$3.apply(PairRDDFunctions.scala:331)
at org.apache.spark.rdd.PairRDDFunctions$anonfun$reduceByKey$3.apply(PairRDDFunctions.scala:331)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:323)
at org.apache.spark.rdd.PairRDDFunctions.reduceByKey(PairRDDFunctions.scala:330)
at $line19.$read$iwC$iwC$iwC$iwC$iwC$iwC$iwC$iwC.<init>(<console>:29)
at $line19.$read$iwC$iwC$iwC$iwC$iwC$iwC$iwC.<init>(<console>:34)
at $line19.$read$iwC$iwC$iwC$iwC$iwC$iwC.<init>(<console>:36)
at $line19.$read$iwC$iwC$iwC$iwC$iwC.<init>(<console>:38)
at $line19.$read$iwC$iwC$iwC$iwC.<init>(<console>:40)
at $line19.$read$iwC$iwC$iwC.<init>(<console>:42)
at $line19.$read$iwC$iwC.<init>(<console>:44)
at $line19.$read$iwC.<init>(<console>:46)
at $line19.$read.<init>(<console>:48)
at $line19.$read$.<init>(<console>:52)
at $line19.$read$.<clinit>(<console>)
at $line19.$eval$.<init>(<console>:7)
at $line19.$eval$.<clinit>(<console>)
at $line19.$eval.$print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$loop(SparkILoop.scala:670)
at org.apache.spark.repl.SparkILoop$anonfun$org$apache$spark$repl$SparkILoop$process$1.apply$mcZ$sp(SparkILoop.s
cala:997)
at org.apache.spark.repl.SparkILoop$anonfun$org$apache$spark$repl$SparkILoop$process$1.apply(SparkILoop.scala:94
5)
at org.apache.spark.repl.SparkILoop$anonfun$org$apache$spark$repl$SparkILoop$process$1.apply(SparkILoop.scala:94
5)
at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$process(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
16/08/27 16:29:35 ERROR LzoCodec: Cannot load native-lzo without native-hadoop
16/08/27 16:29:35 INFO FileInputFormat: Total input paths to process : 1
counts: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[4] at reduceByKey at <console>:29
scala>

Please help to fix this issue.

anandi · ‎08-30-2016

Resolution done for Spark 2.0.0

Resolution for Spark Submit issue: add java-opts file in /usr/hdp/current/spark2-client/conf/

[root@sandbox conf]# cat java-opts
-Dhdp.version=2.5.0.0-817

Spark Submit working example:

[root@sandbox spark2-client]# ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster --driver-memory 2g --executor-memory 2g --ex
ecutor-cores 1 examples/jars/spark-examples*.jar 10
16/08/2917:44:57 WARN util.NativeCodeLoader:Unable to load native-hadoop library for your platform...using builtin-java classes where applicable
16/08/2917:44:58 WARN shortcircuit.DomainSocketFactory:Theshort-circuit local reads feature cannot be used because libhadoop cannot be loaded.
16/08/2917:44:58 INFO client.RMProxy:Connecting to ResourceManager at sandbox.hortonworks.com/10.0.2.15:8050
16/08/2917:44:58 INFO yarn.Client:Requesting a new application from cluster with1NodeManagers
16/08/2917:44:58 INFO yarn.Client:Verifyingour application has not requested more than the maximum memory capability of the cluster (7680 MB per container)
16/08/2917:44:58 INFO yarn.Client:Will allocate AM container,with2248 MB memory including 200 MB overhead
16/08/2917:44:58 INFO yarn.Client:Setting up container launch context forour AM
16/08/2917:44:58 INFO yarn.Client:Setting up the launch environment forour AM container
16/08/2917:44:58 INFO yarn.Client:Preparing resources forour AM container
16/08/2917:44:58 WARN yarn.Client:Neither spark.yarn.jars nor spark.yarn.archive isset, falling back to uploading libraries under SPARK_HOME.
16/08/2917:45:00 INFO yarn.Client:Uploading resource file:/tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b/__spark_libs__3503948162159958877.zip -> hdfs://sandbox.hortonw
orks.com:8020/user/root/.sparkStaging/application_1472397144295_0006/__spark_libs__3503948162159958877.zip
16/08/2917:45:01 INFO yarn.Client:Uploading resource file:/usr/hdp/2.5.0.0-817/spark2/examples/jars/spark-examples_2.11-2.0.0.jar-> hdfs://sandbox.hortonworks.com:8020/
user/root/.sparkStaging/application_1472397144295_0006/spark-examples_2.11-2.0.0.jar
16/08/2917:45:01 INFO yarn.Client:Uploading resource file:/tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b/__spark_conf__4613069544481307021.zip -> hdfs://sandbox.hortonw
orks.com:8020/user/root/.sparkStaging/application_1472397144295_0006/__spark_conf__.zip
16/08/2917:45:01 WARN yarn.Client: spark.yarn.am.extraJavaOptions will not take effect in cluster mode
16/08/2917:45:01 INFO spark.SecurityManager:Changing view acls to: root
16/08/2917:45:01 INFO spark.SecurityManager:Changing modify acls to: root
16/08/2917:45:01 INFO spark.SecurityManager:Changing view acls groups to:
16/08/2917:45:01 INFO spark.SecurityManager:Changing modify acls groups to:
16/08/2917:45:01 INFO spark.SecurityManager:SecurityManager: authentication disabled; ui acls disabled; users with view permissions:Set(root); groups with view permiss
ions:Set(); users with modify permissions:Set(root); groups with modify permissions:Set()
16/08/2917:45:01 INFO yarn.Client:Submitting application application_1472397144295_0006 to ResourceManager
16/08/2917:45:01 INFO impl.YarnClientImpl:Submitted application application_1472397144295_0006
16/08/2917:45:02 INFO yarn.Client:Application report for application_1472397144295_0006 (state: ACCEPTED)
16/08/2917:45:02 INFO yarn.Client:
client token: N/A
diagnostics: AM container is launched, waiting for AM container to Registerwith RM
ApplicationMaster host: N/A
ApplicationMaster RPC port:-1
queue:default
start time:1472492701409
final status: UNDEFINED
tracking URL:http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/
user: root
16/08/2917:45:03 INFO yarn.Client:Application report for application_1472397144295_0006 (state: ACCEPTED)
16/08/2917:45:04 INFO yarn.Client:Application report for application_1472397144295_0006 (state: ACCEPTED)
16/08/2917:45:05 INFO yarn.Client:Application report for application_1472397144295_0006 (state: ACCEPTED)
16/08/2917:45:06 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:06 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host:10.0.2.15
ApplicationMaster RPC port:0
queue:default
start time:1472492701409
final status: UNDEFINED
tracking URL:http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/
user: root
16/08/2917:45:07 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:08 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:09 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:10 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:11 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:12 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:13 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:14 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:15 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:16 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:17 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:18 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:19 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:20 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:21 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:22 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:23 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:24 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:25 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:26 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:27 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:28 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:29 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:30 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:31 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:32 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:33 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:34 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:35 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:36 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:37 INFO yarn.Client:Application report for application_1472397144295_0006 (state: FINISHED)
16/08/2917:45:37 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host:10.0.2.15
ApplicationMaster RPC port:0
queue:default
start time:1472492701409
final status: SUCCEEDED
tracking URL:http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/
user: root
16/08/2917:45:37 INFO util.ShutdownHookManager:Shutdown hook called
16/08/2917:45:37 INFO util.ShutdownHookManager:Deleting directory /tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b
[root@sandbox spark2-client]#

Resolution for Spark Shell issue (lzo-codec): add the following 2 lines in your spark-defaults.conf

spark.driver.extraClassPath /usr/hdp/current/hadoop-client/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar
spark.driver.extraLibraryPath /usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64

Spark Shell working example:

[root@sandbox spark2-client]# ./bin/spark-shell --master yarn --deploy-mode client --driver-memory 2g --executor-memory 2g --executor-cores 1
Settingdefault log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel).
16/08/2917:47:09 WARN yarn.Client:Neither spark.yarn.jars nor spark.yarn.archive isset, falling back to uploading libraries under SPARK_HOME.
16/08/2917:47:21 WARN spark.SparkContext:Use an existing SparkContext, some configuration may not take effect.
Spark context Web UI available at http://10.0.2.15:4041
Spark context available as'sc'(master = yarn, app id = application_1472397144295_0007).
Spark session available as'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/'_/
/___/ .__/\_,_/_/ /_/\_\ version 2.0.0
/_/
Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.7.0_101)
Type in expressions to have them evaluated.
Type :help for more information.
scala> sc.getConf.getAll.foreach(println)
(spark.eventLog.enabled,true)
(spark.yarn.scheduler.heartbeat.interval-ms,5000)
(hive.metastore.warehouse.dir,file:/usr/hdp/2.5.0.0-817/spark2/spark-warehouse)
(spark.repl.class.outputDir,/tmp/spark-fa16d4d3-8ec8-4b0e-a1da-5a2dffe39d08/repl-5dd28f29-ae03-4965-a535-18a95173b173)
(spark.yarn.am.extraJavaOptions,-Dhdp.version=2.5.0.0-817)
(spark.yarn.containerLauncherMaxThreads,25)
(spark.driver.extraJavaOptions,-Dhdp.version=2.5.0.0-817)
(spark.driver.extraLibraryPath,/usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64)
(spark.driver.appUIAddress,http://10.0.2.15:4041)
(spark.driver.host,10.0.2.15)
(spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_URI_BASES,http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0007)
(spark.yarn.preserve.staging.files,false)
(spark.home,/usr/hdp/current/spark2-client)
(spark.app.name,Spark shell)
(spark.repl.class.uri,spark://10.0.2.15:37426/classes)
(spark.ui.port,4041)
(spark.yarn.max.executor.failures,3)
(spark.submit.deployMode,client)
(spark.yarn.executor.memoryOverhead,200)
(spark.ui.filters,org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter)
(spark.driver.extraClassPath,/usr/hdp/current/hadoop-client/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar)
(spark.executor.memory,2g)
(spark.yarn.driver.memoryOverhead,200)
(spark.hadoop.yarn.timeline-service.enabled,false)
(spark.executor.extraLibraryPath,/usr/hdp/current/hadoop-client/lib/native)
(spark.app.id,application_1472397144295_0007)
(spark.executor.id,driver)
(spark.yarn.queue,default)
(spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_HOSTS,sandbox.hortonworks.com)
(spark.eventLog.dir,hdfs:///spark-history)
(spark.master,yarn)
(spark.driver.port,37426)
(spark.yarn.submit.file.replication,3)
(spark.sql.catalogImplementation,hive)
(spark.driver.memory,2g)
(spark.jars,)
(spark.executor.cores,1)
scala> val file = sc.textFile("/tmp/data")
file: org.apache.spark.rdd.RDD[String] = /tmp/data MapPartitionsRDD[1] at textFile at <console>:24
scala> val counts = file.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey(_ + _)
counts: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[4] at reduceByKey at <console>:26
scala> counts.take(10)
res1: Array[(String, Int)] = Array((hadoop.tasklog.noKeepSplits=4,1), (log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=${yarn.se
rver.resourcemanager.appsummary.logger},1), (Unless,1), (this,4), (hadoop.mapreduce.jobsummary.log.file=hadoop-mapreduce.jobsummary.log,1), (under,4), (log4j.appender.RFA.
layout.ConversionPattern=%d{ISO8601},2), (log4j.appender.DRFAAUDIT.layout=org.apache.log4j.PatternLayout,1), (AppSummaryLogging,1), (log4j.appender.RMAUDIT.layout=org.apac
he.log4j.PatternLayout,1))
scala>

View solution in original post

anandi · ‎08-30-2016

Resolution done for Spark 2.0.0

Resolution for Spark Submit issue: add java-opts file in /usr/hdp/current/spark2-client/conf/

[root@sandbox conf]# cat java-opts
-Dhdp.version=2.5.0.0-817

Spark Submit working example:

[root@sandbox spark2-client]# ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster --driver-memory 2g --executor-memory 2g --ex
ecutor-cores 1 examples/jars/spark-examples*.jar 10
16/08/2917:44:57 WARN util.NativeCodeLoader:Unable to load native-hadoop library for your platform...using builtin-java classes where applicable
16/08/2917:44:58 WARN shortcircuit.DomainSocketFactory:Theshort-circuit local reads feature cannot be used because libhadoop cannot be loaded.
16/08/2917:44:58 INFO client.RMProxy:Connecting to ResourceManager at sandbox.hortonworks.com/10.0.2.15:8050
16/08/2917:44:58 INFO yarn.Client:Requesting a new application from cluster with1NodeManagers
16/08/2917:44:58 INFO yarn.Client:Verifyingour application has not requested more than the maximum memory capability of the cluster (7680 MB per container)
16/08/2917:44:58 INFO yarn.Client:Will allocate AM container,with2248 MB memory including 200 MB overhead
16/08/2917:44:58 INFO yarn.Client:Setting up container launch context forour AM
16/08/2917:44:58 INFO yarn.Client:Setting up the launch environment forour AM container
16/08/2917:44:58 INFO yarn.Client:Preparing resources forour AM container
16/08/2917:44:58 WARN yarn.Client:Neither spark.yarn.jars nor spark.yarn.archive isset, falling back to uploading libraries under SPARK_HOME.
16/08/2917:45:00 INFO yarn.Client:Uploading resource file:/tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b/__spark_libs__3503948162159958877.zip -> hdfs://sandbox.hortonw
orks.com:8020/user/root/.sparkStaging/application_1472397144295_0006/__spark_libs__3503948162159958877.zip
16/08/2917:45:01 INFO yarn.Client:Uploading resource file:/usr/hdp/2.5.0.0-817/spark2/examples/jars/spark-examples_2.11-2.0.0.jar-> hdfs://sandbox.hortonworks.com:8020/
user/root/.sparkStaging/application_1472397144295_0006/spark-examples_2.11-2.0.0.jar
16/08/2917:45:01 INFO yarn.Client:Uploading resource file:/tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b/__spark_conf__4613069544481307021.zip -> hdfs://sandbox.hortonw
orks.com:8020/user/root/.sparkStaging/application_1472397144295_0006/__spark_conf__.zip
16/08/2917:45:01 WARN yarn.Client: spark.yarn.am.extraJavaOptions will not take effect in cluster mode
16/08/2917:45:01 INFO spark.SecurityManager:Changing view acls to: root
16/08/2917:45:01 INFO spark.SecurityManager:Changing modify acls to: root
16/08/2917:45:01 INFO spark.SecurityManager:Changing view acls groups to:
16/08/2917:45:01 INFO spark.SecurityManager:Changing modify acls groups to:
16/08/2917:45:01 INFO spark.SecurityManager:SecurityManager: authentication disabled; ui acls disabled; users with view permissions:Set(root); groups with view permiss
ions:Set(); users with modify permissions:Set(root); groups with modify permissions:Set()
16/08/2917:45:01 INFO yarn.Client:Submitting application application_1472397144295_0006 to ResourceManager
16/08/2917:45:01 INFO impl.YarnClientImpl:Submitted application application_1472397144295_0006
16/08/2917:45:02 INFO yarn.Client:Application report for application_1472397144295_0006 (state: ACCEPTED)
16/08/2917:45:02 INFO yarn.Client:
client token: N/A
diagnostics: AM container is launched, waiting for AM container to Registerwith RM
ApplicationMaster host: N/A
ApplicationMaster RPC port:-1
queue:default
start time:1472492701409
final status: UNDEFINED
tracking URL:http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/
user: root
16/08/2917:45:03 INFO yarn.Client:Application report for application_1472397144295_0006 (state: ACCEPTED)
16/08/2917:45:04 INFO yarn.Client:Application report for application_1472397144295_0006 (state: ACCEPTED)
16/08/2917:45:05 INFO yarn.Client:Application report for application_1472397144295_0006 (state: ACCEPTED)
16/08/2917:45:06 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:06 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host:10.0.2.15
ApplicationMaster RPC port:0
queue:default
start time:1472492701409
final status: UNDEFINED
tracking URL:http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/
user: root
16/08/2917:45:07 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:08 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:09 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:10 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:11 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:12 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:13 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:14 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:15 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:16 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:17 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:18 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:19 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:20 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:21 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:22 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:23 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:24 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:25 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:26 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:27 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:28 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:29 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:30 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:31 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:32 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:33 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:34 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:35 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:36 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
16/08/2917:45:37 INFO yarn.Client:Application report for application_1472397144295_0006 (state: FINISHED)
16/08/2917:45:37 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host:10.0.2.15
ApplicationMaster RPC port:0
queue:default
start time:1472492701409
final status: SUCCEEDED
tracking URL:http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/
user: root
16/08/2917:45:37 INFO util.ShutdownHookManager:Shutdown hook called
16/08/2917:45:37 INFO util.ShutdownHookManager:Deleting directory /tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b
[root@sandbox spark2-client]#

Resolution for Spark Shell issue (lzo-codec): add the following 2 lines in your spark-defaults.conf

spark.driver.extraClassPath /usr/hdp/current/hadoop-client/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar
spark.driver.extraLibraryPath /usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64

Spark Shell working example:

[root@sandbox spark2-client]# ./bin/spark-shell --master yarn --deploy-mode client --driver-memory 2g --executor-memory 2g --executor-cores 1
Settingdefault log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel).
16/08/2917:47:09 WARN yarn.Client:Neither spark.yarn.jars nor spark.yarn.archive isset, falling back to uploading libraries under SPARK_HOME.
16/08/2917:47:21 WARN spark.SparkContext:Use an existing SparkContext, some configuration may not take effect.
Spark context Web UI available at http://10.0.2.15:4041
Spark context available as'sc'(master = yarn, app id = application_1472397144295_0007).
Spark session available as'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/'_/
/___/ .__/\_,_/_/ /_/\_\ version 2.0.0
/_/
Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.7.0_101)
Type in expressions to have them evaluated.
Type :help for more information.
scala> sc.getConf.getAll.foreach(println)
(spark.eventLog.enabled,true)
(spark.yarn.scheduler.heartbeat.interval-ms,5000)
(hive.metastore.warehouse.dir,file:/usr/hdp/2.5.0.0-817/spark2/spark-warehouse)
(spark.repl.class.outputDir,/tmp/spark-fa16d4d3-8ec8-4b0e-a1da-5a2dffe39d08/repl-5dd28f29-ae03-4965-a535-18a95173b173)
(spark.yarn.am.extraJavaOptions,-Dhdp.version=2.5.0.0-817)
(spark.yarn.containerLauncherMaxThreads,25)
(spark.driver.extraJavaOptions,-Dhdp.version=2.5.0.0-817)
(spark.driver.extraLibraryPath,/usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64)
(spark.driver.appUIAddress,http://10.0.2.15:4041)
(spark.driver.host,10.0.2.15)
(spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_URI_BASES,http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0007)
(spark.yarn.preserve.staging.files,false)
(spark.home,/usr/hdp/current/spark2-client)
(spark.app.name,Spark shell)
(spark.repl.class.uri,spark://10.0.2.15:37426/classes)
(spark.ui.port,4041)
(spark.yarn.max.executor.failures,3)
(spark.submit.deployMode,client)
(spark.yarn.executor.memoryOverhead,200)
(spark.ui.filters,org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter)
(spark.driver.extraClassPath,/usr/hdp/current/hadoop-client/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar)
(spark.executor.memory,2g)
(spark.yarn.driver.memoryOverhead,200)
(spark.hadoop.yarn.timeline-service.enabled,false)
(spark.executor.extraLibraryPath,/usr/hdp/current/hadoop-client/lib/native)
(spark.app.id,application_1472397144295_0007)
(spark.executor.id,driver)
(spark.yarn.queue,default)
(spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_HOSTS,sandbox.hortonworks.com)
(spark.eventLog.dir,hdfs:///spark-history)
(spark.master,yarn)
(spark.driver.port,37426)
(spark.yarn.submit.file.replication,3)
(spark.sql.catalogImplementation,hive)
(spark.driver.memory,2g)
(spark.jars,)
(spark.executor.cores,1)
scala> val file = sc.textFile("/tmp/data")
file: org.apache.spark.rdd.RDD[String] = /tmp/data MapPartitionsRDD[1] at textFile at <console>:24
scala> val counts = file.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey(_ + _)
counts: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[4] at reduceByKey at <console>:26
scala> counts.take(10)
res1: Array[(String, Int)] = Array((hadoop.tasklog.noKeepSplits=4,1), (log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=${yarn.se
rver.resourcemanager.appsummary.logger},1), (Unless,1), (this,4), (hadoop.mapreduce.jobsummary.log.file=hadoop-mapreduce.jobsummary.log,1), (under,4), (log4j.appender.RFA.
layout.ConversionPattern=%d{ISO8601},2), (log4j.appender.DRFAAUDIT.layout=org.apache.log4j.PatternLayout,1), (AppSummaryLogging,1), (log4j.appender.RMAUDIT.layout=org.apac
he.log4j.PatternLayout,1))
scala>

Report Inappropriate Content · ‎09-01-2016

@anandi Thanks, the word count fix works great! However I applied the fix by editing/adding the properties in Ambari, so they won't get overwritten if I make another change at that level. In my opinion that is preferable to editing the config file directly.

Cloudera Community

Support Questions

Sandbox HDP 2.5.0 - Spark 1.6.2 - Issues: GPLNativeCodeLoader: Could not load native gpl library - LzoCodec: Cannot load native-lzo without native-hadoop