Member since
08-13-2014
5
Posts
5
Kudos Received
0
Solutions
10-02-2014
12:56 PM
Investigating the nodes, it seems that there were some /usr/lib/xxx folders left over (e.g. hadoop, hive, etc.). I removed them. I also removed a couple of leftover conf folders. Lastly, I've performed a full cluster restart and a clean_restart of the cloudera-scm-agents. Still seeing the same issue. 😞
... View more
10-02-2014
11:07 AM
Yes, I'm aware of the Spark service not actually being utilized with YARN, but CM doesn't allow you to add Gateways without having at least a single Master and single Worker. 😕 I have made sure that the services are not running. I did some spot checking and saw that everything existed -- nothing really jumps out. The only other thing I can think of is that this cluster (long ago) was installed using packages instead of parcels on CDH 4. Maybe there's something leftover.
... View more
10-02-2014
10:15 AM
Thanks for the quick reply, Sean. I do not have any modifications to the Spark configuration and have removed and re-added the Spark service using CM to ensure this. All binaries are directly from CDH. I have been able to run it successfully on other clusters, as well, it's just this particular cluster that we are having an issue. I don't know that it matters, but here's my cluster layout (this is a lab/test cluster so it's only 3 nodes): clouderahost1.cluster.local 32 Role(s) Accumulo 1.6 Garbage Collector Accumulo 1.6 Master Accumulo 1.6 Monitor Accumulo 1.6 Tracer Accumulo 1.6 Tablet Server Flume Agent HBase Master HBase RegionServer HDFS DataNode HDFS Failover Controller HDFS HttpFS HDFS JournalNode HDFS NameNode Hive Gateway Hive Metastore Server HiveServer2 Hue Server Hue Kerberos Ticket Renewer Impala Daemon Impala StateStore Cloudera Management Service Navigator Audit Server Cloudera Management Service Navigator Metadata Server Cloudera Management Service Reports Manager Oozie Server Sentry Server Solr Server Spark Worker Sqoop 1 Client Gateway YARN (MR2 Included) JobHistory Server YARN (MR2 Included) NodeManager YARN (MR2 Included) ResourceManager ZooKeeper Server clouderahost2.cluster.local 16 Role(s) Accumulo 1.6 Tablet Server Flume Agent HBase Thrift Server HBase RegionServer HDFS DataNode HDFS Failover Controller HDFS JournalNode HDFS NameNode Hive Gateway Impala Catalog Server Impala Daemon Solr Server Spark Worker Sqoop 2 Server YARN (MR2 Included) NodeManager ZooKeeper Server clouderamain.cluster.local 18 Role(s) Accumulo 1.6 Tablet Server Flume Agent HBase RegionServer HDFS DataNode HDFS JournalNode Hive Gateway Impala Daemon Cloudera Management Service Activity Monitor Cloudera Management Service Alert Publisher Cloudera Management Service Event Server Cloudera Management Service Host Monitor Cloudera Management Service Service Monitor Spark Gateway Spark Master Spark Worker Sqoop 1 Client Gateway YARN (MR2 Included) NodeManager ZooKeeper Server
... View more
10-02-2014
10:05 AM
Hi all, We are running Spark on a Kerberized CDH 5.1.3 cluster managed by CM 5.1.3. We are not able to execute some simple spark-shell commands: [root@clouderamain ~]# source /etc/spark/conf/spark-env.sh
[root@clouderamain ~]# export SPARK_PRINT_LAUNCH_COMMAND=1
[root@clouderamain ~]# spark-shell --verbose --master yarn-client
scala> sc.setLocalProperty("yarn.nodemanager.delete.debug-delay-sec", "36000")
scala> val textFile = sc.textFile("salaries.java")
scala> textFile.count() We get the following error upon execution: WARN TaskSetManager: Loss was due to java.lang.NoClassDefFoundError
java.lang.NoClassDefFoundError: org/apache/hadoop/mapred/JobConf The SparkPi example runs without issue. Here are the full logs from our failing job: [root@clouderamain lib]# spark-shell --verbose --master yarn-client
Spark Command: /usr/java/default/bin/java -cp ::/opt/cloudera/parcels/CDH-5.1.3-1.cdh5.1.3.p0.12/lib/spark/conf:/opt/cloudera/parcels/CDH-5.1.3-1.cdh5.1.3.p0.12/lib/ spark/assembly/lib/*:/opt/cloudera/parcels/CDH-5.1.3-1.cdh5.1.3.p0.12/lib/spark/examples/lib/*:/etc/hadoop/conf:/etc/hadoop/conf:/opt/cloudera/parcels/CDH-5.1.3-1.cd h5.1.3.p0.12/lib/hadoop/libexec/../../hadoop/lib/*:/opt/cloudera/parcels/CDH-5.1.3-1.cdh5.1.3.p0.12/lib/hadoop/libexec/../../hadoop/.//*:/opt/cloudera/parcels/CDH-5. 1.3-1.cdh5.1.3.p0.12/bin/../lib/hadoop/../hadoop-hdfs/./:/opt/cloudera/parcels/CDH-5.1.3-1.cdh5.1.3.p0.12/bin/../lib/hadoop/../hadoop-hdfs/lib/*:/opt/cloudera/parcel s/CDH-5.1.3-1.cdh5.1.3.p0.12/bin/../lib/hadoop/../hadoop-hdfs/.//*:/opt/cloudera/parcels/CDH-5.1.3-1.cdh5.1.3.p0.12/bin/../lib/hadoop/../hadoop-yarn/lib/*:/opt/cloud era/parcels/CDH-5.1.3-1.cdh5.1.3.p0.12/bin/../lib/hadoop/../hadoop-yarn/.//*:/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/lib/*:/opt/cloudera/parcels/CDH/lib/hadoo p-mapreduce/.//*:/opt/cloudera/parcels/CDH-5.1.3-1.cdh5.1.3.p0.12/lib/spark/lib/scala-library.jar:/opt/cloudera/parcels/CDH-5.1.3-1.cdh5.1.3.p0.12/lib/spark/lib/scal a-compiler.jar:/opt/cloudera/parcels/CDH-5.1.3-1.cdh5.1.3.p0.12/lib/spark/lib/jline.jar -XX:MaxPermSize=128m -Djava.library.path= -Xms512m -Xmx512m org.apache.spark. deploy.SparkSubmit spark-shell --verbose --master yarn-client --class org.apache.spark.repl.Main
========================================
Using properties file: /opt/cloudera/parcels/CDH-5.1.3-1.cdh5.1.3.p0.12/lib/spark/conf/spark-defaults.conf
Adding default property: spark.eventLog.enabled=true
Adding default property: spark.eventLog.dir=/user/spark/applicationHistory
Adding default property: spark.master=spark://clouderamain.cluster.local:7077
Using properties file: /opt/cloudera/parcels/CDH-5.1.3-1.cdh5.1.3.p0.12/lib/spark/conf/spark-defaults.conf
Adding default property: spark.eventLog.enabled=true
Adding default property: spark.eventLog.dir=/user/spark/applicationHistory
Adding default property: spark.master=spark://clouderamain.cluster.local:7077
Parsed arguments:
master yarn-client
deployMode null
executorMemory null
executorCores null
totalExecutorCores null
propertiesFile /opt/cloudera/parcels/CDH-5.1.3-1.cdh5.1.3.p0.12/lib/spark/conf/spark-defaults.conf
driverMemory null
driverCores null
driverExtraClassPath null
driverExtraLibraryPath null
driverExtraJavaOptions null
supervise false
queue null
numExecutors null
files null
pyFiles null
archives null
mainClass org.apache.spark.repl.Main
primaryResource spark-shell
name org.apache.spark.repl.Main
childArgs []
jars null
verbose true
Default properties from /opt/cloudera/parcels/CDH-5.1.3-1.cdh5.1.3.p0.12/lib/spark/conf/spark-defaults.conf:
spark.eventLog.enabled -> true
spark.eventLog.dir -> /user/spark/applicationHistory
spark.master -> spark://clouderamain.cluster.local:7077
Using properties file: /opt/cloudera/parcels/CDH-5.1.3-1.cdh5.1.3.p0.12/lib/spark/conf/spark-defaults.conf
Adding default property: spark.eventLog.enabled=true
Adding default property: spark.eventLog.dir=/user/spark/applicationHistory
Adding default property: spark.master=spark://clouderamain.cluster.local:7077
Main class:
org.apache.spark.repl.Main
Arguments:
System properties:
spark.eventLog.enabled -> true
SPARK_SUBMIT -> true
spark.app.name -> org.apache.spark.repl.Main
spark.jars ->
spark.eventLog.dir -> /user/spark/applicationHistory
spark.master -> yarn-client
Classpath elements:
14/10/02 12:21:11 INFO SecurityManager: Changing view acls to: root
14/10/02 12:21:11 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root)
14/10/02 12:21:11 INFO HttpServer: Starting HTTP Server
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 1.0.0
/_/
Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_55)
Type in expressions to have them evaluated.
Type :help for more information.
14/10/02 12:21:17 INFO SecurityManager: Changing view acls to: root
14/10/02 12:21:17 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root)
14/10/02 12:21:18 INFO Slf4jLogger: Slf4jLogger started
14/10/02 12:21:18 INFO Remoting: Starting remoting
14/10/02 12:21:18 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://spark@clouderamain.cluster.local:40136]
14/10/02 12:21:18 INFO Remoting: Remoting now listens on addresses: [akka.tcp://spark@clouderamain.cluster.local:40136]
14/10/02 12:21:18 INFO SparkEnv: Registering MapOutputTracker
14/10/02 12:21:18 INFO SparkEnv: Registering BlockManagerMaster
14/10/02 12:21:18 INFO DiskBlockManager: Created local directory at /tmp/spark-local-20141002122118-5a5c
14/10/02 12:21:18 INFO MemoryStore: MemoryStore started with capacity 294.9 MB.
14/10/02 12:21:18 INFO ConnectionManager: Bound socket to port 44633 with id = ConnectionManagerId(clouderamain.cluster.local,44633)
14/10/02 12:21:18 INFO BlockManagerMaster: Trying to register BlockManager
14/10/02 12:21:18 INFO BlockManagerInfo: Registering block manager clouderamain.cluster.local:44633 with 294.9 MB RAM
14/10/02 12:21:18 INFO BlockManagerMaster: Registered BlockManager
14/10/02 12:21:19 INFO HttpServer: Starting HTTP Server
14/10/02 12:21:19 INFO HttpBroadcast: Broadcast server started at http://111.111.168.96:49465
14/10/02 12:21:19 INFO HttpFileServer: HTTP File server directory is /tmp/spark-468b7112-def1-42f7-ba3d-166cf09f919c
14/10/02 12:21:19 INFO HttpServer: Starting HTTP Server
14/10/02 12:21:19 INFO SparkUI: Started SparkUI at http://clouderamain.cluster.local:4040
14/10/02 12:21:20 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/10/02 12:21:23 INFO EventLoggingListener: Logging events to /user/spark/applicationHistory/spark-shell-1412266881060
--args is deprecated. Use --arg instead.
14/10/02 12:21:24 INFO RMProxy: Connecting to ResourceManager at clouderahost1.cluster.local/111.111.168.97:8032
14/10/02 12:21:24 INFO Client: Got Cluster metric info from ApplicationsManager (ASM), number of NodeManagers: 3
14/10/02 12:21:24 INFO Client: Queue info ... queueName: root.default, queueCurrentCapacity: 0.0, queueMaxCapacity: -1.0,
queueApplicationCount = 0, queueChildQueueCount = 0
14/10/02 12:21:24 INFO Client: Max mem capabililty of a single resource in this cluster 4096
14/10/02 12:21:24 INFO Client: Preparing Local resources
14/10/02 12:21:24 INFO DFSClient: Created HDFS_DELEGATION_TOKEN token 7996 for cloudera on ha-hdfs:nameservice1
14/10/02 12:21:24 INFO Client: Uploading hdfs://nameservice1:8020/user/spark/share/lib/spark-assembly.jar to hdfs://nameservice1/user/cloudera/.sparkStaging/application_1412014410679_0011/spark-assembly.jar
14/10/02 12:21:31 INFO Client: Setting up the launch environment
14/10/02 12:21:31 INFO Client: Setting up container launch context
14/10/02 12:21:32 INFO Client: Command for starting the Spark ApplicationMaster: List($JAVA_HOME/bin/java, -server, -Xmx512m, -Djava.io.tmpdir=$PWD/tmp, -Dspark.tachyonStore.folderName=\"spark-edeaaf88-58ab-4e56-b072-e63837686234\", -Dspark.eventLog.enabled=\"true\", -Dspark.yarn.secondary.jars=\"\", -Dspark.home=\"/opt/cloudera/parcels/CDH-5.1.3-1.cdh5.1.3.p0.12/lib/spark\", -Dspark.repl.class.uri=\"http://111.111.168.96:58885\", -Dspark.driver.host=\"clouderamain.cluster.local\", -Dspark.driver.appUIHistoryAddress=\"\", -Dspark.app.name=\"Spark shell\", -Dspark.jars=\"\", -Dspark.fileserver.uri=\"http://111.111.168.96:33361\", -Dspark.eventLog.dir=\"/user/spark/applicationHistory\", -Dspark.master=\"yarn-client\", -Dspark.driver.port=\"40136\", -Dspark.httpBroadcast.uri=\"http://111.111.168.96:49465\", -Dlog4j.configuration=log4j-spark-container.properties, org.apache.spark.deploy.yarn.ExecutorLauncher, --class, notused, --jar , null, --args 'clouderamain.cluster.local:40136' , --executor-memory, 1024, --executor-cores, 1, --num-executors , 2, 1>, <LOG_DIR>/stdout, 2>, <LOG_DIR>/stderr)
14/10/02 12:21:32 INFO Client: Submitting application to ASM
14/10/02 12:21:32 INFO YarnClientImpl: Submitted application application_1412014410679_0011
14/10/02 12:21:32 INFO YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: -1
appStartTime: 1412266892078
yarnAppState: ACCEPTED
14/10/02 12:21:33 INFO YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: -1
appStartTime: 1412266892078
yarnAppState: ACCEPTED
14/10/02 12:21:34 INFO YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: -1
appStartTime: 1412266892078
yarnAppState: ACCEPTED
14/10/02 12:21:35 INFO YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: -1
appStartTime: 1412266892078
yarnAppState: ACCEPTED
14/10/02 12:21:36 INFO YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: -1
appStartTime: 1412266892078
yarnAppState: ACCEPTED
14/10/02 12:21:37 INFO YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: -1
appStartTime: 1412266892078
yarnAppState: ACCEPTED
14/10/02 12:21:38 INFO YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: -1
appStartTime: 1412266892078
yarnAppState: ACCEPTED
14/10/02 12:21:39 INFO YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: -1
appStartTime: 1412266892078
yarnAppState: ACCEPTED
14/10/02 12:21:40 INFO YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: -1
appStartTime: 1412266892078
yarnAppState: ACCEPTED
14/10/02 12:21:41 INFO YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: -1
appStartTime: 1412266892078
yarnAppState: ACCEPTED
14/10/02 12:21:42 INFO YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: -1
appStartTime: 1412266892078
yarnAppState: ACCEPTED
14/10/02 12:21:43 INFO YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: -1
appStartTime: 1412266892078
yarnAppState: ACCEPTED
14/10/02 12:21:44 INFO YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: -1
appStartTime: 1412266892078
yarnAppState: ACCEPTED
14/10/02 12:21:45 INFO YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: -1
appStartTime: 1412266892078
yarnAppState: ACCEPTED
14/10/02 12:21:46 INFO YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: -1
appStartTime: 1412266892078
yarnAppState: ACCEPTED
14/10/02 12:21:47 INFO YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: -1
appStartTime: 1412266892078
yarnAppState: ACCEPTED
14/10/02 12:21:48 INFO YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: -1
appStartTime: 1412266892078
yarnAppState: ACCEPTED
14/10/02 12:21:49 INFO YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: -1
appStartTime: 1412266892078
yarnAppState: ACCEPTED
14/10/02 12:21:50 INFO YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: -1
appStartTime: 1412266892078
yarnAppState: ACCEPTED
14/10/02 12:21:51 INFO YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: -1
appStartTime: 1412266892078
yarnAppState: ACCEPTED
14/10/02 12:21:52 INFO YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: -1
appStartTime: 1412266892078
yarnAppState: ACCEPTED
14/10/02 12:21:53 INFO YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: 0
appStartTime: 1412266892078
yarnAppState: RUNNING
14/10/02 12:21:55 INFO YarnClientClusterScheduler: YarnClientClusterScheduler.postStartHook done
14/10/02 12:21:55 INFO SparkILoop: Created spark context..
Spark context available as sc.
scala> 14/10/02 12:22:17 INFO YarnClientSchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@clouderahost1.cluster.local:56255/user/Executor#-475572516] with ID 2
14/10/02 12:22:18 INFO YarnClientSchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@clouderamain.cluster.local:51361/user/Executor#378528362] with ID 1
14/10/02 12:22:18 INFO BlockManagerInfo: Registering block manager clouderahost1.cluster.local:58034 with 589.2 MB RAM
14/10/02 12:22:18 INFO BlockManagerInfo: Registering block manager clouderamain.cluster.local:58125 with 589.2 MB RAM
sc.setLocalProperty("yarn.nodemanager.delete.debug-delay-sec", "36000")
scala> val textFile = sc.textFile("salaries.java")
14/10/02 12:24:17 INFO MemoryStore: ensureFreeSpace(240695) called with curMem=0, maxMem=309225062
14/10/02 12:24:17 INFO MemoryStore: Block broadcast_0 stored as values to memory (estimated size 235.1 KB, free 294.7 MB)
textFile: org.apache.spark.rdd.RDD[String] = MappedRDD[1] at textFile at <console>:12
scala> textFile.count()
14/10/02 12:24:19 INFO FileInputFormat: Total input paths to process : 1
14/10/02 12:24:19 INFO SparkContext: Starting job: count at <console>:15
14/10/02 12:24:19 INFO DAGScheduler: Got job 0 (count at <console>:15) with 2 output partitions (allowLocal=false)
14/10/02 12:24:19 INFO DAGScheduler: Final stage: Stage 0(count at <console>:15)
14/10/02 12:24:19 INFO DAGScheduler: Parents of final stage: List()
14/10/02 12:24:19 INFO DAGScheduler: Missing parents: List()
14/10/02 12:24:19 INFO DAGScheduler: Submitting Stage 0 (MappedRDD[1] at textFile at <console>:12), which has no missing parents
14/10/02 12:24:19 INFO DAGScheduler: Submitting 2 missing tasks from Stage 0 (MappedRDD[1] at textFile at <console>:12)
14/10/02 12:24:19 INFO YarnClientClusterScheduler: Adding task set 0.0 with 2 tasks
14/10/02 12:24:20 INFO RackResolver: Resolved clouderamain.cluster.local to /default
14/10/02 12:24:20 INFO RackResolver: Resolved clouderahost1.cluster.local to /default
14/10/02 12:24:20 INFO TaskSetManager: Starting task 0.0:0 as TID 0 on executor 1: clouderamain.cluster.local (NODE_LOCAL)
14/10/02 12:24:20 INFO TaskSetManager: Serialized task 0.0:0 as 1711 bytes in 4 ms
14/10/02 12:24:20 INFO TaskSetManager: Starting task 0.0:1 as TID 1 on executor 2: clouderahost1.cluster.local (NODE_LOCAL)
14/10/02 12:24:20 INFO TaskSetManager: Serialized task 0.0:1 as 1711 bytes in 0 ms
14/10/02 12:24:20 WARN TaskSetManager: Lost TID 1 (task 0.0:1)
14/10/02 12:24:20 WARN TaskSetManager: Loss was due to java.lang.NoClassDefFoundError
java.lang.NoClassDefFoundError: org/apache/hadoop/mapred/JobConf
at java.lang.Class.getDeclaredFields0(Native Method)
at java.lang.Class.privateGetDeclaredFields(Class.java:2397)
at java.lang.Class.getDeclaredField(Class.java:1946)
at java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.java:1659)
at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:72)
at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:480)
at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:468)
at java.security.AccessController.doPrivileged(Native Method)
at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:468)
at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365)
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:602)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1622)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at scala.collection.immutable.$colon$colon.readObject(List.scala:362)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63)
at org.apache.spark.scheduler.ResultTask$.deserializeInfo(ResultTask.scala:61)
at org.apache.spark.scheduler.ResultTask.readExternal(ResultTask.scala:141)
at java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:85)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:169)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
14/10/02 12:24:20 INFO TaskSetManager: Starting task 0.0:1 as TID 2 on executor 1: clouderamain.cluster.local (NODE_LOCAL)
14/10/02 12:24:20 INFO TaskSetManager: Serialized task 0.0:1 as 1711 bytes in 1 ms
14/10/02 12:24:20 WARN TaskSetManager: Lost TID 0 (task 0.0:0)
14/10/02 12:24:20 INFO TaskSetManager: Loss was due to java.lang.NoClassDefFoundError: org/apache/hadoop/mapred/JobConf [duplicate 1]
14/10/02 12:24:20 INFO TaskSetManager: Starting task 0.0:0 as TID 3 on executor 2: clouderahost1.cluster.local (NODE_LOCAL)
14/10/02 12:24:20 INFO TaskSetManager: Serialized task 0.0:0 as 1711 bytes in 1 ms
14/10/02 12:24:21 INFO YarnClientSchedulerBackend: Executor 1 disconnected, so removing it
14/10/02 12:24:21 ERROR YarnClientClusterScheduler: Lost executor 1 on clouderamain.cluster.local: remote Akka client disassociated
14/10/02 12:24:21 INFO TaskSetManager: Re-queueing tasks for 1 from TaskSet 0.0
14/10/02 12:24:21 WARN TaskSetManager: Lost TID 2 (task 0.0:1)
14/10/02 12:24:21 INFO YarnClientSchedulerBackend: Executor 2 disconnected, so removing it
14/10/02 12:24:21 ERROR YarnClientClusterScheduler: Lost executor 2 on clouderahost1.cluster.local: remote Akka client disassociated
14/10/02 12:24:21 INFO TaskSetManager: Re-queueing tasks for 2 from TaskSet 0.0
14/10/02 12:24:21 WARN TaskSetManager: Lost TID 3 (task 0.0:0)
14/10/02 12:24:21 INFO DAGScheduler: Executor lost: 1 (epoch 0)
14/10/02 12:24:21 INFO BlockManagerMasterActor: Trying to remove executor 1 from BlockManagerMaster.
14/10/02 12:24:21 INFO BlockManagerMaster: Removed 1 successfully in removeExecutor
14/10/02 12:24:21 INFO DAGScheduler: Executor lost: 2 (epoch 1)
14/10/02 12:24:21 INFO BlockManagerMasterActor: Trying to remove executor 2 from BlockManagerMaster.
14/10/02 12:24:21 INFO BlockManagerMaster: Removed 2 successfully in removeExecutor
14/10/02 12:24:38 INFO YarnClientSchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@clouderahost1.cluster.local:60965/user/Executor#560179254] with ID 4
14/10/02 12:24:38 INFO TaskSetManager: Starting task 0.0:0 as TID 4 on executor 4: clouderahost1.cluster.local (PROCESS_LOCAL)
14/10/02 12:24:38 INFO TaskSetManager: Serialized task 0.0:0 as 1711 bytes in 0 ms
14/10/02 12:24:38 INFO BlockManagerInfo: Registering block manager clouderahost1.cluster.local:47072 with 589.2 MB RAM
14/10/02 12:24:39 INFO TaskSetManager: Starting task 0.0:1 as TID 5 on executor 4: clouderahost1.cluster.local (PROCESS_LOCAL)
14/10/02 12:24:39 INFO TaskSetManager: Serialized task 0.0:1 as 1711 bytes in 0 ms
14/10/02 12:24:39 WARN TaskSetManager: Lost TID 4 (task 0.0:0)
14/10/02 12:24:39 WARN TaskSetManager: Loss was due to java.lang.NoClassDefFoundError
java.lang.NoClassDefFoundError: org/apache/hadoop/mapred/JobConf
at java.lang.Class.getDeclaredFields0(Native Method)
at java.lang.Class.privateGetDeclaredFields(Class.java:2397)
at java.lang.Class.getDeclaredField(Class.java:1946)
at java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.java:1659)
at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:72)
at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:480)
at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:468)
at java.security.AccessController.doPrivileged(Native Method)
at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:468)
at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365)
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:602)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1622)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at scala.collection.immutable.$colon$colon.readObject(List.scala:362)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63)
at org.apache.spark.scheduler.ResultTask$.deserializeInfo(ResultTask.scala:61)
at org.apache.spark.scheduler.ResultTask.readExternal(ResultTask.scala:141)
at java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:85)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:169)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
14/10/02 12:24:39 INFO YarnClientSchedulerBackend: Executor 4 disconnected, so removing it
14/10/02 12:24:39 ERROR YarnClientClusterScheduler: Lost executor 4 on clouderahost1.cluster.local: remote Akka client disassociated
14/10/02 12:24:39 INFO TaskSetManager: Re-queueing tasks for 4 from TaskSet 0.0
14/10/02 12:24:39 WARN TaskSetManager: Lost TID 5 (task 0.0:1)
14/10/02 12:24:39 INFO DAGScheduler: Executor lost: 4 (epoch 2)
14/10/02 12:24:39 INFO BlockManagerMasterActor: Trying to remove executor 4 from BlockManagerMaster.
14/10/02 12:24:39 INFO BlockManagerMaster: Removed 4 successfully in removeExecutor
14/10/02 12:24:39 INFO YarnClientSchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@clouderamain.cluster.local:41804/user/Executor#-1065411697] with ID 3
14/10/02 12:24:39 INFO TaskSetManager: Starting task 0.0:1 as TID 6 on executor 3: clouderamain.cluster.local (PROCESS_LOCAL)
14/10/02 12:24:39 INFO TaskSetManager: Serialized task 0.0:1 as 1711 bytes in 0 ms
14/10/02 12:24:39 INFO BlockManagerInfo: Registering block manager clouderamain.cluster.local:41099 with 589.2 MB RAM
14/10/02 12:24:40 INFO TaskSetManager: Starting task 0.0:0 as TID 7 on executor 3: clouderamain.cluster.local (PROCESS_LOCAL)
14/10/02 12:24:40 INFO TaskSetManager: Serialized task 0.0:0 as 1711 bytes in 0 ms
14/10/02 12:24:40 WARN TaskSetManager: Lost TID 6 (task 0.0:1)
14/10/02 12:24:40 INFO TaskSetManager: Loss was due to java.lang.NoClassDefFoundError: org/apache/hadoop/mapred/JobConf [duplicate 1]
14/10/02 12:24:40 ERROR TaskSetManager: Task 0.0:1 failed 4 times; aborting job
14/10/02 12:24:40 INFO TaskSetManager: Loss was due to java.lang.NoClassDefFoundError: org/apache/hadoop/mapred/JobConf [duplicate 2]
14/10/02 12:24:40 INFO YarnClientClusterScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool
14/10/02 12:24:40 INFO DAGScheduler: Failed to run count at <console>:15
14/10/02 12:24:40 INFO YarnClientClusterScheduler: Cancelling stage 0
14/10/02 12:24:41 INFO YarnClientSchedulerBackend: Executor 3 disconnected, so removing it
14/10/02 12:24:41 ERROR YarnClientClusterScheduler: Lost executor 3 on clouderamain.cluster.local: remote Akka client disassociated
14/10/02 12:24:41 INFO DAGScheduler: Executor lost: 3 (epoch 3)
14/10/02 12:24:41 INFO BlockManagerMasterActor: Trying to remove executor 3 from BlockManagerMaster.
14/10/02 12:24:41 INFO BlockManagerMaster: Removed 3 successfully in removeExecutor
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0:1 failed 4 times, most recent failure: Exception failure in TID 6 on host clouderamain.cluster.local: java.lang.NoClassDefFoundError: org/apache/hadoop/mapred/JobConf
java.lang.Class.getDeclaredFields0(Native Method)
java.lang.Class.privateGetDeclaredFields(Class.java:2397)
java.lang.Class.getDeclaredField(Class.java:1946)
java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.java:1659)
java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:72)
java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:480)
java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:468)
java.security.AccessController.doPrivileged(Native Method)
java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:468)
java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365)
java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:602)
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1622)
java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
scala.collection.immutable.$colon$colon.readObject(List.scala:362)
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
java.lang.reflect.Method.invoke(Method.java:606)
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63)
org.apache.spark.scheduler.ResultTask$.deserializeInfo(ResultTask.scala:61)
org.apache.spark.scheduler.ResultTask.readExternal(ResultTask.scala:141)
java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837)
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63)
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:85)
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:169)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1033)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1017)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1015)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1015)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:633)
at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1207)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
at akka.actor.ActorCell.invoke(ActorCell.scala:456)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
scala> 14/10/02 12:24:55 INFO YarnClientSchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@clouderahost1.cluster.local:59671/user/Executor#-585651425] with ID 6
14/10/02 12:24:55 INFO BlockManagerInfo: Registering block manager clouderahost1.cluster.local:40622 with 589.2 MB RAM
14/10/02 12:24:58 INFO YarnClientSchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@clouderamain.cluster.local:44204/user/Executor#907271125] with ID 5
14/10/02 12:24:58 INFO BlockManagerInfo: Registering block manager clouderamain.cluster.local:37710 with 589.2 MB RAM And here's the launch information from the container executor: #!/bin/bash
export SPARK_YARN_MODE="true"
export SPARK_YARN_STAGING_DIR=".sparkStaging/application_1412014410679_0011/"
export SPARK_YARN_CACHE_FILES_VISIBILITIES="PRIVATE"
export JAVA_HOME="/usr/java/jdk1.7.0_55-cloudera"
export NM_AUX_SERVICE_mapreduce_shuffle="AAA0+gAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=^M
"
export HADOOP_YARN_HOME="/opt/cloudera/parcels/CDH-5.1.3-1.cdh5.1.3.p0.12/lib/hadoop-yarn"
export NM_HOST="clouderahost1.cluster.local"
export JVM_PID="$$"
export SPARK_USER="cloudera"
export SPARK_YARN_CACHE_FILES_TIME_STAMPS="1412266890783"
export PWD="/yarn/nm/usercache/cloudera/appcache/application_1412014410679_0011/container_1412014410679_0011_01_000011"
export NM_PORT="8041"
export LOGNAME="cloudera"
export MALLOC_ARENA_MAX="4"
export LOG_DIRS="/var/log/hadoop-yarn/container/application_1412014410679_0011/container_1412014410679_0011_01_000011"
export SPARK_YARN_CACHE_FILES_FILE_SIZES="93542713"
export NM_HTTP_PORT="8042"
export LOCAL_DIRS="/yarn/nm/usercache/cloudera/appcache/application_1412014410679_0011"
export SPARK_YARN_CACHE_FILES="hdfs://nameservice1/user/cloudera/.sparkStaging/application_1412014410679_0011/spark-assembly.jar#__spark__.jar"
export HADOOP_COMMON_HOME="/opt/cloudera/parcels/CDH-5.1.3-1.cdh5.1.3.p0.12/lib/hadoop"
export HADOOP_TOKEN_FILE_LOCATION="/yarn/nm/usercache/cloudera/appcache/application_1412014410679_0011/container_1412014410679_0011_01_000011/container_tokens"
export CLASSPATH="$PWD/__spark__.jar:$HADOOP_CONF_DIR:$HADOOP_COMMON_HOME/*:$HADOOP_COMMON_HOME/lib/*:$HADOOP_HDFS_HOME/*:$HADOOP_HDFS_HOME/lib/*:$HADOOP_YARN_HOME/*:$HADOOP_YARN_HOME/lib/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*:$PWD/__app__.jar:$PWD/:$PWD:$PWD/*"
export USER="cloudera"
export HADOOP_HDFS_HOME="/opt/cloudera/parcels/CDH-5.1.3-1.cdh5.1.3.p0.12/lib/hadoop-hdfs"
export CONTAINER_ID="container_1412014410679_0011_01_000011"
export HOME="/home/"
export HADOOP_CONF_DIR="/var/run/cloudera-scm-agent/process/8994-yarn-NODEMANAGER"
ln -sf "/yarn/nm/usercache/cloudera/filecache/20/spark-assembly.jar" "__spark__.jar"
exec /bin/bash -c "$JAVA_HOME/bin/java -server -XX:OnOutOfMemoryError='kill %p' -Xms1024m -Xmx1024m -Djava.io.tmpdir=$PWD/tmp -Dlog4j.configuration=log4j-spark-container.properties org.apache.spark.executor.CoarseGrainedExecutorBackend akka.tcp://spark@clouderamain.cluster.local:40136/user/CoarseGrainedScheduler 6 clouderahost1.cluster.local 1 1> /var/log/hadoop-yarn/container/application_1412014410679_0011/container_1412014410679_0011_01_000011/stdout 2> /var/log/hadoop-yarn/container/application_1412014410679_0011/container_1412014410679_0011_01_000011/stderr" Any help would be appreciated, Brian
... View more
Labels:
- Labels:
-
Apache Spark