Created on 04-12-2017 08:23 AM - edited 09-16-2022 04:27 AM
Hi,
I recently deployed spark on YARN in CDH 5.10.0
When i launch my script, i get this kind of Warning :
17/04/12 16:22:25 WARN net.ScriptBasedMapping: Exception running /etc/spark/conf.cloudera.spark_on_yarn/yarn-conf/topology.py <IP_NodeManager> java.io.IOException: Cannot run program "/etc/spark/conf.cloudera.spark_on_yarn/yarn-conf/topology.py" (in directory "/yarn/nm/usercache/root/appcache/application_1492005737104_0001/container_1492005737104_0001_01_000001"): error=2, No such file or directory at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048) at org.apache.hadoop.util.Shell.runCommand(Shell.java:548) at org.apache.hadoop.util.Shell.run(Shell.java:504) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:786) at org.apache.hadoop.net.ScriptBasedMapping$RawScriptBasedMapping.runResolveCommand(ScriptBasedMapping.java:251) at org.apache.hadoop.net.ScriptBasedMapping$RawScriptBasedMapping.resolve(ScriptBasedMapping.java:188) at org.apache.hadoop.net.CachedDNSToSwitchMapping.resolve(CachedDNSToSwitchMapping.java:119) at org.apache.hadoop.yarn.util.RackResolver.coreResolve(RackResolver.java:101) at org.apache.hadoop.yarn.util.RackResolver.resolve(RackResolver.java:81) at org.apache.spark.scheduler.cluster.YarnScheduler.getRackForHost(YarnScheduler.scala:38) at org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$resourceOffers$1.apply(TaskSchedulerImpl.scala:310) at org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$resourceOffers$1.apply(TaskSchedulerImpl.scala:299) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at org.apache.spark.scheduler.TaskSchedulerImpl.resourceOffers(TaskSchedulerImpl.scala:299) at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverEndpoint.org$apache$spark$scheduler$cluster$CoarseGrainedSchedulerBackend$DriverEndpoint$$makeOffers(CoarseGrainedSchedulerBackend.scala:207) at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverEndpoint$$anonfun$receive$1.applyOrElse(CoarseGrainedSchedulerBackend.scala:126) at org.apache.spark.rpc.netty.Inbox$$anonfun$process$1.apply$mcV$sp(Inbox.scala:116) at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:204) at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100) at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:217) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: error=2, No such file or directory at java.lang.UNIXProcess.forkAndExec(Native Method) at java.lang.UNIXProcess.<init>(UNIXProcess.java:247) at java.lang.ProcessImpl.start(ProcessImpl.java:134) at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) ... 25 more
Did someone already have this issue please?
Thanks!
Francis
Created 04-27-2017 05:56 AM
MEA CULPA!!!
Not HDFS Gateway but Spark Gateway on node where i wanted to execute spark shell.
By adding this role instance, the node will receive the client configurations 🙂
Created 04-25-2017 04:35 AM
Help please...
Created 04-25-2017 08:02 AM
Solved!
I added a HDFS Gateway for Host that execute Spark-shell 🙂
Created 04-27-2017 05:56 AM
MEA CULPA!!!
Not HDFS Gateway but Spark Gateway on node where i wanted to execute spark shell.
By adding this role instance, the node will receive the client configurations 🙂
Created 04-27-2017 11:28 PM
This is a bug that fixed at 5.10.1
Created 05-02-2017 02:26 AM
Thanks Fawze!
Created 05-11-2017 12:48 PM
What is the bug? I don't see it here when searching under Spark.
Created 05-11-2017 01:20 PM