03-15-2015 02:16 AM
i am using spark streaming , event count example , flume as source of avro events , everything works fine when executing spark on local mode , but when i try to run the example on my cluster i got failed to bind error ,
Command line thats working " local mode "
spark-submit --class "WordCount" --master local[*] --jars /opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/lib/spark/lib/spark-streaming-flume_2.10-1.2.0.jar,/opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/lib/flume-ng/lib/avro-ipc.jar,/opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/lib/flume-ng/lib/flume-ng-sdk-1.5.0-cdh5.3.1.jar /usr/local/WordCount/target/scala-2.10/wordcount_2.10-1.0.jar node01 6789
Command line that's not working " cluster mode "
spark-submit --class "WordCount" --master spark://node01:7077 --jars /opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/lib/spark/lib/spark-streaming-flume_2.10-1.2.0.jar,/opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/lib/flume-ng/lib/avro-ipc.jar,/opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/lib/flume-ng/lib/flume-ng-sdk-1.5.0-cdh5.3.1.jar /usr/local/WordCount/target/scala-2.10/wordcount_2.10-1.0.jar node01 6789
ERROR ReceiverTracker: Deregistered receiver for stream 0: Error starting receiver 0 - org.jboss.netty.channel.ChannelException: Failed to bind to: /192.168.168.94:6789 at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272) at org.apache.avro.ipc.NettyServer.<init>(NettyServer.java:106) at org.apache.avro.ipc.NettyServer.<init>(NettyServer.java:119) at org.apache.avro.ipc.NettyServer.<init>(NettyServer.java:74) at org.apache.avro.ipc.NettyServer.<init>(NettyServer.java:68) at org.apache.spark.streaming.flume.FlumeReceiver.initServer(FlumeInputDStream.scala:164) at org.apache.spark.streaming.flume.FlumeReceiver.onStart(FlumeInputDStream.scala:171) at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:121) at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:106) at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverLauncher$$anonfun$8.apply(ReceiverTracker.scala:277) at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverLauncher$$anonfun$8.apply(ReceiverTracker.scala:269) at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314) at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) at org.apache.spark.scheduler.Task.run(Task.scala:56) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.net.BindException: Cannot assign requested address at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:444) at sun.nio.ch.Net.bind(Net.java:436) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) at org.jboss.netty.channel.socket.nio.NioServerBoss$RegisterTask.run(NioServerBoss.java:193) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:366) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:290) at org.jboss.netty.channel.socket.nio.NioServerBoss.run(NioServerBoss.java:42) at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) ... 3 more
03-15-2015 03:29 AM
As you can see, the problem is that the receiver can't bind to its assigned address. Is there any networking-related restriction in place that would prevent this? is this the port you intended?
03-15-2015 03:53 AM
i guess there's no network restrictions , all 3 nodes that i am using are configured to work together normally , but i read that when spark deploy the code on a worker it starts to listen to the port , while another worker is already using listening on this port so it cause the faliure , check this http://apache-spark-user-list.1001560.n3.nabble.com/How-to-use-FlumeInputDStream-in-spark-cluster-td...
03-15-2015 03:59 AM
but if there's a network problem , what might it be ? i have 3 nodes 1 master and 2 workers , i can submit any job using master and it appears working on the 2 workers ,, but the network word count isn't working on cluster mode , so lets assume its a network problem , what might it be ?
03-15-2015 08:43 AM
Although that thread sounds similar, I don't think it's the same thing. Failing to bind is not a failure to connect to a remote host. It means the local host didn't allow the process to listen on a port. The two most likely explanations are:
- an old process is still listening on that port, or at least, another still-running process is
- you appear to be binding to a non-routable address (192.168.x.x) This might be OK but worth double-checking
03-15-2015 09:37 AM
i fixed that problem by making the spark listen to node02 and flume send events to node02 , that fixed the problem acutally , thanks so much for your help
11-17-2015 01:38 AM
We actually have the same problem of filed to bind with the internal ip-port 10.1.0.11:50321.
The question is that sometimes connect Spark Streaming with the port and sometimes didn´t it and when the did not connect sometimes in 5 -7 min that Spark Streaming is trying connect.
Do you know what is the possible cause?
We install Spark 1.5.1 on clouderas 5.3.7 (with YARN).