About tarekabouzeid91

tarekabouzeid91 · ‎03-15-2015

but if there's a network problem , what might it be ? i have 3 nodes 1 master and 2 workers , i can submit any job using master and it appears working on the 2 workers ,, but the network word count isn't working on cluster mode , so lets assume its a network problem , what might it be ?

tarekabouzeid91 · ‎03-15-2015

i guess there's no network restrictions , all 3 nodes that i am using are configured to work together normally , but i read that when spark deploy the code on a worker it starts to listen to the port , while another worker is already using listening on this port so it cause the faliure , check this http://apache-spark-user-list.1001560.n3.nabble.com/How-to-use-FlumeInputDStream-in-spark-cluster-td1604.html

tarekabouzeid91 · ‎03-15-2015

i am using spark streaming , event count example , flume as source of avro events , everything works fine when executing spark on local mode , but when i try to run the example on my cluster i got failed to bind error , Command line thats working " local mode " spark-submit --class "WordCount" --master local[*] --jars /opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/lib/spark/lib/spark-streaming-flume_2.10-1.2.0.jar,/opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/lib/flume-ng/lib/avro-ipc.jar,/opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/lib/flume-ng/lib/flume-ng-sdk-1.5.0-cdh5.3.1.jar /usr/local/WordCount/target/scala-2.10/wordcount_2.10-1.0.jar node01 6789 Command line that's not working " cluster mode " spark-submit --class "WordCount" --master spark://node01:7077 --jars /opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/lib/spark/lib/spark-streaming-flume_2.10-1.2.0.jar,/opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/lib/flume-ng/lib/avro-ipc.jar,/opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/lib/flume-ng/lib/flume-ng-sdk-1.5.0-cdh5.3.1.jar /usr/local/WordCount/target/scala-2.10/wordcount_2.10-1.0.jar node01 6789 Error : ERROR ReceiverTracker: Deregistered receiver for stream 0: Error starting receiver 0 - org.jboss.netty.channel.ChannelException: Failed to bind to: /192.168.168.94:6789 at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272) at org.apache.avro.ipc.NettyServer.<init>(NettyServer.java:106) at org.apache.avro.ipc.NettyServer.<init>(NettyServer.java:119) at org.apache.avro.ipc.NettyServer.<init>(NettyServer.java:74) at org.apache.avro.ipc.NettyServer.<init>(NettyServer.java:68) at org.apache.spark.streaming.flume.FlumeReceiver.initServer(FlumeInputDStream.scala:164) at org.apache.spark.streaming.flume.FlumeReceiver.onStart(FlumeInputDStream.scala:171) at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:121) at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:106) at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverLauncher$$anonfun$8.apply(ReceiverTracker.scala:277) at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverLauncher$$anonfun$8.apply(ReceiverTracker.scala:269) at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314) at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) at org.apache.spark.scheduler.Task.run(Task.scala:56) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.net.BindException: Cannot assign requested address at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:444) at sun.nio.ch.Net.bind(Net.java:436) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) at org.jboss.netty.channel.socket.nio.NioServerBoss$RegisterTask.run(NioServerBoss.java:193) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:366) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:290) at org.jboss.netty.channel.socket.nio.NioServerBoss.run(NioServerBoss.java:42) at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) ... 3 more

tarekabouzeid91 · ‎03-12-2015

thanks so much !!

tarekabouzeid91 · ‎03-12-2015

i am working on a cluster of 1 master and 2 workers , when i open spark master UI Memory: 128.0 MB Total each worker memory 64.0 MB (0.0 B Used) while each worker has 4 GB memory in total and 2.5 GB free memory , so i want to increase the worker memory instead of 64 to make it 1 GB , i tried to find spark-defaults.conf in /etc/spark/conf.cloudera.spark but it doesn't exist , i am running cloudera manager 5.0.2 , i guess there might be a configuration that can be done to increase the memory from cloudera manger => spark service , but i can't find it Thanks in advance

tarekabouzeid91 · ‎03-11-2015

that solved my problem , as the version build with cloudera isn't build to hive

tarekabouzeid91 · ‎03-11-2015

yes i had to rebuild spark to be compatible with hive http://spark.apache.org/docs/1.2.0/building-spark.html this section : Building With Hive and JDBC Support

tarekabouzeid91 · ‎03-11-2015

i tried it but it doesn't work , but i got this fixed by copying all files in path /var/lib/alternatives from a working VM to the VM that has the problem , then restarted the agent and got it fixed , thanks so much

tarekabouzeid91 · ‎03-11-2015

i am having same problem , but when i opened /var/lib/alternatives , i found hadoop file and most of other files empty ! with zero size

tarekabouzeid91 · ‎03-10-2015

hey smark , i am having Matthew's same problem but his fix didn't work with me when i tried service cloudera-scm-server-db start the output was : Creating DB navms for role NAVIGATORMETASERVER waiting for server to start.... done server started psql: could not connect to server: Connection timed out Is the server running on host "localhost" and accepting TCP/IP connections on port 7432? waiting for server to shut down..... done server stopped Unable to create database role navms, giving up waiting for server to start..... done server started its status : pg_ctl: server is running (PID: 3808) /usr/bin/postgres "-D" "/var/lib/cloudera-scm-server-db/data" but when i tried to turn the server on : service cloudera-scm-server start it says server running [ok] but when i get its status after a while i found it dead service cloudera-scm-server status cloudera-scm-server dead but pid file exists My log error is same as the above

Online	Offline
Last Visited	‎10-12-2021 03:27 AM

Member Since	‎02-09-2015 12:35 AM
Last Visited	‎10-12-2021 03:27 AM
Posts	95
Kudos received	8

Cloudera Community

Re: Parquet schema error

Re: sqoop jdbc error sandbox hortonwork

Re: Kafka offsets in DR scenario

Re: Hive - tez , vertex failed error during reduc...

Re: Cannot read data using Spark - Hive Warehouse...

Re: Spark Streaming Fails on Cluster mode ( Flume ...

Re: Spark Streaming Fails on Cluster mode ( Flume ...

Spark Streaming Fails on Cluster mode ( Flume as s...

Re: how to increase spark's worker memory from CM

how to increase spark's worker memory from CM

Re: Spark sql and Hive tables

Re: Spark sql and Hive tables

Re: issue with hadoop -bash: hadoop: command not f...

Re: issue with hadoop -bash: hadoop: command not f...

Re: Clouder Manager Timeout - Problem with Postgre...