Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Connect HDP 2.4 Spark remotely failed

SOLVED Go to solution
Highlighted

Connect HDP 2.4 Spark remotely failed

New Contributor

I have a hortonwork sandbox 2.4 with spark 1.6 set up. Then I create Intellij spark development environment in windows using hdp spark jar and scala 2.10.5. So both spark and scala version are matched between my windows and hdp environment as indicated here. And my Intellij dev environment works with local as Master. Then I'm trying to connect hdp in windows using below code

val sparkConf = new SparkConf()
      .setAppName("spark-word-count")
      .setMaster("spark://10.33.241.160:7077")

And I get below error information and have no clue to resolve it. Please help!

6/03/21 16:27:40 INFO SparkUI: Started SparkUI at http://10.33.240.126:4040
16/03/21 16:27:40 INFO AppClient$ClientEndpoint: Connecting to master spark://10.33.241.160:7077...
16/03/21 16:27:41 WARN AppClient$ClientEndpoint: Failed to connect to master 10.33.241.160:7077
java.io.IOException: Failed to connect to /10.33.241.160:7077
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:216)
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:167)
    at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:200)
    at org.apache.spark.rpc.netty.Outbox$anon$1.call(Outbox.scala:187)
    at org.apache.spark.rpc.netty.Outbox$anon$1.call(Outbox.scala:183)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:722)
Caused by: java.net.ConnectException: Connection refused: no further information: /10.33.241.160:7077
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:692)
    at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224)
    at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:289)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
    ... 1 more
16/03/21 16:28:40 ERROR MapOutputTrackerMaster: Error communicating with MapOutputTracker
java.lang.InterruptedException
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1325)
    at scala.concurrent.impl.Promise$DefaultPromise.tryAwait(Promise.scala:208)
    at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:218)
    at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
    at scala.concurrent.Await$anonfun$result$1.apply(package.scala:107)
    at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
    at scala.concurrent.Await$.result(package.scala:107)
    at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
    at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:101)
    at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:77)
    at org.apache.spark.MapOutputTracker.askTracker(MapOutputTracker.scala:110)
    at org.apache.spark.MapOutputTracker.sendTracker(MapOutputTracker.scala:120)
    at org.apache.spark.MapOutputTrackerMaster.stop(MapOutputTracker.scala:462)
    at org.apache.spark.SparkEnv.stop(SparkEnv.scala:93)
    at org.apache.spark.SparkContext$anonfun$stop$12.apply$mcV$sp(SparkContext.scala:1756)
    at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1229)
    at org.apache.spark.SparkContext.stop(SparkContext.scala:1755)
    at org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend.dead(SparkDeploySchedulerBackend.scala:127)
    at org.apache.spark.deploy.client.AppClient$ClientEndpoint.markDead(AppClient.scala:264)
    at org.apache.spark.deploy.client.AppClient$ClientEndpoint$anon$2$anonfun$run$1.apply$mcV$sp(AppClient.scala:134)
    at org.apache.spark.util.Utils$.tryOrExit(Utils.scala:1163)
    at org.apache.spark.deploy.client.AppClient$ClientEndpoint$anon$2.run(AppClient.scala:129)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:722)
1 ACCEPTED SOLUTION

Accepted Solutions

Re: Connect HDP 2.4 Spark remotely failed

New Contributor

Finally figure it out by myself. I need to setup my hortonworks Spark as master server, and use my intellij dev environment as slave to connect hdp. Just run ./sbin/start-master.sh in hdp as this link.

11 REPLIES 11

Re: Connect HDP 2.4 Spark remotely failed

@Theo Zhu

you most lost likely need to port forward on your virtual machine the port 7077. Go to port forwarding section to add the port.

Re: Connect HDP 2.4 Spark remotely failed

New Contributor

@azeltov

I was using VMplayer 6.0 bridged network. To set up port forwarding, I switch to NAT network and add an entry "7077 = 192.168.159.129:7077"to vmnetnat.conf seems doesn't work. How should I do port forwarding?

Re: Connect HDP 2.4 Spark remotely failed

First check whether spark is running in your hdp env then check if you are able to access 7077 VM port from your dev env through "telnet ip 7077", probably firewall or network settings might be the culprit.

Also it's better to check the spark UI and find out the URL mentioned there and please make sure we use exactly same URL while connecting i.e spark://hostname:7077

Re: Connect HDP 2.4 Spark remotely failed

New Contributor

@Jitendra Yadav

I cannot telnet myhostname 7077 using PuTTy while ssh myhostname 22 working. It not seems like a fireware issue since I install vmplayer in my desktop. For spark UI, I can login spark history server by myhostname:18080, but attempt to connect myhostname:7077 alway get refused. What should I do?

Re: Connect HDP 2.4 Spark remotely failed

@Theo Zh what wrong with my answer? :)

Re: Connect HDP 2.4 Spark remotely failed

New Contributor

Finally figure it out by myself. I need to setup my hortonworks Spark as master server, and use my intellij dev environment as slave to connect hdp. Just run ./sbin/start-master.sh in hdp as this link.

Re: Connect HDP 2.4 Spark remotely failed

With spark on HDP you will have to submit spark jobs on yarn. Checkout http://spark.apache.org/docs/1.6.1/running-on-yarn.html . This might work .setMaster("yarn-client") if HADOOP_CONF_DIR and YARN_CONF_DIR environment variables are mapped to /etc/hadoop/conf .

Re: Connect HDP 2.4 Spark remotely failed

New Contributor

Why can't I just have my spark connect to the master? Why do I have to go through yarn?

Re: Connect HDP 2.4 Spark remotely failed

If you want to use the Spark on HDP you will have to go through Yarn as Yarn acts as the resource manager to allocate resource (CPU and memory). When you install Spark on HDP , Spark Master is not automatically started. But when you submit Spark through Yarn, Yarn creates the Application Master which acts as the Spark Master.