Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

PySpark Connection remote server

New Contributor

I've install a cluster with one node on a amazon machine thanks to ambari. I'm trying to use spark from an other machine thanks to pySpark.

This is my code :

from pyspark import SparkConf, SparkContext
conf = SparkConf().setAppName('hello').setMaster('spark://MYIP:7077')
sc = SparkContext(conf=conf)

The problem is that I have a connection refused when I run the program :

WARN StandaloneAppClient$ClientEndpoint: Failed to connect to master "MYIP"

So, I tried this command to start the master : ./sbin/start-master.sh

And now, I have this error :

17/07/27 12:07:15 WARN StandaloneAppClient$ClientEndpoint: Failed to connect to master XX.XXX.XXX.XX:7077 org.apache.spark.SparkException: Exception thrown in awaitResult: at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205) at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75) at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:100) at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:108) at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1$$anon$1.run(StandaloneAppClient.scala:106) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.RuntimeException: java.io.StreamCorruptedException: invalid stream header: 01000C31

This is not a problem of port because the port 7077 is open.

I don't find any answer for that problem on the forum, do you have any idea ?

1 ACCEPTED SOLUTION

New Contributor

That was a problem of version compatibility between spark in Ambari and my spark version imported with python.

View solution in original post

1 REPLY 1

New Contributor

That was a problem of version compatibility between spark in Ambari and my spark version imported with python.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.