Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

PySpark Connection remote server

Solved Go to solution

PySpark Connection remote server

New Contributor

I've install a cluster with one node on a amazon machine thanks to ambari. I'm trying to use spark from an other machine thanks to pySpark.

This is my code :

from pyspark import SparkConf, SparkContext
conf = SparkConf().setAppName('hello').setMaster('spark://MYIP:7077')
sc = SparkContext(conf=conf)

The problem is that I have a connection refused when I run the program :

WARN StandaloneAppClient$ClientEndpoint: Failed to connect to master "MYIP"

So, I tried this command to start the master : ./sbin/start-master.sh

And now, I have this error :

17/07/27 12:07:15 WARN StandaloneAppClient$ClientEndpoint: Failed to connect to master XX.XXX.XXX.XX:7077 org.apache.spark.SparkException: Exception thrown in awaitResult: at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205) at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75) at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:100) at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:108) at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1$$anon$1.run(StandaloneAppClient.scala:106) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.RuntimeException: java.io.StreamCorruptedException: invalid stream header: 01000C31

This is not a problem of port because the port 7077 is open.

I don't find any answer for that problem on the forum, do you have any idea ?

1 ACCEPTED SOLUTION

Accepted Solutions

Re: PySpark Connection remote server

New Contributor

That was a problem of version compatibility between spark in Ambari and my spark version imported with python.

1 REPLY 1

Re: PySpark Connection remote server

New Contributor

That was a problem of version compatibility between spark in Ambari and my spark version imported with python.

Don't have an account?
Coming from Hortonworks? Activate your account here