Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Livy "Interpreter died" with PySpark

Highlighted

Livy "Interpreter died" with PySpark

New Contributor

Hello,

 

 

On top of my ubuntu based docker image (Python 3.6, Spark 2.2.2) I installed Livy following https://github.com/cloudera/livy, and trying to create a spark server in the local mode (to start with).

 

In the building process, I specified my Spark version like following :  

export SPARK_VERSION=2.2.2

mvn -DskipTests -Dspark.version=$SPARK_VERSION clean package

 

It works perfectly with "Spark Example" in https://github.com/cloudera/livy. 

However, "PySpark Example" does not work giving the message "Interpreter died" (Please see below)

 

I tried different settings in the file under  /usr/local/livy/conf folder without success.

 

Can anyone tell me what can be the cause, and how to debug ? (It does not say anything in the Spark log...)

 

I'd very much appreciate your help.

 

 

Here is a snippet of the client :

 

import json, pprint, requests, textwrap
host = 'http://localhost:8998'
data = {'kind': 'pyspark'}
headers = {'Content-Type': 'application/json'}
r = requests.post(host + '/sessions', data=json.dumps(data), headers=headers)
r.json()
{'id': 0, 'appId': None, 'owner': None, 'proxyUser': None, 'state': 'starting', 'kind': 'pyspark', 'appInfo': {'driverLogUrl': None, 'sparkUiUrl': None}, 'log': []}
 
 
 
session_url = host + r.headers['location']
r2 = requests.get(session_url, headers=headers)
r2.json()
{'id': 0,
'appId': None,
'owner': None,
'proxyUser': None,
'state': 'idle',
'kind': 'pyspark',
'appInfo': {'driverLogUrl': None, 'sparkUiUrl': None},
'log': []}
 
 
statements_url = session_url + '/statements'

data = {
  'code': textwrap.dedent("""
    import random
    NUM_SAMPLES = 100000
    def sample(p):
      x, y = random.random(), random.random()
      return 1 if x*x + y*y < 1 else 0

    count = sc.parallelize(xrange(0, NUM_SAMPLES)).map(sample).reduce(lambda a, b: a + b)
    print("Pi is roughly %f" % (4.0 * count / NUM_SAMPLES))
    """)
}
r = requests.post(statements_url, data=json.dumps(data), headers=headers)
pprint.pprint(r.json())
 
{'id': 0, 'output': None, 'progress': 0.0, 'state': 'waiting'}
 
 
 
r = requests.get(statements_url, headers=headers)
pprint.pprint(r.json())
 
{'statements': [{'id': 0, 'output': {'ename': 'Error', 'evalue': 'Interpreter died:\n', 'execution_count': 0, 'status': 'error', 'traceback': []}, 'progress': 1.0, 'state': 'available'}], 'total_statements': 1}