Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Bug in spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar, I know the fix, but unable to apply.

avatar
Contributor

We are using spark 1.3 on hdp2.2.4 and I found there is a bug in the spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar that ships with spark. the Mllib check for the version of numpy is incorrect and MLlib throws an exception.

I know the fix, I have to change the below file in the jar:

mllib/__init__.py"

below is the current code in the above mention python file:

import numpy 
if numpy.version.version < '1.4': 
raise Exception("MLlib requires NumPy 1.4+") 

It can be fixed by changing to:

import numpy 
ver = [int(x) for x in numpy.version.version.split('.')[:2]] 
if ver < [1, 4]: 
raise Exception("MLlib requires NumPy 1.4+") 

I have tried editing the 'spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar', to correct the code.

 
I un-zipped the jar file, fixed the code, re packed it using zip. 

But after placing the fix, it gives EOF error:

Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 6, xxxxxx.xxxx.uk.hxxx): org.apache.spark.SparkException: Error from python worker:   /opt/anaconda/envs/sparkAnaconda/bin/python: No module named pyspark PYTHONPATH was:   /data/4/hadoop/yarn/local/usercache/xxxxxxxx/filecache/33/spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar java.io.EOFException   at java.io.DataInputStream.readInt(DataInputStream.java:392)   at org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:163)   at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:86)   at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:62)   at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:105)   at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)

1 ACCEPTED SOLUTION

avatar
Master Mentor

You might have to distribute the new binary across the cluster. looks like you're hitting Spark-8032 which was committed after Spark 1.3.1, you should consider upgrading your cluster to latest HDP. https://github.com/apache/spark/commit/22703dd79fecc844d68033358f3201fd8a8f95cb

View solution in original post

2 REPLIES 2

avatar
Master Mentor

You might have to distribute the new binary across the cluster. looks like you're hitting Spark-8032 which was committed after Spark 1.3.1, you should consider upgrading your cluster to latest HDP. https://github.com/apache/spark/commit/22703dd79fecc844d68033358f3201fd8a8f95cb

avatar
Contributor

Thanks Artem, you are correct, but due to some constraints we can not wait until upgrade. I am unable to find a fix for this.