Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Can't run spark-submit to yarn cluster

avatar
New Member

I have a problem while try to run spark-submit to yarn-cluster

below is my spark-submit code

spark-submit --master yarn-cluster --name spark_ml user_recommendation.py

getting below error :

java.io.FileNotFoundException: File does not exist: hdfs://name-node:8020/user/spark/.sparkStaging/application_1450092198211_0007/pyspark.zip

Is it configuration issue?

Thanks,

Coktra

1 ACCEPTED SOLUTION

avatar
Master Mentor
9 REPLIES 9

avatar
Master Mentor

avatar
New Member

Hi Neeraj,

thanks for the link, so there is no solution yet for this problem?

avatar
Master Mentor

@cokorda putra susila No. You can update that jira and vote for it

avatar
New Member

@Neeraj Sabharwal thank you, i will vote to jira

avatar
Master Mentor

@cokorda putra susila I guess we can close this question for now. You can do it by accepting the jira response if you like.

avatar
Master Mentor

@cokorda putra susila can you accept the best answer to close this thread or provide your own solution?

avatar
New Member

I am submitting spark-submit jobs with python code and they are running fine in YARN Cluster mode. I would like to understand the question a bit further. Is your Cluster running Spark? Have you setup YARN_CONF_DIR variable?

avatar
Super Collaborator

What's your version of HDP and spark ?

avatar
New Member