Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

hdfs:/user/spark/share/lib/spark-assembly.jar is missing

Solved Go to solution
Highlighted

hdfs:/user/spark/share/lib/spark-assembly.jar is missing

Rising Star

The Spark Jar Location (HDFS) (spark_jar_hdfs_path) parameter is set to /user/spark/share/lib/spark-assembly.jar

However, the HDFS file /user/spark/share/lib/spark-assembly.jar is NOT there!

The only HDFS folder/file for Spark that exists is /user/spark/applicationHistory

 

Although I have run via CM to 'Upload Spark Jar' (from drop-down Actions option) successfully (at least that's what CM tells me) when I check the spark HDFS folders/files the jar (spark-assembly.jar) is not there!!!

 

1 ACCEPTED SOLUTION

Accepted Solutions

Re: hdfs:/user/spark/share/lib/spark-assembly.jar is missing

Super Collaborator

In CM & CDH 5.4 you should unset it and let it use the one that is there on the nodes. Much faster.

 

Wilfred

8 REPLIES 8

Re: hdfs:/user/spark/share/lib/spark-assembly.jar is missing

Master Collaborator

I don't think that is used anymore in recent CDH; this is not how the assembly is distributed. What problem are you having?

Re: hdfs:/user/spark/share/lib/spark-assembly.jar is missing

Rising Star

Interesting...

 

Somehow, the Spark Parameter spark_jar_hdfs_path is set to (HDFS) '/user/spark/share/lib/spark-assmbly.jar' value  and CM complains about 'Failed parameter validation'!

Should I unset it??

 

 

Re: hdfs:/user/spark/share/lib/spark-assembly.jar is missing

Master Collaborator

If it's set, it probably needs to be an hdfs: path, but I don't think this setting matters in recent CDH.

Re: hdfs:/user/spark/share/lib/spark-assembly.jar is missing

Rising Star

Should I un-set it?

CM keeps complaining...

 

Re: hdfs:/user/spark/share/lib/spark-assembly.jar is missing

Rising Star

Also, what Spark userid's HDFS folder structure should look like?

So far I am having only one HDFS folder:

/user/spark/applicationHistory

 

Re: hdfs:/user/spark/share/lib/spark-assembly.jar is missing

Super Collaborator

In a recent version (CM/CDH 5.4 as an example) the directory should just look like what you have now. We do not push the assembly separately any more. It uses the assembly installed on the nodes, by default, that is faster than using the one from HDFS.The setting is still there to allow custom assemblies to be used.

 

The setting should be entered without the HDFS in front and the path will be pushed out with HDFS in front (CM will handle that for you). Which version of CDH and CM are you using?

 

Wilfred

Re: hdfs:/user/spark/share/lib/spark-assembly.jar is missing

Rising Star

I have upgraded both CM & CDH to 5.4 release.

 

Re: hdfs:/user/spark/share/lib/spark-assembly.jar is missing

Super Collaborator

In CM & CDH 5.4 you should unset it and let it use the one that is there on the nodes. Much faster.

 

Wilfred

Don't have an account?
Coming from Hortonworks? Activate your account here