Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Missing cloudbreak home directory after cluster creation

avatar

Hi,

I've set up a fresh cluster using HDC console. When following instructions:

> export SPARK_MAJOR_VERSION=2
> spark-shell —master yarn

[...]
AccessControlException: Permission denied: user=cloudbreak, access=WRITE, inode=“/user/cloudbreak/.sparkStaging/application_1510057948417_0004":hdfs:hdfs:drwxr-xr-x

Same happens with pyspark.

It looks like home directory is missing but I'm unable to create one (lack of access to hdfs account).

Is something missing in template or steps I follow.

I can try to do workarounds like pyspark --master yarn --conf spark.yarn.stagingDir=/tmp/ but still I end up with:

17/11/07 13:30:59 ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.

Running example:

spark-submit --conf spark.yarn.stagingDir=/tmp/  --class org.apache.spark.examples.SparkPi --master yarn --executor-memory 2G --num-executors 5 /usr/hdp/current/spark2-client/examples/jars/spark-examples_2.11-2.1.1.2.6.1.4-2.jar  100

failed with same issue, on RM site I can find:

Application application_1510057948417_0022 failed 2 times due to AM Container for appattempt_1510057948417_0022_000002 exited with exitCode: -1000
For more detailed output, check the application tracking page: http://ip-172-30-12-239.example.com:8088/cluster/app/application_1510057948417_0022 Then click on links to logs of each attempt.
Diagnostics: Failing this attempt. Failing the application.

But there are no logs available for attempt and yarn cmd doesn't provide logs as well: "Can not find the logs for the application: application_1510057948417_0022 with the appOwner: cloudbreak"

1 ACCEPTED SOLUTION

avatar
Master Mentor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
4 REPLIES 4

avatar
Master Mentor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Master Mentor

@Radosław Stankiewicz

If kerberos is not enabled in your cluster then you can also Try to Fake "cloudbreak" user to act as "hdfs" user by running the following command:

By setting "HADOOP_USER_NAME=hdfs"

# export HADOOP_USER_NAME=hdfs
# hdfs dfs -chown -R cloudbreak:hdfs /user/cloudbreak
# hdfs dfs -chmod -R 777 /user/cloudbreak<br>

.

And once the permission on the directory is changed then you can open a new terminal and run your commands as "cloudbreak" user. Please do not forget to unset the "HADOOP_USER_NAME=hdfs"

avatar

that's nice trick, will try that! Will also check user creation. The problematic part is - is it feature or a bug that it's not set after fresh startup. I'm trying to automate cluster creation for ETL (cron based) and it may be difficult to explain that I need those 3 lines if this is default cloudbreak user that is presented in each user guide 🙂

avatar

@Jay Kumar SenSharma, manual directory creation did the trick and all the spark apps are working correctly now. Still think it's a bug but workaround is good enough for me.