Support Questions

radoslaw_stanki · ‎11-07-2017

Hi,

I've set up a fresh cluster using HDC console. When following instructions:

> export SPARK_MAJOR_VERSION=2
> spark-shell —master yarn

[...]
AccessControlException: Permission denied: user=cloudbreak, access=WRITE, inode=“/user/cloudbreak/.sparkStaging/application_1510057948417_0004":hdfs:hdfs:drwxr-xr-x

Same happens with pyspark.

It looks like home directory is missing but I'm unable to create one (lack of access to hdfs account).

Is something missing in template or steps I follow.

I can try to do workarounds like pyspark --master yarn --conf spark.yarn.stagingDir=/tmp/ but still I end up with:

17/11/07 13:30:59 ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.

Running example:

spark-submit --conf spark.yarn.stagingDir=/tmp/  --class org.apache.spark.examples.SparkPi --master yarn --executor-memory 2G --num-executors 5 /usr/hdp/current/spark2-client/examples/jars/spark-examples_2.11-2.1.1.2.6.1.4-2.jar  100

failed with same issue, on RM site I can find:

Application application_1510057948417_0022 failed 2 times due to AM Container for appattempt_1510057948417_0022_000002 exited with exitCode: -1000
For more detailed output, check the application tracking page: http://ip-172-30-12-239.example.com:8088/cluster/app/application_1510057948417_0022 Then click on links to logs of each attempt.
Diagnostics: Failing this attempt. Failing the application.

But there are no logs available for attempt and yarn cmd doesn't provide logs as well: "Can not find the logs for the application: application_1510057948417_0022 with the appOwner: cloudbreak"

jsensharma · ‎11-07-2017

@Radosław Stankiewicz

Following is the HDFS directory which has "hdfs:hdfs" access.

user=cloudbreak, access=WRITE, inode=“/user/cloudbreak/.sparkStaging/application_1510057948417_0004":hdfs:hdfs:drwxr-xr-x

.

Can you please try this "sudo su - hdfs" ?

# sudo su - hdfs
# hdfs dfs -chown -R cloudbreak:hdfs /user/cloudbreak
# hdfs dfs -chmod -R 777 /user/cloudbreak

.

Ambari 2.5 onwards we have a feature to auto create the home directory for the newly created users on HDFS. May be you can try creating a new user. https://docs.hortonworks.com/HDPDocuments/Ambari-2.5.0.3/bk_ambari-administration/content/create_use...

.

View solution in original post

jsensharma · ‎11-07-2017

@Radosław Stankiewicz

Following is the HDFS directory which has "hdfs:hdfs" access.

user=cloudbreak, access=WRITE, inode=“/user/cloudbreak/.sparkStaging/application_1510057948417_0004":hdfs:hdfs:drwxr-xr-x

.

Can you please try this "sudo su - hdfs" ?

# sudo su - hdfs
# hdfs dfs -chown -R cloudbreak:hdfs /user/cloudbreak
# hdfs dfs -chmod -R 777 /user/cloudbreak

.

Ambari 2.5 onwards we have a feature to auto create the home directory for the newly created users on HDFS. May be you can try creating a new user. https://docs.hortonworks.com/HDPDocuments/Ambari-2.5.0.3/bk_ambari-administration/content/create_use...

.

jsensharma · ‎11-07-2017

@Radosław Stankiewicz

If kerberos is not enabled in your cluster then you can also Try to Fake "cloudbreak" user to act as "hdfs" user by running the following command:

By setting "HADOOP_USER_NAME=hdfs"

# export HADOOP_USER_NAME=hdfs
# hdfs dfs -chown -R cloudbreak:hdfs /user/cloudbreak
# hdfs dfs -chmod -R 777 /user/cloudbreak<br>

.

And once the permission on the directory is changed then you can open a new terminal and run your commands as "cloudbreak" user. Please do not forget to unset the "HADOOP_USER_NAME=hdfs"

radoslaw_stanki · ‎11-07-2017

that's nice trick, will try that! Will also check user creation. The problematic part is - is it feature or a bug that it's not set after fresh startup. I'm trying to automate cluster creation for ETL (cron based) and it may be difficult to explain that I need those 3 lines if this is default cloudbreak user that is presented in each user guide 🙂

radoslaw_stanki · ‎11-07-2017

@Jay Kumar SenSharma, manual directory creation did the trick and all the spark apps are working correctly now. Still think it's a bug but workaround is good enough for me.

Cloudera Community

Support Questions

Missing cloudbreak home directory after cluster creation