About liana_napalkova

liana_napalkova · ‎05-07-2018

I have just tested it. It worked fine! Thank you!

liana_napalkova · ‎05-07-2018

Yes, sure. Sorry, I was actually referring to "hdfs://eureambarimaster1.local.eurecat.org:8020/user/hdfs/test/df.parquet" Let me test it.

liana_napalkova · ‎05-07-2018

I think that this is the reason. If I login as HDFS user and run "hdfs dfs -chown -R centos /home/centos/test", then it says that this directory does not exist. I created this directory as HDFS user and then changed permissions to centos. Should I write a parquet file to the full path?: df.coalesce(1).write.format("parquet").save("hdfs://eureambarimaster1.local.eurecat.org:8020/user/hdfs/test")

liana_napalkova · ‎05-07-2018

Maybe the problem is that I run Spark program in Yarn cluster mode? It means that the driver can be running in any of the machines of the cluster. So, probably I should run "chown -R centos:centos ..." in each machine or do ".coalesce(1)"?

liana_napalkova · ‎05-07-2018

The output of "id": uid=1000(centos) gid=1000(centos) groups=1000(centos),4(adm),10(wheel),190(systemd-journal) I executed "chown -R centos:centos /home/centos/test" but still get the same error: 18/05/07 12:06:28 ERROR ApplicationMaster: User class threw exception: org.apache.hadoop.security.AccessControlException: Permission denied: user=centos, access=WRITE, inode="/home/centos/test/df.parquet/_temporary/0":hdfs:hdfs:drwxr-xr-x This is the output of "ls -la" executed in "/home/centos": total 36236 drwx------. 4 centos centos 4096 May 7 12:34 . drwxr-xr-x. 15 root root 4096 Apr 16 18:41 .. -rw-------. 1 centos centos 13781 May 7 11:26 .bash_history -rw-r--r--. 1 centos centos 18 Mar 5 2015 .bash_logout -rw-r--r--. 1 centos centos 193 Mar 5 2015 .bash_profile -rw-r--r--. 1 centos centos 231 Mar 5 2015 .bashrc -rw-rw-r-- 1 centos centos 47 May 7 11:38 .scala_history drwx------. 2 centos centos 46 May 2 07:57 .ssh drwxrwxr-x 4 centos centos 144 May 7 11:42 test

liana_napalkova · ‎05-07-2018

I want to save DataFrame on disk: df.write.format("parquet").save("/home/centos/test/df.parquet") I get the following error, which says that the user "centos" does not have write permissions: 18/05/07 09:18:08 ERROR ApplicationMaster: User class threw exception: org.apache.hadoop.security.AccessControlException: Permission denied: user=centos, access=WRITE, inode="/home/centos/test/df.parquet/_temporary/0":hdfs:hdfs:drwxr-xr-x This is how I run spark-submit command: spark-submit --master yarn --deploy-mode cluster --driver-memory 6g --executor-cores 2 --num-executors 2 --executor-memory 4g --class org.test.MyProcessor mytest.jar

liana_napalkova · ‎05-07-2018

I re-submitted Spark job and now it works fine. The problem was that I submitted Spark job before changing permissions.

liana_napalkova · ‎05-07-2018

Yes, sure. Please see attached more screenshots from the RM UI. Thanks.

liana_napalkova · ‎05-07-2018

However, the application with such Id exists in ResourceManager. Please see the attached screenshot.

liana_napalkova · ‎05-07-2018

Thank you. I did exactly what you suggested, but I still get the same error. The directory "app_logs/centos" has ownership: centos hdfs: 18/05/07 06:36:36 INFO client.AHSProxy: Connecting to Application History server at eureambarislave1.local.eurecat.org/192.168.0.10:10200 File /app-logs/centos/logs-ifile/application_1525529485402_0020 does not exist. File /app-logs/centos/logs/application_1525529485402_0020 does not exist. Can not find any log file matching the pattern: [ALL] for the application: application_1525529485402_0020 Can not find the logs for the application: application_1525529485402_0020 with the appOwner: centos

Online	Offline
Last Visited	‎09-05-2018 01:06 PM

Member Since	‎04-08-2018 02:59 PM
Last Visited	‎09-05-2018 01:06 PM
Posts	64
Kudos received	2

Cloudera Community

Re: How to properly execute spark-submit command w...

Re: ResourceManager is started with alerts and ip:...

Re: Cannot find a saved DataFrame on disk

Re: Cannot find a saved DataFrame on disk

Re: Cannot find a saved DataFrame on disk

Re: Cannot find a saved DataFrame on disk

Re: Cannot find a saved DataFrame on disk

Cannot find a saved DataFrame on disk

Re: File /app-logs/centos/logs-ifile/application_1...

Re: File /app-logs/centos/logs-ifile/application_1...

Re: File /app-logs/centos/logs-ifile/application_1...

Re: File /app-logs/centos/logs-ifile/application_1...