Created on 02-06-2016 05:03 AM - edited 09-16-2022 03:02 AM
Hi,
I'm trying Lab 4 - Spark Risk Factor Analysis, in almost the last step I executed this command:
risk_factor_spark.write.orc("risk_factor_spark")
It throws the following error:
:32: error: value format is not a member of org.apache.spark.sql.DataFrame
Please could you help me solve this?
Thanks!
Created 02-07-2016 06:01 PM
Thanks for your answer Geoffrey.
I'm using version 2.3.2.
After running the command
risk_factor_spark.write.format("orc").save("risk_factor_spark")
I just can see several messages, but one of them says the following:
INFO DefaultWriterContainer: Using output committer class org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE, inode="/user/root/risk_factor_spark/_temporary/0":hdfs:hdfs:drwxr-xr-x
Reading in the comments I saw something similar from Özgür Akdemirci, and tried the answer from Peter Lasne:
there was no /user directory for my user in HDFS, so I also had to do this: “sudo -u admin hdfs dfs -mkdir /user/” and “sudo -u admin hdfs dfs -chown :hdfs /user/”.
But it didn't work. Do I need to set more permissions?
Regards.
Created 02-06-2016 08:31 AM
If you are using Spark >= 1.4 try the following command
Created 02-07-2016 06:01 PM
Thanks for your answer Geoffrey.
I'm using version 2.3.2.
After running the command
risk_factor_spark.write.format("orc").save("risk_factor_spark")
I just can see several messages, but one of them says the following:
INFO DefaultWriterContainer: Using output committer class org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE, inode="/user/root/risk_factor_spark/_temporary/0":hdfs:hdfs:drwxr-xr-x
Reading in the comments I saw something similar from Özgür Akdemirci, and tried the answer from Peter Lasne:
there was no /user directory for my user in HDFS, so I also had to do this: “sudo -u admin hdfs dfs -mkdir /user/” and “sudo -u admin hdfs dfs -chown :hdfs /user/”.
But it didn't work. Do I need to set more permissions?
Regards.
Created 02-07-2016 06:19 PM
hdfs dfs -chown -R root:hdfs /user/root
Created 02-07-2016 07:55 PM
Hi Neeraj,
When I try to assign the permissions it sends an error saying that /user/root folder doesn't exist, I'm newbie in Hadoop and Linux commands, but if I understand well, this one is assigning the permissions in the HDFS not in the linux system. Reviewing HDFS the /user/root folder doesn't exist since I'm logged in with admin user but the /user/admin folder is there.
I ran again the Spark commands but this time from the web interface http://localhost:4200 (previously I was using a console) and it worked, seems that every time that I use the console with ssh root@localhost, Spark assumes that the Hadoop user is root, not admin.
I'm wondering how can I impersonate the user in spark shell different from root when running the scripts.
Thanks anyway! this help me figuring out what was the problem.
Created 02-12-2016 05:39 PM
login as root in your server
su - hdfs
hdfs dfs -mkdir -p /user/root
hdfs dfs -chown -R root:hdfs /user/root