Created 08-10-2018 12:07 PM
I've create a csv file through my session user1
on Zeppelin with dataframe: dataFrame.write.format("com.databricks.spark.csv").save("/tmp/myFile.csv")
, but I realise, when I list the /tmp directory, that it's not `user1` who's the owner of the csv file when I display the content of this file but the `Zeppelin user`. I've done this:
<code>%sh hadoop fs -chmod 666 -R /tmp/webtrack/
but it's not working.
How can I change permission for this file through zeppelin session.
Created 08-10-2018 12:18 PM
how about trying to make the file in the user's directory instead of tmp? You should have R/W access to your own directory...
Created 08-10-2018 12:30 PM
that won't solve the issue, most likely the Zeppelin user will not even have write access to the home dir of user1.
Created 08-10-2018 12:29 PM
What you describe is expected when using spark/spark2 interpreter. If you like to have impersonation with zeppelin you should use livy/livy2 interpreter and then those files will be saved as the user that is logged in to zeppelin instead of user zeppelin.
The %sh interpreter may be failing to change the file permissions because is configured to impersonate your user. You can check the same by running whoami command within %sh and see which username it prints.
If you still like to use spark/spark2 interpreter then you need to save the file to a location where authorization to other users is granted (perhaps via ranger or hdfs posix) - This needs to be done by an administrator that has privileges to change folder/file permissions.
HTH
*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
Created 08-10-2018 12:56 PM
I try what you propose but it's still not working but I think that I should use livy for the future project,
What I do is:
user$su - <br>root#su - hdfs <br>hdfs$ hdfs dfs -chmod -R 777 /tmp/myFile.csv<br>hdfs$hdfs dfs -ls /tmp/myFile.csv<br>drwxrwxrwx - zeppelin hdfs 0 2018-08-10 13:09 /tmp/myFile.csv
it's working now
Created 08-13-2018 06:46 PM
@Yassine if you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer. Thanks!
Created 08-10-2018 12:38 PM
you executed the code above from within Zeppelin? Did you get an error message? Is the difference between /tmp/myFile.csv and /tmp/webtrack/ just in the post, and you actual file was /tmp/webtrack/myFile.csv? When you listed /tmp you listed the tmp dir in the hdfs?