- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
SPARK Application + HDFS + User Airflow is not the owner of inode=alapati
Created on
‎12-01-2019
11:45 PM
- last edited on
‎12-02-2019
01:30 AM
by
VidyaSargur
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
we are runs spark application on hadoop cluster ( HDP version - 2.6.5 from hortonworks )
from the logs we can see the following Diagnostics
User: airflow
Application Type: SPARK
User class threw exception: org.apache.hadoop.security.AccessControlException: Permission denied. user=airflow is not the owner of inode=alapati
not clearly what we need to search in `HDFS` in order to find why we get Permission denied
Created ‎12-02-2019 02:06 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can change the ownership of the HDFS directory to airflow:hadoop please do run the -chown command on / ??? It should something like /users/airflow/xxx
Please let me know
Created ‎12-01-2019 11:56 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That's a classic permissions issue "airflow" is trying to write to that directory but has no permissions as it's owned by alapati user the inode=alapati.
The easiest solution is to grant the permissions as the hdfs user
$ hdfs dfs -chown airflow:{$airflow_group}
Most components like Spark, Hive, sqoop need to access HDFS
Created ‎12-02-2019 12:10 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
how to found the airflow_group ?
Created ‎12-02-2019 12:17 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear Shelton
do you mean :
grep airflow /etc/passwd
airflow:x:1016:1016::/home/airflow:/sbin/nologin
# id 1016
uid=1016(airflow) gid=1016(airflow) groups=1016(airflow),1005(hdfs)
so we need to perform :
$ hdfs dfs -chown airflow:1016
?
Created ‎12-02-2019 12:45 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The gid is numeric value just to indicate the id but the valid groups for airflow are [airflow and hdfs]
$ hdfs dfs -chown airflow:hdfs
Should do the magic, please revert
Cheers
Created on ‎12-02-2019 12:51 AM - edited ‎12-02-2019 01:01 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
we got the following , because missing PATH
$ hdfs dfs -chown airflow:hdfs
-chown: Not enough arguments: expected 2 but got 1
Usage: hadoop fs [generic options]
Created on ‎12-02-2019 12:57 AM - edited ‎12-02-2019 03:43 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ok , we meed to find the path's for the changing
Created ‎12-02-2019 01:42 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@mike_bronson7
That's correct you need to give the path of the directory 🙂 ie usually in hdfs
$ hdfs dfs -chown airflow:hdfs /path/in/hdfs/where/you/failed/to/write
As you didn't include the path I assumed you'd do that with the -chown command
Created on ‎12-02-2019 02:09 AM - edited ‎12-02-2019 02:12 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
so
when we do
hdfs dfs -ls -R / | grep "airflow " | awk '{print $1" "$2" "$3" "$4" "}'
we get:
drwxrwx--- - airflow hadoop
drwxrwx--- - airflow hadoop
drwxrwx--- - airflow hadoop
-rw-r----- 3 airflow hadoop
-rw-r----- 3 airflow hadoop
-rw-r----- 3 airflow hadoop
drwxrwx--- - airflow hadoop
-rw-r----- 3 airflow hadoop
-rw-r----- 3 airflow hadoop
-rw-r----- 3 airflow hadoop
.
.
.
.
do you means to change every hadoop group to hdfs ?
or in simple words - how to know the HDFS path , that we need to change?
Created ‎12-02-2019 02:25 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The hadoop group encapsulates all the users including hdfs
You do run a
# cat /etc/group
You should see someing like like
hadoop:x:1007:yarn-ats,hive,storm,infra-solr,zookeeper,oozie,atlas,ams,ranger,tez,zeppelin,kms,accumulo,livy,druid,spark,ambari-qa,kafka,hdfs,s qoop,yarn,mapred,hbase,knox
So running the -chown should only target the directory in the Diagnostics logs NEVER run the -chown command on / which is the root directory !!
Can you share your log please
