Support Questions
Find answers, ask questions, and share your expertise

SPARK Application + HDFS + User Airflow is not the owner of inode=alapati

we are runs spark application on hadoop cluster ( HDP version - 2.6.5 from hortonworks )

 

from the logs we can see the following Diagnostics

 

User: airflow
Application Type: SPARK
User class threw exception: org.apache.hadoop.security.AccessControlException: Permission denied. user=airflow is not the owner of inode=alapati


not clearly what we need to search in `HDFS` in order to find why we get Permission denied

Michael-Bronson
1 ACCEPTED SOLUTION

Mentor

@mike_bronson7 

 

You can change the ownership of the HDFS  directory to airflow:hadoop  please do run the -chown command on / ??? It should something like /users/airflow/xxx

Please let me know

View solution in original post

11 REPLIES 11

Mentor

@mike_bronson7 

 

That's a classic permissions issue "airflow" is trying to write to that directory but has no permissions as it's owned by alapati user  the inode=alapati.

The  easiest solution is to grant the permissions as the hdfs user

$ hdfs dfs -chown airflow:{$airflow_group}

Most components like Spark, Hive, sqoop  need to access HDFS

how to found the airflow_group ?

Michael-Bronson

Dear Shelton

 

do you mean :

 

grep airflow /etc/passwd
airflow:x:1016:1016::/home/airflow:/sbin/nologin
# id 1016
uid=1016(airflow) gid=1016(airflow) groups=1016(airflow),1005(hdfs)

 

so we need to perform :

 

$ hdfs dfs -chown airflow:1016

 

?

Michael-Bronson

Mentor

@mike_bronson7 

The gid is numeric  value just to indicate the id but the valid groups for airflow  are [airflow and hdfs]

$ hdfs dfs -chown airflow:hdfs

Should do the magic, please revert

Cheers

 

we got the following , because missing PATH

 

$ hdfs dfs -chown airflow:hdfs


-chown: Not enough arguments: expected 2 but got 1
Usage: hadoop fs [generic options]

Michael-Bronson

 

ok , we meed to find the path's for the changing 

Michael-Bronson

Mentor

@mike_bronson7 
That's correct you need to give the path of the directory 🙂 ie usually in hdfs 

 

$ hdfs dfs -chown airflow:hdfs   /path/in/hdfs/where/you/failed/to/write

 

As you didn't include the path I assumed you'd do that with the -chown command 

 

so

 

 

when we do

 

hdfs dfs -ls -R / | grep "airflow " | awk '{print $1" "$2" "$3" "$4" "}' 

 

we get:


drwxrwx--- - airflow hadoop
drwxrwx--- - airflow hadoop
drwxrwx--- - airflow hadoop
-rw-r----- 3   airflow hadoop
-rw-r----- 3   airflow hadoop
-rw-r----- 3   airflow hadoop
drwxrwx--- - airflow hadoop
-rw-r----- 3   airflow hadoop
-rw-r----- 3   airflow hadoop
-rw-r----- 3   airflow hadoop

.

.

.

.

 

do you means to change every hadoop group to hdfs ? 

 

or in simple words - how to know the HDFS path , that we need to change? 

Michael-Bronson

Mentor

@mike_bronson7 

 

The hadoop group encapsulates all the users including hdfs  

 

You do run a

# cat /etc/group 

You should see someing like like 

hadoop:x:1007:yarn-ats,hive,storm,infra-solr,zookeeper,oozie,atlas,ams,ranger,tez,zeppelin,kms,accumulo,livy,druid,spark,ambari-qa,kafka,hdfs,s qoop,yarn,mapred,hbase,knox

 

So running the -chown should only target the directory in the Diagnostics logs  NEVER  run the               -chown command on /  which is the root directory !!

 

Can you share your log please 

yes we get the following:

 

cat /etc/group | grep -i hadoop
hadoop:x:1006:hive,livy,zookeeper,spark,ams,kafka,yarn,hcat,mapred

 

cat /etc/group | grep -i airflow
hdfs:x:1005:hdfs,hive,airflow
airflow:x:1016:

 

cat /etc/group | grep -i hdfs
hdfs:x:1005:hdfs,hive,airflow

 

let me know if you need additional info?

Michael-Bronson

Mentor

@mike_bronson7 

 

You can change the ownership of the HDFS  directory to airflow:hadoop  please do run the -chown command on / ??? It should something like /users/airflow/xxx

Please let me know

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.