Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

permission denied while using distcp

avatar

I'm using Cloudera Quickstart VM 13.0 in my machine.

While I was trying to copy data within the cluster I got permission denied message because hdfs is owner of the directories I was accessing.

 

But distcp cannot be used with default hdfs use because hdfs is the blacklisted user for mapreduce jobs, but when we install cloudera hdfs is the default user against distributed file system.

 

I used ACls to give permissions to particular directories and ran same distcp command permission denied is happening.

Give me better way to copy data 

 

Thanks and regards

solomonchinni

2 REPLIES 2

avatar
Mentor
The following pattern is the often seen when running DR-like HDFS DistCp
jobs on secure clusters:

1. Define a HDFS admin group in your user identity backend (lets call it
'hdfsadmin')
2. Add qualified (strictly administrative users) users to the new
'hdfsadmin' group, and ensure all hosts in the cluster show up the new user
group when running an 'id username' command
3. On both clusters, alter dfs.permissions.supergroup via HDFS -
Configuration - "Superuser Group" field in CM to use "hdfsadmin", which
allows members of this group to act as HDFS superuser (equivalent to 'hdfs'
user when it comes to filesystem access activities)
4. Run DistCp as any user who has been allowed membership of 'hdfsadmin'
group

avatar

I've followed you by adding a new user to super group and gave permissions using ACLs

The user name is 'perl'

 

I ran a new job 

 

$ sudo -u perl hadoop distcp /solomon/data/data.txt /solomon

 

When I run this job the application was in pending status