Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Failed to read data from "/user/guest/Batting.csv"

Explorer

running "pig 1.pig" from the tutorial http://hortonworks.com/hadoop-tutorial/faster-pig-... yields the error

Input(s): Failed to read data from "/user/guest/Batting.csv"

Output(s): Counters: Total records written : 0 Total bytes written : 0 Spillable Memory Manager spill count : 0 Total bags proactively spilled: 0 Total records proactively spilled: 0

Job DAG: null->null, null

1 ACCEPTED SOLUTION

@Matthew bird You need a home directory for the user in HDFS so here is what is needed -

#Login as root to the sandbox
su - hdfs 
hdfs dfs -mkdir /user/root
hdfs dfs -chown root:hadoop /user/root
hdfs dfs -chmod 755 /user/root

Try to run the pig script after you've done the above steps.

View solution in original post

14 REPLIES 14

Master Collaborator

Did the file Batting.csv get copied over to HDFS? Can you do a hadoop fs -ls /user/quest? Potentially could be a file not existing issue or permission issue.

Explorer

[root@sandbox lahman591-csv]# hadoop fs -ls /user/guest

Found 1 items

-rwxrwxrwx 3 root guest 6398886 2015-12-11 00:53 /user/guest/Batting.csv

Master Collaborator

Can you print the whole console output? The error here doesn't tell much.

Explorer

@Matthew bird

It permission issue. User root does not have permission to write into /user/root

try this

hdfs dfs -chown -R root:hadoop /user/root

org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE, inode="/user/root/.staging":hdfs:hdfs:drwxr-xr-x

Explorer

/user/root doesn't exist so I tried the following:

su hdfs

hdfs dfs -chown -R root:sandbox /user/guest

exit

pig 1.pig

and I got the same error. The next thing I tried is:

hdfs dfs -chown -R root:hadoop /user/guest

but that still does not work

pastebin here: http://pastebin.com/8fe23qGL

I have also tried giving ownership of /user/guest to root:root, root:hadoop, root:hdfs,

There are few solutions -

1. The easy solution - grant permission on files to root user. In this case, looks like the file has wide open permission but because the file is under another user's home directory, may be root user does not have access to the guest home directory. So, check the permission for /user/guest and adjust if needed.

2. Use the correct user for the job - I like to create a service Id for data processing and not use local super users like (root) or hdfs super users like (hdfs). So you can use users like guest and inbuilt test user ambari-qa. The user is identify based on their local OS identity so you can switch user to guest before running the process.

Explorer

It looks to me like root:hadoop is already the owner:

[root@sandbox lahman591-csv]# hdfs dfs -ls /user/

Found 11 items

drwxrwx--- - ambari-qa hdfs 0 2015-10-27 12:39 /user/ambari-qa

drwxrwxrwx - root hadoop 0 2015-12-11 00:53 /user/guest

drwxr-xr-x - hcat hdfs 0 2015-10-27 12:43 /user/hcat

drwx------ - hdfs hdfs 0 2015-10-27 13:22 /user/hdfs

drwx------ - hive hdfs 0 2015-10-27 12:43 /user/hive

drwxrwxrwx - hue hdfs 0 2015-10-27 12:55 /user/hue

drwxrwxr-x - oozie hdfs 0 2015-10-27 12:44 /user/oozie

drwxr-xr-x - solr hdfs 0 2015-10-27 12:48 /user/solr

drwxrwxr-x - spark hdfs 0 2015-10-27 12:41 /user/spark

drwxr-xr-x - unit hdfs 0 2015-10-27 12:46 /user/unit

drwxr-xr-x - zeppelin zeppelin 0 2015-10-27 13:19 /user/zeppelin

@Matthew bird You need a home directory for the user in HDFS so here is what is needed -

#Login as root to the sandbox
su - hdfs 
hdfs dfs -mkdir /user/root
hdfs dfs -chown root:hadoop /user/root
hdfs dfs -chmod 755 /user/root

Try to run the pig script after you've done the above steps.

Explorer

definitely progress. Now it is looping: http://pastebin.com/bqgSDYdb I am not sure if this is still about permissions or not.

Master Collaborator

Check if the services are up, looks like your Job History Server may be down.

Explorer

I got it working with a fresh instance of HDP_2.3.2_virtualbox.

I set root:hadoop as owner of /user/root and did the same for /user/guest and that did the trick. Thanks everyone.

I see 'Connection Refused' which means either a service is down or connection to wrong port. Like Deepesh said, appears to be former and that History server is down.

Explorer

I am still facing this issue. Matthew Bird - can you help me? How did you get fresh instance?

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.