Support Questions
Find answers, ask questions, and share your expertise

Failed to read data from "/user/guest/Batting.csv"

Explorer

running "pig 1.pig" from the tutorial http://hortonworks.com/hadoop-tutorial/faster-pig-... yields the error

Input(s): Failed to read data from "/user/guest/Batting.csv"

Output(s): Counters: Total records written : 0 Total bytes written : 0 Spillable Memory Manager spill count : 0 Total bags proactively spilled: 0 Total records proactively spilled: 0

Job DAG: null->null, null

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Failed to read data from "/user/guest/Batting.csv"

@Matthew bird You need a home directory for the user in HDFS so here is what is needed -

#Login as root to the sandbox
su - hdfs 
hdfs dfs -mkdir /user/root
hdfs dfs -chown root:hadoop /user/root
hdfs dfs -chmod 755 /user/root

Try to run the pig script after you've done the above steps.

View solution in original post

14 REPLIES 14

Re: Failed to read data from "/user/guest/Batting.csv"

Master Collaborator

Did the file Batting.csv get copied over to HDFS? Can you do a hadoop fs -ls /user/quest? Potentially could be a file not existing issue or permission issue.

Re: Failed to read data from "/user/guest/Batting.csv"

Explorer

[root@sandbox lahman591-csv]# hadoop fs -ls /user/guest

Found 1 items

-rwxrwxrwx 3 root guest 6398886 2015-12-11 00:53 /user/guest/Batting.csv

Re: Failed to read data from "/user/guest/Batting.csv"

Master Collaborator

Can you print the whole console output? The error here doesn't tell much.

Re: Failed to read data from "/user/guest/Batting.csv"

Explorer

Re: Failed to read data from "/user/guest/Batting.csv"

@Matthew bird

It permission issue. User root does not have permission to write into /user/root

try this

hdfs dfs -chown -R root:hadoop /user/root

org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE, inode="/user/root/.staging":hdfs:hdfs:drwxr-xr-x

Re: Failed to read data from "/user/guest/Batting.csv"

Explorer

/user/root doesn't exist so I tried the following:

su hdfs

hdfs dfs -chown -R root:sandbox /user/guest

exit

pig 1.pig

and I got the same error. The next thing I tried is:

hdfs dfs -chown -R root:hadoop /user/guest

but that still does not work

pastebin here: http://pastebin.com/8fe23qGL

I have also tried giving ownership of /user/guest to root:root, root:hadoop, root:hdfs,

Re: Failed to read data from "/user/guest/Batting.csv"

There are few solutions -

1. The easy solution - grant permission on files to root user. In this case, looks like the file has wide open permission but because the file is under another user's home directory, may be root user does not have access to the guest home directory. So, check the permission for /user/guest and adjust if needed.

2. Use the correct user for the job - I like to create a service Id for data processing and not use local super users like (root) or hdfs super users like (hdfs). So you can use users like guest and inbuilt test user ambari-qa. The user is identify based on their local OS identity so you can switch user to guest before running the process.

Re: Failed to read data from "/user/guest/Batting.csv"

Explorer

It looks to me like root:hadoop is already the owner:

[root@sandbox lahman591-csv]# hdfs dfs -ls /user/

Found 11 items

drwxrwx--- - ambari-qa hdfs 0 2015-10-27 12:39 /user/ambari-qa

drwxrwxrwx - root hadoop 0 2015-12-11 00:53 /user/guest

drwxr-xr-x - hcat hdfs 0 2015-10-27 12:43 /user/hcat

drwx------ - hdfs hdfs 0 2015-10-27 13:22 /user/hdfs

drwx------ - hive hdfs 0 2015-10-27 12:43 /user/hive

drwxrwxrwx - hue hdfs 0 2015-10-27 12:55 /user/hue

drwxrwxr-x - oozie hdfs 0 2015-10-27 12:44 /user/oozie

drwxr-xr-x - solr hdfs 0 2015-10-27 12:48 /user/solr

drwxrwxr-x - spark hdfs 0 2015-10-27 12:41 /user/spark

drwxr-xr-x - unit hdfs 0 2015-10-27 12:46 /user/unit

drwxr-xr-x - zeppelin zeppelin 0 2015-10-27 13:19 /user/zeppelin

Re: Failed to read data from "/user/guest/Batting.csv"

@Matthew bird You need a home directory for the user in HDFS so here is what is needed -

#Login as root to the sandbox
su - hdfs 
hdfs dfs -mkdir /user/root
hdfs dfs -chown root:hadoop /user/root
hdfs dfs -chmod 755 /user/root

Try to run the pig script after you've done the above steps.

View solution in original post