Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Can't store in pig

avatar
Explorer

I am roughly following this tutorial. I am using ambari cluster, but I am using it locally.

https://www.tutorialspoint.com/apache_pig/apache_pig_storing_data.htm

The student_data.txt file is on the above page.

In pig -x mapreduce I get:

1. student = LOAD 'user/admin/Pig_Data/student_data.txt' USING PigStorage(',') as ( id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray ); 2. STORE student INTO '/Pig_Output/' USING PigStorage (',');

I get:

Input(s): Failed to read data from "hdfs://c7001.ambari.apache.org:8020/user/vagrant/user/admin/Pig_Data/student_data.txt" Output(s): Failed to produce result in "file:///Pig_Output"

Why is the store command trying to read data from

"hdfs://c7001.ambari.apache.org:8020/user/vagrant/user/admin/Pig_Data/student_data.txt"

Where is this coming from??? I put the file in HDFS as

hdfs dfs -put student_data.txt 'user/admin/Pig_Data/student_data.txt'

I am assuming that the cluster is saving my files on the ambari-server node

c7001.ambari.apache.org is the amber-server node/namenode, port 8020 ????

I am logged on one pig clients as vagrant, but where does the /user/vagrant come from??

1 ACCEPTED SOLUTION

avatar
Master Mentor

@John Cleveland

One thing i noticed in your command is the PATH of the student_data ... it has "user/admin" should not it be "/user/admin" (A slash is missing) ?

LOAD 'user/admin/Pig_Data/student_data.txt'

.

Are you running the job using logged in user "vagrant" in that case it will try finding the file from the home directory of that user. Because we are specifying the path as "user/admin......"

View solution in original post

6 REPLIES 6

avatar
Expert Contributor

@John Cleveland

Can you execute these commands on grunt and let me know the results? It should load file from the mentioned path as part of your LOAD command.

avatar
Explorer

The above is from grunt. Specifically pig -x mapreduce. The output is:

Input(s): Failed to read data from "hdfs://c7001.ambari.apache.org:8020/user/vagrant/user/admin/Pig_Data/student_data.txt" Output(s): Failed to produce result in "file:///Pig_Output"

see above posting

I will post the entire output in a few minutes ... it is rather long.

avatar
Explorer
pig-output.txt

The above is the entire session.

avatar
Master Mentor

@John Cleveland

One thing i noticed in your command is the PATH of the student_data ... it has "user/admin" should not it be "/user/admin" (A slash is missing) ?

LOAD 'user/admin/Pig_Data/student_data.txt'

.

Are you running the job using logged in user "vagrant" in that case it will try finding the file from the home directory of that user. Because we are specifying the path as "user/admin......"

avatar
Explorer

Thanks, I added that / and it did get rid of the user/vagrant but still no mapreduce.

I am going to start a new topic, because this one is essentially answered, but my real underlying problem has not been resolved.

avatar
Master Mentor

@John Cleveland

Good to know that after fixing the path one issue is resolved. It will be really great if you mark this thread as answered by clicking the "Accept" link, this will help other community users to quickly go through the correct answers.