Support Questions

jcleve72 · ‎06-06-2017

I am roughly following this tutorial. I am using ambari cluster, but I am using it locally.

https://www.tutorialspoint.com/apache_pig/apache_pig_storing_data.htm

The student_data.txt file is on the above page.

In pig -x mapreduce I get:

1. student = LOAD 'user/admin/Pig_Data/student_data.txt' USING PigStorage(',') as ( id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray ); 2. STORE student INTO '/Pig_Output/' USING PigStorage (',');

I get:

Input(s): Failed to read data from "hdfs://c7001.ambari.apache.org:8020/user/vagrant/user/admin/Pig_Data/student_data.txt" Output(s): Failed to produce result in "file:///Pig_Output"

Why is the store command trying to read data from

"hdfs://c7001.ambari.apache.org:8020/user/vagrant/user/admin/Pig_Data/student_data.txt"

Where is this coming from??? I put the file in HDFS as

hdfs dfs -put student_data.txt 'user/admin/Pig_Data/student_data.txt'

I am assuming that the cluster is saving my files on the ambari-server node

c7001.ambari.apache.org is the amber-server node/namenode, port 8020 ????

I am logged on one pig clients as vagrant, but where does the /user/vagrant come from??

jsensharma · ‎06-07-2017

@John Cleveland

One thing i noticed in your command is the PATH of the student_data ... it has "user/admin" should not it be "/user/admin" (A slash is missing) ?

LOAD 'user/admin/Pig_Data/student_data.txt'

.

Are you running the job using logged in user "vagrant" in that case it will try finding the file from the home directory of that user. Because we are specifying the path as "user/admin......"

View solution in original post

SatishS · ‎06-07-2017

@John Cleveland

Can you execute these commands on grunt and let me know the results? It should load file from the mentioned path as part of your LOAD command.

jcleve72 · ‎06-07-2017

The above is from grunt. Specifically pig -x mapreduce. The output is:

Input(s): Failed to read data from "hdfs://c7001.ambari.apache.org:8020/user/vagrant/user/admin/Pig_Data/student_data.txt" Output(s): Failed to produce result in "file:///Pig_Output"

see above posting

I will post the entire output in a few minutes ... it is rather long.

jcleve72 · ‎06-07-2017

pig-output.txt

The above is the entire session.

jsensharma · ‎06-07-2017

@John Cleveland

One thing i noticed in your command is the PATH of the student_data ... it has "user/admin" should not it be "/user/admin" (A slash is missing) ?

LOAD 'user/admin/Pig_Data/student_data.txt'

.

Are you running the job using logged in user "vagrant" in that case it will try finding the file from the home directory of that user. Because we are specifying the path as "user/admin......"

jcleve72 · ‎06-07-2017

Thanks, I added that / and it did get rid of the user/vagrant but still no mapreduce.

I am going to start a new topic, because this one is essentially answered, but my real underlying problem has not been resolved.

jsensharma · ‎06-07-2017

@John Cleveland

Good to know that after fixing the path one issue is resolved. It will be really great if you mark this thread as answered by clicking the "Accept" link, this will help other community users to quickly go through the correct answers.

Cloudera Community

Support Questions

Can't store in pig