Created 06-06-2017 10:39 PM
I am roughly following this tutorial. I am using ambari cluster, but I am using it locally.
https://www.tutorialspoint.com/apache_pig/apache_pig_storing_data.htm
The student_data.txt file is on the above page.
In pig -x mapreduce I get:
1. student = LOAD 'user/admin/Pig_Data/student_data.txt' USING PigStorage(',') as ( id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray ); 2. STORE student INTO '/Pig_Output/' USING PigStorage (',');
I get:
Input(s): Failed to read data from "hdfs://c7001.ambari.apache.org:8020/user/vagrant/user/admin/Pig_Data/student_data.txt" Output(s): Failed to produce result in "file:///Pig_Output"
Why is the store command trying to read data from
"hdfs://c7001.ambari.apache.org:8020/user/vagrant/user/admin/Pig_Data/student_data.txt"
Where is this coming from??? I put the file in HDFS as
hdfs dfs -put student_data.txt 'user/admin/Pig_Data/student_data.txt'
I am assuming that the cluster is saving my files on the ambari-server node
c7001.ambari.apache.org is the amber-server node/namenode, port 8020 ????
I am logged on one pig clients as vagrant, but where does the /user/vagrant come from??
Created 06-07-2017 07:25 AM
One thing i noticed in your command is the PATH of the student_data ... it has "user/admin" should not it be "/user/admin" (A slash is missing) ?
LOAD 'user/admin/Pig_Data/student_data.txt'
.
Are you running the job using logged in user "vagrant" in that case it will try finding the file from the home directory of that user. Because we are specifying the path as "user/admin......"
Created 06-07-2017 02:23 AM
Can you execute these commands on grunt and let me know the results? It should load file from the mentioned path as part of your LOAD command.
Created 06-07-2017 07:05 AM
The above is from grunt. Specifically pig -x mapreduce. The output is:
Input(s): Failed to read data from "hdfs://c7001.ambari.apache.org:8020/user/vagrant/user/admin/Pig_Data/student_data.txt" Output(s): Failed to produce result in "file:///Pig_Output"
see above posting
I will post the entire output in a few minutes ... it is rather long.
Created 06-07-2017 07:20 AM
The above is the entire session.
Created 06-07-2017 07:25 AM
One thing i noticed in your command is the PATH of the student_data ... it has "user/admin" should not it be "/user/admin" (A slash is missing) ?
LOAD 'user/admin/Pig_Data/student_data.txt'
.
Are you running the job using logged in user "vagrant" in that case it will try finding the file from the home directory of that user. Because we are specifying the path as "user/admin......"
Created 06-07-2017 08:08 AM
Thanks, I added that / and it did get rid of the user/vagrant but still no mapreduce.
I am going to start a new topic, because this one is essentially answered, but my real underlying problem has not been resolved.
Created 06-07-2017 08:11 AM