Created 08-20-2016 07:24 PM
Hi All,
I'm doing a normal load to a relation and storing the result on local i.e My Desktop .but running the below command throws me some error 6000.While running the same thing on mapreduce stores the results properly to the HDFS path.Is it that we cannot store anything to local and just read operations are possible in it.??
STORE Relation_name INTO '/home/vaibhav/Desktop'
2016-08-21 00:52:44,693 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 6000: <line 9, column 0> Output Location Validation Failed for: '/home/vaibhav/Desktop More info to follow:1 2016-08-21 00:52:44,694 [main] WARN org.apache.pig.tools.grunt.Grunt - There is no log file to write to. 2016-08-21 00:52:44,694 [main] ERROR org.apache.pig.tools.grunt.Grunt Org.apache.pig.impl.logicalLayer.FrontendException:
ERROR 1002: Unable to store alias CASE1
Created 08-22-2016 06:53 PM
Thank you so much for giving your time to solve the problem!!! Kudos
The problem got solved ,actually i was not having 777 Permissions on whatever was there on desktop ,so i followed the following steps:
$sudo chmod -R 777 /home/vaibhav/Desktop Then format namenode ( Only if this does not harm you anyway) and re-start your daemons and followed the same steps to store the data and the directory got created on dektop.
Created 08-20-2016 08:25 PM
Can you do ls command on the output directory in Grunt shell? I wonder if Desktop with upper case exists.
fs -ls
Created 08-20-2016 08:29 PM
Yes the path is correct . I can see in the logs that it is not even able to read the records from the files While using Store operation in localmode
Created 08-20-2016 11:34 PM
Please paste sample dataset and script I'll test
Created 08-21-2016 06:46 PM
'media/vaibhav'
That does not look right
Created 08-22-2016 04:38 AM
I tried with my desktop path also but it didn't worked
Created 08-21-2016 05:22 PM
i added the dataset in attachment.
PFB the script i have used:
CASE1 = LOAD '/home/vaibhav/Desktop/mydata.csv' USING PigStorage(',') as (cmte_id:chararray,cand_id:chararray)
STORE CASE1 INTO 'media/vaibhav' USING PigStorage(',')
Created 08-22-2016 01:25 PM
Hi @Vaibhav Kumar, could you please try the file:/// prefix in pig -x local mode.
STORE Relation_name INTO 'file:///home/vaibhav/Desktop/Output' using PigStorage(',')
Created 08-22-2016 02:10 PM
here's my result
my home directory looks like so
root@u1201:~# ls -ltra total 8440 -rw-r--r-- 1 root root 140 Apr 19 2012 .profile -rw-r--r-- 1 root root 3106 Apr 19 2012 .bashrc -rw-r--r-- 1 root root 8491533 Nov 18 2015 apache-maven-3.3.9-bin.tar.gz drwxr-xr-x 4 zookeeper users 4096 Dec 16 2015 hdp_manual_install_rpm_helper_files-2.3.4.0.3485 -rw-r--r-- 1 root root 85490 Dec 21 2015 hdp_manual_install_rpm_helper_files-2.3.4.0.3485.tar.gz drwxr-xr-x 2 root root 4096 May 10 19:20 .oracle_jre_usage -rw------- 1 root root 1024 May 10 19:20 .rnd drw------- 2 root root 4096 May 10 19:35 .ssh -rw------- 1 root root 1675 May 11 15:22 ec2-keypair -rw-rw-r-- 1 vagrant vagrant 196 Aug 14 22:28 mydata.csv -rw-r--r-- 1 root root 167 Aug 21 17:22 6836-mydata.tar.gz drwxr-xr-x 26 root root 4096 Aug 22 12:59 .. -rw------- 1 root root 4306 Aug 22 14:02 .bash_history -rw-r--r-- 1 root root 1510 Aug 22 14:05 pig_1471874665243.log -rw-r--r-- 1 root root 286 Aug 22 14:06 .pig_history drwx------ 5 root root 4096 Aug 22 14:07 .
enter Pig Tez local mode using
pig -x tez_local
you will get the grunt shell
WARNING: Use "yarn jar" to launch YARN applications. 16/08/22 14:08:13 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL 16/08/22 14:08:13 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE 16/08/22 14:08:13 INFO pig.ExecTypeProvider: Trying ExecType : TEZ_LOCAL 16/08/22 14:08:13 INFO pig.ExecTypeProvider: Picked TEZ_LOCAL as the ExecType 2016-08-22 14:08:13,754 [main] INFO org.apache.pig.Main - Apache Pig version 0.15.0.2.4.2.0-258 (rexported) compiled Apr 25 2016, 06:41:45 2016-08-22 14:08:13,755 [main] INFO org.apache.pig.Main - Logging error messages to: /root/pig_1471874893753.log 2016-08-22 14:08:13,789 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /root/.pigbootup not found 2016-08-22 14:08:13,941 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:/// 2016-08-22 14:08:14,203 [main] INFO org.apache.pig.PigServer - Pig Script ID for the session: PIG-default-d9d3ca17-ae1d-42ec-b984-d38db00b1f0f 2016-08-22 14:08:14,737 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://u1202.ambari.apache.org:8188/ws/v1/timeline/ 2016-08-22 14:08:15,456 [main] INFO org.apache.pig.backend.hadoop.ATSService - Created ATS Hook
now start entering your script
grunt> CASE1 = load 'mydata.csv' using PigStorage(','); grunt> store CASE1 into 'outputdir' using PigStorage(',');
the result of that is
Input(s): Successfully read 10 records from: "file:///root/mydata.csv" Output(s): Successfully stored 10 records in: "file:///root/outputdir"
now I can list the directory from grunt shell
grunt> fs -ls Found 15 items -rw------- 1 root root 4306 2016-08-22 14:02 .bash_history -rw-r--r-- 1 root root 3106 2012-04-19 09:15 .bashrc drwxr-xr-x - root root 4096 2016-05-10 19:20 .oracle_jre_usage -rw-r--r-- 1 root root 394 2016-08-22 14:10 .pig_history -rw-r--r-- 1 root root 140 2012-04-19 09:15 .profile -rw------- 1 root root 1024 2016-05-10 19:20 .rnd drw------- - root root 4096 2016-05-10 19:35 .ssh -rw-r--r-- 1 root root 167 2016-08-21 17:22 6836-mydata.tar.gz -rw-r--r-- 1 root root 8491533 2015-11-18 07:42 apache-maven-3.3.9-bin.tar.gz -rw------- 1 root root 1675 2016-05-11 15:22 ec2-keypair drwxr-xr-x - zookeeper users 4096 2015-12-16 08:35 hdp_manual_install_rpm_helper_files-2.3.4.0.3485 -rw-r--r-- 1 root root 85490 2015-12-21 16:45 hdp_manual_install_rpm_helper_files-2.3.4.0.3485.tar.gz -rw-rw-r-- 1 vagrant vagrant 196 2016-08-14 22:28 mydata.csv drwxr-xr-x - root root 4096 2016-08-22 14:10 outputdir -rw-r--r-- 1 root root 1510 2016-08-22 14:05 pig_1471874665243.log
notice the outputdir and mydata.csv are both listed, this is inside your home directory, let's do ls on the outputdir
grunt> fs -ls outputdir Found 2 items -rw-r--r-- 1 root root 0 2016-08-22 14:10 outputdir/_SUCCESS -rw-r--r-- 1 root root 196 2016-08-22 14:10 outputdir/part-v000-o000-r-00000
let's look inside the result file
grunt> fs -cat outputdir/part-v000-o000-r-00000 cmte_id,cand_id C00458844,P60006723 C00458844,P60006723 C00458846,P60006723 C00458847,P60006723 C00458848,P60006723 C00458846,P60006723 C00458846,P60006723 C00458856,P60006723 C00458852,P60006723
and for the sake of it let's cat the input file
grunt> fs -cat mydata.csv cmte_id,cand_id C00458844,P60006723 C00458844,P60006723 C00458846,P60006723 C00458847,P60006723 C00458848,P60006723 C00458846,P60006723 C00458846,P60006723 C00458856,P60006723 C00458852,P60006723
Created 08-22-2016 02:40 PM
@Artem Ervits, guess its a problem with the relative vs absolute path. My observation is that in local mode, all store commands targeted towards the present working directory works fine, but for absolute paths, it requires file:/// prefix.