Reply
Highlighted
New Contributor
Posts: 9
Registered: ‎03-06-2017
Accepted Solution

Pig Load - ERROR 2118: Input path does not exist

I'm a newbie at Pig scripting and just walking through some examples (Cloudera on demaind training to be specific).  Anyway I load a file 

 

hdfs dfs -put $ADIR/data/ad_data1.txt /dualcore/

 

Check that the directory has proper permissions via hdfs dfs -l / 

I can see it's chmod 777 for /dualcore and also check the /dualcore/ad_data1.txt is also set properly in HDFS.

 

Now when I try to the pig -x local first_etl.pig script I get the following 

 

ERROR: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Input path does not exist: file:/dualcore/ad_data1.txt

 

QUESTION:  The file is at the root /dualcore/ad_data1.txt.  When I cat the file [hdfs dfs -cat /dualcore/ad_data1.txt] it displays the data. Do I need to specify something other than LOAD '/dualcore/ad_data1.txt' ? 

 

SCRIPT:

data = LOAD '/dualcore/ad_data1.txt' using PigStorage(':') AS (keyword:chararray,
campaign_id:chararray,
date:chararray,
time:chararray,
display_site:chararray,
was_clicked:int,
cpc:int,
country:chararray,
placement:chararray);

reordered = FOREACH data GENERATE campaign_id,
date,
time,
UPPER(TRIM(keyword)),
display_site,
placement,
was_clicked,
cpc;

STORE reordered INTO '/dualcore/ad_data1/';

New Contributor
Posts: 9
Registered: ‎03-06-2017

Re: Pig Load - ERROR 2118: Input path does not exist

Argggg. Ok I need to find a wall and pound my head against it.

The issue was I was running the first_etl.pig as
pig -x local first_etl.pig which runs it locally expecting a local file and what I want is to run this on the Hadoop cluster. Running this as pig first_etl.pig fires this off and finds the file.
Announcements