Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Pig Load - ERROR 2118: Input path does not exist

avatar
Explorer

I'm a newbie at Pig scripting and just walking through some examples (Cloudera on demaind training to be specific).  Anyway I load a file 

 

hdfs dfs -put $ADIR/data/ad_data1.txt /dualcore/

 

Check that the directory has proper permissions via hdfs dfs -l / 

I can see it's chmod 777 for /dualcore and also check the /dualcore/ad_data1.txt is also set properly in HDFS.

 

Now when I try to the pig -x local first_etl.pig script I get the following 

 

ERROR: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Input path does not exist: file:/dualcore/ad_data1.txt

 

QUESTION:  The file is at the root /dualcore/ad_data1.txt.  When I cat the file [hdfs dfs -cat /dualcore/ad_data1.txt] it displays the data. Do I need to specify something other than LOAD '/dualcore/ad_data1.txt' ? 

 

SCRIPT:

data = LOAD '/dualcore/ad_data1.txt' using PigStorage(':') AS (keyword:chararray,
campaign_id:chararray,
date:chararray,
time:chararray,
display_site:chararray,
was_clicked:int,
cpc:int,
country:chararray,
placement:chararray);

reordered = FOREACH data GENERATE campaign_id,
date,
time,
UPPER(TRIM(keyword)),
display_site,
placement,
was_clicked,
cpc;

STORE reordered INTO '/dualcore/ad_data1/';

1 ACCEPTED SOLUTION

avatar
Explorer
Argggg. Ok I need to find a wall and pound my head against it.

The issue was I was running the first_etl.pig as
pig -x local first_etl.pig which runs it locally expecting a local file and what I want is to run this on the Hadoop cluster. Running this as pig first_etl.pig fires this off and finds the file.

View solution in original post

1 REPLY 1

avatar
Explorer
Argggg. Ok I need to find a wall and pound my head against it.

The issue was I was running the first_etl.pig as
pig -x local first_etl.pig which runs it locally expecting a local file and what I want is to run this on the Hadoop cluster. Running this as pig first_etl.pig fires this off and finds the file.