Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

How to resolve ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2118: Input path does not exist: hdfs://sandbox.hortonworks.com:8020/user/admin/timesheet.csv

New Contributor

@username

I'm trying the pig tutorial. The following script throws ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2118: Input path does not exist: hdfs://sandbox.hortonworks.com:8020/user/admin/timesheet.csv

drivers = LOAD 'drivers.csv' USING PigStorage(','); raw_drivers = FILTER drivers BY $0>1; drivers_details = FOREACH raw_drivers GENERATE $0 AS driverId, $1 AS name; timesheet = LOAD '/user/admin/timesheet_info/timesheet.csv' USING PigStorage(','); raw_timesheet = FILTER timesheet by $0>1; timesheet_logged = FOREACH raw_timesheet GENERATE $0 AS driverId, $2 AS hours_logged, $3 AS miles_logged; grp_logged = GROUP timesheet_logged by driverId; sum_logged = FOREACH grp_logged GENERATE group as driverId, SUM(timesheet_logged.hours_logged) as sum_hourslogged, SUM(timesheet_logged.miles_logged) as sum_mileslogged; join_sum_logged = JOIN sum_logged by driverId, drivers_details by driverId; join_data = FOREACH join_sum_logged GENERATE $0 as driverId, $4 as name, $1 as hours_logged, $2 as miles_logged; dump join_data;

I uploaded the files timesheet.csv and drivers.csv in Files View

I have tried: LOAD '/user/admin/timesheet_info/timesheet.csv' USING PigStorage(',');

LOAD '/user/admin/timesheet _info/timesheet.csv' USING PigStorage(',');

2 REPLIES 2

Super Mentor

@Karen Wisdom

The following error indicates that you might not have either not created the directory "/user/admin" or the file does not exist "/user/admin/timesheet.csv" Or if the file has the proper permission?

ERROR 2118: Input path does not exist: hdfs://sandbox.hortonworks.com:8020/user/admin/timesheet.csv

Can you please check the path first?

# hdfs dfs -ls /user/admin

.

New Contributor

I was unable to find the path. Hdfs is in four containers. The fix was to upload the files into user/admin. I had originally opened Files View and uploaded the files there. On the Files View page there is a user folder, in the user folder is an admin folder. Placing the timesheet.csv and drivers.csv in the admin folder fixed the issue.