Created 04-27-2017 12:21 AM
I'm trying the pig tutorial. The following script throws ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2118: Input path does not exist: hdfs://sandbox.hortonworks.com:8020/user/admin/timesheet.csv
drivers = LOAD 'drivers.csv' USING PigStorage(','); raw_drivers = FILTER drivers BY $0>1; drivers_details = FOREACH raw_drivers GENERATE $0 AS driverId, $1 AS name; timesheet = LOAD '/user/admin/timesheet_info/timesheet.csv' USING PigStorage(','); raw_timesheet = FILTER timesheet by $0>1; timesheet_logged = FOREACH raw_timesheet GENERATE $0 AS driverId, $2 AS hours_logged, $3 AS miles_logged; grp_logged = GROUP timesheet_logged by driverId; sum_logged = FOREACH grp_logged GENERATE group as driverId, SUM(timesheet_logged.hours_logged) as sum_hourslogged, SUM(timesheet_logged.miles_logged) as sum_mileslogged; join_sum_logged = JOIN sum_logged by driverId, drivers_details by driverId; join_data = FOREACH join_sum_logged GENERATE $0 as driverId, $4 as name, $1 as hours_logged, $2 as miles_logged; dump join_data;
I uploaded the files timesheet.csv and drivers.csv in Files View
I have tried: LOAD '/user/admin/timesheet_info/timesheet.csv' USING PigStorage(',');
LOAD '/user/admin/timesheet _info/timesheet.csv' USING PigStorage(',');
The following error indicates that you might not have either not created the directory "/user/admin" or the file does not exist "/user/admin/timesheet.csv" Or if the file has the proper permission?
ERROR 2118: Input path does not exist: hdfs://sandbox.hortonworks.com:8020/user/admin/timesheet.csv
Can you please check the path first?
# hdfs dfs -ls /user/admin
I was unable to find the path. Hdfs is in four containers. The fix was to upload the files into user/admin. I had originally opened Files View and uploaded the files there. On the Files View page there is a user folder, in the user folder is an admin folder. Placing the timesheet.csv and drivers.csv in the admin folder fixed the issue.