Member since
04-27-2017
2
Posts
0
Kudos Received
0
Solutions
04-27-2017
07:43 PM
I was unable to find the path. Hdfs is in four containers. The fix was to upload the files into user/admin. I had originally opened Files View and uploaded the files there. On the Files View page there is a user folder, in the user folder is an admin folder. Placing the timesheet.csv and drivers.csv in the admin folder fixed the issue.
... View more
04-27-2017
12:21 AM
@username I'm trying the pig tutorial. The following script throws ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2118: Input path does not exist: hdfs://sandbox.hortonworks.com:8020/user/admin/timesheet.csv drivers = LOAD 'drivers.csv' USING PigStorage(',');
raw_drivers = FILTER drivers BY $0>1;
drivers_details = FOREACH raw_drivers GENERATE $0 AS driverId, $1 AS name;
timesheet = LOAD '/user/admin/timesheet_info/timesheet.csv' USING PigStorage(',');
raw_timesheet = FILTER timesheet by $0>1;
timesheet_logged = FOREACH raw_timesheet GENERATE $0 AS driverId, $2 AS hours_logged, $3 AS miles_logged;
grp_logged = GROUP timesheet_logged by driverId;
sum_logged = FOREACH grp_logged GENERATE group as driverId,
SUM(timesheet_logged.hours_logged) as sum_hourslogged,
SUM(timesheet_logged.miles_logged) as sum_mileslogged;
join_sum_logged = JOIN sum_logged by driverId, drivers_details by driverId;
join_data = FOREACH join_sum_logged GENERATE $0 as driverId, $4 as name, $1 as hours_logged, $2 as miles_logged;
dump join_data;
I uploaded the files timesheet.csv and drivers.csv in Files View I have tried: LOAD '/user/admin/timesheet_info/timesheet.csv' USING PigStorage(','); LOAD '/user/admin/timesheet _info/timesheet.csv' USING PigStorage(',');
... View more
Labels:
- Labels:
-
Hortonworks Data Platform (HDP)