I'm starting to work my way to configure a hadoop cluster with HDFS, MapReduce, Pig, and HIVE.
HDFS and Mapreduce are fine, but I'm having issues to set up pig and hive to work with TEZ.
Btw, I was able to get HIVE to work by setting the execution mode to mapreduce, but I think it is too slow.
For instance, let me focus on PIG:
credit_operations = LOAD '/user/admin/creditcard.csv' USING PigStorage(',');
credit_group = Group credit_operations all;
amount_sum = foreach credit_group Generate SUM(credit_operations.$30) as sum_credit_operations;
I created the following code snippet to load a dataset and sum of the columns. Nothing too fancy.
When I check YARN, I see the job there:
But the thing is that it is entirely stuck. If I look for it on the TEZ View, there is no job there.