Member since
01-21-2018
58
Posts
4
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3302 | 09-23-2017 03:05 AM | |
1566 | 08-31-2017 08:20 PM | |
6193 | 05-15-2017 06:06 PM |
08-14-2017
02:02 PM
thanks so much @Dinesh Chitlangia I set the output format finally like : GENERATE FLATTEN( group) AS (day, code_station),(int)total_dura as (total_dura:int),(float)avg_dura as (avg_dura:float),(int)qty_trips as (qty_trips:int). Now before storing the output in HIVE I have created the table below: CREATE TABLE july_analysis
(day int,code_station int, total_dura double,avg_dura float,qty_trips int)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
STORED AS TEXTFILE; My problem now is when I try to store the data because I get back a message saying: STORE july_result INTO 'poc.july_analysis' USING org.apache.hive.hcatalog.pig.HCatStorer (); 2017-08-14 09:56:55,712 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1002: Unable to store alias july_result I saved the output as file to confirm that everything was coming up right and that worked , I also to the moment to open my pig consoled I taped Pig -x tez -useHCatalog. thanks for whole info you can provide I apprefciate. Andres U,
... View more
08-10-2017
11:10 PM
here is my deal today. Well, I have created a relation as result of a couple of transformations after have read the relation from hive. the thing is that I want to store the final relation after a couple of analysis back in Hive but I can't. Let see that in my code much clear. The first String is when I LOAD from Hive and transform my result: july = LOAD 'POC.july' USING org.apache.hive.hcatalog.pig.HCatLoader ;
july_cl = FOREACH july GENERATE GetDay(ToDate(start_date)) as day:int,start_station,duration; jul_cl_fl = FILTER july_cl BY day==31;
july_gr = GROUP jul_cl_fl BY (day,start_station);
july_result = FOREACH july_gr {
total_dura = SUM(jul_cl_fl.duration);
avg_dura = AVG(jul_cl_fl.duration);
qty_trips = COUNT(jul_cl_fl);
GENERATE FLATTEN(group),total_dura,avg_dura,qty_trips; }; So, now when I try to store the relation july_result I can't because the schema has changed and I suppose that it's not compatible with Hive: STORE july_result INTO 'poc.july_analysis' USING org.apache.hive.hcatalog.pig.HCatStorer (); Even if I have tried to set a special scheme for the final relation I haven't figured it out. july_result = FOREACH july_gr {
total_dura = SUM(jul_cl_fl.duration);
avg_dura = AVG(jul_cl_fl.duration);
qty_trips = COUNT(jul_cl_fl);
GENERATE FLATTEN(group) as (day:int),total_dura as (total_dura:int),avg_dura as (avg_dura:int),qty_trips as (qty_trips:int);
}; PDS: the table in hive exists previusly!!
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Pig
06-23-2017
07:18 PM
Thanks so much @Lester Martin I appreciate your help now worked, I replaced my statement using yours and it worked. salaries_cl = FOREACH salaries_fl GENERATE (int)year as year:int,$1,$2,$3, (long)salary as salary:long; Weird why the other one didn't work but well thanks so much.
... View more
06-20-2017
01:07 PM
Hey everyone, My case today is a little weird for me cause according to me I'm running the right scripts but well anyways, the thing is that I need load a file with a char structure but then I need to delete the headers and set the right structure relation. Here my code: Finally, I'm trying to perform a sum using this new structure but Pig always according with the log return an error casting the column with the new format as long for salary and int for the year so is like Pig won't be able to get the new structure. Error: Could some of our gurus let me know the right way to get a nice transformation and perform my scripts. thanks so much
... View more
Labels:
- Labels:
-
Apache Pig
06-14-2017
01:55 PM
did you create the user that you use in hive or someone else did it for you
... View more
06-14-2017
01:26 PM
A couple of questions from my side are: Just to let you know my scenario, I'M playing with a single node configuration in a virtual machine with hortonworks services as Hive, Pig , etc. * Do you use a specific user, created to get access to hive in your cluster using a view? *Did you follow this tutorial? here
... View more
06-14-2017
01:15 PM
have you configured already the odbc connection parammeter using the hortonworks driver and it worked, I' reaching you cause i'm trying that and i dont know how to do it. thanks
... View more
06-13-2017
02:43 AM
Any update buddy?
... View more
06-06-2017
08:00 PM
got it, I start to understand how works the grouping in Pig . actually to be sure i did: october_gr_counting = FOREACH october_station_gr GENERATE group , COUNT(october) thanks so much buddy.
... View more
06-05-2017
01:11 PM
Hi guys, I have been struggling with my Pig code and I haven't arrived at desired result so this is why I'm knocking you guys. Well, I have a file with some information and my idea is to get a counting by reference number. so as an overview I have done: So the third step worked but the problem is that it generates a huge tuple to include the reference number by each tuple in my grouping bag that contains the number so the output it's like: Then I tried the fourth step but although I got the counting list I missed the reference_number so I would like to get the same list but just once the reference code. Thanks so much for your help team. @Lester Martin
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Pig