About AndresUrrego

AndresUrrego · ‎08-14-2017

thanks so much @Dinesh Chitlangia I set the output format finally like : GENERATE FLATTEN( group) AS (day, code_station),(int)total_dura as (total_dura:int),(float)avg_dura as (avg_dura:float),(int)qty_trips as (qty_trips:int). Now before storing the output in HIVE I have created the table below: CREATE TABLE july_analysis (day int,code_station int, total_dura double,avg_dura float,qty_trips int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' STORED AS TEXTFILE; My problem now is when I try to store the data because I get back a message saying: STORE july_result INTO 'poc.july_analysis' USING org.apache.hive.hcatalog.pig.HCatStorer (); 2017-08-14 09:56:55,712 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1002: Unable to store alias july_result I saved the output as file to confirm that everything was coming up right and that worked , I also to the moment to open my pig consoled I taped Pig -x tez -useHCatalog. thanks for whole info you can provide I apprefciate. Andres U,

AndresUrrego · ‎08-10-2017

here is my deal today. Well, I have created a relation as result of a couple of transformations after have read the relation from hive. the thing is that I want to store the final relation after a couple of analysis back in Hive but I can't. Let see that in my code much clear. The first String is when I LOAD from Hive and transform my result: july = LOAD 'POC.july' USING org.apache.hive.hcatalog.pig.HCatLoader ; july_cl = FOREACH july GENERATE GetDay(ToDate(start_date)) as day:int,start_station,duration; jul_cl_fl = FILTER july_cl BY day==31; july_gr = GROUP jul_cl_fl BY (day,start_station); july_result = FOREACH july_gr { total_dura = SUM(jul_cl_fl.duration); avg_dura = AVG(jul_cl_fl.duration); qty_trips = COUNT(jul_cl_fl); GENERATE FLATTEN(group),total_dura,avg_dura,qty_trips; }; So, now when I try to store the relation july_result I can't because the schema has changed and I suppose that it's not compatible with Hive: STORE july_result INTO 'poc.july_analysis' USING org.apache.hive.hcatalog.pig.HCatStorer (); Even if I have tried to set a special scheme for the final relation I haven't figured it out. july_result = FOREACH july_gr { total_dura = SUM(jul_cl_fl.duration); avg_dura = AVG(jul_cl_fl.duration); qty_trips = COUNT(jul_cl_fl); GENERATE FLATTEN(group) as (day:int),total_dura as (total_dura:int),avg_dura as (avg_dura:int),qty_trips as (qty_trips:int); }; PDS: the table in hive exists previusly!!

AndresUrrego · ‎06-23-2017

Thanks so much @Lester Martin I appreciate your help now worked, I replaced my statement using yours and it worked. salaries_cl = FOREACH salaries_fl GENERATE (int)year as year:int,$1,$2,$3, (long)salary as salary:long; Weird why the other one didn't work but well thanks so much.

AndresUrrego · ‎06-20-2017

Hey everyone, My case today is a little weird for me cause according to me I'm running the right scripts but well anyways, the thing is that I need load a file with a char structure but then I need to delete the headers and set the right structure relation. Here my code: Finally, I'm trying to perform a sum using this new structure but Pig always according with the log return an error casting the column with the new format as long for salary and int for the year so is like Pig won't be able to get the new structure. Error: Could some of our gurus let me know the right way to get a nice transformation and perform my scripts. thanks so much

AndresUrrego · ‎06-14-2017

did you create the user that you use in hive or someone else did it for you

AndresUrrego · ‎06-14-2017

A couple of questions from my side are: Just to let you know my scenario, I'M playing with a single node configuration in a virtual machine with hortonworks services as Hive, Pig , etc. * Do you use a specific user, created to get access to hive in your cluster using a view? *Did you follow this tutorial? here

AndresUrrego · ‎06-14-2017

have you configured already the odbc connection parammeter using the hortonworks driver and it worked, I' reaching you cause i'm trying that and i dont know how to do it. thanks

AndresUrrego · ‎06-13-2017

Any update buddy?

AndresUrrego · ‎06-06-2017

got it, I start to understand how works the grouping in Pig . actually to be sure i did: october_gr_counting = FOREACH october_station_gr GENERATE group , COUNT(october) thanks so much buddy.

AndresUrrego · ‎06-05-2017

Hi guys, I have been struggling with my Pig code and I haven't arrived at desired result so this is why I'm knocking you guys. Well, I have a file with some information and my idea is to get a counting by reference number. so as an overview I have done: So the third step worked but the problem is that it generates a huge tuple to include the reference number by each tuple in my grouping bag that contains the number so the output it's like: Then I tried the fourth step but although I got the counting list I missed the reference_number so I would like to get the same list but just once the reference code. Thanks so much for your help team. @Lester Martin

Online	Offline
Last Visited	‎01-21-2018 10:09 PM

Member Since	‎01-21-2018 06:37 PM
Last Visited	‎01-21-2018 10:09 PM
Posts	58
Kudos received	4

Cloudera Community

Re: Load several files into HIVE table

Re: Read flume twitter files with HIVE

Re: Import Sqoop as textfile

Re: Pig - Store a complex relation schema in a hiv...

Pig - Store a complex relation schema in a hive ta...

Re: PIG tranform relations format after first load

PIG tranform relations format after first load

Re: Error while writing data to Hadoop via SSIS OD...

Re: Error while writing data to Hadoop via SSIS OD...

Re: Error while writing data to Hadoop via SSIS OD...

Re: Error while writing data to Hadoop via SSIS OD...

Re: How to get the desired grouping result pig

How to get the desired grouping result pig