About prodgers125

prodgers125 · ‎11-30-2016

Hi all, Is possible to create an workflow on Oozie that automatically execute some Hive, Pig and Spark scripts in order to automate my analytics process? Many thanks!

prodgers125 · ‎09-29-2016

Hi experts, How can I overwrite an existing file by a new one (data update). Imagine that I've this: result.map(pair => pair.swap).sortByKey(true).saveAsTextFile("FILE/results") And Imagine that I want to do this: test.map(pair => pair.swap).sortByKey(false).saveAsTextFile("FILE/results") How can I overwrite the results of the var result to the results of the val test in same directory?

prodgers125 · ‎09-17-2016

gkeys, many thanks! This was a fantastic answer and cover all of my doubts! 😄 😄

prodgers125 · ‎09-17-2016

What is the bigger advantage of using Hadoop instead SQL Server or ODI when we aren't in a Big Data Scenario? Many thanks!

prodgers125 · ‎09-04-2016

Hi experts, I've this statment in Apache PIG: ... Count = FOREACH data GENERATE SUM(Field); ... How can do a IF Statement like this: IF(SUM(Field) > 10) Store into X; ELSE STORE into Y; Is possible to do this? Many thanks!

prodgers125 · ‎08-08-2016

Hi mqureshi, many thanks for your help 🙂 I will look for good articles/tutorials that show me how to use complex Types in Hive. Thanks!

prodgers125 · ‎08-08-2016

Hi, I have four tables in .csv. All of them can be conected through a fact table (that are in .csv too). I wanna to do some data cleansing to this files and next put them into a Big Table in Have. But in Apache PIG should I've to create a script by table individually, or is better to join in PIG and then aplly some data cleansing in this normalized table? Thanks!

prodgers125 · ‎08-04-2016

I was missing some Jar files 🙂

prodgers125 · ‎08-03-2016

If I use Python inside a file.py in my HDFS I can run Pytho UDFs but with Java I'm getting error... I think I'm not getting all the files

prodgers125 · ‎08-03-2016

Perfect Lester 🙂 It's exactly what I need!!! 🙂 Many thanks!!!

Online	Offline
Last Visited	‎07-13-2016 11:56 AM

Member Since	‎04-27-2016 01:54 AM
Last Visited	‎07-13-2016 11:56 AM
Posts	60
Kudos received	20

Cloudera Community

Oozie - Scheduling - Pig and Hive Scripts and Spar...

Apache SPARK - Overwrite data file

Re: Hadoop versus (SQL Server or ODI)

Hadoop versus (SQL Server or ODI)

Apache PIG - If Statement based on a count value

Re: Data Modeling in Big Data - Star schema into H...

Apache PIG - Script per Table to data cleansing

Re: Big Data Analytics - Approach for Data Quality...

Re: Big Data Analytics - Approach for Data Quality...

Re: Creating a iterativa loop using Apache PIG