About AndresUrrego

PRERIT11 · ‎08-13-2020

While starting Hortonworks sandbox it gets stuck on "extracting and loading the hortonworks sandbox..." And after some times it shows the message of critical error or sometimes it says "your system has ran into an error we'll restart it"

moht · ‎08-22-2019

Did you find a solution to this?

AndresUrrego · ‎02-25-2018

Sorry sometime not read completely come up an issue 😞 works seamlessly.!

orendain · ‎01-12-2018

@Andres Urrego Regarding the VM failing, is it the services shutting down on their own and not staying up? One common cause of this is not enough memory - to reduce resource usage try turning off all services and starting only HDFS, Zookeeper, YARN and Spark. Also make sure that you give your VM at least 8GB of RAM (https://hortonworks.com/tutorial/sandbox-deployment-and-install-guide shows how). As far as documentation for Spark2/HDFS, here is a good Spark2 starter tutorial followed by a Spark2/HDFS project walkthrough. https://hortonworks.com/tutorial/hands-on-tour-of-apache-spark-in-5-minutes/#option-2-download-and-setup-hortonworks-data-platform-hdp-sandbox https://hortonworks.com/tutorial/sentiment-analysis-with-apache-spark/

AndresUrrego · ‎09-23-2017

Hi Guys, I'm so so .... Well, I just remember that you can create just an external table stored in the same folder all files with the same structure are located. So , in that way I will load whole records in one shoot. > CREATE EXTERNAL TABLE bixi_his > ( > STATIONS ARRAY<STRUCT<id: INT,s:STRING,n:string,st:string,b:string,su:string,m:string,lu:string,lc:string,bk:string,bl:string,la:float,lo:float,da:int,dx:int,ba:int,bx:int>>, > SCHEMESUSPENDED STRING, > TIMELOAD BIGINT > ) > ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' > LOCATION '/user/ingenieroandresangel/datasets/bixi2017/'; thanks

AndresUrrego · ‎08-31-2017

Hi guys I want to posted the solution , finally I have added in my flume file the options below: TwitterAgent.sources.Twitter.maxBatchSize = 50000 TwitterAgent.sources.Twitter.maxBatchDurationMillis = 100000 thanks

rmy1712 · ‎08-28-2017

Thank you @Nandish B Naidu..!! The solution worked.

eberezitsky · ‎08-15-2017

@Andres Urrego, What you are looking for (UPSERTS) aren't available in SQOOP-import. There are several approaches on how to actually update data in Hive. One of them is described here: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/bk_data-access/content/incrementally-updating-hive-table-with-sqoop-and-ext-table.html Other approaches are also using side load and merge as post-sqoop or scheduled jobs/processes. You can also check Hive ACID transactions, or using Hive-Hbase integration package. Choosing right approach is not trivial and depends on: initial volume, incremental volumes, frequency or incremental jobs, probability of updates, ability to identify uniqueness of records, acceptable latency, etc...

AndresUrrego · ‎08-16-2017

You are so amazing I really appreciate each of your comments and the time that you have put on. thanks so much. Just to let you know buddy the part that I forgot to tell you is that before going to pig I load the file information in a Hive table within the DB POC. then this is why I used: july = LOAD 'POC.july' USING org.apache.hive.hcatalog.pig.HCatLoader; Then the data coming up from Hive already have a format and the relation in Pig will match the same schema. the problem is that even after setting a schema for the output I'm not able to store this outcome in a Hive table 😞 . so to get my real scenario you should: 1. Load the CSV file in HDFS without headers (I delete them before to avoid filters) run: tail -n +2 OD_XXX.csv >> july.csv 2. Create the table and load the file: Hive: create table july ( start_date string, start_station int, end_date string, end_station int, duration int, member_s int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' STORED AS TEXTFILE; LOAD DATA INPATH '/user/andresangel/datasets/july.CSV' OVERWRITE INTO TABLE july; 3. Follow my script posted up to the end to try to store the final outcome on a hive table 🙂 thanks buddy @Dinesh Chitlangia

AndresUrrego · ‎06-23-2017

Thanks so much @Lester Martin I appreciate your help now worked, I replaced my statement using yours and it worked. salaries_cl = FOREACH salaries_fl GENERATE (int)year as year:int,$1,$2,$3, (long)salary as salary:long; Weird why the other one didn't work but well thanks so much.

Online	Offline
Last Visited	‎01-21-2018 10:09 PM

Member Since	‎01-21-2018 06:37 PM
Last Visited	‎01-21-2018 10:09 PM
Posts	58
Kudos received	4

Cloudera Community

Re: Load several files into HIVE table

Re: Read flume twitter files with HIVE

Re: Import Sqoop as textfile

Re: Facing issue in installation of HDP sandbox 2....

Re: updating and inserting new data to mysql using...

Re: How to start the beeline connection in Hive HD...

Re: Virtualbox Sandbox does not start HDP 2.6

Re: Load several files into HIVE table

Re: Read flume twitter files with HIVE

Re: Sqoop Import-all-tables not working with targe...

Re: Updating hive table with sqoop from mysql tabl...

Re: Pig - Store a complex relation schema in a hiv...

Re: PIG tranform relations format after first load