About aervits

teixeira_javier · ‎02-02-2017

Yes, all clients are installing on the host

joby_johny · ‎02-09-2017

Thanks Artem. I tried downloading the VMware image , uploaded to s3 ,created the ami and then spin the cluster from ami.

pmohan · ‎01-30-2017

thx this helps

kuldeepsinghvit · ‎03-06-2017

Simply set a propert in oozie workflow ie. hadoop property "MAPRED>JOB>QUEUENAME" value would be "YOUR_QUEUE_NAME". this worked for me my oozie workflow is being submitted to a particular QUEUE now. Cheers. , I also had the similar requirement to submit the job in a particular queue. We simply need to change at one place that is oozie workflow settings. Add one hadoop property there as "MAPRED.JOB.QUEUENAME" and Value is "YOUR_QUEUE_NAME". By this way workflow is submitted to defined queue.

aervits · ‎01-17-2017

1. Only once 2. Use decision property https://www.infoq.com/articles/oozieexample/

aervits · ‎02-01-2017

@Vaibhav Kumar recommendations from my colleagues are valid, you have strings in header row of your CSV documents. You can certainly filter by some known entity but there's a more advanced version of CSV Pig Loader called CSVExcelStorage. It is part of Piggybank library that comes bundled with HDP, hence the register command. You can pass different control parameters to it. Mortar blog is an excellent source of information on working with Pig http://help.mortardata.com/technologies/pig/csv. grunt> register /usr/hdp/current/pig-client/piggybank.jar; grunt> a = load 'BJsales.csv' using org.apache.pig.piggybank.storage.CSVExcelStorage(',', 'NO_MULTILINE', 'NOCHANGE', 'SKIP_INPUT_HEADER') as (Num:Int,time:int,BJsales:float); grunt> describe a; a: {Num: int,time: int,BJsales: float} grunt> b = limit a 5; grunt> dump b; output (1,1,200.1) (2,2,199.5) (3,3,199.4) (4,4,198.9) (5,5,199.0) notice I am not filtering any relation, I'm telling the loader to skip header outright, it saves a few key strokes and doesn't waste any cycles processing anything extra.

aervits · ‎12-21-2016

Thank you Sergey, I wish I could accept both answers. Voting up!

gparmar14 · ‎12-11-2017

what should be the url/command when we need to access hadoop jobs for a specified time duration ?

dmitrynheng · ‎12-17-2016

Yes I did. Both were empty at the time when problem occured

sriramhadoop27 · ‎06-08-2018

@Qi Wang Could you please mention the version of Atlas in which the fix has been provided. I am using Atlas 0.8.0 and HDP2.6.4 and I am facing same issue. Could you please help.

Online	Offline
Last Visited	‎08-15-2019 06:35 AM

Member Since	‎10-01-2015 11:46 AM
Last Visited	‎08-15-2019 06:35 AM
Posts	3,933
Kudos received	1074

Cloudera Community

Re: Where can I get latest resource_management.c...

Re: How to Kerberize Flume?

Re: Load Hive Table form Pig Output File.

Re: HDP 2.6 Cluster Issues with Hive Metastore

Re: which HDP release will storm 1.1.0 be packaged...

Re: Ranger 'test connection' for Atlas fails

Re: Options for using HDP2.5 services on AWS

Re: Is there a rule based optimizer in Hive ?

Re: Specify different queue when using Oozie web s...

Re: oozie workflow in Production

Re: Pig Error : ERROR org.apache.pig.tools.grunt.G...

Re: HBase jars with conflicting versions?

Re: Ambari Views REST API Overview

Re: Can not modify Hive view in Ambari

Re: Atals Taxonomy not showing catalog created.