About aervits

nsabharwal · ‎02-29-2016

@nejm hadj Adding more information based on your comments https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.flume.ExecuteFlumeSink/additionalDetails.html You should stick with NiFi and use built in processor to ingest the data from various social media sources. Please do read docs. In NiFi, the contents of a FlowFile are accessed via a stream, but in Flume it is stored in a byte array. This means the full content will be loaded into memory when a FlowFile is processed by the ExecuteFlumeSink processor. You should consider the typical size of the FlowFiles you'll process and the batch size, if any, your sink is configured with when setting NiFi's heap size.

Eukrev · ‎02-28-2016

Thank you!!! I am happy to join this network for the level of support being provided. It keeps me motivated!!!

davide1 · ‎07-11-2017

Hello @Michael Dennis "MD" Uanang can you confirm that this worked on a 3-nodes Zookeeper install? I mean, I need to move my zk cluster from the original 3 hosts to other 3 hosts, will this work repeating 3 times the same procedure?

nsabharwal · ‎02-26-2016

@ccasano I don't see any issues in having Isilon to store the workflow repositories. Isilon is scalable storage solution and based on my experience, Isilon can be a good solution based on

bleonhardi · ‎03-20-2016

Ah cool didn't see that!

aervits · ‎07-26-2016

on Sandbox 2.5, Datafu is indeed 1.3, validated the function albeit with different dataset DEFINE HCatLoader org.apache.hive.hcatalog.pig.HCatLoader(); DEFINE SampleByKey datafu.pig.sampling.SampleByKey('0.2'); ROWS = load 'sample_08' using HCatLoader(); SAMPLE_BY_total_emp = filter ROWS by SampleByKey(total_emp); STORE SAMPLE_BY_total_emp into 'sample_total_emp'; [guest@sandbox ~]$ hdfs dfs -cat sample_total_emp/part-v000-o000-r-00000 | head -n 5 11-3011 Administrative services managers 246930 79500 11-9121 Natural sciences managers 43060 123140 13-1032 Insurance appraisers, auto damage 11280 53980 13-1051 Cost estimators 218400 60320 13-1072 Compensation, benefits, and job analysis specialists 116250 57060

ahadjidj · ‎03-03-2016

@Smart Solutions When you add Spark through Ambari, you will be asked to choose where to deploy master service (Spark History Service) And then to choose where to deploy clients services Finally you will be asked for several properties screen-shot-2016-03-03-at-61725-pm.png

KuldeepK · ‎12-02-2016

@Saurabh I have resolved this kind of error for multiple customers by following below steps: #Command 1: hadoop fs -put /usr/hdp/current/atlas-server/hook/hive/* hdfs://<NN>/user/oozie/share/lib/lib_<Timestamp>/hive/ #Command 2(Please run below command on Oozie server as 'oozie' user): oozie admin -oozie http://<oozie-server:11000/oozie -sharelibupdate Re-run your Oozie workflow, It should succeed without any issue. Hope this helps! Note - Update Oozie sharelib part is missing in the stackoverflow's answer.

aervits · ‎02-24-2016

sure thing

sunile_manjee · ‎07-20-2016

@Artem Ervits @Mehrdad Niasari I believe we can lose this question. i have opened new one on default namespace here.

Online	Offline
Last Visited	‎08-15-2019 06:35 AM

Member Since	‎10-01-2015 11:46 AM
Last Visited	‎08-15-2019 06:35 AM
Posts	3,933
Kudos received	1074

Cloudera Community

Re: Where can I get latest resource_management.c...

Re: How to Kerberize Flume?

Re: Load Hive Table form Pig Output File.

Re: HDP 2.6 Cluster Issues with Hive Metastore

Re: which HDP release will storm 1.1.0 be packaged...

Re: i'am trying to develop my first project with h...

Re: MapReduce: 0 records written from Reducer

Re: What are the steps for moving a zookeeper serv...

Re: Nifi with Isilon?

Re: Is CombineHiveInputFormat deprecated by OrcInp...

Re: Data Munging with Hadoop DataFu Sampling examp...

Re: What happens inside the Spark Component Adding...

Re: hive.exec.post.hooks Class not found:org.apach...

Re: when start sqoop, error: [sqoop@ip-172-31-31-7...

Re: Where do I control hbase namespace from ranger...