About mburgess

TimothySpann · ‎03-16-2018

now use record processor

saikrishna_tara · ‎10-27-2016

@mclark @Matt Burgess so in any case does the file needs to be read in to memory before it splits.?? either by lines or by bytes. i was hoping it starts the next process work once it receive first split.? in my case it waited 8 minutes until it split the 10GB file into 1200+ splits. If my files are about 100 GB each (I have 18 such files) I am scared to run the whole flow for all files. I may have to run for each file one by one.?

vitthal_jadhavb · ‎12-14-2018

I am getting same error in HDP 3.

mburgess · ‎10-04-2016

Great answer! Just to add a caveat, if you are using HDF 2.0 and HDP 2.5, please see the following: https://community.hortonworks.com/questions/59681/puthivestreaming-nifi-processor-various-errors.html

smayani · ‎10-04-2016

@Matt Burgess 2 things Resolved the issue: 1. start with the "jdbc:hive2" prefix jdbc:hive2://host.name.net:10000/;principal=hive/_HOST@EXAMPLE.COM 2. Add following property to hive-site.xml that is passed under HiveConnectionPool "Hive Configuration Resources" property. <property> <name>hadoop.security.authentication</name> <value>kerberos</value> </property>

stamilibrahim · ‎08-17-2017

How do you not use Zookeeper for service discovery. Please give me step by step instructions if possible. Tried all recommended methods online with no luck. Thank you.

mburgess · ‎09-09-2016

The PutElasticsearch processor uses the Transport API for Elasticsearch, not the HTTP API. This means your port should be 9300 not 9200. The "Identifier Attribute" property is the name of a flow file attribute that contains a unique identifer for the document. If you don't have an identifier you want to use, you can put "uuid", this will use the flow file's UUID as the identifier for the Elasticsearch document. If you do have an identifier for the document, put its value (using UpdateAttribute, EvaluateJsonPath, etc.) into a flow file attribute, and put that attribute's name in the "Identifier Attribute" property. Note you don't use Expression language here, so if your attribute's name is "doc_id", you put "doc_id" in the Identifier Attribute property, not "${doc_id}"

joshua_adeleke · ‎12-22-2016

@David Kjerrumgaard I found out the issue was with my csv file. It had to do with the column names i was ingesting in the csv file. Also, the turning the Obtain Generated Keys property to false worked as well in my case.

saikrishna_tara · ‎10-19-2016

Hi @Matt Burgess, any idea on what i am doing wrong above..getting 403 forbidden error. Regards, Sai

SQLShaw · ‎09-09-2016

+ @jfrazee @Matt Burgess

Online	Offline
Last Visited	‎11-07-2024 11:28 PM

Member Since	‎11-16-2015 02:21 PM
Last Visited	‎11-07-2024 11:28 PM
Posts	892
Kudos received	642

Cloudera Community

Re: Nifi Building error when creating a brand new ...

Re: Tuning PutHive3Streaming NiFi processor

Re: NiFi ExecuteScript - Able to add attributes to...

Re: NiFi - JOLT assign value to attribute from Jso...

Re: NiFi - ExecuteScript for getting max value of ...

Re: Converting a Large JSON File into CSV

Re: What is a good approach for Spilitting 100GB f...

Re: PutHiveStreaming Nifi processor; various error...

Re: how to use puthivestreaming

Re: NiFi Cannot create JDBC driver of class 'org.a...

Re: SelectHiveQL Fails on java.lang.NullPointerExc...

Re: How to insert data in elastic search using Apa...

Re: nifi -Ingesting a file from SFTP and insert i...

Re: How to add request headers and request body to...

Re: Can we flatten complex JSON file using NIFI.?