About bbende

bbende · ‎05-10-2017

@Raj B Yes that is work around when using versions before 1.1.x, just keep in mind if you ever blow away the work directory you have to make that change again.

bbende · ‎05-09-2017

Causedby: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.ipc.controller.ServerRpcControllerFactory This class is a Phoenix class, when you install Phoenix it sets this class name somewhere in hbase-site.xml and then when NiFi creates the HBase client, the HBase client is reading hbase-site.xml and trying to instantiate this class which doesn't exist in the HBase client. In Apache NiFi 1.1.x there is a property on the HBase client service for specifying the location of the phoenix-client.jar. Setting that should solve this.

bbende · ‎05-09-2017

First, its important to understand how consuming from Kafka works in NiFi. The ConsumeKafka section of this post was my attempt to explain it: http://bryanbende.com/development/2016/09/15/apache-nifi-and-apache-kafka The Kafka client automatically assigns partitions to consumers, and 1 partition can only be consumed by 1 consumer. The # of consumers = The # of NiFi nodes X the # of concurrent tasks on ConsumeKafka So if you have a single NiFi node with 1 concurrent task, then the consumer group has 1 consumer. The error means that a consumer pulled some data and went to commit offsets to Kafka, but Kafka hasn't heard from the consumer in longer than the configured session timeout, so Kafka is saying sorry you can't do that because the partition was already reassigned to another consumer, or is possible unassigned if there are no consumers alive. The most common case of this happening in NiFi was when back-pressure occurred between ConsumeKafka and the next processor... back-pressure causes the preceeding processor to not execute until the back-pressure threshold is reduced. So if ConsumeKafka wasn't allowed to run for a couple of minutes, then the back-pressure is cleared and it runs again, you then get this error. The was addressed in a ticket for the upcoming Apache NiFi 1.2.0 release: https://issues.apache.org/jira/browse/NIFI-3189

bbende · ‎05-04-2017

Well the options are... 1) ExtractText with regex 2) ExtractText to get whole content into attribute then use EL 3) A Groovy/Jyton/etc script that ExecuteScript can call to parse your data 4) Custom Java processor that knows how to parse your data format

bbende · ‎05-04-2017

You can only perform EL on flow file attributes, not on the content of a flow file. So you would first have to use ExtractText to extract the whole content of the flow file into an attribute (assuming that example message is the content of a flow file). This should only be done if you are certain the size of the messages is reasonable and can fit in memory. Once you have a flow file attribute you can apply any of the EL functions described here: https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html

bbende · ‎05-04-2017

There is an expression language guide here: https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html For your example you should be able to create an UpdateAttribute processor and add a new property like: myfilename = ${filename:substringAfterLast('/')}

bbende · ‎05-04-2017

Have you tried installing the hadoop client on the same machine as NiFi and then sending a similar file to HDFS from the command line? In most cases the issue here is the network or disk.

bbende · ‎05-01-2017

The documentation of the user defined properties for QueryDatabaseTable says: Initial Max Value Attribute Expression Language Specifies an initial max value for max value columns. Properties should be added in the format `initial.maxvalue.{max_value_column}`. So these are meant to be used to specify the initial value of a max value column. To add attributes to the flow files for down stream processing you can use an UpdateAttribute processor right after QueryDatabaseTable.

bbende · ‎04-27-2017

MergeContent in Defragment mode would work if you were only merging back one split, so if you have SplitAvro -> ConvertAvroToJson -> ExecuteScript -> MergeContent I think it would work. Once you have multiple SplitAvro processors then each one is writing over the fragment attributes with the most recent split information so MergeContent can only merge back together the most recent Splits. What are you doing in your ExecuteScript processor to find the problematic record? Just curious to understand that part to see if there is a different way to do the same thing.

bbende · ‎04-27-2017

@Gu Gur Since your processor has INPUT_FORBIDDEN, you can never have flow files coming in to your processor, so you should remove the beginning of onTrigger where you have: FlowFile flowFile = aSession.get(); if(flowFile ==null){ return; } That will always return and exit your onTrigger because there are no flow files. Instead you will want to create a new flow file like: FlowFile flowFile = session.create();

Online	Offline
Last Visited	‎09-10-2020 01:23 PM

Member Since	‎09-29-2015 04:02 PM
Last Visited	‎09-10-2020 01:23 PM
Posts	871
Kudos received	709

Cloudera Community

Re: Using nifi registry in a nifi cluster.

Re: Is there a way to enable a stateful status upd...

Re: Automated Start/Stop of a NiFi Processor

Re: PublishKafkaRecord_0_10 1.2.0.3.0.1.1-5 Error:...

Re: how to configure mergecontent processor

Re: NiFi HBase_1_1_2_ClientService setup issue - s...

Re: NiFi HBase_1_1_2_ClientService setup issue - s...

Re: Retrieving from multiple Kafka topics through ...

Re: Nifi Extract timestamp from log without using ...

Re: Nifi Extract timestamp from log without using ...

Re: How to parse the NiFi filename attribute that ...

Re: Why Standalone NiFi (PutHDFS) is slow ingestin...

Re: User defined properties and EL on QueryDatabas...

Re: Best approach to re merging flowfiles in NiFi

Re: nifi custom processor - onTrigger