About bbende

bbende · ‎08-09-2017

Currently you have to shutdown all the nodes, upgrade them, and restart. There may be future improvements to allow a rolling upgrade (node-by-node), but currently you can't have live nodes with mixed versions, meaning node1 is upgrade and comes up with 1.3 and the rest of the cluster is still running 1.0. From an upgrade perspective, you are mostly replacing the lib directory with the lib directory from the latest release, and of course comparing config files to see what has changed, and reading the release notes, but how you do this is really up to you. There is an article here with some tips: https://cwiki.apache.org/confluence/display/NIFI/Upgrading+NiFi Any data in your flow will still be there when you restart.

bbende · ‎08-09-2017

This type of streaming join is not really what Apache NiFi was made to do... You can do "look up" joins in NiFi where a processor has some kind of reference data set loaded in memory, and a flow file comes in and some value from the flow file is used to look up something in the reference data set and add that back into the data in the flow file, basically enrichment. There are some recent components added to help with this (LookupService and LookupRecord). True streaming joins would better handled in a stream processing platform like Storm.

bbende · ‎08-08-2017

You could also use MergeContent before PutHDFS so that you can merge together all of the small files based on size or time before writing to HDFS.

bbende · ‎08-08-2017

It looks like an unexpected exception is happening and since the processor isn't catching it, the framework rolls back the session which puts the flow file being processed back into the original queue that it was taken from. Obviously this isn't ideal for this case and it would be preferable to improve the processor to handle this and route to failure. In the mean time, you can stop the convert processor and right click on the queue and perform a listing and from the listing you should be able to download the contents of the flow files if you want to inspect them or perform analysis. You can also right-click and clear the queue to get the problem flow files out of there. If you are able to upgrade to NiFi 1.3.0 then you may have better results with the ConvertRecord processor which is not based on the Kite library.

bbende · ‎08-07-2017

There is a property called "Offset Reset" that you can set to "earliest". This only takes effect the first time the consumer group consumes from the topic, so if you have already run the processor before then you will need to also change the group id.

bbende · ‎08-07-2017

Is there an error that is shown on the Remote Process Group? If so can you provide that error and possibly the stacktrace that goes with it from nifi-app.log?

bbende · ‎07-31-2017

I think there should be a property: "Normalize Table/Column Names" Defaults to false, but if set to true I think it will change the / from the table name to something that Avro allows.

bbende · ‎07-31-2017

That looks close... where you have "name" : "children" it seems like that should be "name" : "fields" since "fields" is the array element in your JSON. Your example JSON also has top-level fields for "type" and "name" so I would think you want to declare those in the schema, unless you just want to skip over them.

bbende · ‎07-31-2017

The incoming JSON you showed does not match your schema... Your schema above shows that there is a single record with the fields "businessentityid", "accountnumber", "name", "creditrating". Your incoming JSON shows a field named "fields" with an array of JSON documents where each JSON document then has the four fields. If you want to make a schema that matches that JSON, your schema needs to declare a field called "fields" of type array, where each entry of the array is a type record with the four fields.

bbende · ‎07-11-2017

In SimpleKeyValueLookupService you can add as many user-defined properties as you want, lets say for example you added: field1 = value1 field2 = value2 field3 = value3 Then you would use LookupRecord processor and add a user-defined property like: key = field1 The LookupRecord processor is going to pass a Map with "key" -> "field1" to the SimpleKeyValueLookupService, the service is going to take the value of "key" which is "field1" and then lookup the value of that which is "value1". When "value1" is returned to the processor it will be placed in each record in the location specified by "Result Record Path". So if your schema had a field called "lookupValue" and you set "Result Record Path" to "/lookupValue" then after passing through this processor, every record would have the field "lookupValue" set to "value1".

Online	Offline
Last Visited	‎09-10-2020 01:23 PM

Member Since	‎09-29-2015 04:02 PM
Last Visited	‎09-10-2020 01:23 PM
Posts	871
Kudos received	709

Cloudera Community

Re: Using nifi registry in a nifi cluster.

Re: Is there a way to enable a stateful status upd...

Re: Automated Start/Stop of a NiFi Processor

Re: PublishKafkaRecord_0_10 1.2.0.3.0.1.1-5 Error:...

Re: how to configure mergecontent processor

Re: How to upgrade NiFi Cluster in place?

Re: How to merge 2 flow file content coming from 2...

Re: Nifi ListenSyslog howto ingest into HAWQ

Re: NiFi ConvertJSONToAvro is not routing errors t...

Re: How to send data stored in Kafka from beginnin...

Re: Secure NiFi Site to Site (S2S) on standalone N...

Re: ExecuteSQL NiFi 1.2 Issue

Re: Trouble converting JSON to AVRO in Nifi

Re: Trouble converting JSON to AVRO in Nifi

Re: How to use SimpleKeyValueLookupService process...