About andrewg

andrewg · ‎12-10-2016

A new flow file will be created, BUT they both will point to an immutable piece of data in the Content Repository. The HashContent step in your example will have replaced the content, but it will be a new FF pointing to a new piece of data in the content repository. The other branch of the flow is not affected in any way by this content change. Read more here, for example: https://nifi.apache.org/docs/nifi-docs/html/nifi-in-depth.html#pass-by-reference

andrewg · ‎12-09-2016

Have you tried adding this 'set' statement as the first line of the query, terminated by a semicolon with newline? Next, try quoting the column name, too.

andrewg · ‎12-08-2016

It doesn't affect scheduling, it's a normal (failure) path for your data. When one auto-terminates a relationship it means that a FlowFile which was routed there is dropped and finishes its life in the flow (but still remains for some time in provenance/content repositories for history).

andrewg · ‎12-08-2016

Site-to-site is much more versatile than multi-DC communications (though this is a great use case for this NiFi's feature). S2s can link multiple clusters (or standalone instances), can even connect a cluster to itself (for data re-distribution), as well as used for MiNiFi to NiFi communication. It's also bi-directional, meaning it can be push/pull in either direction. At the end of the day this means you will be able to communicate e.g. over a corporate HTTP proxy regardless of the inbound/outbound firewall rules, there's enough flexibility to accommodate these scenarios in s2s. Read up more here https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#site-to-site

andrewg · ‎12-08-2016

The recommended setup for production is to use Kafka. NiFi publishes to Kafka, Spark Streaming consumes from the topic (or the reverse). Spark Receiver in NiFi works, but wasn't tested at production scale.

andrewg · ‎12-08-2016

Avijeet, take a look at NiFi deeper architectural documents, I recommend https://nifi.apache.org/docs/nifi-docs/html/nifi-in-depth.html

andrewg · ‎12-08-2016

This was a community project of mine. It's not part of NiFi. However, it did validate a few very important use cases and allowed to collect real-world usage patterns. There's a longer-term effort in-flight around SDLC which involves big pieces. E.g. take a look at https://cwiki.apache.org/confluence/display/NIFI/Configuration+Management+of+Flows and https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#custom_properties

andrewg · ‎12-06-2016

Hi, the nifi-api-deploy doesn't have support for NiFi 1.x API.

andrewg · ‎12-05-2016

Ali, take a look at HDF http://hortonworks.com/products/data-center/hdf/ Now that NiFi 1.1.0 is out, the updated HDF version which includes it is around the corner. By the way, NiFi 1.x changed the clustering model, there's no more NCM. More on the architecture here: https://nifi.apache.org/docs/nifi-docs/html/overview.html#nifi-architecture

andrewg · ‎12-01-2016

Ok, it looks like an environmental issue with random entropy collection. There are several ways to solve it, pick your choice based on prod/non-prod requirements. There was a previous discussion with some suggestions here: https://community.hortonworks.com/questions/58436/hdf-20-handing-on-restart.html

Online	Offline
Last Visited	‎11-29-2021 04:12 PM

Member Since	‎07-30-2019 11:14 AM
Last Visited	‎11-29-2021 04:12 PM
Posts	333
Kudos received	330

Cloudera Community

Re: getfile : nifi does not have sufficient permi...

Re: Back pressure settings not Honored when a Funn...

Re: Urgent need for ListSFTP & FetchSFTP working e...

Re: Raise alert from NiFi if file not available fr...

Re: NiFi: PutHiveQL reflect UDF not working

Re: NiFi: Pass by Reference vs Copy on Write

Re: NiFi: SelectHiveQL processor

Re: Autoterminate relationship and scheduling

Re: nifi site-to-site

Re: nifi - spark streaming

Re: nifi clusters

Re: nifi rest-api projects

Re: nifi rest-api projects

Re: Nifi Clustering installation via Ambari

Re: NiFi doesnt start on red-hat 6.8