About david_miller

david_miller · ‎03-22-2019

If you need atomic sequencing and still want to use a paralell system you are gonna have to push that sequencing off onto a system capable of atomic sequencing. probably the easiest way is to write a stored procedure with a transaction in an rdbms. and do an executesql in your flow. Don't use a cache as MC says caches are not designed for transactional atomic stuff. Only use cache for actually caching stuff (you can get the value but its expensive and cheaper just to store for a bit in the cache.

david_miller · ‎01-31-2019

Hey Matt, yeah thats what I said 🙂 Adam, I think your approach is wrong. If you are trying to get one flowfile to appear on each node, just have each node get the flowfile. If you want to send flowfiles between nodes, use s2s or rebalancing. You are re-inventing the wheel here.

david_miller · ‎01-31-2019

One thing I tell people is to always put a limitrate in front of putemail. If you don't you will eventually send yourself 100000 emails in one second and get your nifi boxes blacklisted from your SMTP server 🙂 . Actually, the best solution in this case might be to feed putemail with a monitoractivity. This is a tough problem. I am skeptical that there is a practically generalizable solution. Right now I feel that monitoring and alerting need to be flow-specific, but I am interested to see what others are doing.

david_miller · ‎01-30-2019

you'd have to look at the source, but its been my experience that ${hostname()} gives you the same value as if you ran the unix command 'hostname' Whether nifi is listening there just depends how you have configured other things.... I imagine that the node name you get back from the api is the nifi.web.http[s].host nifi property But it sounds like you are trying to do something that would be better served by just running a completely independent flow on each node.

david_miller · ‎11-26-2018

NO! if you are trying to escape input to generate a sql query you should never roll your own sanitization unless you fully trust the input. THIS IS VULNERABLE TO SQL INJECTION! You should be using the '?' parameter substitution in your putsql stage.

david_miller · ‎08-22-2018

I finally found the policy by looking at https://github.com/apache/nifi/pull/2703/files It is /provenance-data/<component-type>/<component-UUID>

david_miller · ‎08-22-2018

The HDF 3.2 release notes mention that provenance and data access policies have been separated in nifi 1.7. The release notes do not mention what resource identifiers should be entered in ranger to give users access to provenance. Looking at the nifi release notes is similarly unhelpul. my ranger nifi resource identifiers for data look like /data/process-groups/<uuid> but /provenance/process-groups/<uuid> appears not to work I would like to give people access to view any provenance events associated with components underneath a certain process group. What ranger nifi resource identifier should I use?

david_miller · ‎07-12-2018

@Matt Clarke Thanks for that info!

david_miller · ‎07-11-2018

@Matt Clarke If I have disconnected a node (say to drain it for maintenance) I don't want my upstream sources to keep posting data to it. S2S ports already have this behavior and close on disconnect. If a node is disconnected, I don't want webui user sessions to hit that node. At best they will be very confused, and at worst they will make changes to the flow on that node that will cause problems when I reconnect the node.

david_miller · ‎07-11-2018

You might want a load balancer if you use any of the listentcp/listenhttp/etc stages. you want to spread your user interface activity across nodes. (You'll need to pin sessions to nodes for this) you don't want to hardcode all your node hostnames into RPGs (though there are still issues with this) A big problem with configuring your load balancer is when a node is disconnected it continues to listen on the nifi webui/rest api port. You will have to write some external healthcheck that authtenticates to nifi and gets the actuall node status.

Online	Offline
Last Visited	‎03-22-2019 01:45 PM

Member Since	‎01-09-2017 08:12 PM
Last Visited	‎03-22-2019 01:45 PM
Posts	33

Cloudera Community

Re: how to generate sequence number in nifi cluste...

Re: What is the provenance access resource identif...

Re: Is there documentation for the NiFi ManagedRan...

Re: how to generate sequence number in nifi cluste...

Re: Is ${hostname()} and nodeAddress(/controller/c...

Re: Thoughts on NiFi error handling

Re: Is ${hostname()} and nodeAddress(/controller/c...

Re: Replace single quote with two single quotes in...

Re: What is the provenance access resource identif...

What is the provenance access resource identifier ...

Re: We have planning to use Nifi zero-master clust...

Re: We have planning to use Nifi zero-master clust...

Re: We have planning to use Nifi zero-master clust...