Member since
04-11-2016
471
Posts
325
Kudos Received
118
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 2677 | 03-09-2018 05:31 PM | |
| 3540 | 03-07-2018 09:45 AM | |
| 3239 | 03-07-2018 09:31 AM | |
| 5441 | 03-03-2018 01:37 PM | |
| 2950 | 10-17-2017 02:15 PM |
01-23-2017
08:31 PM
Let's say that if you don't configure your ListenTCP to run on your primary node only, then it will run on every node. You can of course give one of the IP address of your cluster to the client opening TCP connections but the load is not balanced and if this node dies, the client is not able to open connections anymore. However, if you setup a load balancer in front of your cluster, you will have a VIP (virtual IP) that you will give to your client and connections will be opened in a round-robin manner to every nodes of the cluster (then you have HA and load balancing). It is generally up to the load balancer to ensure that every node of the cluster is alive with something like a heartbeat. As a general comment, each time you need a ListenX processor, you will certainly be in a situation where you need a load balancer and a virtual IP. Let me know if you have other questions.
... View more
01-23-2017
04:47 PM
Have your servers access to the Internet to get access to the
repository? Did you use a local repository? Before performing the
upgrade, the new version should be installed on all hosts, and the
installation must have failed on one host, hence the error. You can try to reinstall the version on all hosts (basically it will install the stack in /usr/hdp).
... View more
01-23-2017
10:59 AM
2 Kudos
Hi @Shashi Vish, The following should be working : $.feedJobExecutionContexts.['trigger_category.test_shashi'] Keep in mind to set 'return type' to 'json' in your EvaluateJson processor. Hope this helps.
... View more
01-23-2017
09:49 AM
2 Kudos
Hi @Raj B, Regarding the processors you mention: - GetSFTP: if you want to load balance the access to the SFTP server, then I'd recommend the use of ListSFTP running on the primary node and the use of FetchSFTP on all the nodes. This way the actual download of files will be load balanced on your cluster without concurrent access to the files. - Regarding ListenTCP, it depends if you want to have all your nodes of your cluster listening on a given port for TCP connection (with a load balancer in front of NiFi for example) or if you want only one node to listen on that port. However, keep in mind, there is no way to ensure that a given node will be the primary node, and going that way (having a client connecting to a specific node) is not recommended on a HA point of view. - PutHDFS is fine as long as two nodes are not writing to the same path in HDFS. Finally, there is one important point to remember to load balance data after a processor configured running on the primary node. If you do nothing special, then all the flow files will remain on the primary node from the beginning to the end. If you want to load balance the data, you need to connect your input processors to a Remote Process Group that is pointing to itself (to the cluster). This way the data will be actually load balanced on the cluster. You will find an example at the end of this post: https://pierrevillard.com/2016/08/13/apache-nifi-1-0-0-cluster-setup Hope this helps.
... View more
01-20-2017
02:58 PM
1 Kudo
Hi @Michal R, What you are looking for can be achieved with an UpdateAttribute processor, and you just need to update the 'filename' attribute, this is the attribute used by the PutFile processor. Hope this helps.
... View more
01-20-2017
02:44 PM
I agree with @stevel comment, I'd just add that encrypting the data at rest without Kerberos could only be useful in case disks are stolen. But if this is what you are trying to achieve it might be easier to rely on OS/disks native solutions.
... View more
01-19-2017
07:36 PM
1 Kudo
Hi @Andy Liang, NiFi is designed to be multi tenant and to offer the possibility to run multiple flows to serve different projects/teams on the same instance (standalone or cluster). In the UI you can use the notion of process groups to define "boxes", and inside the process groups you can define flows. Unless you draw relationships between the process groups, the workflows will run independently. When your NiFi instance is secured and user access is defined you can define ACL policies on the components to ensure that a given workflow can only be accessed/modified by the appropriate team/project. Hope this helps.
... View more
01-19-2017
03:01 PM
1 Kudo
Hi @Raj B, There is not really the notion of "completed" in NiFi. We are in a streaming approach, and there is no a priori knowledge of a start and a end. However you can find some workarounds based on your use case, for example: you could add a MonitorActivity processor to notify you if there is no more event processed after X minutes. Anyway, there is no general approach, it is really tied to your use case. Hope this helps.
... View more
01-19-2017
01:48 PM
1 Kudo
Hi @Yahya Najjar, Hortonworks is providing training around HDF and NiFi, I'd recommend you to get in touch with our teams. http://hortonworks.com/info/hortonworks-professional-services Hope this helps.
... View more
01-19-2017
11:05 AM
2 Kudos
Hi @Raj B, At the moment, it is not possible to achieve what you are looking for. There is a feature proposal for that here: https://cwiki.apache.org/confluence/display/NIFI/Reference-able+Process+Groups You may also find the following of interest: http://apache-nifi-users-list.2361937.n4.nabble.com/Process-Group-singleton-td535.html Hope this helps.
... View more