About ahadjidj

ahadjidj · ‎10-25-2017

Hi @Charles Bradbury Spark 2.2 is not available in HDP so you won't be able to upgrade to this version using Ambari. You can manually install Spark 2.2 on the cluster but it wasn't tested and is not certified/supported by Hortonworks yet. Thanks

ahadjidj · ‎10-25-2017

Hi @Saikrishna Tarapareddy Indeed the community is working on a flow registry that should be available in coming releases of HDF/NiFi. I don't have a date for this yet, but I can see it available in the one of the next two releases. For SDLC, you can integrate Git with flow.xml or with templates. IMO, using template is easier since you need to manage only a part of the flow. Since there are several flows running in NiFi, it's easier to deploy a template than the complete flow. This reduce impact on the production. You can leverage NiFi API to implement your automated deployments. Note that there are few technical challenges and it's not a lift/shift operation of templates. For instance, all passwords are protected when you download a template. You need to replacer them with the right value before deploying on the next environment. Another example is variables. Let's say you NiFi write data in Kafka, you need to change the Kafka broker address from env to another one. You can use custom variables with nifi.variable.registry.properties but this requires NiFi restart which is not acceptable for production. There's also a work on this topic : https://cwiki.apache.org/confluence/display/NIFI/Variable+Registry I hope these details help you. Thanks

ahadjidj · ‎10-24-2017

Hi @Balakrishna Dhanekula I am glad that the answer was helpful. Please take a moment to click "Accept" the answer for future references. Thanks

ahadjidj · ‎10-24-2017

Hi @Andre Labbe You can configure log rotation and retention in the logback configuration of NiFi. Please start by reading these ressources to get started: https://pierrevillard.com/2017/05/12/monitoring-nifi-logback-configuration/ http://apache-nifi.1125220.n5.nabble.com/NiFi-app-log-td7437.html

ahadjidj · ‎10-24-2017

Hi @dhieru singh You need to do two things: First, you need a good capacity planing to evaluate the required infrastructure that can handle your data flows. Consider the worst case scenario to have room for improvement and the capacity to manage bursts. There are several resources out there that can help you https://community.hortonworks.com/articles/135337/nifi-sizing-guide-deployment-best-practices.html https://docs.hortonworks.com/HDPDocuments/HDF3/HDF-3.0.0/bk_command-line-installation/content/hdf_isg_hardware.html Second, as you said, you need to monitor your system at different level. Pierre has a set of articles on this topic that I recommend you to read : https://pierrevillard.com/2017/05/11/monitoring-nifi-introduction/

ahadjidj · ‎10-24-2017

@Sri Bet NiFi was designed to move data from one place to another, not to store it. NiFi stores data in content repository temporarily for the processing but has routines to delete flow files automatically. Data is deleted just after the end of the flow or after an archive retention period which 12 hours by default. This article explains the archiving process : https://community.hortonworks.com/articles/82308/understanding-how-nifis-content-repository-archivi.html As you can see, NiFi is designed to delete data that's not anymore used. The idea behind is that NiFi moved it to a storage location. You should use storage solution for storing data not NiFi. For instance, why don't you use your FTP server for this?

ahadjidj · ‎10-23-2017

Hi @dhieru singh When you add a processor to a NiFi cluster it's deployed on each node but enabled following the two cases: If you set scheduling to primary node, the processor is actif only in the primary node. If the primary node is down, NiFi will chose a new node as a primary node and the processor is activated on this new node. If you set scheduling to all nodes, the processor is enabled on all cluster's nodes. The ListFile processor lists files local to a NiFi node. So if you use it with primary only scheduling then only primary node lists the directory, and continue to work on generated files. If you use it with all nodes scheduling, each NiFi node list its local files and continue to work on them locally. If you need to distribute files between node then you need to use S2S with remote process group. You need to understand this and your use case and plan accordingly to avoid data duplication and data loss. I hope this is helpful. Thanks

ahadjidj · ‎10-23-2017

Hi @dhieru singh You can configure the scheduling of the processor to define how data is generated (Run Schedule) Is this what you are looking for ?

ahadjidj · ‎10-20-2017

Hi @Raj B This is the expected behavior. The controller services that you add from the Hamburger menu (top right) are used only for reporting tasks. If you want to add controller services for processor you should add them from the configure menu of your process group or root canvas. This article explains the difference : https://community.hortonworks.com/articles/90259/understanding-controller-service-availability-in-a.html

ahadjidj · ‎10-20-2017

Hi @Patrick Maggiulli Glad that the answer was useful. Please accept the answer to close this thread. Thanks

Online	Offline
Last Visited	‎08-19-2019 05:07 AM

Member Since	‎01-11-2016 06:11 PM
Last Visited	‎08-19-2019 05:07 AM
Posts	355
Kudos received	232

Cloudera Community

Re: How to access NIFI Process Group variable in E...

Re: GETSFTP with NiFi cluster

Re: how is Kafka different from Mosquitto(MQTT) ?

Re: Whitelisting using LookupAttribute

Re: Is there any ways if we can schedule or trigge...

Re: Upgrade Spark 2.1 (HDS-2.6.2.0) to Spark 2.2

Re: How to do NiFi Git Integration.?

Re: Install HDF on new node

Re: How to prevent nifi-app logs from filling my d...

Re: Is there any problem with the JVM of NiFi, is ...

Re: Can we use apache NIFI as storage ? means pers...

Re: Nifi cluster, list files from one of the nodes...

Re: Generate flow file processor, can we configure...

Re: Controller services created from Controller Se...

Re: Am I Configuring NiFi's AvroSchemaRegistry Cor...