About ahadjidj

ahadjidj · ‎06-19-2018

Yes sorry I submitted before finishing my answer

ahadjidj · ‎06-19-2018

Hi @Vivek Singh This has been answered by @Matt Burgess recently : https://community.hortonworks.com/questions/193888/nifi-is-it-possible-to-access-processor-group-vari.html This is mainly on how to access. To update the variable, you should use the API. I don't think there's a direct way from the script. Maybe take the output value as a flow file after the ExecuteScript and use another processor to call the API and update the value

ahadjidj · ‎06-18-2018

Hi @rajat puchnanda You can select your process group, click on save as template in the left menu. After that, go to Hamburger menu, template and save. This will download an XML file that describe the process group and you can import it in another NiFi.

ahadjidj · ‎06-16-2018

Hi @Abhinav Yepuri There are several ways to automate this. One of these is using NiFi CLI available from NiFi 1.6 : https://github.com/apache/nifi/tree/master/nifi-toolkit/nifi-toolkit-cli You have nifi pg-get-vars and nifi pg-set-var that you can use to get variables from dev, replace values with a dictionary and set in prod.

ahadjidj · ‎06-13-2018

Hi @John T When you use GetSFTP in a cluster you are duplicating your data. Each node will ingest the same data. You need to use List/Fetch pattern. A great description of this feature is available here : https://pierrevillard.com/2017/02/23/listfetch-pattern-and-remote-process-group-in-apache-nifi/ Now if you used the List/Fetch pattern correctly and don't have even data distribution, you need to understand that Site-to-Site protocol does batching to have better network performance. This means that if you have 3 flow files of few KB or MB to send, NiFi decides to send them to one node rather than using 3 connection. The decision is take based on data size, number of flow files and transmission duration. Because of this, you don't get data distributed when you are doing tests. Usually you test with few small files. The batching threshold is by default but you can change it for each input port. Go to RPG, Input ports then click on the edit pen for your input port and you get this settings I hope this helps understand the behavior. Thanks

ahadjidj · ‎06-07-2018

@Bhushan Kandalkar Here a step by step doc : https://community.hortonworks.com/articles/886/securing-nifi-step-by-step.html And this the official doc : https://docs.hortonworks.com/HDPDocuments/HDF3/HDF-3.1.1/bk_security/content/enabling-ssl-without-ca.html

ahadjidj · ‎06-07-2018

What about proxy ? as you can see in the provided link To allow users to view the NiFi UI, create the following policies for each host: /flow – read /proxy – read/write

ahadjidj · ‎06-07-2018

Hi @Bhushan Kandalkar Have you added Ranger policies to let users see the UI : https://docs.hortonworks.com/HDPDocuments/HDF3/HDF-3.1.2/bk_security/content/policies-to-view-nifi.html ? Thanks

ahadjidj · ‎06-04-2018

Hi @tthomas You can use EvaluateJsonPath to extract a JSON field and add it as a flow file attribute : https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.6.0/org.apache.nifi.processors.standard.EvaluateJsonPath/index.html If your JSON is the following and you want to add a flow file attribute called timestamp { “created_at” : “Thu Sep 28 08:08:09 CEST 2017”, “id_store” : 4, “event_type” : “store capacity”, “id_transaction” : “1009331737896598289”, “id_product” : 889, “value_product” : 45 } you can add an EvaluateJsonPath and add an attribute timestamp with the value $.created_at

ahadjidj · ‎06-02-2018

Hi @Rahul Kumar Beyond the fact that they are both called "pub sub brokers", Kafka and MQTT has different design goal. Without going deep into details, it's better to see MQTT as a communication protocol between several applications. It was designed to be extremely low light to fit into IoT and resource-constrained environment. For this, the objective is to distribute messages between different system and not to store large volume of data for long time. At the other hand, Kafka is broker that can store large volume of data and for long time (or for ever). It was designed to be scalable and provide the best performances. Hence, a Kafka cluster usually use beefy machines. It's well suited for Big Data application and has integration with the big data ecosystem (Spark, Storm, Flink, NiFi, etc). Depending on your application requirements the choice is usually easy to make. In lot of scenarios it's Kafka and MQTT. For IoT for instance, it's not rare to see MQTT at local level (gateway for example) for sensors/actuators communications, and Kafka at regional/center level for data ingestion, processing and storage. Technically, there are lot of difference too in termes of quality of service, streaming semantics, internal architecture, etc I hope this helps clarifies your mind.

Online	Offline
Last Visited	‎08-19-2019 05:07 AM

Member Since	‎01-11-2016 06:11 PM
Last Visited	‎08-19-2019 05:07 AM
Posts	355
Kudos received	232

Cloudera Community

Re: How to access NIFI Process Group variable in E...

Re: GETSFTP with NiFi cluster

Re: how is Kafka different from Mosquitto(MQTT) ?

Re: Whitelisting using LookupAttribute

Re: Is there any ways if we can schedule or trigge...

Re: How to access NIFI Process Group variable in E...

Re: How to access NIFI Process Group variable in E...

Re: How to export processor group from nifi to loc...

Re: How to do NiFi Git Integration.?

Re: GETSFTP with NiFi cluster

Re: Nifi Integration with Ranger Not Working

Re: Nifi Integration with Ranger Not Working

Re: Nifi Integration with Ranger Not Working

Re: How to load attributes from json

Re: how is Kafka different from Mosquitto(MQTT) ?