About DennisJaheruddi

Tokolosk · ‎08-03-2021

Hi @DennisJaheruddi I managed to speak to some devs on Slack and we found a related issue, posted below. https://issues.apache.org/jira/projects/NIFI/issues/NIFI-8749 Hopefully it gets fixed soon so that we can upgrade to 1.14.0 +

DennisJaheruddi · ‎08-03-2021

The general successor to Flume is NiFi. But if your usage is simple enough, then Kafka connect may also suffice. Cloudera of course supports both.

DennisJaheruddi · ‎08-03-2021

I would always recommend you to use the Cloudera distribution, as people like me are not able to troubleshoot the upstream distributions, and we do note that. it is common that people run into trouble when using upstream versions. I am not sure about the exact time, but if you are interested in Nifi on K8s, then rather than trying to solve all challenges personally you may also want to look into how the Cloudera Data Platform attacks this challenge for everyone.

MattWho · ‎07-28-2021

@jg6 There is no direct relationship between the DistributedMapCacheServer and the DistributedMapCacheClientService. Meaning that the client is simply configured with a hostname and a port. This hostname and port could be a DistributedMapCacheServer running on an entirely different NiFi cluster somewhere. Additionally there is no component that registers a dependency on the DistributedMapCacheServer controller service. They only have a dependency on the DistributedMapCacheClientService. So when constructing a template only the interconnected and dependent pieces are included. That being said, using the DistributedMapCache is not the cache I would recommend using anyway. IT offers no high Availability (HA). While a DistributedMapCacheServer is being started on every node in a NiFi cluster, they do not talk to one another and the DistributedMapCacheClientService can only be configured to point at one of them. So if you lose the NiFi node were your clients point, you lost all your cache. There are better options for external cache services that do offer HA. Hope this is helpful, Matt

midee · ‎07-23-2021

@DennisJaheruddi , Thanks for your reply. I a able to resolve this issue with your suggestion. Thanks!

Alex_ru · ‎07-22-2021

Thank you, in the end I did it using a groovy script. But I still didn't understand why when I make a regular request in the http Invoke processor, I see a lot of duplicate files. We had to use the processor to remove duplicates. I didn't quite understand then what the nifi is for?

DennisJaheruddi · ‎07-22-2021

I am not entirely sure what you are trying to achieve. If the set of data lives somewhere in a file you should be able to read it (e.g. list and fetchfile). If the data is generated by a script, is that just for testing? You could look at executescript or ExecuteGroovyScript, or perhaps just generate fake data directly with GenerateFlowFile (though the exact variety that you mention would be tough with the latter)

DennisJaheruddi · ‎07-22-2021

It seems that you were able to split the problem in the two key steps: 1. Define the attribute 2. Route on the attribute First question: Is defining the attribute succesful? Please check your message in the queue directly after doing this evaluatejsonpath. Look carefully (not sure if it is case sensitive/what it does with quotes). Second question: Are you able to make any attribute based routing work? Perhaps just try some things till anything works and see what the difference is with your flow. The only thing I could think of are trivial points such as case sentivity (maybe), or extra quotes/spaces. You may try something even simpler like route if it contains the letter c, for your first test.

DennisJaheruddi · ‎07-22-2021

There is not much information here, could you perhaps share a minimal flow that does not work? Also what points towards a problem in scheduling (opposed to e.g. wrong url or firewall issue).

DennisJaheruddi · ‎02-24-2021

The subquestions can be found here, please note that these may or may not have been answered yet: Subquestions: Streaming data from Kafka to HDFS with NiFi Streaming data from Kafka to HDFS with Flink Jar Streaming data from Kafka to HDFS with Flink SQL Streaming data from Kafka to HDFS with Spark Interactive Streaming data from Kafka to HDFS with a Spark Jar Streaming data from Kafka to HDFS with Kafka Connect Also note that the questions ask for an example, though there may be multiple language choices and other decisions to be made.

Online	Offline
Last Visited	‎12-15-2021 03:18 AM

Member Since	‎01-07-2019 03:54 AM
Last Visited	‎12-15-2021 03:18 AM
Posts	220
Kudos received	31

Cloudera Community

Re: 在启用kerberos的集群flink程序如何连接集群外未启用认证的kafka

Re: Attribute validation against MSSQL database

Re: Put array with Dates on nifi flowfile

Re: NiFi templates don't include all controller se...

Re: Concatenations of Multiple Attributes in Nifi

Re: PutDatabaseRecord or JsonTreeReader changing t...

Re: Apache Flume in 2021

Re: Apache Nifi 1.12.1 in Kubernetes with existing...

Re: NiFi templates don't include all controller se...

Re: Concatenations of Multiple Attributes in Nifi

Re: NI-Fi concatenate file line by first file fie...

Re: Put array with Dates on nifi flowfile

Re: async call using InvokeHTTP

Re: scheduling is not working in HandleHttpRequest...

Re: Streaming data from Kafka to HDFS: All relevan...