Member since
01-07-2019
220
Posts
23
Kudos Received
30
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5044 | 08-19-2021 05:45 AM | |
1811 | 08-04-2021 05:59 AM | |
879 | 07-22-2021 08:09 AM | |
3691 | 07-22-2021 08:01 AM | |
3429 | 07-22-2021 07:32 AM |
08-03-2021
11:12 PM
Hi @DennisJaheruddi I managed to speak to some devs on Slack and we found a related issue, posted below. https://issues.apache.org/jira/projects/NIFI/issues/NIFI-8749 Hopefully it gets fixed soon so that we can upgrade to 1.14.0 +
... View more
08-03-2021
02:05 PM
The general successor to Flume is NiFi. But if your usage is simple enough, then Kafka connect may also suffice. Cloudera of course supports both.
... View more
08-03-2021
02:04 PM
I would always recommend you to use the Cloudera distribution, as people like me are not able to troubleshoot the upstream distributions, and we do note that. it is common that people run into trouble when using upstream versions. I am not sure about the exact time, but if you are interested in Nifi on K8s, then rather than trying to solve all challenges personally you may also want to look into how the Cloudera Data Platform attacks this challenge for everyone.
... View more
07-28-2021
11:09 AM
1 Kudo
@jg6 There is no direct relationship between the DistributedMapCacheServer and the DistributedMapCacheClientService. Meaning that the client is simply configured with a hostname and a port. This hostname and port could be a DistributedMapCacheServer running on an entirely different NiFi cluster somewhere. Additionally there is no component that registers a dependency on the DistributedMapCacheServer controller service. They only have a dependency on the DistributedMapCacheClientService. So when constructing a template only the interconnected and dependent pieces are included. That being said, using the DistributedMapCache is not the cache I would recommend using anyway. IT offers no high Availability (HA). While a DistributedMapCacheServer is being started on every node in a NiFi cluster, they do not talk to one another and the DistributedMapCacheClientService can only be configured to point at one of them. So if you lose the NiFi node were your clients point, you lost all your cache. There are better options for external cache services that do offer HA. Hope this is helpful, Matt
... View more
07-23-2021
12:38 AM
1 Kudo
@DennisJaheruddi , Thanks for your reply. I a able to resolve this issue with your suggestion. Thanks!
... View more
07-22-2021
08:39 AM
Thank you, in the end I did it using a groovy script. But I still didn't understand why when I make a regular request in the http Invoke processor, I see a lot of duplicate files. We had to use the processor to remove duplicates. I didn't quite understand then what the nifi is for?
... View more
07-22-2021
08:09 AM
I am not entirely sure what you are trying to achieve. If the set of data lives somewhere in a file you should be able to read it (e.g. list and fetchfile). If the data is generated by a script, is that just for testing? You could look at executescript or ExecuteGroovyScript, or perhaps just generate fake data directly with GenerateFlowFile (though the exact variety that you mention would be tough with the latter)
... View more
07-22-2021
07:43 AM
It seems that you were able to split the problem in the two key steps: 1. Define the attribute 2. Route on the attribute First question: Is defining the attribute succesful? Please check your message in the queue directly after doing this evaluatejsonpath. Look carefully (not sure if it is case sensitive/what it does with quotes). Second question: Are you able to make any attribute based routing work? Perhaps just try some things till anything works and see what the difference is with your flow. The only thing I could think of are trivial points such as case sentivity (maybe), or extra quotes/spaces. You may try something even simpler like route if it contains the letter c, for your first test.
... View more
07-22-2021
07:38 AM
1 Kudo
There is not much information here, could you perhaps share a minimal flow that does not work? Also what points towards a problem in scheduling (opposed to e.g. wrong url or firewall issue).
... View more
02-24-2021
07:19 AM
The subquestions can be found here, please note that these may or may not have been answered yet: Subquestions: Streaming data from Kafka to HDFS with NiFi Streaming data from Kafka to HDFS with Flink Jar Streaming data from Kafka to HDFS with Flink SQL Streaming data from Kafka to HDFS with Spark Interactive Streaming data from Kafka to HDFS with a Spark Jar Streaming data from Kafka to HDFS with Kafka Connect Also note that the questions ask for an example, though there may be multiple language choices and other decisions to be made.
... View more