Member since
01-07-2019
220
Posts
23
Kudos Received
30
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4913 | 08-19-2021 05:45 AM | |
1788 | 08-04-2021 05:59 AM | |
865 | 07-22-2021 08:09 AM | |
3621 | 07-22-2021 08:01 AM | |
3323 | 07-22-2021 07:32 AM |
07-22-2021
08:02 AM
I believe that what you are looking for is called a 'streaming join'. I won't say it is impossible, but this is not something Nifi is made for and there is no good way to do it. Perhaps look into other solutions available in the Cloudera offering, such as Flink or Spark Streaming.
... View more
07-22-2021
08:01 AM
1 Kudo
I just did a small test and it indeed seems that only referenced controller services are copied when you copy a process group. I have not found any discussion on the topic, but as it seems to happen for all (just tried csvrecordreader) I suspect it is a feature. I hope this does not seem too unreasonable.
... View more
07-22-2021
07:47 AM
I have seen a similar issue a while ago but that one got fixed, the best way to reach the experts to check if there is a known fix is to go to the Cloudera Support Portal, once you log in you should be able to log a support ticket.
... View more
07-22-2021
07:43 AM
It seems that you were able to split the problem in the two key steps: 1. Define the attribute 2. Route on the attribute First question: Is defining the attribute succesful? Please check your message in the queue directly after doing this evaluatejsonpath. Look carefully (not sure if it is case sensitive/what it does with quotes). Second question: Are you able to make any attribute based routing work? Perhaps just try some things till anything works and see what the difference is with your flow. The only thing I could think of are trivial points such as case sentivity (maybe), or extra quotes/spaces. You may try something even simpler like route if it contains the letter c, for your first test.
... View more
07-22-2021
07:38 AM
1 Kudo
There is not much information here, could you perhaps share a minimal flow that does not work? Also what points towards a problem in scheduling (opposed to e.g. wrong url or firewall issue).
... View more
07-22-2021
07:32 AM
Your expression looks fine. I just tested as follows: 1. Generateflowfile: Add property Location1, value loc1; Add property Location2, value loc2 followed by: 2. Updateattribute processor: Add property Total, value ${path: append("folder1/"):append(${Location1}):append('/'):append(${Location2})} Now the output I get is ./folder1/loc1/loc2 which is exactly what I expected. If you think there may be a problem with parsing the json inthe first place, please inspect your data in the queue directly after parsing, but the concatenation seems to work. (Unless you expected a different outcome, in which case please be very clear in what you expect and see).
... View more
07-22-2021
07:22 AM
1 Kudo
There may be some corner cases where you would want to use something else, but fortunately the general answer to your question is very straightforward: In general anything you were considering Flume for, you now want to use NiFi for instead. Flume has been deprecated, so I would not recommend you to spend time and energy into developing custom content for it, rather see if NiFi solves your problem out of the box (or if needed perhaps contribute a processor to NiFi)
... View more
07-22-2021
07:19 AM
Are you using the version published by Cloudera? Please confirm exactly which platform version and whether this is the on premise variant or in public cloud.
... View more
02-24-2021
07:19 AM
The subquestions can be found here, please note that these may or may not have been answered yet: Subquestions: Streaming data from Kafka to HDFS with NiFi Streaming data from Kafka to HDFS with Flink Jar Streaming data from Kafka to HDFS with Flink SQL Streaming data from Kafka to HDFS with Spark Interactive Streaming data from Kafka to HDFS with a Spark Jar Streaming data from Kafka to HDFS with Kafka Connect Also note that the questions ask for an example, though there may be multiple language choices and other decisions to be made.
... View more
02-24-2021
07:04 AM
In order to understand what it would take to work with various streaming tools, I have defined this question as an umbrella for making the overview of ways to stream data. For consistency I picked a simple reference usecase: Messages arrive from kafka, and need to be put on HDFS. Source topic name: input Output folder name on HDFS: output The core usecase is picking up a bit of data from Kafka, and putting it on HDFS. The bonus usecase is ensuring that new field C is defined by dividing fields A and B which both occur in the data, and ideally the schema would be used for this. Subquestions: Streaming data from Kafka to HDFS with NiFi Streaming data from Kafka to HDFS with Flink Streaming data from Kafka to HDFS with Flink SQL Streaming data from Kafka to HDFS with Spark Interactive Streaming data from Kafka to HDFS with a Spark Jar Streaming data from Kafka to HDFS with Kafka Connect If a substep is well documented, do not hesitate to refer to it, but please ensure the end-to-end process is documented including building and deployment. If you notice this question is not specified well, or if there is something blocking one of the subquestions to be answered, please post a comment.
... View more
Labels:
- Labels:
-
Apache Kafka