Member since
02-01-2022
285
Posts
103
Kudos Received
60
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1181 | 05-15-2025 05:45 AM | |
| 5119 | 06-12-2024 06:43 AM | |
| 8115 | 04-12-2024 06:05 AM | |
| 5995 | 12-07-2023 04:50 AM | |
| 3298 | 12-05-2023 06:22 AM |
11-08-2023
07:33 AM
@Rohit1997jio I do not think this is possible. You would need a method outside of the consume/produce that handles logic for which consume topic maps to which produce topic. Then you could use a dynamic topic name in the producer. However, you would still be limited in fact that ConsumeKafka doesnt take upstream connections. In the example above, if customerTopicX is attribute based, you can just use the same attribute logic in topic Name for a single publishKafka versus three seen above. That would atleast clean up your flow.
... View more
10-23-2023
11:44 AM
@MWM @cotopaul If you get the record reader/writer using the schema(s) you want, you do not have to do any magic to convert values, it should just work. Only use, inferSchema long enough to get the structure when you have none. Then copy/paste it and use it as @cotopaul has described in place of InferSchema. You can also use Schema Registry. Make the edits you need to satisfy reader (upstream), writer (downstream) as they are sometimes needing minor adjustments like in this case.
... View more
09-06-2023
08:00 AM
@manishg You should only copy individual nars that you know you need, not ALL 1.10 nars. You error suggest a conflict with 1 or more. That being said, be super careful with the expectation that things from 1.10 will work in 1.22. Each of them would to be tested individually for compatibility in 1.22.x.
... View more
08-28-2023
05:29 AM
@JohnnyRocks ReplaceText more than once is something you want to avoid entirely. You need to look at how to solve the schema concerns within the record based processors. It should be possible to avoid ReplaceText all together. If your upstream data is that different (3 different formats) within the same pipeline, consider how to address that upstream or in separate nifi flows. Alternatively multiple pipelines can be built with separate top branch that pipes into the same record based processor. This would be something like 3 single routes through a ReplaceText then all going to ConvertRecord. However i would still try to optimize without ReplaceText in the manner described here.
... View more
08-23-2023
05:55 AM
@tqiu No, there is no more HDP. That product is end of life. Please check out how to upgrade HDP to CDP. https://docs.cloudera.com/upgrade-companion/cdp_upgrade.html
... View more
08-21-2023
08:37 AM
Let's take this a different direction... open up a code box in your reply. Choose Preformatted: Insert Lines 0 - 11 here Remove anything sensitive of course.
... View more
08-21-2023
07:16 AM
@bhadraka It does not appear like this can be achieved with our CSVReader. See this JIRA looking for a feature to set the skip rows: https://issues.apache.org/jira/browse/NIFI-8932 If your data's "10 rows" is always known and the same you could use a ReplaceText to get only 11+ rows. You would match that text, and not replace it, just take the rest. If the first 10 rows are just ignored and of the same structure as all rows, ie not some prepending lines with headers, etc; you could also use a CSV Reader and program your flow to simply ignore the first 10 flow files.
... View more
08-21-2023
07:00 AM
@kothari here is a very good match with several comments w/ things you should check. https://community.cloudera.com/t5/Support-Questions/Ranger-group-policy-not-being-applied-to-the-users-with-in/td-p/95972 To summarize: Make sure user is in the AD group. Make sure users and groups synced. Check case sensitivity Confirm ranger policies are correct.
... View more
08-18-2023
05:35 AM
@sahil0915 It is not clear what you are asking for here. Using Nifi to do this replication, you would be well aware of any records that fail as that is the nature of nifi and how nifi works. NiFi data flows capture failures so you could easily be aware of any records that did not make if from dc1 to dc2 and dc3. Additionally, nifi handles retries, so a replication flow should be resilient to failures, and notify you at that time versus having to fully audit it after replication. If you are using a database technology that replicates across regions or some other replication method and intend to use nifi to check if the replication is complete or accurate, you are going to need to make a nifi flow that will pull all 3 data sets and compare. At the 100 million record row this could be a pretty heavy process with 3 copies of all data coming into NiFi. It would make more sense to me to allow nifi to handle the replication as described above and take the inherit fault tolerance.
... View more
08-18-2023
05:27 AM
A solution and a jira, excellent work gents!
... View more