Member since
04-14-2022
9
Posts
0
Kudos Received
0
Solutions
08-22-2023
05:24 AM
@sahil0915 What you are proposing would require you to ingest into NiFi all ~100 million records from DC2, hash that record, write all ~100 million hashes to a map cache like Redis or HBase (which you would also need to install somewhere) using DistributedMapCache processor, then ingest all 100 million records from DC1, hash those records and finally compare the hash of those 100 million record with the hashes you added to the Distributed map cache using DetectDuplicate. Any records routed to non-duplicate would represent what is not in DC2. Then you would have to flush your Distributed Map Cache and repeat process except this time writing the hashes from DC3 to the Distributed Map Cache. I suspect this is going to perform poorly. You would have NiFi ingesting ~300 million records just to create hash for a one time comparison. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
09-19-2022
06:58 AM
Hi, Is your input in Json format? if its json do you need to completely replace it with the output you specified? If that is the case then you dont need Jolt transformation, instead you can do the following: 1- Place the Gallery_Ids attribute as flowfile content. You can do that using the ReplaceText processor where the Replacement Value is the attribute ${GALLERY_IDS} 2- Do SplitJson to split the Array into a different flow file, this should give 7 flowfiles with the values: 1,2,3,4,5,6 & 7 3- Add another ReplaceText processor that will capture each split from above and replace the flowfile content with the following template in the Replacement Value: {"GALLERY_ID":"$1","PERSON_ID":"test$1"} Leave the Search Value as (?s)(^.*$) This should again give you 7 flowfiles with the expected output format. Hope that helps, if it does please accept solution. Thanks
... View more
06-30-2022
03:14 PM
Hi, Similar question was asked before. Can you please check this post: https://community.cloudera.com/t5/Support-Questions/How-to-load-json-record-to-postgres-as-json-datatype/m-p/345779#M234644
... View more
04-21-2022
12:41 AM
@sahil0915 , Please try the JoltTransformJSON processor with the following Chain specification: [
{
"operation": "default",
"spec": {
"json_data": {
"items[]": {
"*": {
"quality": "good"
}
}
}
}
}
] Cheers, André
... View more