Created 01-06-2019 12:25 PM
Hi Guys,
I need a suggestion/advice on a use case that i have. I have a bunch of data being streamed from NiFi on 3 different mongo db collections let's say A,B,C. I need to perform a lookup based on a particular field say emp ID which is present in A. If the value of Emp ID present in A is matching with that of the payloads present in collection B & C. Only then i need to attach the document present in B & C to document present in A(i am creating new key/value pair, key being a random UID and value being the payload from B & C in A for performing update), If the value doesn't exists then I need to insert the document as it is in A from B or C. Currently i am using Spark to do this. I believe its an overkill for this job and doesn't make sense to use spark. Any other big data tech suggestions ?
P.S I need to perform update on real time basis.
-Thanks in Advance
Created 01-06-2019 03:04 PM
You can leverage NiFi for this usecase by using LookupRecord processor with MongoDBLookup service.
Once you get empID in A then perform a series of lookups to check the same empid exists in B,C collection and define your Record writer controller service with avro schema that matches with the Result Record to create new key/value pair.
Based on Routing Strategy property you can know is the empID present in B,C collections or not, use matched/unmatched connections to make decision i.e. "To create a new key/value record then insert into A collection (or) insert document into A collection."
Refer this link for more details regards to LookupRecord processor with MongoDbLookup service.
Created 01-06-2019 03:21 PM
Thank you for your reply. But I don't see lookuprecord processor present in nifi under add processor menu. The NiFi version that I am using is 1.2.0.3.0.1.1-5. Is there anything that I am missing ?
Created 01-06-2019 04:30 PM
Created 01-07-2019 11:34 AM
Thank you very much. Ill check with Hortonworks on how to get these processors into the Nifi version that I am using. Meanwhile, I wanted to know is there any other tool that I can use other than Spark or NiFi ?
Thanks In Advance
Created 01-07-2019 01:20 PM
Refer to this link provides you integration from MongoDB to Hadoop.
I could think of Spark would be the way to go as we can use spark mongodb connector to get data and perform lookup's.
Created 01-07-2019 02:04 PM
Okay. Thanks a lot @Shu much appreciated.