Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
Labels (1)
avatar
Master Guru

I am running a free mongodb instance (DBaaS) on at mlab.com.

From NiFi, I can read from and write to MongoDB very easily.

It is a great way to pull data out of a large collection of MongoDB databases; in some startups or enterprises a lot of little MEAN apps have been written and have small silos of data locked in MongoDB datasets. These can be streamed into a data lake very easily with NiFi. Once stored in HDFS, the data can be accessed via SparkSQL, Hive, Zeppelin and other tools very easily.

An example of a Twitter tweet as a JSON document in a MongoDB collection in a MongoDB database stored in an online NoSQL store.

6996-mongo1.png

The nifi flow for storing to MongoDB is trivial.

6997-mongo2.png

A simple flow to read MongoDB JSON records and land them as JSON in HDFS.

6999-mongo4.png

Here is an example of another source to store to HDFS or MongoDB for example. We use a GetHTTP processor to access an SSL protected resource.

7001-mongo5.png

7002-sslcontext.png

There are a few options for storing something to MongoDB. You need to format the mongodb URI correctly. You need the username:password@yoururl. Then set your database and collection name. Insert mode is most common, but you can do an upsert. There are a few options for writing to MongoDB. Write Concern acknowledged if you want to get an acknowledgement of storing to all nodes in MongoDB cluster. https://docs.mongodb.com/v3.0/reference/write-concern/

7004-putmongo.png


putmongo.png
22,444 Views
Comments
avatar
New Contributor

Great article.

I'm trying to read the data from MongoDB based on incremental value. There is field called CreationDate in the database which has timestamp at which the record is being written.

I want to read the data create in previous 1 min. When I tried the following queries in GetMongo "query" property, I see the warning button on the process indicating invalid.

{'creationDate': { $gte: new Date(ISODate().getTime()) - 1 * 60 * 1000}}

OR

{'creationDate': { $gte: ISODate().getTime() - 1 * 60 * 1000}}

When I left the query property blank, then the Nifi is reading the entire table every time it runs.

Your response will be very much appreciated.

avatar
Explorer

Thanks for this writeup.  I was having a heck of a time authenticating, was expecting the authentication keys to be separate fields in the processor configuration.   I didn't realize you could add all that right in the connection string, although it leaves your passwords in plain text.   Seems like an opportunity for improvement.