Community Articles

Find and share helpful community-sourced technical articles.
Labels (1)

If I approached you on the street and challenged you to name a state capital building from a flashcard image how many capital buildings out of the 50 do you think you could recognize? That would be pretty difficult right? We humans tend to have a narrow scope of knowledge focused around our intimate interactions with day to day life. I would have no problem identifying Atlanta Georgia’s capital building for example because I live in Atlanta and am exposed to it almost daily. If presented with a image of Kansas's capital in Topeka however I would be stumped since I have never been there or even seen a picture of that capital building. The point is experience drives our ability to recognize images. Most millennials tend to also share these “experiences” to social media, blogs, text, etc. Google is dominate at crunching these digital “experiences” from its users and artfully marrying those impressions against validated datasets. Google Vision is a rest API hosted and managed by Google that allows users to upload arbitrary images and perform services like landmark detection, label annotations, OCR, image properties, explicit content detection, Face detection along with sentiment, and corporate logo detection with amazing accuracy. Obviously this opens up a wide range of next generation platform possibilities but how do we use it? Google Vision can be accessed via a myriad of language sdks but the focus of this article will be around Google Vision’s integration with Apache NiFi.

I quizzed myself and shamefully was only able to recognize 4 of the 50 state capitals but how did Google and Apache NiFi do? Using Apache NiFi and Google Vision API I was able to successfully detect 35 of 50 state capitals from those same images! Don’t believe me? Lets take a look at how I did it. First up was the Google Vision API integration with Apache NiFi. Apache NiFi already has a robust set of tools for invoking REST APIs and handling JSON data. However I prefer my workflows to remain clean and concise. Although the discrete components are there I opted to create a custom GoogleVisionProcessor to condense those messy workflows into a single processor. The source code and instructions for using this processor can be found at I also plan to contribute it to Apache in the coming weeks after I iron out some more advanced features.

Lets take a look at the NiFi workflow and results from the experiment.


As you can see the GoogleVisionProcessor properly detected 35 out of 50 state capitals! The processor takes the JSON detection definition returned by Google Vision and creates a handy Flowfile attribute that allows us to access that information using other Apache NiFi processors and do with it as we will.


And just for reference here is a more graphical representation of the same image landmark detection from Google.


I’m really excited about the new opportunities using the combination of Google’s advanced analytics with Apache NiFi’s agility will bring to end users.

Much more information about Google Vision can be found at and

Expert Contributor

Thanks, this is very useful.

How would one go about getting the application name? Is it the app name or the app ID or something else? Thanks!

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.
Version history
Last update:
‎08-17-2019 10:07 AM
Updated by:
Top Kudoed Authors