Community Articles

Find and share helpful community-sourced technical articles.
Celebrating as our community reaches 100,000 members! Thank you!
Master Guru

Using Deployed Models as a Function as a Service

104409-dataengineering.png 104410-datascience.png 104431-flowmanagement.png

Using Cloudera Data Science Workbench with Apache NiFi we can easily call functions within our deployed models from Apache NiFi as part of flows. I am working against CDSW on HDP (, but it will work for all CDSW regardless of install type.

In my simple example, I built a Python model that uses TextBlob to run sentiment against a passed in sentence. It returns Sentiment Polarity and Subjectivity which we can immediately act upon in our flow.

CDSW is extremely easy to work with and I was up and running in a few minutes. For my model, I created a python 3 script and a shell script for install details. Both of these artifacts are available here:

My Apache NiFi 1.8 flow is here (I use no custom processors): cdsw-twitter-sentiment.xml


Deploying a Machine Learning Model as a REST Service


Once you login to CDSW and create a project or choose an existing one ( From your project, open workbench and you can install some libraries and test some Python. I am using a Python 3 session to download the TextBlob/NLTK Corpora for NLP.


Let's Pip Install some libraries for testing


Let's Create a new Model


You choose your file (mine is see github). The function name is actually sentiment. Notice a typo I had to rebuild this and deploy. You setup an example input (sentence is the input parameter name) and an example output. Input and output will be JSON since this is a REST API.

Let's Deploy It (Python 3)


The deploy will build it for deployment.


We can see standard output, standard error, status, # of REST calls received and success.

Once a Model is Deployed We Can Control It


We can stop it, rebuild it or replace the files if need be. I had to update things a few times. The amount of resources used for the model rest hosting if your choice from a drop down. Since I am doing something small I picked the smallest model with only 1 virtual CPU and 2 GB of RAM. All of this is running in Docker on Kubernetes!

Once Deployed, It's Ready To Test and Use From Apache NiFi


Just click test. See the JSON results and we can now call it from an Apache NiFi flow.

Once Deployed We Can Monitor The Model


Let's Run the Test


See the status and response!

Apache NiFi Example Flow



Step 1: Call Twitter


Step 2: Extract Social Attributes of Interest


Step 3: Build our web call with our access key and function parameter


Step 4: Extract our string as a flow file to send to the HTTP Post



Step 5: Call Our Cloudera Data Science Workbench REST API (see tester).


Step 6: Extract the two result values.


Step 7: Let's route on the sentiment


We can have negative (<0), neutral (0), positive (>0) and very positive (1) polarity of the sentiment. See TextBlob for more information on how this works.

Step 8: Send bad sentiment to a slack channel for human analysis.


We send all the related information to a slack channel including the message.

Example Message Sent to Slack


Step 9: Store all the results (or some) in either Phoenix/HBase, Hive LLAP, Impala, Kudu or HDFS.

Results as Attributes


Slack Message Call
${msg:append(" User:"):append(${user_name}):append(${handle}):append(" Geo:"):append(${coordinates}):append(${geo}):append(${location}):append(${place}):append(" Hashtags:"):append(${hashtags}):append(" Polarity:"):append(${polarity}):append(" Subjectivity:"):append(${subjectivity}):append(" Friends Count:"):append(${friends_count}):append(" Followers Count:"):append(${followers_count}):append(" Retweet Count:"):append(${retweet_count}):append(" Source:"):append(${source}):append(" Time:"):append(${time}):append(" Tweet ID:"):append(${tweet_id})}

REST CALL to Model
{"accessKey":"from your workbench","request":{"sentence":"${msg:replaceAll('\"', ''):replaceAll('\n','')}"}}