Created on 01-30-2020 04:56 AM - edited 09-16-2022 01:45 AM
Image classification is a powerful tool that can be used to gather insight that is not contained in textual information like (titles, comments, tags, etc...). The objects in the image posted along with a tweet for example may have nothing to do with what was written in the post. In this exercise we will look at how a real time stream of tweets on any topic can be broken down and analyzed in order to classify the main object/subject of a photo by leveraging a deep learning model that has been deployed as a Rest Endpoint via CDSW.
To get started we need to get a model deployed in CDSW that will serve as our endpoint for predictions of our incoming images.
In CDSW Create a project and then add the python code we will use to score any image URL:
import json
import csv
import keras
#import pandas as pandas
from keras.applications import ResNet50
from keras.preprocessing.image import img_to_array
from keras.applications import imagenet_utils
from keras.preprocessing.image import load_img
from PIL import Image
import numpy as np
import io
from io import BytesIO
import os
#import magic
import requests
#!pip install keras tensorflow Pillow requests numpy
def load_model():
# load the pre-trained Keras model (here we are using a model
# pre-trained on ImageNet and provided by Keras, but you can
# substitute in your own networks just as easily)
global model
model = ResNet50(weights="imagenet")
#model2 = VGG
#model3
def prepare_image(image, target):
# if the image mode is not RGB, convert it
if image.mode != "RGB":
image = image.convert("RGB")
# resize the input image and preprocess it
image = image.resize(target)
image = img_to_array(image)
image = np.expand_dims(image, axis=0)
image = imagenet_utils.preprocess_input(image)
# return the processed image
return image
def predict(im):
# initialize the data dictionary that will be returned from the
# view
# read the image in PIL format
data = {}
filename='temp.jpeg'
response = requests.get(im["url"])
with open(filename, 'wb') as f:
f.write(response.content)
image = load_img(filename, target_size=(224, 224))
# preprocess the image and prepare it for classification
image = prepare_image(image, target=(224, 224))
# classify the input image and then initialize the list
# of predictions to return to the client
preds = model.predict(image)
results = imagenet_utils.decode_predictions(preds)
data["predictions"] = []
# loop over the results and add them to the list of
# returned predictions
for (imagenetID, label, prob) in results[0]:
r = {"label": label, "probability": float(prob)}
data["predictions"].append(r)
# return the data dictionary as a JSON response
jdata = json.dumps(data)
return jdata
load_model()
Now, CDSW will need to have some "pip installs" if you are using a clean environment without any dependencies installed.
In the cdsw_build.sh script,
pip2 install sklearn
pip2 install keras
pip2 install tensorflow
pip2 install Pillow
pip2 install requests
pip2 install numpy
You can run this in your workspace if you want to test code interactively or make adjustments to the model.You need a sample request like the following:
{
"url": "<a href="<a href="<a href="https://miro.medium.com/max/3027/0*kp8rJzqHjagMj22J" target="_blank">https://miro.medium.com/max/3027/0*kp8rJzqHjagMj22J</a>" target="_blank"><a href="https://miro.medium.com/max/3027/0*kp8rJzqHjagMj22J</a" target="_blank">https://miro.medium.com/max/3027/0*kp8rJzqHjagMj22J</a</a>>" target="_blank"><a href="<a href="https://miro.medium.com/max/3027/0*kp8rJzqHjagMj22J</a" target="_blank">https://miro.medium.com/max/3027/0*kp8rJzqHjagMj22J</a</a>" target="_blank"><a href="https://miro.medium.com/max/3027/0*kp8rJzqHjagMj22J</a</a" target="_blank">https://miro.medium.com/max/3027/0*kp8rJzqHjagMj22J</a</a</a>>>"
}
And a sample response, like the following:
{
"predictions": [
{
"probability": 0.7351706624031067,
"label": "airliner"
},
{
"probability": 0.19801650941371918,
"label": "space_shuttle"
},
{
"probability": 0.05893068388104439,
"label": "wing"
},
{
"probability": 0.006579721812158823,
"label": "warplane"
},
{
"probability": 0.0006061011808924377,
"label": "airship"
}
],
"success": true
}
Then, you can deploy. After a couple of minutes, the status will show deployed. You can click test in the UI and get back a prediction.
Now, we have an endpoint capable of predicting any image given the images url.
Next step is to setup NiFi.
The following is a flow which can read from twitter and then score against your endpoint:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<template encoding-version="1.3">
<description>predict the images posted along with tweets about any topic you specify in the filters of the input processor</description>
<groupId>f0f4f2f6-016f-1000-ffff-ffffd1df01cd</groupId>
<name>twitter_image_prediction</name>
<snippet>
<connections>
<id>219fa01a-56a5-38aa-0000-000000000000</id>
<parentGroupId>74a3848d-1f7a-3866-0000-000000000000</parentGroupId>
<backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold>
<backPressureObjectThreshold>10000</backPressureObjectThreshold>
<destination>
<groupId>74a3848d-1f7a-3866-0000-000000000000</groupId>
<id>1b0c31a8-ec30-3832-0000-000000000000</id>
<type>PROCESSOR</type>
</destination>
<flowFileExpiration>0 sec</flowFileExpiration>
<labelIndex>1</labelIndex>
<loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
<loadBalancePartitionAttribute></loadBalancePartitionAttribute>
<loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
<loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
<name></name>
<selectedRelationships>Response</selectedRelationships>
<source>
<groupId>74a3848d-1f7a-3866-0000-000000000000</groupId>
<id>7f0d4823-8b99-305b-0000-000000000000</id>
<type>PROCESSOR</type>
</source>
<zIndex>0</zIndex>
</connections>
<connections>
<id>5021a311-2a61-30eb-0000-000000000000</id>
<parentGroupId>74a3848d-1f7a-3866-0000-000000000000</parentGroupId>
<backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold>
<backPressureObjectThreshold>10000</backPressureObjectThreshold>
<destination>
<groupId>74a3848d-1f7a-3866-0000-000000000000</groupId>
<id>55ad95fb-e275-3c64-0000-000000000000</id>
<type>FUNNEL</type>
</destination>
<flowFileExpiration>0 sec</flowFileExpiration>
<labelIndex>1</labelIndex>
<loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
<loadBalancePartitionAttribute></loadBalancePartitionAttribute>
<loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
<loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
<name></name>
<selectedRelationships>Response</selectedRelationships>
<source>
<groupId>74a3848d-1f7a-3866-0000-000000000000</groupId>
<id>7d54a566-45aa-3c17-0000-000000000000</id>
<type>PROCESSOR</type>
</source>
<zIndex>0</zIndex>
</connections>
<connections>
<id>5c06faac-4180-323a-0000-000000000000</id>
<parentGroupId>74a3848d-1f7a-3866-0000-000000000000</parentGroupId>
<backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold>
<backPressureObjectThreshold>10000</backPressureObjectThreshold>
<destination>
<groupId>74a3848d-1f7a-3866-0000-000000000000</groupId>
<id>c82b45d8-617b-396c-0000-000000000000</id>
<type>PROCESSOR</type>
</destination>
<flowFileExpiration>0 sec</flowFileExpiration>
<labelIndex>1</labelIndex>
<loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
<loadBalancePartitionAttribute></loadBalancePartitionAttribute>
<loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
<loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
<name></name>
<selectedRelationships>isImage</selectedRelationships>
<source>
<groupId>74a3848d-1f7a-3866-0000-000000000000</groupId>
<id>f29be6bf-ddbf-3b6d-0000-000000000000</id>
<type>PROCESSOR</type>
</source>
<zIndex>0</zIndex>
</connections>
<connections>
<id>6b6627d4-649b-3070-0000-000000000000</id>
<parentGroupId>74a3848d-1f7a-3866-0000-000000000000</parentGroupId>
<backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold>
<backPressureObjectThreshold>10000</backPressureObjectThreshold>
<destination>
<groupId>74a3848d-1f7a-3866-0000-000000000000</groupId>
<id>55ad95fb-e275-3c64-0000-000000000000</id>
<type>FUNNEL</type>
</destination>
<flowFileExpiration>0 sec</flowFileExpiration>
<labelIndex>1</labelIndex>
<loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
<loadBalancePartitionAttribute></loadBalancePartitionAttribute>
<loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
<loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
<name></name>
<selectedRelationships>Original</selectedRelationships>
<source>
<groupId>74a3848d-1f7a-3866-0000-000000000000</groupId>
<id>c82b45d8-617b-396c-0000-000000000000</id>
<type>PROCESSOR</type>
</source>
<zIndex>0</zIndex>
</connections>
<connections>
<id>8dcd8171-6516-36b1-0000-000000000000</id>
<parentGroupId>74a3848d-1f7a-3866-0000-000000000000</parentGroupId>
<backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold>
<backPressureObjectThreshold>10000</backPressureObjectThreshold>
<destination>
<groupId>74a3848d-1f7a-3866-0000-000000000000</groupId>
<id>912a453f-e17e-35a5-0000-000000000000</id>
<type>PROCESSOR</type>
</destination>
<flowFileExpiration>0 sec</flowFileExpiration>
<labelIndex>1</labelIndex>
<loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
<loadBalancePartitionAttribute></loadBalancePartitionAttribute>
<loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
<loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
<name></name>
<selectedRelationships>matched</selectedRelationships>
<source>
<groupId>74a3848d-1f7a-3866-0000-000000000000</groupId>
<id>d93241be-c09a-3a4c-0000-000000000000</id>
<type>PROCESSOR</type>
</source>
<zIndex>0</zIndex>
</connections>
<connections>
<id>98eb5d14-1885-3d33-0000-000000000000</id>
<parentGroupId>74a3848d-1f7a-3866-0000-000000000000</parentGroupId>
<backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold>
<backPressureObjectThreshold>10000</backPressureObjectThreshold>
<destination>
<groupId>74a3848d-1f7a-3866-0000-000000000000</groupId>
<id>55ad95fb-e275-3c64-0000-000000000000</id>
<type>FUNNEL</type>
</destination>
<flowFileExpiration>0 sec</flowFileExpiration>
<labelIndex>1</labelIndex>
<loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
<loadBalancePartitionAttribute></loadBalancePartitionAttribute>
<loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
<loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
<name></name>
<selectedRelationships>probHigh</selectedRelationships>
<source>
<groupId>74a3848d-1f7a-3866-0000-000000000000</groupId>
<id>912a453f-e17e-35a5-0000-000000000000</id>
<type>PROCESSOR</type>
</source>
<zIndex>0</zIndex>
</connections>
<connections>
<id>b36415ac-5cc5-39bb-0000-000000000000</id>
<parentGroupId>74a3848d-1f7a-3866-0000-000000000000</parentGroupId>
<backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold>
<backPressureObjectThreshold>10000</backPressureObjectThreshold>
<destination>
<groupId>74a3848d-1f7a-3866-0000-000000000000</groupId>
<id>f5dac0c6-3680-3d94-0000-000000000000</id>
<type>PROCESSOR</type>
</destination>
<flowFileExpiration>0 sec</flowFileExpiration>
<labelIndex>1</labelIndex>
<loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
<loadBalancePartitionAttribute></loadBalancePartitionAttribute>
<loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
<loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
<name></name>
<selectedRelationships>success</selectedRelationships>
<source>
<groupId>74a3848d-1f7a-3866-0000-000000000000</groupId>
<id>e5fa80e4-9690-38a3-0000-000000000000</id>
<type>PROCESSOR</type>
</source>
<zIndex>0</zIndex>
</connections>
<connections>
<id>babe17fd-025b-3b3f-0000-000000000000</id>
<parentGroupId>74a3848d-1f7a-3866-0000-000000000000</parentGroupId>
<backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold>
<backPressureObjectThreshold>10000</backPressureObjectThreshold>
<destination>
<groupId>74a3848d-1f7a-3866-0000-000000000000</groupId>
<id>63d78c45-e918-371a-0000-000000000000</id>
<type>PROCESSOR</type>
</destination>
<flowFileExpiration>0 sec</flowFileExpiration>
<labelIndex>1</labelIndex>
<loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
<loadBalancePartitionAttribute></loadBalancePartitionAttribute>
<loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
<loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
<name></name>
<selectedRelationships>success</selectedRelationships>
<source>
<groupId>74a3848d-1f7a-3866-0000-000000000000</groupId>
<id>cf25469b-28b9-3f90-0000-000000000000</id>
<type>PROCESSOR</type>
</source>
<zIndex>0</zIndex>
</connections>
<connections>
<id>cd56c28b-94c9-31e9-0000-000000000000</id>
<parentGroupId>74a3848d-1f7a-3866-0000-000000000000</parentGroupId>
<backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold>
<backPressureObjectThreshold>10000</backPressureObjectThreshold>
<destination>
<groupId>74a3848d-1f7a-3866-0000-000000000000</groupId>
<id>7f0d4823-8b99-305b-0000-000000000000</id>
<type>PROCESSOR</type>
</destination>
<flowFileExpiration>0 sec</flowFileExpiration>
<labelIndex>1</labelIndex>
<loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
<loadBalancePartitionAttribute></loadBalancePartitionAttribute>
<loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
<loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
<name></name>
<selectedRelationships>success</selectedRelationships>
<source>
<groupId>74a3848d-1f7a-3866-0000-000000000000</groupId>
<id>63d78c45-e918-371a-0000-000000000000</id>
<type>PROCESSOR</type>
</source>
<zIndex>0</zIndex>
</connections>
<connections>
<id>ce19ac81-e486-30bb-0000-000000000000</id>
<parentGroupId>74a3848d-1f7a-3866-0000-000000000000</parentGroupId>
<backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold>
<backPressureObjectThreshold>10000</backPressureObjectThreshold>
<destination>
<groupId>74a3848d-1f7a-3866-0000-000000000000</groupId>
<id>f29be6bf-ddbf-3b6d-0000-000000000000</id>
<type>PROCESSOR</type>
</destination>
<flowFileExpiration>0 sec</flowFileExpiration>
<labelIndex>1</labelIndex>
<loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
<loadBalancePartitionAttribute></loadBalancePartitionAttribute>
<loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
<loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
<name></name>
<selectedRelationships>matched</selectedRelationships>
<source>
<groupId>74a3848d-1f7a-3866-0000-000000000000</groupId>
<id>f5dac0c6-3680-3d94-0000-000000000000</id>
<type>PROCESSOR</type>
</source>
<zIndex>0</zIndex>
</connections>
<connections>
<id>cfbbc63d-28a4-3888-0000-000000000000</id>
<parentGroupId>74a3848d-1f7a-3866-0000-000000000000</parentGroupId>
<backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold>
<backPressureObjectThreshold>10000</backPressureObjectThreshold>
<destination>
<groupId>74a3848d-1f7a-3866-0000-000000000000</groupId>
<id>cf25469b-28b9-3f90-0000-000000000000</id>
<type>PROCESSOR</type>
</destination>
<flowFileExpiration>0 sec</flowFileExpiration>
<labelIndex>1</labelIndex>
<loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
<loadBalancePartitionAttribute></loadBalancePartitionAttribute>
<loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
<loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
<name></name>
<selectedRelationships>isImage</selectedRelationships>
<source>
<groupId>74a3848d-1f7a-3866-0000-000000000000</groupId>
<id>f29be6bf-ddbf-3b6d-0000-000000000000</id>
<type>PROCESSOR</type>
</source>
<zIndex>0</zIndex>
</connections>
<connections>
<id>dbdf2bf4-a2a3-30cc-0000-000000000000</id>
<parentGroupId>74a3848d-1f7a-3866-0000-000000000000</parentGroupId>
<backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold>
<backPressureObjectThreshold>10000</backPressureObjectThreshold>
<destination>
<groupId>74a3848d-1f7a-3866-0000-000000000000</groupId>
<id>7d54a566-45aa-3c17-0000-000000000000</id>
<type>PROCESSOR</type>
</destination>
<flowFileExpiration>0 sec</flowFileExpiration>
<labelIndex>1</labelIndex>
<loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
<loadBalancePartitionAttribute></loadBalancePartitionAttribute>
<loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
<loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
<name></name>
<selectedRelationships>isImage</selectedRelationships>
<source>
<groupId>74a3848d-1f7a-3866-0000-000000000000</groupId>
<id>f29be6bf-ddbf-3b6d-0000-000000000000</id>
<type>PROCESSOR</type>
</source>
<zIndex>0</zIndex>
</connections>
<connections>
<id>dc2c9d0e-5315-39ae-0000-000000000000</id>
<parentGroupId>74a3848d-1f7a-3866-0000-000000000000</parentGroupId>
<backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold>
<backPressureObjectThreshold>10000</backPressureObjectThreshold>
<destination>
<groupId>74a3848d-1f7a-3866-0000-000000000000</groupId>
<id>9e48d253-c1a1-3df8-0000-000000000000</id>
<type>FUNNEL</type>
</destination>
<flowFileExpiration>0 sec</flowFileExpiration>
<labelIndex>1</labelIndex>
<loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
<loadBalancePartitionAttribute></loadBalancePartitionAttribute>
<loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
<loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
<name></name>
<selectedRelationships>unmatched</selectedRelationships>
<source>
<groupId>74a3848d-1f7a-3866-0000-000000000000</groupId>
<id>912a453f-e17e-35a5-0000-000000000000</id>
<type>PROCESSOR</type>
</source>
<zIndex>0</zIndex>
</connections>
<connections>
<id>f2d088e5-6458-3ca3-0000-000000000000</id>
<parentGroupId>74a3848d-1f7a-3866-0000-000000000000</parentGroupId>
<backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold>
<backPressureObjectThreshold>10000</backPressureObjectThreshold>
<destination>
<groupId>74a3848d-1f7a-3866-0000-000000000000</groupId>
<id>d93241be-c09a-3a4c-0000-000000000000</id>
<type>PROCESSOR</type>
</destination>
<flowFileExpiration>0 sec</flowFileExpiration>
<labelIndex>1</labelIndex>
<loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
<loadBalancePartitionAttribute></loadBalancePartitionAttribute>
<loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
<loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
<name></name>
<selectedRelationships>matched</selectedRelationships>
<source>
<groupId>74a3848d-1f7a-3866-0000-000000000000</groupId>
<id>1b0c31a8-ec30-3832-0000-000000000000</id>
<type>PROCESSOR</type>
</source>
<zIndex>0</zIndex>
</connections>
<funnels>
<id>55ad95fb-e275-3c64-0000-000000000000</id>
<parentGroupId>74a3848d-1f7a-3866-0000-000000000000</parentGroupId>
<position>
<x>2090.0000059810604</x>
<y>712.0</y>
</position>
</funnels>
<funnels>
<id>9e48d253-c1a1-3df8-0000-000000000000</id>
<parentGroupId>74a3848d-1f7a-3866-0000-000000000000</parentGroupId>
<position>
<x>2602.0000059810604</x>
<y>776.0</y>
</position>
</funnels>
<processors>
<id>1b0c31a8-ec30-3832-0000-000000000000</id>
<parentGroupId>74a3848d-1f7a-3866-0000-000000000000</parentGroupId>
<position>
<x>1770.0000059810604</x>
<y>0.0</y>
</position>
<bundle>
<artifact>nifi-standard-nar</artifact>
<group>org.apache.nifi</group>
<version>1.10.0.2.0.0.0-106</version>
</bundle>
<config>
<bulletinLevel>WARN</bulletinLevel>
<comments></comments>
<concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount>
<descriptors>
<entry>
<key>Destination</key>
<value>
<name>Destination</name>
</value>
</entry>
<entry>
<key>Return Type</key>
<value>
<name>Return Type</name>
</value>
</entry>
<entry>
<key>Path Not Found Behavior</key>
<value>
<name>Path Not Found Behavior</name>
</value>
</entry>
<entry>
<key>Null Value Representation</key>
<value>
<name>Null Value Representation</name>
</value>
</entry>
<entry>
<key>json_prediction</key>
<value>
<name>json_prediction</name>
</value>
</entry>
</descriptors>
<executionNode>ALL</executionNode>
<lossTolerant>false</lossTolerant>
<penaltyDuration>30 sec</penaltyDuration>
<properties>
<entry>
<key>Destination</key>
<value>flowfile-content</value>
</entry>
<entry>
<key>Return Type</key>
<value>auto-detect</value>
</entry>
<entry>
<key>Path Not Found Behavior</key>
<value>ignore</value>
</entry>
<entry>
<key>Null Value Representation</key>
<value>empty string</value>
</entry>
<entry>
<key>json_prediction</key>
<value>$.response</value>
</entry>
</properties>
<runDurationMillis>0</runDurationMillis>
<schedulingPeriod>0 sec</schedulingPeriod>
<schedulingStrategy>TIMER_DRIVEN</schedulingStrategy>
<yieldDuration>1 sec</yieldDuration>
</config>
<executionNodeRestricted>false</executionNodeRestricted>
<name>EvaluateJsonPath</name>
<relationships>
<autoTerminate>true</autoTerminate>
<name>failure</name>
</relationships>
<relationships>
<autoTerminate>false</autoTerminate>
<name>matched</name>
</relationships>
<relationships>
<autoTerminate>true</autoTerminate>
<name>unmatched</name>
</relationships>
<state>RUNNING</state>
<style/>
<type>org.apache.nifi.processors.standard.EvaluateJsonPath</type>
</processors>
<processors>
<id>63d78c45-e918-371a-0000-000000000000</id>
<parentGroupId>74a3848d-1f7a-3866-0000-000000000000</parentGroupId>
<position>
<x>1282.0000059810604</x>
<y>0.0</y>
</position>
<bundle>
<artifact>nifi-update-attribute-nar</artifact>
<group>org.apache.nifi</group>
<version>1.10.0.2.0.0.0-106</version>
</bundle>
<config>
<bulletinLevel>WARN</bulletinLevel>
<comments></comments>
<concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount>
<descriptors>
<entry>
<key>Delete Attributes Expression</key>
<value>
<name>Delete Attributes Expression</name>
</value>
</entry>
<entry>
<key>Store State</key>
<value>
<name>Store State</name>
</value>
</entry>
<entry>
<key>Stateful Variables Initial Value</key>
<value>
<name>Stateful Variables Initial Value</name>
</value>
</entry>
<entry>
<key>canonical-value-lookup-cache-size</key>
<value>
<name>canonical-value-lookup-cache-size</name>
</value>
</entry>
<entry>
<key>accessKey</key>
<value>
<name>accessKey</name>
</value>
</entry>
<entry>
<key>Authorization</key>
<value>
<name>Authorization</name>
</value>
</entry>
<entry>
<key>request</key>
<value>
<name>request</name>
</value>
</entry>
</descriptors>
<executionNode>ALL</executionNode>
<lossTolerant>false</lossTolerant>
<penaltyDuration>30 sec</penaltyDuration>
<properties>
<entry>
<key>Delete Attributes Expression</key>
</entry>
<entry>
<key>Store State</key>
<value>Do not store state</value>
</entry>
<entry>
<key>Stateful Variables Initial Value</key>
</entry>
<entry>
<key>canonical-value-lookup-cache-size</key>
<value>100</value>
</entry>
<entry>
<key>accessKey</key>
<value>mkvz7qe59mw0jtsybqtg2ooybc5w3v1p</value>
</entry>
<entry>
<key>Authorization</key>
<value>Bearer somevalue</value>
</entry>
<entry>
<key>request</key>
<value>{"accessKey":"your_access_key","request":{
"url": "${media}"
}}</value>
</entry>
</properties>
<runDurationMillis>0</runDurationMillis>
<schedulingPeriod>0 sec</schedulingPeriod>
<schedulingStrategy>TIMER_DRIVEN</schedulingStrategy>
<yieldDuration>1 sec</yieldDuration>
</config>
<executionNodeRestricted>false</executionNodeRestricted>
<name>UpdateAttribute</name>
<relationships>
<autoTerminate>false</autoTerminate>
<name>success</name>
</relationships>
<state>RUNNING</state>
<style/>
<type>org.apache.nifi.processors.attributes.UpdateAttribute</type>
</processors>
<processors>
<id>7d54a566-45aa-3c17-0000-000000000000</id>
<parentGroupId>74a3848d-1f7a-3866-0000-000000000000</parentGroupId>
<position>
<x>1210.0000059810604</x>
<y>648.0</y>
</position>
<bundle>
<artifact>nifi-standard-nar</artifact>
<group>org.apache.nifi</group>
<version>1.10.0.2.0.0.0-106</version>
</bundle>
<config>
<bulletinLevel>WARN</bulletinLevel>
<comments></comments>
<concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount>
<descriptors>
<entry>
<key>HTTP Method</key>
<value>
<name>HTTP Method</name>
</value>
</entry>
<entry>
<key>Remote URL</key>
<value>
<name>Remote URL</name>
</value>
</entry>
<entry>
<key>SSL Context Service</key>
<value>
<identifiesControllerService>org.apache.nifi.ssl.SSLContextService</identifiesControllerService>
<name>SSL Context Service</name>
</value>
</entry>
<entry>
<key>Connection Timeout</key>
<value>
<name>Connection Timeout</name>
</value>
</entry>
<entry>
<key>Read Timeout</key>
<value>
<name>Read Timeout</name>
</value>
</entry>
<entry>
<key>Include Date Header</key>
<value>
<name>Include Date Header</name>
</value>
</entry>
<entry>
<key>Follow Redirects</key>
<value>
<name>Follow Redirects</name>
</value>
</entry>
<entry>
<key>Attributes to Send</key>
<value>
<name>Attributes to Send</name>
</value>
</entry>
<entry>
<key>Basic Authentication Username</key>
<value>
<name>Basic Authentication Username</name>
</value>
</entry>
<entry>
<key>Basic Authentication Password</key>
<value>
<name>Basic Authentication Password</name>
</value>
</entry>
<entry>
<key>proxy-configuration-service</key>
<value>
<identifiesControllerService>org.apache.nifi.proxy.ProxyConfigurationService</identifiesControllerService>
<name>proxy-configuration-service</name>
</value>
</entry>
<entry>
<key>Proxy Host</key>
<value>
<name>Proxy Host</name>
</value>
</entry>
<entry>
<key>Proxy Port</key>
<value>
<name>Proxy Port</name>
</value>
</entry>
<entry>
<key>Proxy Type</key>
<value>
<name>Proxy Type</name>
</value>
</entry>
<entry>
<key>invokehttp-proxy-user</key>
<value>
<name>invokehttp-proxy-user</name>
</value>
</entry>
<entry>
<key>invokehttp-proxy-password</key>
<value>
<name>invokehttp-proxy-password</name>
</value>
</entry>
<entry>
<key>Put Response Body In Attribute</key>
<value>
<name>Put Response Body In Attribute</name>
</value>
</entry>
<entry>
<key>Max Length To Put In Attribute</key>
<value>
<name>Max Length To Put In Attribute</name>
</value>
</entry>
<entry>
<key>Digest Authentication</key>
<value>
<name>Digest Authentication</name>
</value>
</entry>
<entry>
<key>Always Output Response</key>
<value>
<name>Always Output Response</name>
</value>
</entry>
<entry>
<key>Add Response Headers to Request</key>
<value>
<name>Add Response Headers to Request</name>
</value>
</entry>
<entry>
<key>Content-Type</key>
<value>
<name>Content-Type</name>
</value>
</entry>
<entry>
<key>send-message-body</key>
<value>
<name>send-message-body</name>
</value>
</entry>
<entry>
<key>Use Chunked Encoding</key>
<value>
<name>Use Chunked Encoding</name>
</value>
</entry>
<entry>
<key>Penalize on "No Retry"</key>
<value>
<name>Penalize on "No Retry"</name>
</value>
</entry>
<entry>
<key>use-etag</key>
<value>
<name>use-etag</name>
</value>
</entry>
<entry>
<key>etag-max-cache-size</key>
<value>
<name>etag-max-cache-size</name>
</value>
</entry>
</descriptors>
<executionNode>ALL</executionNode>
<lossTolerant>false</lossTolerant>
<penaltyDuration>30 sec</penaltyDuration>
<properties>
<entry>
<key>HTTP Method</key>
<value>GET</value>
</entry>
<entry>
<key>Remote URL</key>
<value>${media}</value>
</entry>
<entry>
<key>SSL Context Service</key>
</entry>
<entry>
<key>Connection Timeout</key>
<value>5 secs</value>
</entry>
<entry>
<key>Read Timeout</key>
<value>15 secs</value>
</entry>
<entry>
<key>Include Date Header</key>
<value>True</value>
</entry>
<entry>
<key>Follow Redirects</key>
<value>True</value>
</entry>
<entry>
<key>Attributes to Send</key>
</entry>
<entry>
<key>Basic Authentication Username</key>
</entry>
<entry>
<key>Basic Authentication Password</key>
</entry>
<entry>
<key>proxy-configuration-service</key>
</entry>
<entry>
<key>Proxy Host</key>
</entry>
<entry>
<key>Proxy Port</key>
</entry>
<entry>
<key>Proxy Type</key>
<value>http</value>
</entry>
<entry>
<key>invokehttp-proxy-user</key>
</entry>
<entry>
<key>invokehttp-proxy-password</key>
</entry>
<entry>
<key>Put Response Body In Attribute</key>
</entry>
<entry>
<key>Max Length To Put In Attribute</key>
<value>256</value>
</entry>
<entry>
<key>Digest Authentication</key>
<value>false</value>
</entry>
<entry>
<key>Always Output Response</key>
<value>false</value>
</entry>
<entry>
<key>Add Response Headers to Request</key>
<value>false</value>
</entry>
<entry>
<key>Content-Type</key>
<value>${mime.type}</value>
</entry>
<entry>
<key>send-message-body</key>
<value>true</value>
</entry>
<entry>
<key>Use Chunked Encoding</key>
<value>false</value>
</entry>
<entry>
<key>Penalize on "No Retry"</key>
<value>false</value>
</entry>
<entry>
<key>use-etag</key>
<value>false</value>
</entry>
<entry>
<key>etag-max-cache-size</key>
<value>10MB</value>
</entry>
</properties>
<runDurationMillis>0</runDurationMillis>
<schedulingPeriod>0 sec</schedulingPeriod>
<schedulingStrategy>TIMER_DRIVEN</schedulingStrategy>
<yieldDuration>1 sec</yieldDuration>
</config>
<executionNodeRestricted>false</executionNodeRestricted>
<name>InvokeHTTP</name>
<relationships>
<autoTerminate>true</autoTerminate>
<name>Failure</name>
</relationships>
<relationships>
<autoTerminate>true</autoTerminate>
<name>No Retry</name>
</relationships>
<relationships>
<autoTerminate>true</autoTerminate>
<name>Original</name>
</relationships>
<relationships>
<autoTerminate>false</autoTerminate>
<name>Response</name>
</relationships>
<relationships>
<autoTerminate>true</autoTerminate>
<name>Retry</name>
</relationships>
<state>STOPPED</state>
<style/>
<type>org.apache.nifi.processors.standard.InvokeHTTP</type>
</processors>
<processors>
<id>7f0d4823-8b99-305b-0000-000000000000</id>
<parentGroupId>74a3848d-1f7a-3866-0000-000000000000</parentGroupId>
<position>
<x>1522.0000059810604</x>
<y>360.0</y>
</position>
<bundle>
<artifact>nifi-standard-nar</artifact>
<group>org.apache.nifi</group>
<version>1.10.0.2.0.0.0-106</version>
</bundle>
<config>
<bulletinLevel>WARN</bulletinLevel>
<comments></comments>
<concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount>
<descriptors>
<entry>
<key>HTTP Method</key>
<value>
<name>HTTP Method</name>
</value>
</entry>
<entry>
<key>Remote URL</key>
<value>
<name>Remote URL</name>
</value>
</entry>
<entry>
<key>SSL Context Service</key>
<value>
<identifiesControllerService>org.apache.nifi.ssl.SSLContextService</identifiesControllerService>
<name>SSL Context Service</name>
</value>
</entry>
<entry>
<key>Connection Timeout</key>
<value>
<name>Connection Timeout</name>
</value>
</entry>
<entry>
<key>Read Timeout</key>
<value>
<name>Read Timeout</name>
</value>
</entry>
<entry>
<key>Include Date Header</key>
<value>
<name>Include Date Header</name>
</value>
</entry>
<entry>
<key>Follow Redirects</key>
<value>
<name>Follow Redirects</name>
</value>
</entry>
<entry>
<key>Attributes to Send</key>
<value>
<name>Attributes to Send</name>
</value>
</entry>
<entry>
<key>Basic Authentication Username</key>
<value>
<name>Basic Authentication Username</name>
</value>
</entry>
<entry>
<key>Basic Authentication Password</key>
<value>
<name>Basic Authentication Password</name>
</value>
</entry>
<entry>
<key>proxy-configuration-service</key>
<value>
<identifiesControllerService>org.apache.nifi.proxy.ProxyConfigurationService</identifiesControllerService>
<name>proxy-configuration-service</name>
</value>
</entry>
<entry>
<key>Proxy Host</key>
<value>
<name>Proxy Host</name>
</value>
</entry>
<entry>
<key>Proxy Port</key>
<value>
<name>Proxy Port</name>
</value>
</entry>
<entry>
<key>Proxy Type</key>
<value>
<name>Proxy Type</name>
</value>
</entry>
<entry>
<key>invokehttp-proxy-user</key>
<value>
<name>invokehttp-proxy-user</name>
</value>
</entry>
<entry>
<key>invokehttp-proxy-password</key>
<value>
<name>invokehttp-proxy-password</name>
</value>
</entry>
<entry>
<key>Put Response Body In Attribute</key>
<value>
<name>Put Response Body In Attribute</name>
</value>
</entry>
<entry>
<key>Max Length To Put In Attribute</key>
<value>
<name>Max Length To Put In Attribute</name>
</value>
</entry>
<entry>
<key>Digest Authentication</key>
<value>
<name>Digest Authentication</name>
</value>
</entry>
<entry>
<key>Always Output Response</key>
<value>
<name>Always Output Response</name>
</value>
</entry>
<entry>
<key>Add Response Headers to Request</key>
<value>
<name>Add Response Headers to Request</name>
</value>
</entry>
<entry>
<key>Content-Type</key>
<value>
<name>Content-Type</name>
</value>
</entry>
<entry>
<key>send-message-body</key>
<value>
<name>send-message-body</name>
</value>
</entry>
<entry>
<key>Use Chunked Encoding</key>
<value>
<name>Use Chunked Encoding</name>
</value>
</entry>
<entry>
<key>Penalize on "No Retry"</key>
<value>
<name>Penalize on "No Retry"</name>
</value>
</entry>
<entry>
<key>use-etag</key>
<value>
<name>use-etag</name>
</value>
</entry>
<entry>
<key>etag-max-cache-size</key>
<value>
<name>etag-max-cache-size</name>
</value>
</entry>
</descriptors>
<executionNode>ALL</executionNode>
<lossTolerant>false</lossTolerant>
<penaltyDuration>30 sec</penaltyDuration>
<properties>
<entry>
<key>HTTP Method</key>
<value>POST</value>
</entry>
<entry>
<key>Remote URL</key>
<value>cdsw_endpoint_url</value>
</entry>
<entry>
<key>SSL Context Service</key>
</entry>
<entry>
<key>Connection Timeout</key>
<value>5 secs</value>
</entry>
<entry>
<key>Read Timeout</key>
<value>15 secs</value>
</entry>
<entry>
<key>Include Date Header</key>
<value>True</value>
</entry>
<entry>
<key>Follow Redirects</key>
<value>True</value>
</entry>
<entry>
<key>Attributes to Send</key>
<value>accessKey</value>
</entry>
<entry>
<key>Basic Authentication Username</key>
</entry>
<entry>
<key>Basic Authentication Password</key>
</entry>
<entry>
<key>proxy-configuration-service</key>
</entry>
<entry>
<key>Proxy Host</key>
</entry>
<entry>
<key>Proxy Port</key>
</entry>
<entry>
<key>Proxy Type</key>
<value>http</value>
</entry>
<entry>
<key>invokehttp-proxy-user</key>
</entry>
<entry>
<key>invokehttp-proxy-password</key>
</entry>
<entry>
<key>Put Response Body In Attribute</key>
</entry>
<entry>
<key>Max Length To Put In Attribute</key>
<value>256</value>
</entry>
<entry>
<key>Digest Authentication</key>
<value>false</value>
</entry>
<entry>
<key>Always Output Response</key>
<value>false</value>
</entry>
<entry>
<key>Add Response Headers to Request</key>
<value>false</value>
</entry>
<entry>
<key>Content-Type</key>
<value>${mime.type}</value>
</entry>
<entry>
<key>send-message-body</key>
<value>true</value>
</entry>
<entry>
<key>Use Chunked Encoding</key>
<value>false</value>
</entry>
<entry>
<key>Penalize on "No Retry"</key>
<value>false</value>
</entry>
<entry>
<key>use-etag</key>
<value>false</value>
</entry>
<entry>
<key>etag-max-cache-size</key>
<value>10MB</value>
</entry>
</properties>
<runDurationMillis>0</runDurationMillis>
<schedulingPeriod>0 sec</schedulingPeriod>
<schedulingStrategy>TIMER_DRIVEN</schedulingStrategy>
<yieldDuration>1 sec</yieldDuration>
</config>
<executionNodeRestricted>false</executionNodeRestricted>
<name>InvokeHTTP</name>
<relationships>
<autoTerminate>true</autoTerminate>
<name>Failure</name>
</relationships>
<relationships>
<autoTerminate>true</autoTerminate>
<name>No Retry</name>
</relationships>
<relationships>
<autoTerminate>true</autoTerminate>
<name>Original</name>
</relationships>
<relationships>
<autoTerminate>false</autoTerminate>
<name>Response</name>
</relationships>
<relationships>
<autoTerminate>true</autoTerminate>
<name>Retry</name>
</relationships>
<state>RUNNING</state>
<style/>
<type>org.apache.nifi.processors.standard.InvokeHTTP</type>
</processors>
<processors>
<id>912a453f-e17e-35a5-0000-000000000000</id>
<parentGroupId>74a3848d-1f7a-3866-0000-000000000000</parentGroupId>
<position>
<x>2442.0000059810604</x>
<y>272.0</y>
</position>
<bundle>
<artifact>nifi-standard-nar</artifact>
<group>org.apache.nifi</group>
<version>1.10.0.2.0.0.0-106</version>
</bundle>
<config>
<bulletinLevel>WARN</bulletinLevel>
<comments></comments>
<concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount>
<descriptors>
<entry>
<key>Routing Strategy</key>
<value>
<name>Routing Strategy</name>
</value>
</entry>
<entry>
<key>probHigh</key>
<value>
<name>probHigh</name>
</value>
</entry>
</descriptors>
<executionNode>ALL</executionNode>
<lossTolerant>false</lossTolerant>
<penaltyDuration>30 sec</penaltyDuration>
<properties>
<entry>
<key>Routing Strategy</key>
<value>Route to Property name</value>
</entry>
<entry>
<key>probHigh</key>
<value>${prob1:gt(0.75)}</value>
</entry>
</properties>
<runDurationMillis>0</runDurationMillis>
<schedulingPeriod>0 sec</schedulingPeriod>
<schedulingStrategy>TIMER_DRIVEN</schedulingStrategy>
<yieldDuration>1 sec</yieldDuration>
</config>
<executionNodeRestricted>false</executionNodeRestricted>
<name>RouteOnAttribute</name>
<relationships>
<autoTerminate>false</autoTerminate>
<name>probHigh</name>
</relationships>
<relationships>
<autoTerminate>false</autoTerminate>
<name>unmatched</name>
</relationships>
<state>RUNNING</state>
<style/>
<type>org.apache.nifi.processors.standard.RouteOnAttribute</type>
</processors>
<processors>
<id>c82b45d8-617b-396c-0000-000000000000</id>
<parentGroupId>74a3848d-1f7a-3866-0000-000000000000</parentGroupId>
<position>
<x>1034.0000059810604</x>
<y>944.0</y>
</position>
<bundle>
<artifact>nifi-standard-nar</artifact>
<group>org.apache.nifi</group>
<version>1.10.0.2.0.0.0-106</version>
</bundle>
<config>
<bulletinLevel>WARN</bulletinLevel>
<comments></comments>
<concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount>
<descriptors>
<entry>
<key>HTTP Method</key>
<value>
<name>HTTP Method</name>
</value>
</entry>
<entry>
<key>Remote URL</key>
<value>
<name>Remote URL</name>
</value>
</entry>
<entry>
<key>SSL Context Service</key>
<value>
<identifiesControllerService>org.apache.nifi.ssl.SSLContextService</identifiesControllerService>
<name>SSL Context Service</name>
</value>
</entry>
<entry>
<key>Connection Timeout</key>
<value>
<name>Connection Timeout</name>
</value>
</entry>
<entry>
<key>Read Timeout</key>
<value>
<name>Read Timeout</name>
</value>
</entry>
<entry>
<key>Include Date Header</key>
<value>
<name>Include Date Header</name>
</value>
</entry>
<entry>
<key>Follow Redirects</key>
<value>
<name>Follow Redirects</name>
</value>
</entry>
<entry>
<key>Attributes to Send</key>
<value>
<name>Attributes to Send</name>
</value>
</entry>
<entry>
<key>Basic Authentication Username</key>
<value>
<name>Basic Authentication Username</name>
</value>
</entry>
<entry>
<key>Basic Authentication Password</key>
<value>
<name>Basic Authentication Password</name>
</value>
</entry>
<entry>
<key>proxy-configuration-service</key>
<value>
<identifiesControllerService>org.apache.nifi.proxy.ProxyConfigurationService</identifiesControllerService>
<name>proxy-configuration-service</name>
</value>
</entry>
<entry>
<key>Proxy Host</key>
<value>
<name>Proxy Host</name>
</value>
</entry>
<entry>
<key>Proxy Port</key>
<value>
<name>Proxy Port</name>
</value>
</entry>
<entry>
<key>Proxy Type</key>
<value>
<name>Proxy Type</name>
</value>
</entry>
<entry>
<key>invokehttp-proxy-user</key>
<value>
<name>invokehttp-proxy-user</name>
</value>
</entry>
<entry>
<key>invokehttp-proxy-password</key>
<value>
<name>invokehttp-proxy-password</name>
</value>
</entry>
<entry>
<key>Put Response Body In Attribute</key>
<value>
<name>Put Response Body In Attribute</name>
</value>
</entry>
<entry>
<key>Max Length To Put In Attribute</key>
<value>
<name>Max Length To Put In Attribute</name>
</value>
</entry>
<entry>
<key>Digest Authentication</key>
<value>
<name>Digest Authentication</name>
</value>
</entry>
<entry>
<key>Always Output Response</key>
<value>
<name>Always Output Response</name>
</value>
</entry>
<entry>
<key>Add Response Headers to Request</key>
<value>
<name>Add Response Headers to Request</name>
</value>
</entry>
<entry>
<key>Content-Type</key>
<value>
<name>Content-Type</name>
</value>
</entry>
<entry>
<key>send-message-body</key>
<value>
<name>send-message-body</name>
</value>
</entry>
<entry>
<key>Use Chunked Encoding</key>
<value>
<name>Use Chunked Encoding</name>
</value>
</entry>
<entry>
<key>Penalize on "No Retry"</key>
<value>
<name>Penalize on "No Retry"</name>
</value>
</entry>
<entry>
<key>use-etag</key>
<value>
<name>use-etag</name>
</value>
</entry>
<entry>
<key>etag-max-cache-size</key>
<value>
<name>etag-max-cache-size</name>
</value>
</entry>
</descriptors>
<executionNode>ALL</executionNode>
<lossTolerant>false</lossTolerant>
<penaltyDuration>30 sec</penaltyDuration>
<properties>
<entry>
<key>HTTP Method</key>
<value>GET</value>
</entry>
<entry>
<key>Remote URL</key>
<value>${media}</value>
</entry>
<entry>
<key>SSL Context Service</key>
</entry>
<entry>
<key>Connection Timeout</key>
<value>5 secs</value>
</entry>
<entry>
<key>Read Timeout</key>
<value>15 secs</value>
</entry>
<entry>
<key>Include Date Header</key>
<value>True</value>
</entry>
<entry>
<key>Follow Redirects</key>
<value>True</value>
</entry>
<entry>
<key>Attributes to Send</key>
</entry>
<entry>
<key>Basic Authentication Username</key>
</entry>
<entry>
<key>Basic Authentication Password</key>
</entry>
<entry>
<key>proxy-configuration-service</key>
</entry>
<entry>
<key>Proxy Host</key>
</entry>
<entry>
<key>Proxy Port</key>
</entry>
<entry>
<key>Proxy Type</key>
<value>http</value>
</entry>
<entry>
<key>invokehttp-proxy-user</key>
</entry>
<entry>
<key>invokehttp-proxy-password</key>
</entry>
<entry>
<key>Put Response Body In Attribute</key>
<value>pic</value>
</entry>
<entry>
<key>Max Length To Put In Attribute</key>
<value>256</value>
</entry>
<entry>
<key>Digest Authentication</key>
<value>false</value>
</entry>
<entry>
<key>Always Output Response</key>
<value>false</value>
</entry>
<entry>
<key>Add Response Headers to Request</key>
<value>false</value>
</entry>
<entry>
<key>Content-Type</key>
<value>${mime.type}</value>
</entry>
<entry>
<key>send-message-body</key>
<value>true</value>
</entry>
<entry>
<key>Use Chunked Encoding</key>
<value>false</value>
</entry>
<entry>
<key>Penalize on "No Retry"</key>
<value>false</value>
</entry>
<entry>
<key>use-etag</key>
<value>false</value>
</entry>
<entry>
<key>etag-max-cache-size</key>
<value>10MB</value>
</entry>
</properties>
<runDurationMillis>0</runDurationMillis>
<schedulingPeriod>0 sec</schedulingPeriod>
<schedulingStrategy>TIMER_DRIVEN</schedulingStrategy>
<yieldDuration>1 sec</yieldDuration>
</config>
<executionNodeRestricted>false</executionNodeRestricted>
<name>InvokeHTTP</name>
<relationships>
<autoTerminate>true</autoTerminate>
<name>Failure</name>
</relationships>
<relationships>
<autoTerminate>true</autoTerminate>
<name>No Retry</name>
</relationships>
<relationships>
<autoTerminate>false</autoTerminate>
<name>Original</name>
</relationships>
<relationships>
<autoTerminate>true</autoTerminate>
<name>Response</name>
</relationships>
<relationships>
<autoTerminate>true</autoTerminate>
<name>Retry</name>
</relationships>
<state>STOPPED</state>
<style/>
<type>org.apache.nifi.processors.standard.InvokeHTTP</type>
</processors>
<processors>
<id>cf25469b-28b9-3f90-0000-000000000000</id>
<parentGroupId>74a3848d-1f7a-3866-0000-000000000000</parentGroupId>
<position>
<x>1050.0000059810604</x>
<y>200.0</y>
</position>
<bundle>
<artifact>nifi-standard-nar</artifact>
<group>org.apache.nifi</group>
<version>1.10.0.2.0.0.0-106</version>
</bundle>
<config>
<bulletinLevel>WARN</bulletinLevel>
<comments></comments>
<concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount>
<descriptors>
<entry>
<key>Regular Expression</key>
<value>
<name>Regular Expression</name>
</value>
</entry>
<entry>
<key>Replacement Value</key>
<value>
<name>Replacement Value</name>
</value>
</entry>
<entry>
<key>Character Set</key>
<value>
<name>Character Set</name>
</value>
</entry>
<entry>
<key>Maximum Buffer Size</key>
<value>
<name>Maximum Buffer Size</name>
</value>
</entry>
<entry>
<key>Replacement Strategy</key>
<value>
<name>Replacement Strategy</name>
</value>
</entry>
<entry>
<key>Evaluation Mode</key>
<value>
<name>Evaluation Mode</name>
</value>
</entry>
<entry>
<key>Line-by-Line Evaluation Mode</key>
<value>
<name>Line-by-Line Evaluation Mode</name>
</value>
</entry>
</descriptors>
<executionNode>ALL</executionNode>
<lossTolerant>false</lossTolerant>
<penaltyDuration>30 sec</penaltyDuration>
<properties>
<entry>
<key>Regular Expression</key>
<value>(?s)(^.*$)</value>
</entry>
<entry>
<key>Replacement Value</key>
<value>{"accessKey":"mkvz7qe59mw0jtsybqtg2ooybc5w3v1p","request":{
"url": "${media}"
}}</value>
</entry>
<entry>
<key>Character Set</key>
<value>UTF-8</value>
</entry>
<entry>
<key>Maximum Buffer Size</key>
<value>1 MB</value>
</entry>
<entry>
<key>Replacement Strategy</key>
<value>Regex Replace</value>
</entry>
<entry>
<key>Evaluation Mode</key>
<value>Entire text</value>
</entry>
<entry>
<key>Line-by-Line Evaluation Mode</key>
<value>All</value>
</entry>
</properties>
<runDurationMillis>0</runDurationMillis>
<schedulingPeriod>0 sec</schedulingPeriod>
<schedulingStrategy>TIMER_DRIVEN</schedulingStrategy>
<yieldDuration>1 sec</yieldDuration>
</config>
<executionNodeRestricted>false</executionNodeRestricted>
<name>ReplaceText</name>
<relationships>
<autoTerminate>true</autoTerminate>
<name>failure</name>
</relationships>
<relationships>
<autoTerminate>false</autoTerminate>
<name>success</name>
</relationships>
<state>RUNNING</state>
<style/>
<type>org.apache.nifi.processors.standard.ReplaceText</type>
</processors>
<processors>
<id>d93241be-c09a-3a4c-0000-000000000000</id>
<parentGroupId>74a3848d-1f7a-3866-0000-000000000000</parentGroupId>
<position>
<x>2434.0000059810604</x>
<y>0.0</y>
</position>
<bundle>
<artifact>nifi-standard-nar</artifact>
<group>org.apache.nifi</group>
<version>1.10.0.2.0.0.0-106</version>
</bundle>
<config>
<bulletinLevel>WARN</bulletinLevel>
<comments></comments>
<concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount>
<descriptors>
<entry>
<key>Destination</key>
<value>
<name>Destination</name>
</value>
</entry>
<entry>
<key>Return Type</key>
<value>
<name>Return Type</name>
</value>
</entry>
<entry>
<key>Path Not Found Behavior</key>
<value>
<name>Path Not Found Behavior</name>
</value>
</entry>
<entry>
<key>Null Value Representation</key>
<value>
<name>Null Value Representation</name>
</value>
</entry>
<entry>
<key>prob1</key>
<value>
<name>prob1</name>
</value>
</entry>
</descriptors>
<executionNode>ALL</executionNode>
<lossTolerant>false</lossTolerant>
<penaltyDuration>30 sec</penaltyDuration>
<properties>
<entry>
<key>Destination</key>
<value>flowfile-attribute</value>
</entry>
<entry>
<key>Return Type</key>
<value>auto-detect</value>
</entry>
<entry>
<key>Path Not Found Behavior</key>
<value>ignore</value>
</entry>
<entry>
<key>Null Value Representation</key>
<value>empty string</value>
</entry>
<entry>
<key>prob1</key>
<value>$.predictions[0].probability</value>
</entry>
</properties>
<runDurationMillis>0</runDurationMillis>
<schedulingPeriod>0 sec</schedulingPeriod>
<schedulingStrategy>TIMER_DRIVEN</schedulingStrategy>
<yieldDuration>1 sec</yieldDuration>
</config>
<executionNodeRestricted>false</executionNodeRestricted>
<name>EvaluateJsonPath</name>
<relationships>
<autoTerminate>true</autoTerminate>
<name>failure</name>
</relationships>
<relationships>
<autoTerminate>false</autoTerminate>
<name>matched</name>
</relationships>
<relationships>
<autoTerminate>true</autoTerminate>
<name>unmatched</name>
</relationships>
<state>RUNNING</state>
<style/>
<type>org.apache.nifi.processors.standard.EvaluateJsonPath</type>
</processors>
<processors>
<id>e5fa80e4-9690-38a3-0000-000000000000</id>
<parentGroupId>74a3848d-1f7a-3866-0000-000000000000</parentGroupId>
<position>
<x>0.0</x>
<y>33.000009169540704</y>
</position>
<bundle>
<artifact>nifi-social-media-nar</artifact>
<group>org.apache.nifi</group>
<version>1.10.0.2.0.0.0-106</version>
</bundle>
<config>
<bulletinLevel>WARN</bulletinLevel>
<comments></comments>
<concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount>
<descriptors>
<entry>
<key>Twitter Endpoint</key>
<value>
<name>Twitter Endpoint</name>
</value>
</entry>
<entry>
<key>max-client-error-retries</key>
<value>
<name>max-client-error-retries</name>
</value>
</entry>
<entry>
<key>Consumer Key</key>
<value>
<name>Consumer Key</name>
</value>
</entry>
<entry>
<key>Consumer Secret</key>
<value>
<name>Consumer Secret</name>
</value>
</entry>
<entry>
<key>Access Token</key>
<value>
<name>Access Token</name>
</value>
</entry>
<entry>
<key>Access Token Secret</key>
<value>
<name>Access Token Secret</name>
</value>
</entry>
<entry>
<key>Languages</key>
<value>
<name>Languages</name>
</value>
</entry>
<entry>
<key>Terms to Filter On</key>
<value>
<name>Terms to Filter On</name>
</value>
</entry>
<entry>
<key>IDs to Follow</key>
<value>
<name>IDs to Follow</name>
</value>
</entry>
<entry>
<key>Locations to Filter On</key>
<value>
<name>Locations to Filter On</name>
</value>
</entry>
</descriptors>
<executionNode>ALL</executionNode>
<lossTolerant>false</lossTolerant>
<penaltyDuration>30 sec</penaltyDuration>
<properties>
<entry>
<key>Twitter Endpoint</key>
<value>Filter Endpoint</value>
</entry>
<entry>
<key>max-client-error-retries</key>
<value>5</value>
</entry>
<entry>
<key>Consumer Key</key>
</entry>
<entry>
<key>Consumer Secret</key>
</entry>
<entry>
<key>Access Token</key>
</entry>
<entry>
<key>Access Token Secret</key>
</entry>
<entry>
<key>Languages</key>
<value>en</value>
</entry>
<entry>
<key>Terms to Filter On</key>
<value>dog</value>
</entry>
<entry>
<key>IDs to Follow</key>
</entry>
<entry>
<key>Locations to Filter On</key>
</entry>
</properties>
<runDurationMillis>0</runDurationMillis>
<schedulingPeriod>0 sec</schedulingPeriod>
<schedulingStrategy>TIMER_DRIVEN</schedulingStrategy>
<yieldDuration>1 sec</yieldDuration>
</config>
<executionNodeRestricted>false</executionNodeRestricted>
<name>GetTwitter</name>
<relationships>
<autoTerminate>false</autoTerminate>
<name>success</name>
</relationships>
<state>STOPPED</state>
<style/>
<type>org.apache.nifi.processors.twitter.GetTwitter</type>
</processors>
<processors>
<id>f29be6bf-ddbf-3b6d-0000-000000000000</id>
<parentGroupId>74a3848d-1f7a-3866-0000-000000000000</parentGroupId>
<position>
<x>738.0000059810604</x>
<y>416.0</y>
</position>
<bundle>
<artifact>nifi-standard-nar</artifact>
<group>org.apache.nifi</group>
<version>1.10.0.2.0.0.0-106</version>
</bundle>
<config>
<bulletinLevel>WARN</bulletinLevel>
<comments></comments>
<concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount>
<descriptors>
<entry>
<key>Routing Strategy</key>
<value>
<name>Routing Strategy</name>
</value>
</entry>
<entry>
<key>isImage</key>
<value>
<name>isImage</name>
</value>
</entry>
</descriptors>
<executionNode>ALL</executionNode>
<lossTolerant>false</lossTolerant>
<penaltyDuration>30 sec</penaltyDuration>
<properties>
<entry>
<key>Routing Strategy</key>
<value>Route to Property name</value>
</entry>
<entry>
<key>isImage</key>
<value>${type:equals("photo")}</value>
</entry>
</properties>
<runDurationMillis>0</runDurationMillis>
<schedulingPeriod>0 sec</schedulingPeriod>
<schedulingStrategy>TIMER_DRIVEN</schedulingStrategy>
<yieldDuration>1 sec</yieldDuration>
</config>
<executionNodeRestricted>false</executionNodeRestricted>
<name>RouteOnAttribute</name>
<relationships>
<autoTerminate>false</autoTerminate>
<name>isImage</name>
</relationships>
<relationships>
<autoTerminate>true</autoTerminate>
<name>unmatched</name>
</relationships>
<state>RUNNING</state>
<style/>
<type>org.apache.nifi.processors.standard.RouteOnAttribute</type>
</processors>
<processors>
<id>f5dac0c6-3680-3d94-0000-000000000000</id>
<parentGroupId>74a3848d-1f7a-3866-0000-000000000000</parentGroupId>
<position>
<x>738.0000059810604</x>
<y>0.0</y>
</position>
<bundle>
<artifact>nifi-standard-nar</artifact>
<group>org.apache.nifi</group>
<version>1.10.0.2.0.0.0-106</version>
</bundle>
<config>
<bulletinLevel>WARN</bulletinLevel>
<comments></comments>
<concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount>
<descriptors>
<entry>
<key>Destination</key>
<value>
<name>Destination</name>
</value>
</entry>
<entry>
<key>Return Type</key>
<value>
<name>Return Type</name>
</value>
</entry>
<entry>
<key>Path Not Found Behavior</key>
<value>
<name>Path Not Found Behavior</name>
</value>
</entry>
<entry>
<key>Null Value Representation</key>
<value>
<name>Null Value Representation</name>
</value>
</entry>
<entry>
<key>media</key>
<value>
<name>media</name>
</value>
</entry>
<entry>
<key>type</key>
<value>
<name>type</name>
</value>
</entry>
</descriptors>
<executionNode>ALL</executionNode>
<lossTolerant>false</lossTolerant>
<penaltyDuration>30 sec</penaltyDuration>
<properties>
<entry>
<key>Destination</key>
<value>flowfile-attribute</value>
</entry>
<entry>
<key>Return Type</key>
<value>auto-detect</value>
</entry>
<entry>
<key>Path Not Found Behavior</key>
<value>ignore</value>
</entry>
<entry>
<key>Null Value Representation</key>
<value>empty string</value>
</entry>
<entry>
<key>media</key>
<value>$.entities.media[0].media_url</value>
</entry>
<entry>
<key>type</key>
<value>$.entities.media[0].type</value>
</entry>
</properties>
<runDurationMillis>0</runDurationMillis>
<schedulingPeriod>0 sec</schedulingPeriod>
<schedulingStrategy>TIMER_DRIVEN</schedulingStrategy>
<yieldDuration>1 sec</yieldDuration>
</config>
<executionNodeRestricted>false</executionNodeRestricted>
<name>EvaluateJsonPath</name>
<relationships>
<autoTerminate>true</autoTerminate>
<name>failure</name>
</relationships>
<relationships>
<autoTerminate>false</autoTerminate>
<name>matched</name>
</relationships>
<relationships>
<autoTerminate>true</autoTerminate>
<name>unmatched</name>
</relationships>
<state>RUNNING</state>
<style/>
<type>org.apache.nifi.processors.standard.EvaluateJsonPath</type>
</processors>
</snippet>
<timestamp>01/30/2020 12:03:02 UTC</timestamp>
</template>
You need to provide your Twitter credentials, the CDSW token and URL in the processors requiring additional configuration. Then you can start the flow.
Looking at the queue at the very end of the above flow, there is a split where I am routing images with greater than 75% confidence away from the rest of the predictions.
In this example I am just looking at dogs, so let's see what the results look like...
For this tweet, we get the following prediction:
{"predictions": [{"probability": 0.9260697364807129, "label": "Boston_bull"}, {"probability": 0.06926842778921127, "label": "French_bulldog"}, {"probability": 0.0018951991805806756, "label": "toy_terrier"}, {"probability": 0.0010464458027854562, "label": "boxer"}, {"probability": 0.0003390783385839313, "label": "Staffordshire_bullterrier"}]}
Following is the image:
Not too shabby for a phase one prediction pipeline!
That is all for this walkthrough. Check back for the next article which will build on this to incorporate Flink for streaming analysis of our results as well as the text from the tweet to gain more insights.
Hope you enjoyed this article and see you again soon.