1973
Posts
1225
Kudos Received
124
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 2493 | 04-03-2024 06:39 AM | |
| 3849 | 01-12-2024 08:19 AM | |
| 2083 | 12-07-2023 01:49 PM | |
| 3073 | 08-02-2023 07:30 AM | |
| 4214 | 03-29-2023 01:22 PM |
07-11-2018
02:24 PM
2 Kudos
Capture Images from PicSum.com Free Images Process All the Images via TensorFlow Processor, SSD Predict via MMS and SqueezeNet v1.1 via MMS Apache Zeppelin SQL Against tblsqueeze11 Example Output from Squeeze v1.1 Storing Generic Data in HDFS via Schema Example SSD Data JSON High Level Flow From Server Apache NiFi Server Flows to Store Convert to Apache ORC Extract Attributes Convert JSON Arrays to Other Example Data Derived From TensorFlow Processor Schemas in Schema Registry Create Table in Zeppelin Query Table in Zeppelin Python Libraries git clone https://github.com/awslabs/mxnet-model-server.git
pip install opencv-python -U
pip install scikit-learn -U
pip install easydict -U
pip install scikit-image -U
pip install numpy -U
pip install mxnet -U
pip3.6 install opencv-python -U
pip3.6 install scikit-learn -U
pip3.6 install easydict -U
pip3.6 install scikit-image -U
pip3.6 install numpy -U
pip3.6 install mxnet -U Example Runs - Squeeze v1.1 mxnet-model-server --models squeezenet=squeezenet_v1.1.model --service mms/model_service/mxnet_vision_service.py --port 9999
[INFO 2018-07-10 16:50:26,840 PID:7730 /usr/local/lib/python3.6/site-packages/mms/request_handler/flask_handler.py:jsonify:159] Jsonifying the response: {'prediction': [[{'probability': 0.3365139067173004, 'class': 'n03710193 mailbox, letter box'}, {'probability': 0.1522996574640274, 'class': 'n03764736 milk can'}, {'probability': 0.08760709315538406, 'class': 'n03000134 chainlink fence'}, {'probability': 0.08103135228157043, 'class': 'n02747177 ashcan, trash can, garbage can, wastebin, ash bin, ash-bin, ashbin, dustbin, trash barrel, trash bin'}, {'probability': 0.04956872761249542, 'class': 'n02795169 barrel, cask'}]]}
[INFO 2018-07-10 16:50:26,842 PID:7730 /usr/local/lib/python3.6/site-packages/werkzeug/_internal.py:_log:88] 127.0.0.1 - - [10/Jul/2018 16:50:26] "POST /squeezenet/predict HTTP/1.1" 200 -
[INFO 2018-07-10 16:50:46,904 PID:7730 /usr/local/lib/python3.6/site-packages/mms/serving_frontend.py:predict_callback:467] Request input: data should be image with jpeg format.
[INFO 2018-07-10 16:50:46,960 PID:7730 /usr/local/lib/python3.6/site-packages/mms/request_handler/flask_handler.py:get_file_data:137] Getting file data from request.
[INFO 2018-07-10 16:50:47,020 PID:7730 /usr/local/lib/python3.6/site-packages/mms/serving_frontend.py:predict_callback:510] Response is text.
[INFO 2018-07-10 16:50:47,020 PID:7730 /usr/local/lib/python3.6/site-packages/mms/request_handler/flask_handler.py:jsonify:159] Jsonifying the response: {'prediction': [[{'probability': 0.1060439869761467, 'class': 'n02536864 coho, cohoe, coho salmon, blue jack, silver salmon, Oncorhynchus kisutch'}, {'probability': 0.06582894921302795, 'class': 'n01930112 nematode, nematode worm, roundworm'}, {'probability': 0.05008145794272423, 'class': 'n01751748 sea snake'}, {'probability': 0.03847070038318634, 'class': 'n01737021 water snake'}, {'probability': 0.03614763543009758, 'class': 'n09229709 bubble'}]]}
[INFO 2018-07-10 16:50:47,021 PID:7730 /usr/local/lib/python3.6/site-packages/werkzeug/_internal.py:_log:88] 127.0.0.1 - - [10/Jul/2018 16:50:47] "POST /squeezenet/predict HTTP/1.1" 200 -
mxnet-model-server --models SSD=resnet50_ssd_model.model --service ssd_service.py --port 9998
Apache MXNet Model Server Model Zoo https://github.com/awslabs/mxnet-model-server/blob/master/docs/model_zoo.md Connect to MMS /opt/demo/curl.sh
curl -X POST http://127.0.0.1:9998/SSD/predict -F "data=@$1" 2>/dev/null
/opt/demo/curl2.sh
curl -X POST http://127.0.0.1:9999/squeezenet/predict -F "data=@$1" 2>/dev/null
Flows mxnetserverlocal.xml mxnetmodelserver.xml Reference
https://community.hortonworks.com/articles/155435/using-the-new-mxnet-model-server.html https://community.hortonworks.com/articles/177232/apache-deep-learning-101-processing-apache-mxnet-m.html https://mxnet.incubator.apache.org/model_zoo/ https://medium.com/apache-mxnet/mxnet-1-2-adds-built-in-support-for-onnx-e2c7450ffc28 https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html https://www.kaggle.com/c/challenges-in-representation-learning-facial-expression-recognition-challenge/data https://github.com/onnx/models https://github.com/awslabs/mxnet-model-server/blob/master/docs/model_zoo.md#lstm-ptb https://github.com/awslabs/mxnet-model-server/blob/master/docs/model_zoo.md#arcface-resnet100_onnx
... View more
Labels:
07-19-2018
11:12 AM
Great post. Another solution may be to make use of Google Blockchain public dataset and Nifi: http://datamater.io/2018/07/19/fetching-bitcoin-transactions-with-apache-nifi/
... View more
06-29-2018
09:20 PM
1 Kudo
Working with Infura.io Calling REST APIs is very easy with Apache NiFi, let's use this to ingest a lot of data about Ethereum blockchains and transactions. We can also ingest and examine network status data. Infura provides secure, reliable, scalable access to Ethereum and IPFS. Check out the current status: https://infura.io/status Apache NiFi Flows to Read From the Ethereum Blockchain via Infura REST APIs An Example REST Call We rename the file to make it unique and it note that it is from ethbtcfull API call. Example Data (JSON) {"base": "ETH", "quote": "BTC", "tickers": [{"bid": 0.0750827, "ask": 0.07530999, "volume": 5159.53235584, "timestamp": 1528924408, "exchange": "bitstamp"}, {"bid": 0.07519, "ask": 0.0752, "volume": 1938.0570499, "timestamp": 1528924408, "exchange": "gemini"}, {"bid": 0.07515, "ask": 0.07516, "volume": 10310.62442822, "timestamp": 1528924408, "exchange": "gdax"}, {"bid": 0.075208, "ask": 0.075226, "volume": 45253.638, "timestamp": 1528924409, "exchange": "hitbtc"}, {"bid": 0.075156, "ask": 0.07517, "volume": 25932.60119467, "timestamp": 1528924409, "exchange": "bitfinex"}, {"bid": 0.07503287, "ask": 0.0751326, "volume": 6713.29011055, "timestamp": 1528924409, "exchange": "exmo"}, {"bid": 0.075141, "ask": 0.075191, "volume": 136176.55, "timestamp": 1528924409, "exchange": "binance"}, {"bid": 0.073687, "ask": 0.07566, "volume": 99.58262408, "timestamp": 1528924409, "exchange": "quoine"}, {"bid": 0.074975, "ask": 0.075251, "volume": 698.857662, "timestamp": 1528924409, "exchange": "cex"}, {"bid": 0.07494, "ask": 0.07517503, "volume": 12079.03438486, "timestamp": 1528924409, "exchange": "livecoin"}, {"bid": 0.07421999, "ask": 0.07574378, "volume": 106.69757, "timestamp": 1528924410, "exchange": "btc_markets"}]} Calling INFURA REST APIs Note: For most use cases you do not need an API Key. Make sure you keep it under their limits and follow all of their terms of service. API Calls https://api.infura.io/v2/blacklist https://api.infura.io/v1/ticker/ethbtc/full https://api.infura.io/v1/ticker/ethbtc https://api.infura.io/v1/ticker/symbols Format The Files Example Expression Language ${filename:append('infurasymbols.'):append(${now():format('yyyymmddHHMMSS'):append(${md5}):append('.json')})} References:
https://blog.infura.io/getting-started-with-infura-28e41844cc89 https://infura.io/ https://infura.io/docs https://infura.io/status https://infura.docs.apiary.io/
... View more
Labels:
06-16-2018
02:38 PM
2 Kudos
Using Apache MXNet GluonCV with Apache NiFi for Deep Learning Computer Vision Source: https://github.com/tspannhw/OpenSourceComputerVision/ Gluon and Apache MXNet have been great for deep learning especially for newbies like me. It got even better! They added a Deep Learning Toolkit that is easy to use and has a number of great pre-trained models that you can easily use to do some general use cases around computer vision. So I have used a simple well-documented example that I tweaked to save the final image and send some JSON details via MQTT to Apache NiFi. This may sound familiar: https://community.hortonworks.com/articles/198912/ingesting-apache-mxnet-gluon-deep-learning-results.html GluonCV makes this even easier! Let's check it out. Again let's take a simpmle Python example tweak it, run it via a shell script and send the results over MQTT. See: https://gluon-cv.mxnet.io/build/examples_detection/demo_ssd.html#sphx-glr-build-examples-detection-demo-ssd-py Python Code: https://github.com/tspannhw/UsingGluonCV/tree/master This is the Saved Annotated Figure Simple Apache NiFi Flow to Ingest MQTT Data from GluonCV example Python and Store to Hive and Parquet and HBase. A simple flow:
ConsumeMQTT InferAvroSchema RouteOnContent MergeRecord (convert batches of json to single avro) ConvertAvroToORC PutHDFS PutParquet PutHbaseRecord Again Apache NiFi generates a schema for us from data examination. There's a really cool project coming out of New Jersey that has advanced schema generation looking at tables, I'll report on that later. We take it add, save to Schema Registry and are ready to Merge Records. One thing you may want to add is to turn regular types from: "type": "string" to "type": ["string","null"]. Schema {
"type": "record",
"name": "gluoncv",
"fields": [
{
"name": "imgname",
"type": "string",
"doc": "Type inferred from '\"images/gluoncv_image_20180615203319_6e0e5f0b-d2aa-4e94-b7e9-8bb7f29c9512.jpg\"'"
},
{
"name": "host",
"type": "string",
"doc": "Type inferred from '\"HW13125.local\"'"
},
{
"name": "shape",
"type": "string",
"doc": "Type inferred from '\"(1, 3, 512, 910)\"'"
},
{
"name": "end",
"type": "string",
"doc": "Type inferred from '\"1529094800.88097\"'"
},
{
"name": "te",
"type": "string",
"doc": "Type inferred from '\"2.4256367683410645\"'"
},
{
"name": "battery",
"type": "int",
"doc": "Type inferred from '100'"
},
{
"name": "systemtime",
"type": "string",
"doc": "Type inferred from '\"06/15/2018 16:33:20\"'"
},
{
"name": "cpu",
"type": "double",
"doc": "Type inferred from '23.2'"
},
{
"name": "diskusage",
"type": "string",
"doc": "Type inferred from '\"112000.8 MB\"'"
},
{
"name": "memory",
"type": "double",
"doc": "Type inferred from '65.8'"
},
{
"name": "id",
"type": "string",
"doc": "Type inferred from '\"20180615203319_6e0e5f0b-d2aa-4e94-b7e9-8bb7f29c9512\"'"
}
]
}
Example JSON {"imgname": "images/gluoncv_image_20180615203615_c83fed6f-2ec8-4841-97e3-40985f7859ad.jpg", "host": "HW13125.local", "shape": "(1, 3, 512, 910)", "end": "1529094976.237143", "te": "1.8907802104949951", "battery": 100, "systemtime": "06/15/2018 16:36:16", "cpu": 29.3, "diskusage": "112008.6 MB", "memory": 66.5, "id": "20180615203615_c83fed6f-2ec8-4841-97e3-40985f7859ad"} Table Generated CREATE EXTERNAL TABLE IF NOT EXISTS gluoncv (imgname STRING, host STRING, shape STRING, end STRING, te STRING, battery INT, systemtime STRING, cpu DOUBLE, diskusage STRING, memory DOUBLE, id STRING) STORED AS ORC LOCATION '/gluoncv' Parquet Table create external table gluoncv_parquet (imgname STRING, host STRING, shape STRING, end STRING, te STRING, battery INT, systemtime STRING, cpu DOUBLE, diskusage STRING, memory DOUBLE, id STRING) STORED AS PARQUET LOCATION '/gluoncvpar' Reference: https://gluon-cv.mxnet.io/ https://gluon-cv.mxnet.io/build/examples_detection/index.html https://medium.com/apache-mxnet/gluoncv-deep-learning-toolkit-for-computer-vision-9218a907e8da
... View more
Labels:
06-15-2018
04:11 PM
Adding Parquet Output https://cwiki.apache.org/confluence/display/Hive/Parquet create external table gluon2_parquet (top1pct STRING, top2pct STRING, top3pct STRING, top4pct STRING, top5pct STRING, top1 STRING, top2 STRING, top3 STRING, top4 STRING, top5 STRING,
imgname STRING, host STRING, `end` STRING, te STRING, battery INT, systemtime STRING, cpu DOUBLE, diskusage STRING, memory DOUBLE, id STRING)
STORED AS PARQUET
LOCATION '/gluon2par' select * from gluon2_parquet Add the PutParquet Processor
... View more
06-14-2018
09:15 PM
We are analyzing this unsplash picture https://raw.githubusercontent.com/tspannhw/DWS-DeepLearning-CrashCourse/master/photo1.jpg
... View more
05-26-2018
10:55 PM
3 Kudos
Integrating Keras (TensorFlow) YOLOv3 Into Apache NiFi Workflows For this article I wanted to try the new YOLOv3 that's running in Keras. Out of the box with video streaming, pretty cool: git clone https://github.com/qqwweee/keras-yolo3
wget https://pjreddie.com/media/files/yolov3.weights
python convert.py yolov3.cfg yolov3.weights model_data/yolo.h5
python yolo.py See: https://github.com/qqwweee/keras-yolo3 My article on Darknet original YOLO v3 is here: https://community.hortonworks.com/articles/191259/integrating-darknet-yolov3-into-apache-nifi-workfl.html Example JSON Output {
"boxes" : "Found 8 boxes for img",
"class7" : "diningtable",
"score7" : "0.7484486",
"left7" : "636",
"right7" : "1096",
"top7" : "210",
"bottom7" : "693",
"class6" : "sofa",
"score6" : "0.31372178",
"left6" : "1114",
"right6" : "1276",
"top6" : "172",
"bottom6" : "381",
"class5" : "chair",
"score5" : "0.3455438",
"left5" : "990",
"right5" : "1048",
"top5" : "183",
"bottom5" : "246",
"class4" : "chair",
"score4" : "0.34554565",
"left4" : "858",
"right4" : "1000",
"top4" : "186",
"bottom4" : "244",
"class3" : "chair",
"score3" : "0.87056005",
"left3" : "1114",
"right3" : "1276",
"top3" : "172",
"bottom3" : "381",
"class2" : "chair",
"score2" : "0.9683409",
"left2" : "958",
"right2" : "1151",
"top2" : "203",
"bottom2" : "482",
"class1" : "cup",
"score1" : "0.49115792",
"left1" : "691",
"right1" : "770",
"top1" : "84",
"bottom1" : "229",
"class0" : "person",
"score0" : "0.9980049",
"left0" : "187",
"right0" : "709",
"top0" : "2",
"bottom0" : "720",
"host" : "HW13125.local.fios-router.home",
"end" : "1527443620.644303",
"te" : "4.380625247955322",
"battery" : 100,
"systemtime" : "05/27/2018 13:53:40",
"cpu" : 19.7,
"diskusage" : "140607.7 MB",
"memory" : 65.8,
"yoloid" : "20180527175341_284c6051-a45d-4757-977a-fe1dd76f295b"
}
I created a YOLO ID so that the JSON, the ID and the image would have that ID for keeping them in sync as we send them across various distributed networks into a cluster for storage. I also have some other metadata I thought would be helpful such as date time, run time, battery available, disk usage, cpu and memory. Those are quick and easy to grab with Python and could be useful. This is running on a Mac laptop. This could be ported to the NVIDIA Jetson TX1. It may work on the RPI3 with Movidius, but I think it may be a touch slow. Even on a Mac with no GPU and some stuff running I am getting an image every 2-3 seconds produced. It's very choppy, I would like to try this on an UBuntu workstation with a few NVidia high end GPUs and TensorFlow compiled for GPU. NiFi Flow for Ingest of JSON and Images NiFi Server to process and store data in Hive and HBase Read JSON from the Logs Directory Written by Python 3 Ingest Images from Images Directory Written by YOLO v3 TensorFlow Analysis of Captured Image You can watch the data arrive To Write Data to HBase is Easy (Just create an HBase table with a column family) Apache NiFi generates my Hive table Once we have a generate table it is populated with our data and we can query it in Apache Zeppelin (or any JDBC/ODBC Tool) Using InferAvroSchema we had a schema created, we store it in Hortonworks Schema Registry for use. Sample of Data Stored in HBase 20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:boxes, timestamp=1527456044351, value=Found 9 boxes for img
20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:class0, timestamp=1527456044351, value=person
20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:class1, timestamp=1527456044351, value=chair
20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:class2, timestamp=1527456044351, value=chair
20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:class3, timestamp=1527456044351, value=chair
20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:class4, timestamp=1527456044351, value=chair
20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:class5, timestamp=1527456044351, value=chair
20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:class6, timestamp=1527456044351, value=chair
20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:class7, timestamp=1527456044351, value=sofa
20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:class8, timestamp=1527456044351, value=diningtable Flow (Local, Server) and Zeppelin Notebook: yolo-keras-json-server-save.xml keras-tensorflow-yolo-v3-osx.xml yolo-copy.json Forked Python Code For Saving JSON and Images https://github.com/tspannhw/yolo3-keras-tensorflow Coming Soon: Live Recording of YOLO v3 with Keras/TensorFlow recording of the capture stream.
... View more
Labels:
05-25-2018
07:48 PM
One thing we are missing is language detection, may be using Apache Tika or Apache OpenNLP to try that. Also we should probably add attributes to let you exactly specify the models for Organization, Location, Name, Dates.
... View more
05-21-2018
08:13 PM
/nifi/nifi-toolkit/nifi-toolkit-assembly/target copy this somewhere nifi-toolkit-1.7.0-SNAPSHOT-bin.zip and unzip to run
... View more