About TimothySpann

TimothySpann · ‎06-29-2018

Working with Infura.io Calling REST APIs is very easy with Apache NiFi, let's use this to ingest a lot of data about Ethereum blockchains and transactions. We can also ingest and examine network status data. Infura provides secure, reliable, scalable access to Ethereum and IPFS. Check out the current status: https://infura.io/status Apache NiFi Flows to Read From the Ethereum Blockchain via Infura REST APIs An Example REST Call We rename the file to make it unique and it note that it is from ethbtcfull API call. Example Data (JSON) {"base": "ETH", "quote": "BTC", "tickers": [{"bid": 0.0750827, "ask": 0.07530999, "volume": 5159.53235584, "timestamp": 1528924408, "exchange": "bitstamp"}, {"bid": 0.07519, "ask": 0.0752, "volume": 1938.0570499, "timestamp": 1528924408, "exchange": "gemini"}, {"bid": 0.07515, "ask": 0.07516, "volume": 10310.62442822, "timestamp": 1528924408, "exchange": "gdax"}, {"bid": 0.075208, "ask": 0.075226, "volume": 45253.638, "timestamp": 1528924409, "exchange": "hitbtc"}, {"bid": 0.075156, "ask": 0.07517, "volume": 25932.60119467, "timestamp": 1528924409, "exchange": "bitfinex"}, {"bid": 0.07503287, "ask": 0.0751326, "volume": 6713.29011055, "timestamp": 1528924409, "exchange": "exmo"}, {"bid": 0.075141, "ask": 0.075191, "volume": 136176.55, "timestamp": 1528924409, "exchange": "binance"}, {"bid": 0.073687, "ask": 0.07566, "volume": 99.58262408, "timestamp": 1528924409, "exchange": "quoine"}, {"bid": 0.074975, "ask": 0.075251, "volume": 698.857662, "timestamp": 1528924409, "exchange": "cex"}, {"bid": 0.07494, "ask": 0.07517503, "volume": 12079.03438486, "timestamp": 1528924409, "exchange": "livecoin"}, {"bid": 0.07421999, "ask": 0.07574378, "volume": 106.69757, "timestamp": 1528924410, "exchange": "btc_markets"}]} Calling INFURA REST APIs Note: For most use cases you do not need an API Key. Make sure you keep it under their limits and follow all of their terms of service. API Calls https://api.infura.io/v2/blacklist https://api.infura.io/v1/ticker/ethbtc/full https://api.infura.io/v1/ticker/ethbtc https://api.infura.io/v1/ticker/symbols Format The Files Example Expression Language ${filename:append('infurasymbols.'):append(${now():format('yyyymmddHHMMSS'):append(${md5}):append('.json')})} References: https://blog.infura.io/getting-started-with-infura-28e41844cc89 https://infura.io/ https://infura.io/docs https://infura.io/status https://infura.docs.apiary.io/

TimothySpann · ‎06-16-2018

Using Apache MXNet GluonCV with Apache NiFi for Deep Learning Computer Vision Source: https://github.com/tspannhw/OpenSourceComputerVision/ Gluon and Apache MXNet have been great for deep learning especially for newbies like me. It got even better! They added a Deep Learning Toolkit that is easy to use and has a number of great pre-trained models that you can easily use to do some general use cases around computer vision. So I have used a simple well-documented example that I tweaked to save the final image and send some JSON details via MQTT to Apache NiFi. This may sound familiar: https://community.hortonworks.com/articles/198912/ingesting-apache-mxnet-gluon-deep-learning-results.html GluonCV makes this even easier! Let's check it out. Again let's take a simpmle Python example tweak it, run it via a shell script and send the results over MQTT. See: https://gluon-cv.mxnet.io/build/examples_detection/demo_ssd.html#sphx-glr-build-examples-detection-demo-ssd-py Python Code: https://github.com/tspannhw/UsingGluonCV/tree/master This is the Saved Annotated Figure Simple Apache NiFi Flow to Ingest MQTT Data from GluonCV example Python and Store to Hive and Parquet and HBase. A simple flow: ConsumeMQTT InferAvroSchema RouteOnContent MergeRecord (convert batches of json to single avro) ConvertAvroToORC PutHDFS PutParquet PutHbaseRecord Again Apache NiFi generates a schema for us from data examination. There's a really cool project coming out of New Jersey that has advanced schema generation looking at tables, I'll report on that later. We take it add, save to Schema Registry and are ready to Merge Records. One thing you may want to add is to turn regular types from: "type": "string" to "type": ["string","null"]. Schema { "type": "record", "name": "gluoncv", "fields": [ { "name": "imgname", "type": "string", "doc": "Type inferred from '\"images/gluoncv_image_20180615203319_6e0e5f0b-d2aa-4e94-b7e9-8bb7f29c9512.jpg\"'" }, { "name": "host", "type": "string", "doc": "Type inferred from '\"HW13125.local\"'" }, { "name": "shape", "type": "string", "doc": "Type inferred from '\"(1, 3, 512, 910)\"'" }, { "name": "end", "type": "string", "doc": "Type inferred from '\"1529094800.88097\"'" }, { "name": "te", "type": "string", "doc": "Type inferred from '\"2.4256367683410645\"'" }, { "name": "battery", "type": "int", "doc": "Type inferred from '100'" }, { "name": "systemtime", "type": "string", "doc": "Type inferred from '\"06/15/2018 16:33:20\"'" }, { "name": "cpu", "type": "double", "doc": "Type inferred from '23.2'" }, { "name": "diskusage", "type": "string", "doc": "Type inferred from '\"112000.8 MB\"'" }, { "name": "memory", "type": "double", "doc": "Type inferred from '65.8'" }, { "name": "id", "type": "string", "doc": "Type inferred from '\"20180615203319_6e0e5f0b-d2aa-4e94-b7e9-8bb7f29c9512\"'" } ] } Example JSON {"imgname": "images/gluoncv_image_20180615203615_c83fed6f-2ec8-4841-97e3-40985f7859ad.jpg", "host": "HW13125.local", "shape": "(1, 3, 512, 910)", "end": "1529094976.237143", "te": "1.8907802104949951", "battery": 100, "systemtime": "06/15/2018 16:36:16", "cpu": 29.3, "diskusage": "112008.6 MB", "memory": 66.5, "id": "20180615203615_c83fed6f-2ec8-4841-97e3-40985f7859ad"} Table Generated CREATE EXTERNAL TABLE IF NOT EXISTS gluoncv (imgname STRING, host STRING, shape STRING, end STRING, te STRING, battery INT, systemtime STRING, cpu DOUBLE, diskusage STRING, memory DOUBLE, id STRING) STORED AS ORC LOCATION '/gluoncv' Parquet Table create external table gluoncv_parquet (imgname STRING, host STRING, shape STRING, end STRING, te STRING, battery INT, systemtime STRING, cpu DOUBLE, diskusage STRING, memory DOUBLE, id STRING) STORED AS PARQUET LOCATION '/gluoncvpar' Reference: https://gluon-cv.mxnet.io/ https://gluon-cv.mxnet.io/build/examples_detection/index.html https://medium.com/apache-mxnet/gluoncv-deep-learning-toolkit-for-computer-vision-9218a907e8da

TimothySpann · ‎06-15-2018

Adding Parquet Output https://cwiki.apache.org/confluence/display/Hive/Parquet create external table gluon2_parquet (top1pct STRING, top2pct STRING, top3pct STRING, top4pct STRING, top5pct STRING, top1 STRING, top2 STRING, top3 STRING, top4 STRING, top5 STRING, imgname STRING, host STRING, `end` STRING, te STRING, battery INT, systemtime STRING, cpu DOUBLE, diskusage STRING, memory DOUBLE, id STRING) STORED AS PARQUET LOCATION '/gluon2par' select * from gluon2_parquet Add the PutParquet Processor

TimothySpann · ‎06-14-2018

We are analyzing this unsplash picture https://raw.githubusercontent.com/tspannhw/DWS-DeepLearning-CrashCourse/master/photo1.jpg

TimothySpann · ‎05-26-2018

Integrating Keras (TensorFlow) YOLOv3 Into Apache NiFi Workflows For this article I wanted to try the new YOLOv3 that's running in Keras. Out of the box with video streaming, pretty cool: git clone https://github.com/qqwweee/keras-yolo3 wget https://pjreddie.com/media/files/yolov3.weights python convert.py yolov3.cfg yolov3.weights model_data/yolo.h5 python yolo.py See: https://github.com/qqwweee/keras-yolo3 My article on Darknet original YOLO v3 is here: https://community.hortonworks.com/articles/191259/integrating-darknet-yolov3-into-apache-nifi-workfl.html Example JSON Output { "boxes" : "Found 8 boxes for img", "class7" : "diningtable", "score7" : "0.7484486", "left7" : "636", "right7" : "1096", "top7" : "210", "bottom7" : "693", "class6" : "sofa", "score6" : "0.31372178", "left6" : "1114", "right6" : "1276", "top6" : "172", "bottom6" : "381", "class5" : "chair", "score5" : "0.3455438", "left5" : "990", "right5" : "1048", "top5" : "183", "bottom5" : "246", "class4" : "chair", "score4" : "0.34554565", "left4" : "858", "right4" : "1000", "top4" : "186", "bottom4" : "244", "class3" : "chair", "score3" : "0.87056005", "left3" : "1114", "right3" : "1276", "top3" : "172", "bottom3" : "381", "class2" : "chair", "score2" : "0.9683409", "left2" : "958", "right2" : "1151", "top2" : "203", "bottom2" : "482", "class1" : "cup", "score1" : "0.49115792", "left1" : "691", "right1" : "770", "top1" : "84", "bottom1" : "229", "class0" : "person", "score0" : "0.9980049", "left0" : "187", "right0" : "709", "top0" : "2", "bottom0" : "720", "host" : "HW13125.local.fios-router.home", "end" : "1527443620.644303", "te" : "4.380625247955322", "battery" : 100, "systemtime" : "05/27/2018 13:53:40", "cpu" : 19.7, "diskusage" : "140607.7 MB", "memory" : 65.8, "yoloid" : "20180527175341_284c6051-a45d-4757-977a-fe1dd76f295b" } I created a YOLO ID so that the JSON, the ID and the image would have that ID for keeping them in sync as we send them across various distributed networks into a cluster for storage. I also have some other metadata I thought would be helpful such as date time, run time, battery available, disk usage, cpu and memory. Those are quick and easy to grab with Python and could be useful. This is running on a Mac laptop. This could be ported to the NVIDIA Jetson TX1. It may work on the RPI3 with Movidius, but I think it may be a touch slow. Even on a Mac with no GPU and some stuff running I am getting an image every 2-3 seconds produced. It's very choppy, I would like to try this on an UBuntu workstation with a few NVidia high end GPUs and TensorFlow compiled for GPU. NiFi Flow for Ingest of JSON and Images NiFi Server to process and store data in Hive and HBase Read JSON from the Logs Directory Written by Python 3 Ingest Images from Images Directory Written by YOLO v3 TensorFlow Analysis of Captured Image You can watch the data arrive To Write Data to HBase is Easy (Just create an HBase table with a column family) Apache NiFi generates my Hive table Once we have a generate table it is populated with our data and we can query it in Apache Zeppelin (or any JDBC/ODBC Tool) Using InferAvroSchema we had a schema created, we store it in Hortonworks Schema Registry for use. Sample of Data Stored in HBase 20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:boxes, timestamp=1527456044351, value=Found 9 boxes for img 20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:class0, timestamp=1527456044351, value=person 20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:class1, timestamp=1527456044351, value=chair 20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:class2, timestamp=1527456044351, value=chair 20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:class3, timestamp=1527456044351, value=chair 20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:class4, timestamp=1527456044351, value=chair 20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:class5, timestamp=1527456044351, value=chair 20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:class6, timestamp=1527456044351, value=chair 20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:class7, timestamp=1527456044351, value=sofa 20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:class8, timestamp=1527456044351, value=diningtable Flow (Local, Server) and Zeppelin Notebook: yolo-keras-json-server-save.xml keras-tensorflow-yolo-v3-osx.xml yolo-copy.json Forked Python Code For Saving JSON and Images https://github.com/tspannhw/yolo3-keras-tensorflow Coming Soon: Live Recording of YOLO v3 with Keras/TensorFlow recording of the capture stream.

TimothySpann · ‎05-30-2018

Thanks for the information. I'll try fastText.

TimothySpann · ‎05-25-2018

One thing we are missing is language detection, may be using Apache Tika or Apache OpenNLP to try that. Also we should probably add attributes to let you exactly specify the models for Organization, Location, Name, Dates.

TimothySpann · ‎05-21-2018

/nifi/nifi-toolkit/nifi-toolkit-assembly/target copy this somewhere nifi-toolkit-1.7.0-SNAPSHOT-bin.zip and unzip to run

TimothySpann · ‎05-15-2018

Integrating Darknet YOLOv3 Into Apache NiFi Workflows Darknet has released a new version of YOLO, version 3. This one is a faster and perhaps more accurate. It's new and shiny and I had to try it. I am liking the results. Flow to Execute Script We call the shell script, then I route out the empty results. I use SplitText to split into individual lines. I use Extract Text with ([^:]+):(.*) to split into our name, value pairs. We also want to process the images produced by YOLOv3. We grab the newest ones from the output directory. I also add meta data extraction and Tensorflow analysis. This data is stored in attributes and can be saved independently via using AttributesToJSON to build a new flow file that we save off separate probably converting into Apache ORC and storing in HDFS for Apache Hive querying. The image file we can store in the cloud, another file system, send to a front end or save in HDFS. Or even email it to someone. The parsed YOLOv3 results in Apache NiFi Attributes. As you can see we would grab labelvalue.1 and labelvalue.2 to do our processing. We may want to send this to JMS or MQTT or Apache Kafka for further display in an application or dashboard. This is an example of the result of our Extract Text. This is the output that we parse with Apache NiFi YOLOv3 also generates an image with rectangles and labels. YOLOv3 does some great classification on multiple items in a picture. I use Python to capture an image from my webcam via OpenCV2. I wrap my call in a shell script that captures the image sends it to Darknet's build of YOLOv3 and send errors to /dev/null. If you have a good GPU, you can compile with CUDA and OPENCV to do real-time off a webcam Example Output: /Volumes/seagate/models/darknet-master/images/yolo_image_img_20180514183707.jpg: Predicted in 26.351510 seconds. cell phone: 72% chair: 78% chair: 72% chair: 59% person: 100% chair: 83% Example Run: ./darknet detect cfg/yolov3.cfg cfg/yolov3.weights /Volumes/seagate/StrataNYC2018/kafka.jpg Source: https://github.com/tspannhw/nifi-yolo3/tree/master Reference: See: https://github.com/pjreddie/darknet See: https://pjreddie.com/darknet/yolo/ Download the training weights and data (https://pjreddie.com/media/files/yolov3.weights) See: https://pjreddie.com/media/files/papers/YOLOv3.pd @article{yolov3, title={YOLOv3: An Incremental Improvement}, author={Redmon, Joseph and Farhadi, Ali}, journal = {arXiv}, year={2018} }

TimothySpann · ‎05-16-2018

To Build YourApiKeyToken Create an account as Etherscan.io and confirm it. Login Go to My Account Click Developers Click Create Api Key Add an appname Copy the API KEY to end of URL in NiFi

Online	Offline
Last Visited	‎05-20-2024 05:42 PM

Member Since	‎01-07-2019 11:58 AM
Last Visited	‎05-20-2024 05:42 PM
Posts	1,973
Kudos received	1122

Cloudera Community

Re: Has anyone tried NiFi consuming (JMSConsume) f...

Re: NiFi Crash after runing chain of lookups

Re: Recommend approach for listening to RSS Feed i...

Re: NiFi ListenFTP Processor Default Data Port

Re: Nifi: Kafka Producer with Avro format in both ...

Ingesting Infura REST APIs to Access the Ethereum ...

Using Apache MXNet GluonCV with Apache NiFi for De...

Re: Ingesting Apache MXNet Gluon Deep Learning Res...

Re: Executing TensorFlow Classifications from Apac...

Integrating Keras (TensorFlow) YOLOv3 Into Apache ...

Re: Detecting Language with Apache NiFi

Re: Updating The Apache OpenNLP Community Apache N...

Re: DevOps Tips: Using the Apache NiFi Toolkit wi...

Integrating Darknet YOLOv3 Into Apache NiFi Workfl...

Re: Ethereum: Accessing Feeds from Etherscan on ...