About TimothySpann

TimothySpann · ‎11-05-2018

In preparation for my talk at the Philadelphia Open Source Conference(https://phillyopensource.splashthat.com/), Apache Deep Learning 201, I wanted to have some good images for running various Apache MXNet GluonCV Deep Learning Algorithms for Computer Vision. See: https://gluon-cv.mxnet.io/ Using Apache open source tools - Apache NiFi 1.8 and Apache MXNet 1.3 with GluonCV I can easily ingest live traffic camera images and run Object Detection, Semantic Segmentation and Instance Segmentation. Code: https://github.com/tspannhw/ApacheDeepLearning201 It's so easy, I am running multiple on the data to see which gives me the results I like. I am like YOLO which is a one of my old favorites. yolonifitraffic.py demo_mask_rcnn_nyc.py demo_deeplab_nyc.py So we can find the cars and people in these webcams. Use cases can be around traffic optimization, public safety and advertisement optimization. Due to licensing, I thought it better not to show Traffic Camera data here. To industrialize and scale out this process from a single Data Scientist to a national ingestion system, we use the power of Apache NiFi to ingest, process and control flows. I am using the latest Apache NiFi 1.8. Apache NiFi Flow to Ingest and Process Traffic Camera Data First we have a list of URLs that I want to process, this can be sourced and stored anywhere. For ease of use with a static set I am using GenerateFlowFile. I have a JSON file of URLs that I split and parse to call various Computer Vision Python scripts (DeepLab3, MaskRCNN, YOLO and others). YOLO seems to be the most useful so far. I am grabbing the results, some system metrics, metadata and the deep learning analytics generated by Apache MXNet. I split the flow into two portions. One builds GluonCV result data from YOLO and the other creates a file from TensorFlow results done on the fly. Here is a list of my webcam URLs. There's millions of them out there. If your data is tabular, then you need a schema for fast record processing. An Example Dataset Returned from GLUONCV - YOLO Python 3.6 Script I turn JSON data into HDFS Writeable AVRO Data and Can Run Live SQL on It One Output Source Code Be a Joint Slack Group Object Detection: GluonCV YOLO v3 and Apache NiFi This can be OpenCV, a static photo or from a URL. Object Detection: Faster RCNN with GluonCV Faster RCNN model trained on Pascal VOC dataset with ResNet-50 backbone net = gcv.model_zoo.get_model(faster_rcnn_resnet50_v1b_voc, pretrained=True) https://gluon-cv.mxnet.io/api/model_zoo.html Instance Segmentation: Mask RCNN with GluonCV Mask RCNN model trained on COCO dataset with ResNet-50 backbone net = model_zoo.get_model('mask_rcnn_resnet50_v1b_coco', pretrained=True) https://gluon-cv.mxnet.io/build/examples_instance/demo_mask_rcnn.html https://github.com/matterport/Mask_RCNN https://arxiv.org/abs/1703.06870 Photo by Ryoji Iwata on Unsplash There's a lot of people crossing the street! Semantic Segmentation: DeepLabV3 with GluonCV GluonCV DeepLabV3 model on ADE20K dataset model = gluoncv.model_zoo.get_model('deeplab_resnet101_ade', pretrained=True) run1.sh demo_deeplab_webcam.py This runs pretty slow on a machine with no GPU. https://www.cityscapes-dataset.com/ http://groups.csail.mit.edu/vision/datasets/ADE20K/ https://arxiv.org/abs/1706.05587 https://gluon-cv.mxnet.io/build/examples_segmentation/demo_deeplab.html That is the best picture of me ever! Semantic Segmentation: Fully Convolutional Networks GluonCV FCN model on PASCAL VOC dataset model = gluoncv.model_zoo.get_model(‘fcn_resnet101_voc ', pretrained=True) run1.sh demo_fcn_webcam.py https://gluon-cv.mxnet.io/build/examples_segmentation/demo_fcn.html https://people.eecs.berkeley.edu/~jonlong/long_shelhamer_fcn.pdf It found me. For NYC Dot and PennDot camera usage, you have to sign a developer agreement for a feed! See: http://www.nyc.gov/html/dot/html/home/home.shtml https://www.penndot.gov/Pages/default.aspx

TimothySpann · ‎10-18-2018

Simple Apache NiFi Operations Dashboard - Part 2 Part 1: https://community.cloudera.com/t5/Community-Articles/Building-a-Custom-Apache-NiFi-Operations-Dashboard-Part-1/ta-p/249060 To access data to display in our dashboard we will use some Spring Boot 2.06 Java 8 microservices to call Apache Hive 3.1.0 tables in HDP 3.0 on Hadoop 3.1. We will have our web site hosted and make REST Calls to Apache NiFi, our microservices, YARN and other APIs. As you can see we can easily incorporate data from HDP 3 - Apache Hive 3.1.0 in Spring Boot java applications with not much trouble. You can see the Maven build script (all code is in github.) Our motivation is put all this data somewhere and show it in a dashboard that can use REST APIs for data access and updates. We may choose to use Apache NiFi for all REST APIs or we can do some in Apache NiFi. We are still exploring. We can also decide to change the backend to HBase 2.0, Phoenix or Druid or a combination. We will see. Spring Boot 2.0.6 Loading JSON Output Spring Boot Microservices and UI https://github.com/tspannhw/operations-dashboard To start I have a simple web page that calls one of the REST APIs. The microservice can be run off of YARN 3.1, Kubernetes, CloudFoundry, OpenShift or any machine that can run a simple Java 8 jar. We can have this HTML as part of a larger dashboard or hosted anywhere. For Parsing the Monitoring Data We have some schemas for Metrics, Status and Bulletins. Now that monitoring data is in Apache Hive, I can query it with easy in Apache Zeppelin (or any JDBC/ODBC tool) Apache Zeppelin Screens We have a lot of reporting tasks for Monitoring NiFi We read from NiFi and send to NiFi, would be nice to have a dedicated reporting cluster Just Show Me Bulletins for MonitorMemory (You can see that in Reporting Tasks) NiFi Query To Limit Which Bulletins We Are Storing In Hive (For Now Just grab Errors) Spring Boot Code for REST APIs Metrics REST API Results Bulletin REST API Results Metrics Home Page Run The Microservice java -Xms512m -Xmx2048m -Dhdp.version=3.0.0 -Djava.net.preferIPv4Stack=true -jar target/operations-0.0.1-SNAPSHOT.jar Maven POM <?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.dataflowdeveloper</groupId> <artifactId>operations</artifactId> <version>0.0.1-SNAPSHOT</version> <packaging>jar</packaging> <name>operations</name> <description>Apache Hive Operations Spring Boot</description> <parent> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-parent</artifactId> <version>2.0.5.RELEASE</version> <relativePath/> </parent> <properties> <java.version>1.8</java.version> </properties> <dependencies> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> <exclusions> <exclusion> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-logging</artifactId> </exclusion> <exclusion> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-tomcat</artifactId> </exclusion> </exclusions> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-log4j2</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-jdbc</artifactId> </dependency> <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-jdbc</artifactId> <version>3.1.0</version> <exclusions> <exclusion> <groupId>org.eclipse.jetty.aggregate</groupId> <artifactId>*</artifactId> </exclusion> <exclusion> <artifactId>servlet-api</artifactId> <groupId>javax.servlet</groupId> </exclusion> </exclusions> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-test</artifactId> <scope>test</scope> </dependency> <dependency> <groupId>org.springframework.restdocs</groupId> <artifactId>spring-restdocs-mockmvc</artifactId> <scope>test</scope> </dependency> </dependencies> <build> <plugins> <plugin> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-maven-plugin</artifactId> </plugin> </plugins> </build> <repositories> <repository> <id>spring-releases</id> <url>https://repo.spring.io/libs-release</url> </repository> </repositories> <pluginRepositories> <pluginRepository> <id>spring-releases</id> <url>https://repo.spring.io/libs-release</url> </pluginRepository> </pluginRepositories> </project> With some help from the Internet, we have a simple Javascript to read the Spring Boot /metrics REST API and fill some values: HTML and Javascript (see src/main/resources/static/index.html) <h1>Metrics</h1> <div id="output" name="output" style="align: center; overflow:auto; height:400px; width:800px" class="white-frame"> <ul id="metrics"></ul> </div> <script language="javascript">var myList = document.querySelector('ul');var myRequest = new Request('./metrics/');fetch(myRequest).then(function(response) { return response.json(); }).then(function(data) {for (var i = 0; i < data.length; i++) {var listItem = document.createElement('li');listItem.innerHTML = '<strong>Timestamp' + data[i].timestamp + '</strong>Flow Files Received: ' +data[i].flowfilesreceivedlast5minutes + ' JVM Heap Usage:' + data[i].jvmheap_usage +' Threads Waiting:' + data[i].jvmthread_statestimed_waiting +' Thread Count: ' + data[i].jvmthread_count +' Total Task Duration: ' + data[i].totaltaskdurationnanoseconds +' Bytes Read Last 5 min: ' + data[i].bytesreadlast5minutes +' Flow Files Queued: ' + data[i].flowfilesqueued +' Bytes Queued: ' + data[i].bytesqueued;myList.appendChild(listItem);}});</script> Resources https://github.com/tspannhw/operations-dashboard https://community.hortonworks.com/articles/177256/spring-boot-20-on-acid-integrating-rest-microservi.html https://community.hortonworks.com/articles/207858/more-devops-for-hdf-apache-nifi-and-friends.html https://pierrevillard.com/2017/05/16/monitoring-nifi-ambari-grafana/ Example API Calls to Spring Boot http://localhost:8090/status/Update http://localhost:8090/bulletin/error http://localhost:8090/metrics/ TODO: We will add more calls directly to REST APIs of Apache NiFi clusters for display in our dashboard. REST API for NiFi of Interest /nifi-api/flow/process-groups/root/status /nifi-api/resources /flow/cluster/summary /nifi-api/flow/process-groups/root /nifi-api/Site-to-site /nifi-api/flow/bulletin-board /flow/history\?offset\=1\&count\=100 /nifi-api/flow/search-results\?\q\=NiFi+Operations /nifi-api/flow/status /flow/process-groups/root/controller-services /nifi-api/flow/process-groups/root/status /nifi-api/system-diagnostics

TimothySpann · ‎10-18-2018

Simple Apache NiFi Operations Dashboard This is an evolving work in progress, please get involved everything is open source. @milind pandit and I are working on a project to build something useful for teams to analyze their flows, current cluster state, start and stop flows and have a rich one look dashboard. There's a lot of data provided by Apache NiFi and related tools to aggregate, sort, categorize, search and eventually do machine learning analytics on. There are a lot of tools that come out of the box that solve parts of these problems. Ambari Metrics, Grafana and Log Search provide a ton of data and analysis abilities. You can find all your errors easily in Log Search and see nice graphs of what is going on in Ambari Metrics and Grafana. What is cool with Apache NiFi is that is has SitetoSite tasks for sending all the provenance, analytics, metrics and operational data you need to wherever you want it. That includes to Apache NiFi! This is Monitoring Driven Development (MDD). Monitoring Driven Development (MDD) MDD - https://pierrevillard.com/2018/08/29/monitoring-driven-development-with-nifi-1-7/ In this little proof of concept work, we grab some of these flows process them in Apache NiFi and then store them in Apache Hive 3 tables for analytics. We should probably push the data to HBase for aggregates and Druid for time series. We will see as this expands. There are also other data access options including the NiFi REST API and the NiFi Python APIs. Boostrap Notifier Send notification when the NiFi starts, stops or died unexpectedly Two OOTB notifications Email notification service HTTP notification service It’s easy to write a custom notification service https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#notification_services Reporting Tasks AmbariReportingTask (global, per process group) MonitorDiskUsage(Flowfile, content, provenance repositories) MonitorMemory Monitor Disk Usage MonitorActivity See: https://nipyapi.readthedocs.io/en/latest/readme.html https://community.hortonworks.com/articles/177301/big-data-devops-apache-nifi-flow-versioning-and-au.html These are especially useful for doing things like purging connections. Purge it! nipyapi.canvas.purge_connection(con_id) nipyapi.canvas.purge_process_group(process_group, stop=False) nipyapi.canvas.delete_process_group(process_group, force=True, refresh=True) Use Cases Example Metrics Data [ { "appid" : "nifi", "instanceid" : "7c84501d-d10c-407c-b9f3-1d80e38fe36a", "hostname" : "#.#.hortonworks.com", "timestamp" : 1539411679652, "loadAverage1min" : 0.93, "availableCores" : 16, "FlowFilesReceivedLast5Minutes" : 14, "BytesReceivedLast5Minutes" : 343779, "FlowFilesSentLast5Minutes" : 0, "BytesSentLast5Minutes" : 0, "FlowFilesQueued" : 59952, "BytesQueued" : 294693938, "BytesReadLast5Minutes" : 241681, "BytesWrittenLast5Minutes" : 398753, "ActiveThreads" : 2, "TotalTaskDurationSeconds" : 273, "TotalTaskDurationNanoSeconds" : 273242860763, "jvmuptime" : 224997, "jvmheap_used" : 5.15272616E8, "jvmheap_usage" : 0.9597700387239456, "jvmnon_heap_usage" : -5.1572632E8, "jvmthread_statesrunnable" : 11, "jvmthread_statesblocked" : 2, "jvmthread_statestimed_waiting" : 26, "jvmthread_statesterminated" : 0, "jvmthread_count" : 242, "jvmdaemon_thread_count" : 125, "jvmfile_descriptor_usage" : 0.0709, "jvmgcruns" : null, "jvmgctime" : null } ] Example Status Data { "statusId" : "a63818fe-dbd2-44b8-af53-eaa27fd9ef05", "timestampMillis" : "2018-10-18T20:54:38.218Z", "timestamp" : "2018-10-18T20:54:38.218Z", "actorHostname" : "#.#.hortonworks.com", "componentType" : "RootProcessGroup", "componentName" : "NiFi Flow", "parentId" : null, "platform" : "nifi", "application" : "NiFi Flow", "componentId" : "7c84501d-d10c-407c-b9f3-1d80e38fe36a", "activeThreadCount" : 1, "flowFilesReceived" : 1, "flowFilesSent" : 0, "bytesReceived" : 1661, "bytesSent" : 0, "queuedCount" : 18, "bytesRead" : 0, "bytesWritten" : 1661, "bytesTransferred" : 16610, "flowFilesTransferred" : 10, "inputContentSize" : 0, "outputContentSize" : 0, "queuedContentSize" : 623564, "activeRemotePortCount" : null, "inactiveRemotePortCount" : null, "receivedContentSize" : null, "receivedCount" : null, "sentContentSize" : null, "sentCount" : null, "averageLineageDuration" : null, "inputBytes" : null, "inputCount" : 0, "outputBytes" : null, "outputCount" : 0, "sourceId" : null, "sourceName" : null, "destinationId" : null, "destinationName" : null, "maxQueuedBytes" : null, "maxQueuedCount" : null, "queuedBytes" : null, "backPressureBytesThreshold" : null, "backPressureObjectThreshold" : null, "isBackPressureEnabled" : null, "processorType" : null, "averageLineageDurationMS" : null, "flowFilesRemoved" : null, "invocations" : null, "processingNanos" : null } Example Failure Data [ { "objectId" : "34c3249c-4a42-41ce-b94e-3563409ad55b", "platform" : "nifi", "project" : null, "bulletinId" : 28321, "bulletinCategory" : "Log Message", "bulletinGroupId" : "0b69ea51-7afb-32dd-a7f4-d82b936b37f9", "bulletinGroupName" : "Monitoring", "bulletinLevel" : "ERROR", "bulletinMessage" : "QueryRecord[id=d0258284-69ae-34f6-97df-fa5c82402ef3] Unable to query StandardFlowFileRecord[uuid=cd305393-f55a-40f7-8839-876d35a2ace1,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1539633295746-10, container=default, section=10], offset=95914, length=322846],offset=0,name=783936865185030,size=322846] due to Failed to read next record in stream for StandardFlowFileRecord[uuid=cd305393-f55a-40f7-8839-876d35a2ace1,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1539633295746-10, container=default, section=10], offset=95914, length=322846],offset=0,name=783936865185030,size=322846] due to -40: org.apache.nifi.processor.exception.ProcessException: Failed to read next record in stream for StandardFlowFileRecord[uuid=cd305393-f55a-40f7-8839-876d35a2ace1,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1539633295746-10, container=default, section=10], offset=95914, length=322846],offset=0,name=783936865185030,size=322846] due to -40", "bulletinNodeAddress" : null, "bulletinNodeId" : "91ab706b-5d92-454e-bc7a-6911d155fdca", "bulletinSourceId" : "d0258284-69ae-34f6-97df-fa5c82402ef3", "bulletinSourceName" : "QueryRecord", "bulletinSourceType" : "PROCESSOR", "bulletinTimestamp" : "2018-10-18T20:54:39.179Z" } ] Apache Hive 3 Tables CREATE EXTERNAL TABLE IF NOT EXISTS failure (statusId STRING, timestampMillis BIGINT, `timestamp` STRING, actorHostname STRING, componentType STRING, componentName STRING, parentId STRING, platform STRING, `application` STRING, componentId STRING, activeThreadCount BIGINT, flowFilesReceived BIGINT, flowFilesSent BIGINT, bytesReceived BIGINT, bytesSent BIGINT, queuedCount BIGINT, bytesRead BIGINT, bytesWritten BIGINT, bytesTransferred BIGINT, flowFilesTransferred BIGINT, inputContentSize BIGINT, outputContentSize BIGINT, queuedContentSize BIGINT, activeRemotePortCount BIGINT, inactiveRemotePortCount BIGINT, receivedContentSize BIGINT, receivedCount BIGINT, sentContentSize BIGINT, sentCount BIGINT, averageLineageDuration BIGINT, inputBytes BIGINT, inputCount BIGINT, outputBytes BIGINT, outputCount BIGINT, sourceId STRING, sourceName STRING, destinationId STRING, destinationName STRING, maxQueuedBytes BIGINT, maxQueuedCount BIGINT, queuedBytes BIGINT, backPressureBytesThreshold BIGINT, backPressureObjectThreshold BIGINT, isBackPressureEnabled STRING, processorType STRING, averageLineageDurationMS BIGINT, flowFilesRemoved BIGINT, invocations BIGINT, processingNanos BIGINT) STORED AS ORC LOCATION '/failure'; CREATE EXTERNAL TABLE IF NOT EXISTS bulletin (objectId STRING, platform STRING, project STRING, bulletinId BIGINT, bulletinCategory STRING, bulletinGroupId STRING, bulletinGroupName STRING, bulletinLevel STRING, bulletinMessage STRING, bulletinNodeAddress STRING, bulletinNodeId STRING, bulletinSourceId STRING, bulletinSourceName STRING, bulletinSourceType STRING, bulletinTimestamp STRING) STORED AS ORC LOCATION '/error'; CREATE EXTERNAL TABLE IF NOT EXISTS memory (objectId STRING, platform STRING, project STRING, bulletinId BIGINT, bulletinCategory STRING, bulletinGroupId STRING, bulletinGroupName STRING, bulletinLevel STRING, bulletinMessage STRING, bulletinNodeAddress STRING, bulletinNodeId STRING, bulletinSourceId STRING, bulletinSourceName STRING, bulletinSourceType STRING, bulletinTimestamp STRING) STORED AS ORC LOCATION '/memory' ; // backpressure CREATE EXTERNAL TABLE IF NOT EXISTS status (statusId STRING, timestampMillis BIGINT, `timestamp` STRING, actorHostname STRING, componentType STRING, componentName STRING, parentId STRING, platform STRING, `application` STRING, componentId STRING, activeThreadCount BIGINT, flowFilesReceived BIGINT, flowFilesSent BIGINT, bytesReceived BIGINT, bytesSent BIGINT, queuedCount BIGINT, bytesRead BIGINT, bytesWritten BIGINT, bytesTransferred BIGINT, flowFilesTransferred BIGINT, inputContentSize BIGINT, outputContentSize BIGINT, queuedContentSize BIGINT, activeRemotePortCount BIGINT, inactiveRemotePortCount BIGINT, receivedContentSize BIGINT, receivedCount BIGINT, sentContentSize BIGINT, sentCount BIGINT, averageLineageDuration BIGINT, inputBytes BIGINT, inputCount BIGINT, outputBytes BIGINT, outputCount BIGINT, sourceId STRING, sourceName STRING, destinationId STRING, destinationName STRING, maxQueuedBytes BIGINT, maxQueuedCount BIGINT, queuedBytes BIGINT, backPressureBytesThreshold BIGINT, backPressureObjectThreshold BIGINT, isBackPressureEnabled STRING, processorType STRING, averageLineageDurationMS BIGINT, flowFilesRemoved BIGINT, invocations BIGINT, processingNanos BIGINT) STORED AS ORC LOCATION '/status';

TimothySpann · ‎10-18-2018

As of today, there is only support for JDK 8 and JDK9 https://cwiki.apache.org/confluence/display/NIFI/Release+Notes#ReleaseNotes-Version1.7.1 JDK 11 is coming. But not today. It will not work.

TimothySpann · ‎10-12-2018

Running TensorFlow on YARN 3.1 with or without GPU You have the option to run with or without Docker containers. If you are not using Docker containers you will need CUDA, TensorFlow and all your Data Science libraries. See: https://community.hortonworks.com/articles/222242/running-apache-mxnet-deep-learning-on-yarn-31-hdp.html Tips from Wangda Basically GPU on YARN give you isolation of GPU device. Let's say a Node with 4 GPUS. First task comes ask 1 GPU. (Yarn.io/gpu=1). And YARN NM gives the task GPU0. Then the second task comes, ask 2 GPUs. And YARN NM gives the task GPU1/GPU2. So from TF perspective, you don't need to specify which GPUs to use. TF will automatically detect and consume whatever available to the job. For this case, task2 cannot see other GPUs apart from GPU1/GPU2. If you wish to run Apache MXNet deep learning programs, see this article: https://community.hortonworks.com/articles/222242/running-apache-mxnet-deep-learning-on-yarn-31-hdp.html Installation Install CUDA and Nvidia libraries if you have NVidia cards. Install Python 3.x Install Docker Install PIP sudo yum groupinstall 'Development Tools' -y sudo yum install cmake git pkgconfig -y sudo yum install libpng-devel libjpeg-turbo-devel jasper-devel openexr-devel libtiff-devel libwebp-devel -y sudo yum install libdc1394-devel libv4l-devel gstreamer-plugins-base-devel -y sudo yum install gtk2-devel -ysudo yum install tbb-devel eigen3-devel -y pip3.6 install --upgrade pip pip3.6 install tensorflow pip3.6 install numpy -U pip3.6 install scikit-learn -U pip3.6 install opencv-python -U pip3.6 install keras pip3.6 install hdfs git clone https://github.com/tensorflow/models/ You can see a docker example: https://github.com/hortonworks/hdp-assemblies/blob/master/tensorflow/markdown/Dockerfile.md https://github.com/hortonworks/hdp-assemblies/blob/master/tensorflow/markdown/TensorflowOnYarnTutorial.md Run Command for an Example Classification yarn jar /usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell.jar -jar /usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell.jar -shell_command python3.6 -shell_args "/opt/demo/DWS-DeepLearning-CrashCourse/tf.py /opt/demo/images/photo1.jpg" -container_resources memory-mb=512,vcores=1 Without Docker container_resources memory-mb=3072,vcores=1,yarn.io/gpu=2 With Docker (Enable it first: https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.1/data-operating-system/content/dosg_enable_gpu_for_docker_ambari_cluster.html) -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker \ -shell_env YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=<docker-image-name> \ Running a More Complex Training Job https://github.com/hortonworks/hdp-assemblies/blob/master/tensorflow/markdown/RunTensorflowJobUsingNativeServiceSpec.md This is the main example: https://github.com/tensorflow/models/tree/master/tutorials/image/cifar10_estimator yarn jar /usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell.jar -jar /usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell.jar -shell_command python3.6 -shell_args "/opt/demo/models/tutorials/image/cifar10_estimator/cifar10_main.py --data-dir=hdfs://default/tmp/cifar-10-data --job-dir=hdfs://default/tmp/cifar-10-jobdir --train-steps=10000 --eval-batch-size=16 --train-batch-size=16 --sync --num-gpus=0" -container_resources memory-mb=512,vcores=1 Example Output [hdfs@princeton0 DWS-DeepLearning-CrashCourse]$ python3.6 tf.py 2018-10-15 02:37:23.892791: W tensorflow/core/framework/op_def_util.cc:355] Op BatchNormWithGlobalNormalization is deprecated. It will cease to work in GraphDef version 9. Use tf.nn.batch_normalization(). 2018-10-15 02:37:24.181707: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 273 racer, race car, racing car 37.46013343334198% 274 sports car, sport car 25.35209059715271% 267 cab, hack, taxi, taxicab 11.118262261152267% 268 convertible 9.854312241077423% 271 minivan 3.2295159995555878% Output Written to HDFS hdfs dfs -ls /tfyarn Found 1 items -rw-r--r-- 3 root hdfs 457 2018-10-15 02:35 /tfyarn/tf_uuid_img_20181015023542.json hdfs dfs -cat /tfyarn/tf_uuid_img_20181015023542.json {"node_id273": "273", "humanstr273": "racer, race car, racing car", "score273": "37.46013343334198", "node_id274": "274", "humanstr274": "sports car, sport car", "score274": "25.35209059715271", "node_id267": "267", "humanstr267": "cab, hack, taxi, taxicab", "score267": "11.118262261152267", "node_id268": "268", "humanstr268": "convertible", "score268": "9.854312241077423", "node_id271": "271", "humanstr271": "minivan", "score271": "3.2295159995555878"} Full Source Code https://github.com/tspannhw/TensorflowOnYARN Resources https://www.tensorflow.org/ https://github.com/tspannhw/ApacheDeepLearning101/blob/master/yarn.sh https://github.com/hortonworks/hdp-assemblies/ https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.1/data-operating-system/content/configuring_gpu_scheduling_and_isolation.html https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.1/data-operating-system/content/dosg_enable_gpu_for_docker_ambari_cluster.html https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.0/data-operating-system/content/dosg_recommendations_for_running_docker_containers_on_yarn.html https://feathercast.apache.org/2018/10/02/deep-learning-on-yarn-running-distributed-tensorflow-mxnet-caffe-xgboost-on-hadoop-clusters-wangda-tan/ https://github.com/deep-diver/CIFAR10-img-classification-tensorflow https://aajisaka.github.io/hadoop-document/hadoop-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/RunningDistributedCifar10TFJobs.html https://conferences.oreilly.com/strata/strata-ny-2018/public/schedule/detail/68289 https://github.com/tspannhw/ApacheDeepLearning101/blob/master/analyzehdfs.py https://github.com/open-source-for-science/TensorFlow-Course https://github.com/hortonworks/hdp-assemblies/ https://github.com/hortonworks/hdp-assemblies/blob/master/tensorflow/markdown/Dockerfile.md https://github.com/hortonworks/hdp-assemblies/blob/master/tensorflow/markdown/TensorflowOnYarnTutorial.md https://github.com/hortonworks/hdp-assemblies/blob/master/tensorflow/markdown/RunTensorflowJobUsingHelperScript.md Documentation https://hadoop.apache.org/docs/r3.1.0/ https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.1/data-operating-system/content/options_distributed_shell_gpu.html Coming Soon https://github.com/leftnoteasy/hadoop-1/tree/submarine/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine https://conferences.oreilly.com/strata/strata-ny/public/schedule/detail/68289 https://github.com/leftnoteasy/hadoop-1/blob/submarine/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/src/site/QuickStart.md

TimothySpann · ‎10-05-2018

Posting Images with Apache NiFi 1.7 and a Custom Processor I have been using a shell script for this since Apache NiFi did not have a good way to natively post an image to HTTP servers su such as the model server for Apache MXNet. So I wrote a quick and dirty processor that posts an image there and gathers the headers, result body, status text and status code and returns them to you as attributes. In this example I am download images from picsum.photos free photo service. To use this new processor, download to your lib directory and restart Apache NiFi, then you can add the PostImageProcessor. Eclipse For Building My Processor Configure the Post Image Processor with your URL, fieldname, imagename and image type. MXNet Model Server Results The Attribute Results From the Data Results Example Results post.header {Server=[Werkzeug/0.14.1 Python/3.6.6], Access-Control-Allow-Origin=[*], Content-Length=[396], Date=[Fri, 05 Oct 2018 17:47:22 GMT], Content-Type=[application/json]} post.results {"prediction":[[{"probability":0.24173378944396973,"class":"n02281406 sulphur butterfly, sulfur butterfly"},{"probability":0.19173663854599,"class":"n02190166 fly"},{"probability":0.052654966711997986,"class":"n02280649 cabbage butterfly"},{"probability":0.05147545784711838,"class":"n03485794 handkerchief, hankie, hanky, hankey"},{"probability":0.048753462731838226,"class":"n02834397 bib"}]]} post.status OK post.statuscode 200 Results from HTTP Posting an Image to MXNet Model Server [INFO 2018-10-05 13:47:22,217 PID:88561 /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/mms/serving_frontend.py:predict_callback:467] Request input: data should be image with jpeg format. [INFO 2018-10-05 13:47:22,218 PID:88561 /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/mms/request_handler/flask_handler.py:get_file_data:137] Getting file data from request. [INFO 2018-10-05 13:47:22,262 PID:88561 /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/mms/serving_frontend.py:predict_callback:510] Response is text. [INFO 2018-10-05 13:47:22,262 PID:88561 /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/mms/request_handler/flask_handler.py:jsonify:159] Jsonifying the response: {'prediction': [[{'probability': 0.24173378944396973, 'class': 'n02281406 sulphur butterfly, sulfur butterfly'}, {'probability': 0.19173663854599, 'class': 'n02190166 fly'}, {'probability': 0.052654966711997986, 'class': 'n02280649 cabbage butterfly'}, {'probability': 0.05147545784711838, 'class': 'n03485794 handkerchief, hankie, hanky, hankey'}, {'probability': 0.048753462731838226, 'class': 'n02834397 bib'}]]} [INFO 2018-10-05 13:47:22,263 PID:88561 /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/werkzeug/_internal.py:_log:88] 127.0.0.1 - - [05/Oct/2018 13:47:22] "POST /squeezenet/predict HTTP/1.1" 200 - Example HTTP Server https://github.com/awslabs/mxnet-model-server Source Code For Processor https://github.com/tspannhw/nifi-postimage-processor Pre-Built NAR To Install https://github.com/tspannhw/nifi-postimage-processor/releases/tag/1.0

TimothySpann · ‎10-04-2018

Properties File Lookup Augmentation of Data Flow in Apache NiFi 1.7.x A really cool technologist contacted me on LinkedIn and asked an interesting question Tim, How do I read values from a properties file and use them in my flow. I want to update/inject an attribute with this value. If you don't want to use the Variable Registry, but want to inject a value from a properties file how to do it. You could run some REST server and read it or does some file reading hack. But we have a great service to do this very easily! In my UpdateAttribute (or in your regular attributes already), I have an attribute named, keytofind. This contains a lookup key such as an integer or a string key. We will find that value in the properties value and give you that in an attribute of your choosing. We have a Controller Service to handle this for you. It reads from your specified properties file. Make sure Apache NiFi has permissions to that path and can read the file. PropertiesFileLookupService https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-lookup-services-nar/1.7.1/org.apache.nifi.lookup.PropertiesFileLookupService/index.html We lookup the key specified in the “keytofind”. It returns a value that you specify as an extra attribute, mine is “updatedvalue”. This is my properties file: -rwxrwxrwx 1 tspann staff 67 Oct 4 09:15 lookup.properties stuff1=value1 stuff2=value2 stuff3=value other tim=spann nifi=cool In this example, we are using the LookupAttribute processor. You can also use the LookupRecord processor depending on your needs. Resources: http://discover.attunity.com/apache-nifi-for-dummies-en-report-go-c-lp8558.html https://community.hortonworks.com/articles/140231/data-flow-enrichment-with-nifi-part-2-lookupattrib.html https://community.hortonworks.com/articles/189213/etl-with-lookups-with-apache-hbase-and-apache-nifi.html The Flow lookup-from-properties-values.xml

TimothySpann · ‎09-28-2018

This is an extension of this article: https://community.hortonworks.com/articles/163776/parsing-any-document-with-apache-nifi-15-with-apac.html

TimothySpann · ‎09-24-2018

Using GluonCV 0.3 with Apache MXNet 1.3 source code: https://github.com/tspannhw/nifi-gluoncv-yolo3 *Captured and Processed Image Available for Viewing in Stream in Apache NiFi 1.7.x use case: I need to easily monitor the contents of my security vault. It is a fixed number of known things. What we need in the real world is a nice camera(s) (maybe four to eight depending on angles of the room), a device like an NVidia Jetson TX2, MiniFi 0.5 Java Agent, JDK 8, Apache MXNet, GluonCV, Lots of Python Libraries, a network connection and a simple workflow. Outside of my vault, I will need a server(s) or clusters to do the more advanced processing, though I could run it all on the local box. If the number of items or certain items I am watching are no longer in the screen, then we should send an immediate alert. That could be to an SMS, Email, Slack, Alert System or other means. We had most of that implemented below. If anyone wants to do the complete use case I can assist. demo implementation: I wanted to use the new YOLO 3 model which is part of the new 0.3 stream, so I installed a 0.3. This may be final by the time you read this. You can try to do a regular pip3.6 install -U gluoncv and see what you get. pip3.6 install -U gluoncv==0.3.0b20180924 Yolo v3 is a great pretrained model to use for object detection. See: https://gluon-cv.mxnet.io/build/examples_detection/demo_yolo.html The GluonCV Model Zoo is very rich and incredibly easy to use. So we just grab the model "yolo3_darknet53_voc" with an automatic one time download and we are ready to go. They provide easy to customize code to start with. I write my processed image and JSON results out for ingest by Apache NiFi. You will notice this is similar to what we did for the Open Computer Vision talks: https://community.hortonworks.com/articles/198939/using-apache-mxnet-gluoncv-with-apache-nifi-for-de.html This is updated and even easier. I dropped the MQTT and just output image files and some JSON to read. GluonCV makes working with Computer Vision extremely clean and easy. why Apache NiFi For Deep Learning Workflows Let me count the top five ways: #1 Provenance - lets me see everything, everywhere, all the time with the data and the metadata. #2 Configurable Queues - queues are everywhere and they are extremely configurable on size and priority. There's always backpressure and safety between every step. Sinks, Sources and steps can be offline as things happen in the real-world internet. Offline, online, wherever, I can recover and have full visibility into my flows as they spread between devices, servers, networks, clouds and nation-states. #3 Security - secure at every level from SSL and data encryption. Integration with leading edge tools including Apache Knox, Apache Ranger and Apache Atlas. See: https://docs.hortonworks.com/HDPDocuments/HDF3/HDF-3.1.1/bk_security/content/ch_enabling-knox-for-nifi.html #4 UI - a simple UI to develop, monitor and manage incredibly complex flows including IoT, Deep Learning, Logs and every data source you can throw at it. #5 Agents - MiniFi gives me two different agents for my devices or systems to stream data headless. running gluoncv yolo3 model I wrap my Python script in a shell script to throw away warnings and junk cd /Volumes/TSPANN/2018/talks/ApacheDeepLearning101/nifi-gluoncv-yolo3 python3.6 -W ignore /Volumes/TSPANN/2018/talks/ApacheDeepLearning101/nifi-gluoncv-yolo3/yolonifi.py 2>/dev/null List of Possible Objects We Can Detect ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"] I am going to train this with my own data for the upcoming INTERNET OF BEER, for the vault use case we would need your vault content pictures. See: https://gluon-cv.mxnet.io/build/examples_datasets/detection_custom.html#sphx-glr-build-examples-datasets-detection-custom-py Example Output in JSON {"imgname": "images/gluoncv_image_20180924190411_b90c6ba4-bbc7-4bbf-9f8f-ee5a6a859602.jpg", "imgnamep": "images/gluoncv_image_p_20180924190411_b90c6ba4-bbc7-4bbf-9f8f-ee5a6a859602.jpg", "class1": "tvmonitor", "pct1": "49.070724999999996", "host": "HW13125.local", "shape": "(1, 3, 512, 896)", "end": "1537815855.105193", "te": "4.199203014373779", "battery": 100, "systemtime": "09/24/2018 15:04:15", "cpu": 33.7, "diskusage": "49939.2 MB", "memory": 60.1, "id": "20180924190411_b90c6ba4-bbc7-4bbf-9f8f-ee5a6a859602"} Example Processed Image Output It found one generic person, we could train against a known set of humans that are allowed in an area or are known users. nifi flows Gateway Server (We could skip this, but aggregating multiple camera agents is useful) Send the Flow to the Cloud Cloud Server Site-to-Site After we infer the schema of the data once, we don't need it again. We could derive the schema manually or from another tool, but this is easy. Once you are done, then you can delete the InferAvroSchema processor from your flow. I left mine in for your uses if you wish to start from this flow that is attached at the end of the article. flow steps Route When No Error to Merge Record Then Convert Those Aggregated Apache Avro Records into One Apache ORC file. Then store it in an HDFS directory. Once complete their will be a DDL added to metadata that you can send to a PutHiveQL or manually create the table in Beeline or Zeppelin or Hortonworks Data Analytics Studio (https://hortonworks.com/products/dataplane/data-analytics-studio/). schema: gluoncvyolo { "type" : "record", "name" : "gluoncvyolo", "fields" : [ { "name" : "imgname", "type" : "string", "doc" : "Type inferred from '\"images/gluoncv_image_20180924211055_8f3b9dac-5645-49aa-94e7-ee5176c3f55c.jpg\"'" }, { "name" : "imgnamep", "type" : "string", "doc" : "Type inferred from '\"images/gluoncv_image_p_20180924211055_8f3b9dac-5645-49aa-94e7-ee5176c3f55c.jpg\"'" }, { "name" : "class1", "type" : "string", "doc" : "Type inferred from '\"tvmonitor\"'" }, { "name" : "pct1", "type" : "string", "doc" : "Type inferred from '\"95.71207000000001\"'" }, { "name" : "host", "type" : "string", "doc" : "Type inferred from '\"HW13125.local\"'" }, { "name" : "shape", "type" : "string", "doc" : "Type inferred from '\"(1, 3, 512, 896)\"'" }, { "name" : "end", "type" : "string", "doc" : "Type inferred from '\"1537823458.559896\"'" }, { "name" : "te", "type" : "string", "doc" : "Type inferred from '\"3.580893039703369\"'" }, { "name" : "battery", "type" : "int", "doc" : "Type inferred from '100'" }, { "name" : "systemtime", "type" : "string", "doc" : "Type inferred from '\"09/24/2018 17:10:58\"'" }, { "name" : "cpu", "type" : "double", "doc" : "Type inferred from '12.0'" }, { "name" : "diskusage", "type" : "string", "doc" : "Type inferred from '\"48082.7 MB\"'" }, { "name" : "memory", "type" : "double", "doc" : "Type inferred from '70.6'" }, { "name" : "id", "type" : "string", "doc" : "Type inferred from '\"20180924211055_8f3b9dac-5645-49aa-94e7-ee5176c3f55c\"'" } ] } Tabular data has fields with types and properties. Let's specify those for automated analysis, conversion and live stream SQL. hive table schema: gluoncvyolo CREATE EXTERNAL TABLE IF NOT EXISTS gluoncvyolo (imgname STRING, imgnamep STRING, class1 STRING, pct1 STRING, host STRING, shape STRING, end STRING, te STRING, battery INT, systemtime STRING, cpu DOUBLE, diskusage STRING, memory DOUBLE, id STRING) STORED AS ORC; Apache NiFi generates tables for me in Apache Hive 3.x as Apache ORC files for fast performance. hive acid table schema: gluoncvyoloacid CREATE TABLE gluoncvyoloacid (imgname STRING, imgnamep STRING, class1 STRING, pct1 STRING, host STRING, shape STRING, `end` STRING, te STRING, battery INT, systemtime STRING, cpu DOUBLE, diskusage STRING, memory DOUBLE, id STRING) STORED AS ORC TBLPROPERTIES ('transactional'='true') I can just as easily insert or update data into Hive 3.x ACID 2 tables. We have data, now query it. Easy, no install analytics with tables, Leafletjs, AngularJS, graphs, maps and charts. nifi flow registry To manage version control I am using the NiFi Registry which is great. In the newest version, 0.2, there is the ability to back it up with github! It's easy. Everything you need to know is in the doc and Bryan Bend's excellent post on the subject. https://nifi.apache.org/docs/nifi-registry-docs/index.html https://bryanbende.com/development/2018/06/20/apache-nifi-registry-0-2-0 There were a few gotchas for me. Use your own new github project with permissions and then clone it local git clone https://github.com/tspannhw/nifi-registry-github.git Make sure github directory has permission and is empty (no readme or junk) Make sure you put in the full directory path Update your config like below: <flowPersistenceProvider> <class>org.apache.nifi.registry.provider.flow.git.GitFlowPersistenceProvider</class> <property name="Flow Storage Directory">/Users/tspann/Documents/nifi-registry-0.2.0/conf/nifi-registry-github</property> <property name="Remote To Push">origin</property> <property name="Remote Access User">tspannhw</property> <property name="Remote Access Password">generatethis</property> </flowPersistenceProvider> This is my github directory to hold versions: https://github.com/tspannhw/nifi-registry-github resources: https://github.com/tspannhw/UsingGluonCV https://gluon.mxnet.io/chapter01_crashcourse/ndarray.html https://gluon-cv.mxnet.io/build/examples_detection/demo_yolo.html#sphx-glr-build-examples-detection-demo-yolo-py https://gluon-cv.mxnet.io/model_zoo/index.html#object-detection https://community.hortonworks.com/articles/215271/iot-edge-processing-with-deep-learning-on-hdf-32-a-2.html https://community.hortonworks.com/articles/198912/ingesting-apache-mxnet-gluon-deep-learning-results.html zeppelin notebook apache-mxnet-gluoncv-yolov3-copy.json nifi flow gluoncv-server.xml

TimothySpann · ‎09-21-2018

Running Apache MXNet Deep Learning on YARN 3.1 - HDP 3.0 With Hadoop 3.1 / HDP 3.0, we can easily run distributed classification, training and other deep learning jobs. I am using Apache MXNet with Python. You can also do TensorFlow or Pytorch. If you need GPU resources, you can specify them as such: yarn.io/gpu=2 My cluster does not have an NVidia GPU unfortunately. See: https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.0/data-operating-system/content/dosg_recommendations_for_running_docker_containers_on_yarn.html Running App on YARN [root@princeton0 ApacheDeepLearning101]# ./yarn.sh 18/09/21 15:31:22 INFO distributedshell.Client: Initializing Client 18/09/21 15:31:22 INFO distributedshell.Client: Running Client 18/09/21 15:31:22 INFO client.RMProxy: Connecting to ResourceManager at princeton0.field.hortonworks.com/172.26.208.140:8050 18/09/21 15:31:23 INFO client.AHSProxy: Connecting to Application History server at princeton0.field.hortonworks.com/172.26.208.140:10200 18/09/21 15:31:23 INFO distributedshell.Client: Got Cluster metric info from ASM, numNodeManagers=1 18/09/21 15:31:23 INFO distributedshell.Client: Got Cluster node info from ASM 18/09/21 15:31:23 INFO distributedshell.Client: Got node report from ASM for, nodeId=princeton0.field.hortonworks.com:45454, nodeAddress=princeton0.field.hortonworks.com:8042, nodeRackName=/default-rack, nodeNumContainers=4 18/09/21 15:31:23 INFO distributedshell.Client: Queue info, queueName=default, queueCurrentCapacity=0.4, queueMaxCapacity=1.0, queueApplicationCount=8, queueChildQueueCount=0 18/09/21 15:31:23 INFO distributedshell.Client: User ACL Info for Queue, queueName=root, userAcl=SUBMIT_APPLICATIONS 18/09/21 15:31:23 INFO distributedshell.Client: User ACL Info for Queue, queueName=root, userAcl=ADMINISTER_QUEUE 18/09/21 15:31:23 INFO distributedshell.Client: User ACL Info for Queue, queueName=default, userAcl=SUBMIT_APPLICATIONS 18/09/21 15:31:23 INFO distributedshell.Client: User ACL Info for Queue, queueName=default, userAcl=ADMINISTER_QUEUE 18/09/21 15:31:23 INFO distributedshell.Client: Max mem capability of resources in this cluster 15360 18/09/21 15:31:23 INFO distributedshell.Client: Max virtual cores capability of resources in this cluster 12 18/09/21 15:31:23 WARN distributedshell.Client: AM Memory not specified, use 100 mb as AM memory 18/09/21 15:31:23 WARN distributedshell.Client: AM vcore not specified, use 1 mb as AM vcores 18/09/21 15:31:23 WARN distributedshell.Client: AM Resource capability=<memory:100, vCores:1> 18/09/21 15:31:23 INFO distributedshell.Client: Copy App Master jar from local filesystem and add to local environment 18/09/21 15:31:24 INFO distributedshell.Client: Set the environment for the application master 18/09/21 15:31:24 INFO distributedshell.Client: Setting up app master command 18/09/21 15:31:24 INFO distributedshell.Client: Completed setting up app master command {{JAVA_HOME}}/bin/java -Xmx100m org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster --container_type GUARANTEED --container_memory 512 --container_vcores 1 --num_containers 1 --priority 0 1><LOG_DIR>/AppMaster.stdout 2><LOG_DIR>/AppMaster.stderr 18/09/21 15:31:24 INFO distributedshell.Client: Submitting application to ASM 18/09/21 15:31:24 INFO impl.YarnClientImpl: Submitted application application_1536697796040_0022 18/09/21 15:31:25 INFO distributedshell.Client: Got application report from ASM for, appId=22, clientToAMToken=null, appDiagnostics=AM container is launched, waiting for AM container to Register with RM, appMasterHost=N/A, appQueue=default, appMasterRpcPort=-1, appStartTime=1537543884622, yarnAppState=ACCEPTED, distributedFinalState=UNDEFINED, appTrackingUrl=http://princeton0.field.hortonworks.com:8088/proxy/application_1536697796040_0022/, appUser=root 18/09/21 15:31:26 INFO distributedshell.Client: Got application report from ASM for, appId=22, clientToAMToken=null, appDiagnostics=AM container is launched, waiting for AM container to Register with RM, appMasterHost=N/A, appQueue=default, appMasterRpcPort=-1, appStartTime=1537543884622, yarnAppState=ACCEPTED, distributedFinalState=UNDEFINED, appTrackingUrl=http://princeton0.field.hortonworks.com:8088/proxy/application_1536697796040_0022/, appUser=root 18/09/21 15:31:27 INFO distributedshell.Client: Got application report from ASM for, appId=22, clientToAMToken=null, appDiagnostics=AM container is launched, waiting for AM container to Register with RM, appMasterHost=N/A, appQueue=default, appMasterRpcPort=-1, appStartTime=1537543884622, yarnAppState=ACCEPTED, distributedFinalState=UNDEFINED, appTrackingUrl=http://princeton0.field.hortonworks.com:8088/proxy/application_1536697796040_0022/, appUser=root 18/09/21 15:31:28 INFO distributedshell.Client: Got application report from ASM for, appId=22, clientToAMToken=null, appDiagnostics=, appMasterHost=princeton0/172.26.208.140, appQueue=default, appMasterRpcPort=-1, appStartTime=1537543884622, yarnAppState=RUNNING, distributedFinalState=UNDEFINED, appTrackingUrl=http://princeton0.field.hortonworks.com:8088/proxy/application_1536697796040_0022/, appUser=root 18/09/21 15:31:29 INFO distributedshell.Client: Got application report from ASM for, appId=22, clientToAMToken=null, appDiagnostics=, appMasterHost=princeton0/172.26.208.140, appQueue=default, appMasterRpcPort=-1, appStartTime=1537543884622, yarnAppState=RUNNING, distributedFinalState=UNDEFINED, appTrackingUrl=http://princeton0.field.hortonworks.com:8088/proxy/application_1536697796040_0022/, appUser=root 18/09/21 15:31:30 INFO distributedshell.Client: Got application report from ASM for, appId=22, clientToAMToken=null, appDiagnostics=, appMasterHost=princeton0/172.26.208.140, appQueue=default, appMasterRpcPort=-1, appStartTime=1537543884622, yarnAppState=RUNNING, distributedFinalState=UNDEFINED, appTrackingUrl=http://princeton0.field.hortonworks.com:8088/proxy/application_1536697796040_0022/, appUser=root 18/09/21 15:31:31 INFO distributedshell.Client: Got application report from ASM for, appId=22, clientToAMToken=null, appDiagnostics=, appMasterHost=princeton0/172.26.208.140, appQueue=default, appMasterRpcPort=-1, appStartTime=1537543884622, yarnAppState=RUNNING, distributedFinalState=UNDEFINED, appTrackingUrl=http://princeton0.field.hortonworks.com:8088/proxy/application_1536697796040_0022/, appUser=root 18/09/21 15:31:32 INFO distributedshell.Client: Got application report from ASM for, appId=22, clientToAMToken=null, appDiagnostics=, appMasterHost=princeton0/172.26.208.140, appQueue=default, appMasterRpcPort=-1, appStartTime=1537543884622, yarnAppState=RUNNING, distributedFinalState=UNDEFINED, appTrackingUrl=http://princeton0.field.hortonworks.com:8088/proxy/application_1536697796040_0022/, appUser=root 18/09/21 15:31:33 INFO distributedshell.Client: Got application report from ASM for, appId=22, clientToAMToken=null, appDiagnostics=, appMasterHost=princeton0/172.26.208.140, appQueue=default, appMasterRpcPort=-1, appStartTime=1537543884622, yarnAppState=FINISHED, distributedFinalState=SUCCEEDED, appTrackingUrl=http://princeton0.field.hortonworks.com:8088/proxy/application_1536697796040_0022/, appUser=root 18/09/21 15:31:33 INFO distributedshell.Client: Application has completed successfully. Breaking monitoring loop 18/09/21 15:31:33 INFO distributedshell.Client: Application completed successfully Results: https://github.com/tspannhw/ApacheDeepLearning101/blob/master/run.log Script: https://github.com/tspannhw/ApacheDeepLearning101/blob/master/yarn.sh yarn jar /usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell.jar -jar /usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell.jar -shell_command python3.6 -shell_args "/opt/demo/ApacheDeepLearning101/analyzex.py /opt/demo/images/201813161108103.jpg" -container_resources memory-mb=512,vcores=1 For pre-HDP 3.0, see my older script using the DMLC YARN runner. We don't need that anymore. No Spark either. https://github.com/tspannhw/nifi-mxnet-yarn Python MXNet Script: https://github.com/tspannhw/ApacheDeepLearning101/blob/master/analyzehdfs.py Since we are distributed, let's write the results to HDFS. We can use and install the Python HDFS library that works on Python 2.7 and 3.x. So let's pip install it. pip install hdfs In our code: from hdfs import InsecureClient client = InsecureClient('http://princeton0.field.hortonworks.com:50070', user='root') from json import dumps client.write('/mxnetyarn/' + uniqueid + '.json', dumps(row)) We write our row as JSON to HDFS. When the job completes in YARN, we get a new JSON file written to HDFS. hdfs dfs -ls /mxnetyarn Found 2 items -rw-r--r-- 3 root hdfs 424 2018-09-21 17:50 /mxnetyarn/mxnet_uuid_img_20180921175007.json -rw-r--r-- 3 root hdfs 424 2018-09-21 17:55 /mxnetyarn/mxnet_uuid_img_20180921175552.json hdfs dfs -cat /mxnetyarn/mxnet_uuid_img_20180921175552.json {"uuid": "mxnet_uuid_img_20180921175552", "top1pct": "49.799999594688416", "top1": "n03063599 coffee mug", "top2pct": "21.50000035762787", "top2": "n07930864 cup", "top3pct": "12.399999797344208", "top3": "n07920052 espresso", "top4pct": "7.500000298023224", "top4": "n07584110 consomme", "top5pct": "5.200000107288361", "top5": "n04263257 soup bowl", "imagefilename": "/opt/demo/images/201813161108103.jpg", "runtime": "0"} HDP Assemblies https://github.com/hortonworks/hdp-assemblies/ https://github.com/hortonworks/hdp-assemblies/blob/master/tensorflow/markdown/Dockerfile.md https://github.com/hortonworks/hdp-assemblies/blob/master/tensorflow/markdown/TensorflowOnYarnTutorial.md https://github.com/hortonworks/hdp-assemblies/blob/master/tensorflow/markdown/RunTensorflowJobUsingHelperScript.md *** SUBMARINE ** Coming soon, Submarine is really cool new way. https://github.com/leftnoteasy/hadoop-1/tree/submarine/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine See this awesome presentation from Strata NYC 2018 by Wangda Tan (Hortonworks): https://conferences.oreilly.com/strata/strata-ny/public/schedule/detail/68289 See the quick start for setting Docker and GPU options: https://github.com/leftnoteasy/hadoop-1/blob/submarine/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/src/site/QuickStart.md Resources: https://community.hortonworks.com/articles/60480/using-images-stored-in-hdfs-for-web-pages.html

Online	Offline
Last Visited	‎05-20-2024 05:42 PM

Member Since	‎01-07-2019 11:58 AM
Last Visited	‎05-20-2024 05:42 PM
Posts	1,973
Kudos received	1122

Cloudera Community

Re: Has anyone tried NiFi consuming (JMSConsume) f...

Re: NiFi Crash after runing chain of lookups

Re: Recommend approach for listening to RSS Feed i...

Re: NiFi ListenFTP Processor Default Data Port

Re: Nifi: Kafka Producer with Avro format in both ...

Ingesting and Analyzing Street Camera Data from Ma...

Building a Custom Apache NiFi Operations Dashboard...

Building a Custom Apache NiFi Operations Dashboard...

Re: unable to install nifi with java 11

Running TensorFlow on YARN 3.1 with or without GPU...

Posting Images with Apache NiFi 1.7 and a Custom P...

Properties File Lookup Augmentation of Data Flow i...

Re: Converting PowerPoint Presentations into Frenc...

Using Apache NiFi with Apache MXNet GluonCV for YO...

Running Apache MXNet Deep Learning on YARN 3.1 - H...