Community Articles
Find and share helpful community-sourced technical articles
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Overview

Partner demo kit is built and maintained by the Hortonworks Partner Solutions team. The purpose of the demo kit is to enable the partners to:

  • Quickly bring up a HDP environments with pre-built demos
  • Leverage available demos to understand the capabilities of the platform
  • Use the demos as part of business conversation to demonstrate the art of possible

The remainder of this article provides a short description of the 3 demos packaged within the demo kit and step by step instruction on:

  • How to launch the demo kit on AWS or on private cloud
  • How to execute the demos provided with the demo kit

Other Versions

  • The Security/Governance Demo kit for HDP 2.6 can be found here
  • The previous version of demo kit (for HDP 2.5) can be found here

Pre-requisites

  • When using AWS, you must already have created your Amazon Web Services account. Sample steps for doing this can be found here. If you have an AWS promo code, you can apply it to your account using the steps here.
  • For running the sentiment demo, you must have created a Twitter application using your Twitter account and generated consumer keys/secrets. If you do not have these, you can generate a new set using your Twitter account by following this section of the Hortonworks tutorial.

Notes

  • Note that the partner demo kit is not a formally supported offering.
  • In case of questions, see ‘Questions?” section at the end of this article.

Slides

  • Slides for demo kit are available here

Packaged Demos

The demo kit comes with 3 demos:

  • 1. IOT demo
    • Purpose: IOT demo showcases how a logistic company uses the Hortonworks Connected Data Platform to monitor its fleet in real time to mitigate driving infractions
    • Use case setup:
    • Sensor devices from trucks capture events of the trucks and actions of the drivers.
    • Some of these driver events are dangerous "events” such as: Lane Departure, Unsafe following distance, Unsafe tail distance
    • The Business Requirement is to stream these events in, filter on violations and do real-time alerting when “lots” of erratic behavior is detected for a given driver over a short period of time.
    • Over time, users would like to do advanced analytics on the full archive of historical events generated by the trucks to:
    • Determine what factors have an impact on driving violations (e.g. weather, driver fatigue etc)
    • Build an AI model to make predictions when violations will occur
  • Technologies used: Apache Nifi, Kafka, Storm, Streaming Analytics Manager, Schema Registry, HBase, Spark, Zeppelin
  • More details available here and here
    • 2. Sentiment demo
      • Purpose: Sentiment demo showcases how a retail company can use the Hortonworks Connected Data Platform to visualize and analyze social media data related to their products
      • Use case setup:
    • The Business Requirement is to capture, process and analyze flow of tweets to understand the social sentiments for their products
  • Technologies used: Apache Nifi, Solr, HDFS
  • More details available here and here

    • 3. Advanced analytics demo
      • Purpose: Advanced analytics demo showcases how an insurance company can use the Hortonworks Connected Data Platform to visualize and make predictions on earthquake data using Apache Spark’s machine learning libraries
      • Use case setup:
      • The Business Requirement is to be able to perform advanced analytics on world wide earthquake data to predict where large earthquakes will happen so the business can accordingly modify insurance premiums
    • Technologies used: Apache Spark, Zeppelin
    • More details here
  • Option #1: Installing the Demo Kit on your own setup

    You can install Demo Kit on other public or private clouds using the provided automated script. With this option you would launch a CentOS/RHEL 7 VM of the right size on any cloud of your choice (as long as it has access to public internet), and use provided script to install single node HDP and install the demo. For more details see README here. Setup ETA is 1 hour

    Option #2: Launching the Demo Kit AMI on AWS

    You can use this option to launch a prebuilt image of single node HDP (including the demo) on AWS cloud. Setup ETA is 15min

    Steps to launch the AMI

    • 1. Launch Amazon AWS console page in your browser by clicking here and sign in with your credentials. Once signed in, you can close this browser tab.
    • 2. Select the AMI from ‘N. California’ region by clicking here. Now choose instance type: select ‘m4.2xlarge’ and click Next
  • Note: if you choose a smaller instance type from the above recommendation, not all services may come up

    • 3. Configure Instance Details: leave the defaults and click ‘Next’
    • 4. Add storage: keep the default of 500 GB and click ‘Next’
    • 5. Optionally, add a name or any other tags you like. Then click ‘Next’
    • 6. Configure security group: create a new security group and select ‘All traffic’ to open all ports. For long running instances (i.e. anything beyond an hour), a more restrictive security group policy is strongly encouraged (for example: only allow traffic from your company’s IP range). Then click ‘Review and Launch’
    • 7. Review your settings and click Launch
    • 8. Create and download a new key pair (or choose an existing one). Then click ‘Launch instances’
    • 9. Click the shown link under ‘Your instances are now launching’
    • 10. This opens the EC2 dashboard that shows the details of your launched instance
  • 9235-screen-shot-2016-11-08-at-100930-am.jpg

    • 11. Make note of your instance’s ‘Public IP’ (which will be used to access your cluster) . If it is blank, wait 1-2 minutes for this to be populated
    • 12. After 5-10 minutes, open the below URL in your browser to access Ambari’s console: http://<PUBLIC IP>:8080. Login as admin user using StrongPassword as password
    • 13. At this point, Ambari may still be in the process of starting all the services. You can tell by the presence of the blue ‘op’ notification near the top left of the page. If so, just wait until it is done.


  • (Optional) You can also monitor the startup using the log as below:

    • Open SSH session into the VM using your key and the public IP e.g. from OSX:
  • ssh -i ~/.ssh/mykey.pem centos@<publicIP>

    • Tail the startup log:
  • tail -f /var/log/hdp_startup.log

    • Once you see “cluster is ready!” you can proceed
    • 14. Once the blue ‘op’ notification disappears and all the services show a green check mark, the cluster is fully up.
  • If any services fail to start, use the Actions > Start All button to start

    • 15. At this point you can follow the demo instructions.
  • Troubleshooting

    • If any service does not come up for some reason, you can use Ambari to retry by clicking: ‘Actions’ > ‘Start all’.
  • Screen Shot 2017-01-06 at 1.18.25 PM.png

    • In case of multiple failures when starting services, use the EC2 dashboard to double check that the correct instance type was used. Insufficient resources can cause services to not start up successfully
  • Screen Shot 2017-01-06 at 1.20.49 PM.png

    • It is not required to connect via SSH to your instance. But you can do this using the key pair you created/selected earlier by following the standard instructions on AWS website. Make sure the user you login as is centos
    • A log file of the automated startup of HDP services is available under: /var/log/hdp_startup.log
  • Stopping/Terminating demo kit

    • Once you are done with demo kit, we recommend bringing it down to avoid incurring any unnecessary charges. To do this, follow below:
    • First, stop the cluster services using Ambari by clicking: ‘Actions’ > ‘Stop all’.
    • Then pick from one of the two options:
      • a) Terminate the instance: If you do not want to incur any further charges from AWS, terminate the VM instance from the same ‘EC2 dashboard’ that displayed the instance details. Note that this will destroy the VM, so the next time you wish to use demo kit, you will need to follow the same steps outlined in above section ‘Launching the Demo Kit’
      • b) Stop the instance: if you want to bring down your VM instance but keep it around so you can start it back up in the future, stop the VM instance from the EC2 dashboard. Note that this option will preserve any customizations you make to the VM but you will incur AWS charges by choosing for this option.
    • More details on stop vs terminate operations can be found on AWS website here and here
  • Demo Execution Steps

    IOT Demo

    Video recording of the IOT demo

    • Recording of demo provided here (high level) and here (deeper level)
    • PPT and PDF versions of the slides also available
  • IOT Demo setup instructions

    Sequence to walk through the IOT trucking demo:

    1. Events simulator
    2. Schema Registry UI
    3. NiFi flow
    4. SAM Application view
    5. Storm Monitoring view
    6. Superset Dashboard
    7. Superset Slice creation
    8. Zeppelin notebook
  • Detailed steps for IOT trucking demo walk through

    • (Optional): Check that events are being simulated. This step is optional because we will also check this from NiFi UI
      • Open SSH session into the VM using your key and the public IP e.g. from OSX:
  • ssh -i ~/.ssh/mykey.pem centos@<publicIP>

    sudo su -

    • To check events being simulated you can either verify the simulator process is running or monitor the simulator log:
  • ps -ef | grep stream-simulator

    tail -f /tmp/whoville/data_simulator/simulator.log

    • If simulator is not running, you can invoke it by running below from SSH sessioncd /tmp/whoville/data_simulator/sudo ./runDataLoader.sh
    • In case you need to kill the simulator use the ps command above to find the process id and then kill it
    • Next, we will open the web UIs of a number of components that are part of the demo using the Ambari Quicklinks. For example, for Schema Registry here is how to access the Quicklink:
    • Open Schema Registry using Quicklink in Ambari and check 4 schemas below are listed
    • Open NiFi using Quicklink in Ambari, check that “IOT trucking demo” process group is started



    • Double click on the “IOT trucking demo” box to see the details of the flow. The counters should show that simulated events are flowing through the NiFi flow. You can refresh the UI to see this:
    • Open Storm Monitoring view (under Ambari views), and check the topology is live


    • Open SAM using Quicklink in Ambari, check the application is deployed











    • Double click on the application to see more details. You should see that the Emitted and Transferred fields are non-zero (assuming the simulator has been been running for a few min)


    • Open Druid Console using Quicklink in Ambari, check the two datasets are present





    • Open Druid Superset using Quicklink in Ambari and login using admin/StrongPassword



    • There should be one entry under Dashboards. Click it to open the prebuilt dashboard.
    • The prebuilt dashboard will open. You can periodically click the refresh button to see new data arriving. Datasets can take 2-6 mins for new events to appear in Druid
  • The first few slices (i.e graphs) provide monitoring related information (e.g. how many violations? Who are the violators? etc). The last 3 slices provide information about the predictions made by the model (i.e. which routes are predicted to have most violations? Which drivers are predicted to have violations)

    You can also create other slices and add them to the dashboard using the steps here

    • Optionally you can also demonstrate how a data scientist would use archived truck events to build a model to predict violations. Note, to limit amount of resources needed to run the AMI, Spark/Hive has not been installed so you will not be able to actually run the notebook. The previous version of demokit HDP sandbox has these set up so that can be used if you want to actually execute the steps in the notebook.
    • To walk through the trucking events analysis notebook, first open Zeppelin UI using the Quicklink from Ambari:
    • Login as admin/admin
    • Under Notebook section, use search text field to search for “Trucking data analysis” notebook using Zeppelin search:



    • Click Save on the interpreter binding
    • Walk through the notebook to show how data scientist can use SparkSQL to visualize data to help understand what features should be included in the model




    • Finally you can show that once the important features are known, a model can be built to predict violations (in this case, using Logistical Regression)
  • Stopping/Starting the simulator

    • To stop the simulator, use below command to find its process id and then use kill command to kill it:
  • ps -ef | grep stream-simulator

    kill <process_id>

    • To start it back up, run below:cd /tmp/whoville/data_simulator/sudo ./runDataLoader.sh
  • Sentiment Demo

    Video recording of the Sentiment demo

    • Recording of setup instructions for demo provided here
  • Sentiment Demo setup instructions

    • Open Nifi UI using Quicklinks in Ambari
    • Doubleclick "Twitter Dashboard" to open this process group:
    • Right click "Grab Garden Hose" > Properties and enter your Twitter Consumer key/secret and Access token/secret. If you do not have these, you can generate a new set using your Twitter account by following this section of the Hortonworks tutorial. Optionally change the 'Terms to filter on' as desired. Once complete, start the flow.
    • Use Banana UI quicklink from Ambari to open Twitter dashboard
    • An empty dashboard will initially appear. After a minute, you should start seeing charts appear


  • Advanced Analytics Demo

    Video recording of Advanced Analytics demo

    • Video recording provided here
  • Advanced Analytics Demo setup instructions

    • Open Zeppelin UI via Quicklink
    • Login as admin. Password is same as Ambari password
    • A directory structure containing a number of demo notebooks will appear.
    • Find the earthquake demo notebook by filtering for ‘earthquake’
    • On first launch of a notebook, you will see that the "Interpreter Binding" settings will be displayed. You will need to click "Save" under the interpreter order to accept the defaults.
  • Screen Shot 2017-01-06 at 1.37.17 PM.png

    • Now you can walk through the notebook and show the visualizations and process of building the model. Note, to limit amount of resources needed to run the AMI, Spark/Hive has not been installed so you will not be able to actually run the notebook. The previous version of demokit or HDP sandbox has the notebook set up so that can be used if you want to actually execute the steps in the notebook.
  • This concludes this article on how to launch the demo kit and access the provided demonstrations

    Questions?

    In case of questions or issues:

    • 1. Search on our Hortonworks Community Connection forum. For example, to find all Demo Kit related posts access this url
    • 2. If you were not able to find the solution, please post a new question using the tag “partner-demo-kit” here. Please try to be as descriptive as possible when asking questions by providing:
      • Detailed description of problem
      • Steps to reproduce problem
      • Environment details e.g.
    • Instance type used was m4.2xlarge
    • Storage used was 500gb
    • Etc
  • Relevant log file snippets

3,101 Views
Don't have an account?
Coming from Hortonworks? Activate your account here
Version history
Revision #:
1 of 1
Last update:
‎11-17-2017 04:10 AM
Updated by:
 
Contributors
Top Kudoed Authors