Community Articles

Find and share helpful community-sourced technical articles.
Labels (2)
avatar

Use OpenTSDB Ambari service to store/visualize stock data on HDP sandbox

Goal:

OpenTSDB (Scalable Time Series DB) allows you to store and serve massive amounts of time series data without losing granularity (more details here). In this tutorial we will install it on Hbase on HDP sandbox using the Ambari sevice and use it to import and visualize stock data.

Steps:

Setup VM and install Ambari service

  • Download HDP latest sandbox VM image (.ova file) from Hortonworks website
  • Import ova file into VMWare and ensure the VM memory size is set to at least 8GB
  • Now start the VM
  • After it boots up, find the IP address of the VM and add an entry into your machines hosts file e.g.
192.168.191.241 sandbox.hortonworks.com sandbox    
  • Connect to the VM via SSH (password hadoop)
ssh root@sandbox.hortonworks.com
  • Start HBase service from Ambari and ensure Hbase is up and root has authority to create tables. You can do this by trying to create a test table
hbase shell

create 't1', 'f1', 'f2', 'f3'
  • If this fails with the below, you will need to provide appropriate access via Ranger (http://sandbox.hortonworks.com:6080) ERROR: org.apache.hadoop.hbase.security.AccessDeniedException: Insufficient permissions for user 'root (auth:SIMPLE)' (global, action=CREATE)
  • To deploy the OpenTSDB service, run below
VERSION=`hdp-select status hadoop-client | sed 's/hadoop-client - \([0-9]\.[0-9]\).*/\1/'`
sudo git clone https://github.com/hortonworks-gallery/ambari-opentsdb-service.git /var/lib/ambari-server/resources/stacks/HDP/$VERSION/services/OPENTSDB
  • Restart Ambari
#on sandbox
sudo service ambari restart

#on non-sandbox clusters  
sudo service ambari-server restart
sudo service ambari-agent restart
  • Then you can click on 'Add Service' from the 'Actions' dropdown menu in the bottom left of the Ambari dashboard: Image

On bottom left -> Actions -> Add service -> check OpenTSDB server -> Next -> Next -> Customize as needed -> Next -> Deploy

You can customize the port, ZK quorum, ZK dir in the start command. Note that Hbase must be started if the option to automatically create OpenTSDB schema is selected

Image

  • On successful deployment you will see the OpenTSDB service as part of Ambari stack and will be able to start/stop the service from here: Image
  • You can see the parameters you configured under 'Configs' tab Image
  • One benefit to wrapping the component in Ambari service is that you can now automate its deployment via Ambari blueprints or monitor/manage this service remotely via REST API
export SERVICE=OPENTSDB
export PASSWORD=admin
export AMBARI_HOST=sandbox.hortonworks.com
export CLUSTER=Sandbox

#get service status
curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X GET http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/$SERVICE

#start service
curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Start $SERVICE via REST"}, "Body": {"ServiceInfo": {"state": "STARTED"}}}' http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/$SERVICE

#stop service
curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Stop $SERVICE via REST"}, "Body": {"ServiceInfo": {"state": "INSTALLED"}}}' http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/$SERVICE
  • To remove the OpenTSDB service:

Import stock data

  • Use below sample code (taken from here) to pull 30 day intraday stock prices for a few securities in both OpenTSDB and csv formats
cd
/bin/rm -f prices.csv
/bin/rm -f opentsd.input
wget https://raw.githubusercontent.com/abajwa-hw/opentsdb-service/master/scripts/google_intraday.py
python google_intraday.py AAPL > prices.csv
python google_intraday.py GOOG >> prices.csv
python google_intraday.py HDP >> prices.csv
python google_intraday.py ORCL >> prices.csv
python google_intraday.py MSFT >> prices.csv
  • Review opentsd.input which contains the stock proces in OpenTSDB-compatible format
tail opentsd.input
  • Import data from opentsd.input into OpenTSDB
/root/opentsdb/build/tsdb import opentsd.input --zkbasedir=/hbase-unsecure --zkquorum=localhost:2181 --auto-metric

Open WebUI and import stock data

  • The OpenTSDB webUI login page should be at the below link (or whichever port you configured) http://sandbox.hortonworks.com:9999
  • Query the data in OpenTSDB webUI by entering values for:
    • From: pick a date from 3 weeks ago
    • To: pick todays date
    • Check Autoreload
    • Metric: (e.g. volume)
    • Tags: (e.g. symbol GOOG)
    • You can similarly create multiple tabs
      • Tags: symbol ORCL
      • Tags: symbol AAPL
  • To make the charts smoother:
    • Under Style tab, check the 'Smooth' checkbox
    • Under Axes tab, check the 'Log scale' checkbox
  • You can also open it from within Ambari via iFrame view Image
4,193 Views