Reply
New Contributor
Posts: 2
Registered: ‎08-14-2018

CM api timeseries module not returning metrics queried

[ Edited ]

Hi Guys,

 

      I am using the cm_api "timeseries" module to query for HDFS data between 7/05 - 7/28 and the data returned is from 7/19-7/28, but when I go to the CDH UI I can see HDFS data as far back as 7/15. Is there a reason CM would be witholding data in this fashion and is there some sort of max threshold for querying?

 

Any help on this matter would be much appreciated!

 

Thanks,

 

- Ryan

New Contributor
Posts: 2
Registered: ‎08-14-2018

Re: CM api timeseries module not returning metrics queried

[ Edited ]

Btw, Here is the code I am using to grab the metrics:

 

import re
import subprocess
import json
import requests
import numpy as np
import pandas as pd
import time
import sys

from datetime import datetime
from report_tools import to_epoch, easy_time
from cm_api.api_client import ApiResource
from cm_api.endpoints import timeseries

def process_CDH_result(result):
    print result
    ts_list = result[0]
    node_list = []
    for ts in ts_list.timeSeries:
        nodename = ts.metadata.entityName
        name_strings = ['HDFS','hdfs']
        if any(x in nodename for x in name_strings):
                timestamps,values = [],[]
                for point in ts.data:
                        timestamps.append(point.timestamp)
                        values.append(point.value)
                df = pd.DataFrame({'time':timestamps, 'value':values})[['time','value']]
                node_list.append({'Node_name':ts.metadata.entityName, 'Metric_name': ts_list.timeSeries[0].metadata.metricName, 'Data':df})
    return node_list

def get_CDH_metrics(hostname,creds,(start,end)):
    user,pw = creds
    api = ApiResource(hostname,'7180',user,pw,version=16)
    metrics = ["dfs_capacity_used","dfs_capacity"]
    metric_dict = {}
    for metric in metrics:
        result = timeseries.query_timeseries(api,query="select " + metric, from_time=datetime.fromtimestamp(start), to_time=datetime.fromtimestamp(end),desired_rollup='HOURLY',must_use_desired_rollup=True)

        df = process_CDH_result(result)[0]
        new_df = df['Data'].set_index('time').rename(index=str,columns={'value':df['Node_name']})
        print new_df
        metric_dict[metric_name_maps['CDH'][metric]] = new_df
    return metric_dict 

metric_dict = get_CDH_metrics(localhost,('admin','admin'),(1525244400,1534461757))
print metric_dict

 

 

Announcements
New solutions