Reply
Highlighted
Explorer
Posts: 19
Registered: ‎11-15-2016
Accepted Solution

Need additional documentation for rest API - replication status

I'm writing a small script to monitor the status of BDR jobs with the REST apis.

 

I'm having some issue with an endpoint that takes a long time to respond (from my limited testing it scales lineary with the number of jobs and the depth of the history for each job):

https://cloudera.github.io/cm_api/apidocs/v17/path__clusters_-clusterName-_services_-serviceName-_re...

 

In the linked documentation it appears that the api accepts a limits parameter but It's not very well documented: what arguments does it accept? Maybe something to limit the history size?

 

 

 

Cloudera Employee
Posts: 206
Registered: ‎07-08-2013

Re: Need additional documentation for rest API - replication status

The link you provided will list all your replication schedules and their job result history. 

If you know the replication schedule id (eg. below is id=5) perhaps using the replication/{id}/history endpoint [0] may help you. You can limit the history size by doing so.

 

 

http://cm-host.cloudera.com:7180/api/v17/clusters/Cluster%201/services/HDFS-1/replications/5/history?limit=1&offset=0

 

[0] https://cloudera.github.io/cm_api/apidocs/v17/path__clusters_-clusterName-_services_-serviceName-_re...

 

 

Explorer
Posts: 19
Registered: ‎11-15-2016

Re: Need additional documentation for rest API - replication status

Thank you Michalis

And if I don't know the id of the jobs in advance? Any way to limit the response from the main uri /api/vXX/clusters/{cluster_name}/services/{service_name}/replications?
What I'm trying to do is just get the list of all defined jobs and get the state of the last execution (failed/succeded)

Cloudera Employee
Posts: 206
Registered: ‎07-08-2013

Re: Need additional documentation for rest API - replication status

If your objective: "..get the state of the last execution (failed/succeded)", and if I remember correctly each replication job generates an AUDIT event [0], a workaround would be to filter the Events [1].

 

On you CM> Diagnostics> Events filter;

Category: AUDIT_EVENT

Event Code: EV_HDFS_DISTCP

parsing the COMMAND_ARGS you can get the scheduleId

Then you can group the results (by COMMAND_ID) to get the execution flow  

COMMAND_STATUS will contain when it STARTED, FAILED, SUCCEEDED, ABORTED

 

[0] https://cloudera.github.io/cm_api/apidocs/v17/path__events.html

[1] http://cm.cloudera.com:7180/api/v12/events?query=category==AUDIT_EVENT;attributes.eventcode==EV_HDFS...

 

Explorer
Posts: 19
Registered: ‎11-15-2016

Re: Need additional documentation for rest API - replication status

Michalis thanks for the nice workaround!

 

 

Announcements