Reply
Highlighted
New Contributor
Posts: 1
Registered: ‎06-08-2018

Cloudera Manager API - BDR Replication issues

Having similar issues as has been noted here:  https://community.cloudera.com/t5/Cloudera-Manager-Installation/Need-additional-documentation-for-re...

 

Whilst above link gives a good steer to the events API, we have a performance concern which increases linearly with history over time.

 

There are issues with the query facilities on the event API.  This works: http://<SERVER>/api/v17/events?query=attributes.command==HiveReplicationCommand

 

But, this does not work: http://<SERVER>/api/v17/events?query=attributes.command_id==59974

(is it due to the command_id being numeric?)

 

Intended process:

  • POST BDR job execution using scheduleId
  • Grab COMMAND_ID from returned object
  • Query events API using parameter attributes.command_id==<COMMAND_ID>
  • ….

Instead, we must query on “attributes.command==HiveReplicationCommand” and parse the entire return to isolate the value of interest.  This is a concern due to an ever increasing payload on the returned object.

 

Can the above query issue be investigated?

 

Another observation is that the BDR replication API does not support invocation / query by scheduleName. 

 

The endpoint at http://<SERVER>/api/v17/clusters/cluster/services/hive/replication where the scheduleName and scheduleID association can be made is not usable when here is a large amount of BDR history.   It attempts to pull summary detail for all historical runs in the return and is unnecessarily expensive.

 

We find this is an issue for route-to-live, as for example it is not possible to define the scheduleName in metadata in advance for onward deployment.  We must create the jobs first and then capture the ID value in metadata.

 

 

Announcements