(you can check the port in Ambari for the property - dfs.datanode.http.address)
In the json response, look for the value in the field “BytesWritten” which will be under a subsection, name" : "Hadoop:service=DataNode,name=DataNodeActivity-name of your datanode-50010". There should only be one entry/value for this.
The above value reflects the actual data change value (additions, deletions, modifications). Essentially, if you add and drop a hdfs file, the value won’t reduce and will reflect the actual data change rate.
Summing up these values for all the data nodes will give the overall data change in the entire system.
Finally, in order to get the real data size that will be transferred by Data Lifecycle Manager (DLM), this value will have to be divided by the replication factor of the source cluster from where the replication is initiated.
Example, if the “Byteswritten” value for each data node is 10MB and there are 10 data nodes, then (100MB/3) = 33.3 MB will be transferred by DLM. Assuming replication factor at source = 3.
Note - If you are planning to run data replication jobs in DataLifeCycle Manager on a daily basis, you can capture these jmx metrics each day and calculate the difference between consecutive days to get the single day data change value. You can write a simple script in your preferred language to accomplish the same. Also to remember is that if your datanode is restarted at some point, the jmx value for "BytesWritten" will roll back to 0.