Support Questions

Find answers, ask questions, and share your expertise

Slow distcp job termination

avatar
Explorer

I'm copying several directories with several files each

Yarn application full termination seems affected by the number of object transferred

In this case we have 9 minutes between 100% completion and job termination

26/06/05 10:47:20 INFO mapreduce.Job: map 100% reduce 0%
26/06/05 10:56:25 INFO mapreduce.Job: Job job_1780210407885_1954 completed successfully
with
Files Copied=68744
DIR_COPY=23058

while here few seconds

26/06/05 10:43:11 INFO mapreduce.Job:  map 100% reduce 0%
26/06/05 10:43:32 INFO mapreduce.Job: Job job_1780210407885_1958 completed successfully
with
Files Copied=12300
DIR_COPY=191

 Is the any explanation for such a behaviour?

3 REPLIES 3

avatar
Explorer

Checking node manager logs the time spent is between container removal and container success declaration 

2026-06-05 11:08:15,452 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Removed completed containers from NM context: [container_e22_1780210407885_1976_01_000013]
2026-06-05 11:17:55,672 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Container container_e22_1780210407885_1976_01_000001 succeeded

avatar
Explorer

And checking RM there is a huge number of call to the AM , one each 10 seconds:

2026-06-05 11:09:39,050 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=arcgis   OPERATION=AM Released Container TARGET=SchedulerApp     RESULT=SUCCESS  APPID=application_1780210407885_1976    CONTAINERID=container_e22_1780210407885_1976_01_000014  RESOURCE=<memory:2048, vCores:1>        QUEUENAME=arcgis
2026-06-05 11:09:40,102 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: null is accessing unchecked http://almapwrk15.data.com:34620/ws/v1/mapreduce/jobs/job_1780210407885_1976 which is the app master GUI of application_1780210407885_1976 owned by arcgis
2026-06-05 11:09:50,116 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: null is accessing unchecked http://almapwrk15.data.com:34620/ws/v1/mapreduce/jobs/job_1780210407885_1976 which is the app master GUI of application_1780210407885_1976 owned by arcgis
.....
2026-06-05 11:17:41,013 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: null is accessing unchecked http://almapwrk15.data.com:34620/ws/v1/mapreduce/jobs/job_1780210407885_1976 which is the app master GUI of application_1780210407885_1976 owned by arcgis
2026-06-05 11:17:49,422 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: Updating application attempt appattempt_1780210407885_1976_000001 with final state: FINISHING, and exit status: -1000
2026-06-05 11:17:49,422 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1780210407885_1976_000001 State change from RUNNING to FINAL_SAVING on event = UNREGISTERED
2026-06-05 11:17:49,422 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Updating application application_1780210407885_1976 with final state: FINISHING
2026-06-05 11:17:49,422 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1780210407885_1976 State change from RUNNING to FINAL_SAVING on event = ATTEMPT_UNREGISTERED
2026-06-05 11:17:49,433 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating info for app: application_1780210407885_1976
2026-06-05 11:17:49,433 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1780210407885_1976_000001 State change from FINAL_SAVING to FINISHING on event = ATTEMPT_UPDATE_SAVED

avatar
Explorer

Anyway, everything seems related to logs collection for JobHistory