Created 02-03-2017 10:25 PM
What does this error means and how to fix it?
Created 02-05-2017 05:57 PM
Can you provide the container logs?
Created 02-06-2017 06:24 AM
Hi Leo, it seems that error can happen because of some miscommunication between the RM, an app's AM and one or more NMs. One scenario is described in YARN-3535, in particular in this post. How to fix it depends on what are you doing. If it's after an HDP upgrade you can try to clear affected NM recovery directories, and restart Yarn. If it happens on a particular app, then something might be wrong with app, like some configs missing, etc.
Created 08-16-2018 08:48 PM
Have deployed the cluster with 3 data nodes. YARN/MapReduce2/HDFS version is 2.7.3 on HDP.
While running teragen and Gobblin the following Yarn errors get reported in the logs. Errors get reported only when the map tasks defined for the job less than or equals to the number of data nodes in the cluster.
For Teragen -Dmapreduce.job.maps=4
For Gobblin mr.job.max.mappers=4
There are no errors if the map tasks(splits) are <= number of data nodes.
2018-08-16 06:54:05,681 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Container complete event for unknown container id container_1534394833079_0012_01_000006
2018-08-16 05:00:50,138 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Container complete event for unknown container id container_1534394833079_0001_01_000055 2018-08-16 05:00:50,138 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received completed container container_1534394833079_0001_01_000054 2018-08-16 05:00:50,138 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Container complete event for unknown container id container_1534394833079_0001_01_000054 2018-08-16 05:00:50,138 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received completed container container_1534394833079_0001_01_000053 2018-08-16 05:00:50,138 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Container complete event for unknown container id container_1534394833079_0001_01_000053
Please let me know how to avoid these error in the YARN logs.