Support Questions

slienard · ‎12-31-2015

Hello,

In my clutser (4nodes) I have some blocks that does not meet replication number. I do not fnd any issue.

About versions :

Clutser installed with ambari

HDP 2.3.2.0.2950

HDFS 2.7.1.2.3

Replication factor : 3

Here is the nodename log :

2015-12-31 10:09:33,004 INFO  BlockStateChange (UnderReplicatedBlocks.java:chooseUnderReplicatedBlocks(407)) - chooseUnderReplicatedBlocks selected 8 blocks at priority level 2;  Total=8
2015-12-31 10:09:33,004 INFO  BlockStateChange (BlockManager.java:computeReplicationWorkForBlocks(1522)) - BLOCK* neededReplications = 6848, pendingReplications = 0.
2015-12-31 10:09:33,004 INFO  blockmanagement.BlockManager (BlockManager.java:computeReplicationWorkForBlocks(1529)) - Blocks chosen but could not be replicated = 8; of which 8 have no target, 0 have no source, 0 are UC, 0 are abandoned, 0 already have enough replicas.
2015-12-31 10:09:36,005 INFO  BlockStateChange (UnderReplicatedBlocks.java:chooseUnderReplicatedBlocks(407)) - chooseUnderReplicatedBlocks selected 8 blocks at priority level 2;  Total=8
2015-12-31 10:09:36,005 INFO  BlockStateChange (BlockManager.java:computeReplicationWorkForBlocks(1522)) - BLOCK* neededReplications = 6848, pendingReplications = 0.
2015-12-31 10:09:36,005 INFO  blockmanagement.BlockManager (BlockManager.java:computeReplicationWorkForBlocks(1529)) - Blocks chosen but could not be replicated = 8; of which 8 have no target, 0 have no source, 0 are UC, 0 are abandoned, 0 already have enough replicas.
2015-12-31 10:09:38,635 INFO  hdfs.StateChange (FSNamesystem.java:saveAllocatedBlock(3574)) - BLOCK* allocate blk_1073964742_223940{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-0c616a45-8b95-4c41-b592-0821b07a8803:NORMAL:10.1.1.82:50010|RBW], ReplicaUC[[DISK]DS-0a078e05-8825-4ea4-952d-5dc617317989:NORMAL:10.1.1.28:50010|RBW], ReplicaUC[[DISK]DS-a81872bf-681a-41d9-b5bd-a4241536e2ed:NORMAL:10.1.1.26:50010|RBW], ReplicaUC[[DISK]DS-be0ad8a7-2413-493d-9706-b7bf46dad668:NORMAL:10.1.1.24:50010|RBW]]} for /apps/accumulo/data/tables/!0/table_info/F0001jq5.rf_tmp
2015-12-31 10:09:38,677 INFO  hdfs.StateChange (FSNamesystem.java:completeFile(3494)) - DIR* completeFile: /apps/accumulo/data/tables/!0/table_info/F0001jq5.rf_tmp is closed by DFSClient_NONMAPREDUCE_-1241177763_29
2015-12-31 10:09:38,872 INFO  hdfs.StateChange (FSNamesystem.java:saveAllocatedBlock(3574)) - BLOCK* allocate blk_1073964743_223941{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-e21e239f-1ad9-4aff-8a8c-419671f8f87a:NORMAL:10.1.1.82:50010|RBW], ReplicaUC[[DISK]DS-a47ab1c7-9f5a-48fb-8613-1557c412c20a:NORMAL:10.1.1.28:50010|RBW], ReplicaUC[[DISK]DS-be0ad8a7-2413-493d-9706-b7bf46dad668:NORMAL:10.1.1.24:50010|RBW], ReplicaUC[[DISK]DS-165a6e3c-9ded-496e-8c21-40f32f482e3e:NORMAL:10.1.1.26:50010|RBW]]} for /apps/accumulo/data/tables/+r/root_tablet/F0001jq6.rf_tmp
2015-12-31 10:09:38,916 INFO  hdfs.StateChange (FSNamesystem.java:saveAllocatedBlock(3574)) - BLOCK* allocate blk_1073964744_223942{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-e21e239f-1ad9-4aff-8a8c-419671f8f87a:NORMAL:10.1.1.82:50010|RBW], ReplicaUC[[DISK]DS-165a6e3c-9ded-496e-8c21-40f32f482e3e:NORMAL:10.1.1.26:50010|RBW], ReplicaUC[[DISK]DS-9d1e974d-56e1-4f5f-9b2c-9437041cf148:NORMAL:10.1.1.24:50010|RBW], ReplicaUC[[DISK]DS-0a078e05-8825-4ea4-952d-5dc617317989:NORMAL:10.1.1.28:50010|RBW]]} for /apps/accumulo/data/tables/!0/table_info/A0001jq7.rf_tmp
2015-12-31 10:09:38,919 INFO  hdfs.StateChange (FSNamesystem.java:completeFile(3494)) - DIR* completeFile: /apps/accumulo/data/tables/+r/root_tablet/F0001jq6.rf_tmp is closed by DFSClient_NONMAPREDUCE_-1241177763_29
2015-12-31 10:09:38,969 INFO  hdfs.StateChange (FSNamesystem.java:completeFile(3494)) - DIR* completeFile: /apps/accumulo/data/tables/!0/table_info/A0001jq7.rf_tmp is closed by DFSClient_NONMAPREDUCE_-1241177763_29
2015-12-31 10:09:39,006 INFO  BlockStateChange (UnderReplicatedBlocks.java:chooseUnderReplicatedBlocks(407)) - chooseUnderReplicatedBlocks selected 8 blocks at priority level 2;  Total=8
2015-12-31 10:09:39,006 INFO  BlockStateChange (BlockManager.java:computeReplicationWorkForBlocks(1522)) - BLOCK* neededReplications = 6851, pendingReplications = 0.
2015-12-31 10:09:39,006 INFO  blockmanagement.BlockManager (BlockManager.java:computeReplicationWorkForBlocks(1529)) - Blocks chosen but could not be replicated = 8; of which 8 have no target, 0 have no source, 0 are UC, 0 are abandoned, 0 already have enough replicas.
2015-12-31 10:09:39,051 INFO  hdfs.StateChange (FSNamesystem.java:saveAllocatedBlock(3574)) - BLOCK* allocate blk_1073964745_223943{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-e21e239f-1ad9-4aff-8a8c-419671f8f87a:NORMAL:10.1.1.82:50010|RBW], ReplicaUC[[DISK]DS-165a6e3c-9ded-496e-8c21-40f32f482e3e:NORMAL:10.1.1.26:50010|RBW], ReplicaUC[[DISK]DS-0a078e05-8825-4ea4-952d-5dc617317989:NORMAL:10.1.1.28:50010|RBW], ReplicaUC[[DISK]DS-9d1e974d-56e1-4f5f-9b2c-9437041cf148:NORMAL:10.1.1.24:50010|RBW]]} for /apps/accumulo/data/tables/+r/root_tablet/A0001jq8.rf_tmp
2015-12-31 10:09:39,102 INFO  hdfs.StateChange (FSNamesystem.java:completeFile(3494)) - DIR* completeFile: /apps/accumulo/data/tables/+r/root_tablet/A0001jq8.rf_tmp is closed by DFSClient_NONMAPREDUCE_-1241177763_29
2015-12-31 10:09:42,007 INFO  BlockStateChange (UnderReplicatedBlocks.java:chooseUnderReplicatedBlocks(407)) - chooseUnderReplicatedBlocks selected 8 blocks at priority level 2;  Total=8
2015-12-31 10:09:42,007 INFO  BlockStateChange (BlockManager.java:computeReplicationWorkForBlocks(1522)) - BLOCK* neededReplications = 6852, pendingReplications = 0.
2015-12-31 10:09:42,007 INFO  blockmanagement.BlockManager (BlockManager.java:computeReplicationWorkForBlocks(1529)) - Blocks chosen but could not be replicated = 8; of which 8 have no target, 0 have no source, 0 are UC, 0 are abandoned, 0 already have enough replicas.

Any help will be appreciate ?

Thanks

Stephane.

bleonhardi · ‎01-04-2016

If you have 4 nodes he will not be able to replicate 8 copies.It looks like some tmp files from accumulo. While I do not know accumulo too well some small files like jars normally have a high replication level so they are locally available on most nodes.

You can check the filesystem with:

hadoop fsck / -files -blocks -locations

Normally programs honor a parameter called max replication which in your case should be 4 but it seems like accumulo doesnt always do that.

https://issues.apache.org/jira/browse/ACCUMULO-683

Is this causing any problems? Or are you just worried about the errors in the logs.

View solution in original post

bleonhardi · ‎01-04-2016