Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

NotEnoughReplicasException when writing into a partitioned hive table.

Highlighted

NotEnoughReplicasException when writing into a partitioned hive table.

New Contributor

Hey guys, I have a 1.5 gb dataset and I am trying to write it into an external partitioned hive table, I don't know what's going on, but it fails, saying, "could only be replicated to 0 nodes instead of min Replication (=1). There are 1 datanode(s) running and no node(s) are excluded in this operation."

Here are the the exceptions in namenode's log.

java.io.IOException: File /user/maria_dev/data2/.hive-staging_hive_2019-02-08_11-05-21_487_8982509445885981287-1/_task_tmp.-ext-10000/cityid=219/_tmp.000013_0 could only be replicated to 0 nodes instead of min
Replication (=1).  There are 1 datanode(s) running and no node(s) are excluded in this operation.                                                                                                                
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1719)                                                                                              
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3372)                                                                                                        
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3296)                                                                                                        
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:850)                                                                                                         
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:504)                                                        
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)                                                     
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)                                                                                                    
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)                                                                                                                                                   
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)                                                                                                                                          
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)                                                                                                                                          
        at java.security.AccessController.doPrivileged(Native Method)                                                                                                                                            
        at javax.security.auth.Subject.doAs(Subject.java:422)                                                                                                                                                    
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)                                                                                                                  
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)                          

and

2019-02-08 11:05:45,906 DEBUG net.NetworkTopology (NetworkTopology.java:chooseRandom(796)) - chooseRandom returning null                                                                                         
2019-02-08 11:05:45,906 DEBUG blockmanagement.BlockPlacementPolicy (BlockPlacementPolicyDefault.java:chooseLocalRack(547)) - Failed to choose from local rack (location = /default-rack); the second replica is n
ot found, retry choosing ramdomly                                                                                                                                                                                
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy$NotEnoughReplicasException:                                                                                                                   
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:701)                                                                          
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:622)                                                                          
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalRack(BlockPlacementPolicyDefault.java:529)                                                                       
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalStorage(BlockPlacementPolicyDefault.java:489)                                                                    
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:341)                                                                          
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:216)                                                                          
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:113)                                                                          
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:128)                                                                          
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1710)                                                                                              
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3372)                                                                                                        
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3296)                                                                                                        
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:850)                                                                                                         
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:504)                                                        
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)                                                     
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)                                                                                                    
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)                                                                                                                                                   
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)                                                                                                                                          
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)                                                                                                                                          
        at java.security.AccessController.doPrivileged(Native Method)                                                                                                                                            
        at javax.security.auth.Subject.doAs(Subject.java:422)                                                                                                                                                    
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)                                                                                                                  
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)                

I don't understand what's happening, I have 70 gigs of free space on hdfs, and the dataset is pretty small. What could be going wrong?

Inserting into a non-partitioned table works, but inserting into a partitioned table doesn't.

Here's my code for inserting:

insert into table partbrowserdata partition(cityid) 
select /*column names omitted*/
from browserdata; 

I tried this on all hive execution engines: mr, tez (and also spark on my cloudera cluster and it fails too).

By the way the number of partitions it creates is about 300, maybe that's too much? I tried changing settings like hive.exec.max.dynamic.partitions and hive.exec.max.dynamic.partitions.pernode but it did not help.

I would also like to say that I tried it on newly installed hortonworks sandboxes (with 20 gigs of ram and 6 processor cores), versions 2.6 and 3.0 as well as a cloudera cluster and it didn't work. But it did work on MapR, probably because it has a different file system?

Please help.

1 REPLY 1

Re: NotEnoughReplicasException when writing into a partitioned hive table.

New Contributor