Created 08-21-2017 07:39 AM
We recently upgraded from ambari 2.2 to 2.5 and hdp from 2.4 to 2.6. We have hbase bucket cache enabled. But after the upgrade hbase master is continuously failing. the exception is shown below. But once we disabled the bucket cache, the hbase master process is coming up. How to fix this?
java.lang.RuntimeException: Failed construction of Master: class org.apache.hadoop.hbase.master.HMaster at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2756) at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:235) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:139) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2770) Caused by: java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:658) at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311) at org.apache.hadoop.hbase.util.ByteBufferArray.<init>(ByteBufferArray.java:65) at org.apache.hadoop.hbase.io.hfile.bucket.ByteBufferIOEngine.<init>(ByteBufferIOEngine.java:47) at org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.getIOEngineFromName(BucketCache.java:311) at org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.<init>(BucketCache.java:221) at org.apache.hadoop.hbase.io.hfile.CacheConfig.getBucketCache(CacheConfig.java:614) at org.apache.hadoop.hbase.io.hfile.CacheConfig.getL2(CacheConfig.java:553) at org.apache.hadoop.hbase.io.hfile.CacheConfig.instantiateBlockCache(CacheConfig.java:637) at org.apache.hadoop.hbase.io.hfile.CacheConfig.<init>(CacheConfig.java:231) at org.apache.hadoop.hbase.regionserver.HRegionServer.<init>(HRegionServer.java:576) at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:425) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
Created 08-21-2017 08:25 AM
seems like you are hitting this bug : https://issues.apache.org/jira/browse/AMBARI-13325
In the HBase config, make sure the following parameters have no values.
- hbase.bucketcache.size
- hbase.bucketcache.ioengine
- hbase.bucketcache.percentage.in.combinedcache
you can try removing these three configurations using /var/lib/ambari-server/resources/scripts/configs.sh
For example:
/var/lib/ambari-server/resources/scripts/configs.sh delete localhost falconK hbase-site hbase.bucketcache.ioengine
where localhost is my server hostname
Hope this helps @ARUN
Created 08-21-2017 09:18 AM
hi @Arun,
Bucket cache is memory (Off heap) which will be directly accessed from the main memory(RAM), Can you please check are there any memory management errors in /var/log/messages(or syslog). That may indicate any issues with insufficient memory.
There may be cases that, other components(services) in the server increased their usage and hit the 100% mark ( in case of swappiness set to 0).
the following link has a table on how much bucket cache should be allocated in each RS.
Please note that, the computation is considered based of the case that node is worser node, if any other node you must subtract the JVM (xmx size memory of the other components resides in the host)
Created 08-21-2017 09:20 AM
@Akhil S Naik, Thanks for your answer. but it wont help our case. we already had the bucket cache enabled and we are making use of it. SO removing the bucket cache is not a good option for us as we need to retain that.
These were our values
hbase.bucketcache.size = 92160 MB
MaxDirectMemorySize=94208 MB
Created 08-21-2017 10:17 AM
@arun, you need to set -XX:MaxDirectMemorySize=<more_than_bucket_cache_size> in HBASE_MASTER_OPTS also. as master internaly starts a regionserver for some tasks.
export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -XX:MaxDirectMemorySize=<more_than_bucket_cache_size>"
Created 08-21-2017 10:57 AM
we have set this as shown below
export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -XX:MaxDirectMemorySize=94208m "
and our bucket cache size is 92160 mb
This was working perfectly before the upgrade.
92 GB(max direct memory) > 90 GB (bucket cache)
we will try setting on the master one also and get back
Created 08-21-2017 10:59 AM
you need to set the same for Master opts as well as described in my last comment.