Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

regionserver going down with below logs

regionserver going down with below logs

New Contributor

Hi All,

I have two datanode with RF = 2 and two region. Yesterday suddenly both of my region server went down. Below are the logs,

==============================================================

2016-10-05 01:59:08,210 INFO [regionserver/datanode1.qastaging.test.com/10.168.110.104:16020-shortCompactions-1475657912326] regionserver.HStore: Completed compaction of 3 (all) file(s) in 0 of testnamespace.testtable,\x12\x00\x00\x00\x00\x00\x00\x00\x00,1457709158600.eaf10e5a7b26a23ad22319816f55bca7. into 3df5ed9f6ab34ce69fffc051133e3a01(size=41.1 M), total size for store is 41.1 M. This selection was in queue for 0sec, and took 4sec to execute. 2016-10-05 01:59:08,210 INFO [regionserver/datanode1.qastaging.test.com/10.168.110.104:16020-shortCompactions-1475657912326] regionserver.CompactSplitThread: Completed compaction: Request = regionName=testnamespace.testtable,\x12\x00\x00\x00\x00\x00\x00\x00\x00,1457709158600.eaf10e5a7b26a23ad22319816f55bca7., storeName=0, fileCount=3, fileSize=41.1 M, priority=7, time=73158687399839; duration=4sec 2016-10-05 01:59:23,001 INFO [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl: Atomically moving datanode1.qastaging.test.com,16020,1475654176157's wals to my queue 2016-10-05 02:00:20,973 INFO [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl: Atomically moving datanode1.qastaging.test.com,16020,1475651860000's wals to my queue 2016-10-05 02:01:07,452 INFO [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl: Atomically moving datanode2.qastaging.test.com,16020,1475654172159's wals to my queue 2016-10-05 02:03:23,459 INFO [LruBlockCacheStatsExecutor] hfile.LruBlockCache: totalSize=492.76 MB, freeSize=285.49 MB, max=778.25 MB, blockCount=6076, accesses=47444, hits=40025, hitRatio=84.36%, , cachingAccesses=44249, cachingHits=37348, cachingHitsRatio=84.40%, evictions=29, evicted=0, evictedPerRun=0.0 2016-10-05 02:08:23,459 INFO [LruBlockCacheStatsExecutor] hfile.LruBlockCache: totalSize=492.76 MB, freeSize=285.49 MB, max=778.25 MB, blockCount=6076, accesses=47444, hits=40025, hitRatio=84.36%, , cachingAccesses=44249, cachingHits=37348, cachingHitsRatio=84.40%, evictions=59, evicted=0, evictedPerRun=0.0 2016-10-05 02:10:44,431 INFO [B.defaultRpcServer.handler=25,queue=1,port=16020] compress.CodecPool: Got brand-new decompressor [.gz] 2016-10-05 02:10:58,093 INFO [JvmPauseMonitor] util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 2850ms GC pool 'ParNew' had collection(s): count=1 time=1074ms GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=1934ms 2016-10-05 02:11:01,005 INFO [JvmPauseMonitor] util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 2169ms GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=2475ms 2016-10-05 02:11:03,596 INFO [JvmPauseMonitor] util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 2090ms

===============================================================================

I have already tried GC settings suggected on HW community but it is not working out.

my server has 8GB RAM and Datanode is running with 1GB heap. Let me know if more information is needed.

4 REPLIES 4
Highlighted

Re: regionserver going down with below logs

Super Collaborator

Can you tell us the GC settings you use ?

Consider using pastebin for future log postings where line breaks are kept.

How much heap do you give your region server ?

Re: regionserver going down with below logs

New Contributor

Hi @Ted Yu,

Below is my RegionServer setting

export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xmn512m -XX:CMSInitiatingOccupancyFraction=70 -Xms2048m -Xmx2048m $JDK_DEPENDED_OPTS"

Thanks

Re: regionserver going down with below logs

Remember that Datanode is a service in HDFS. RegionServer is a service for HBase. Regions are hosted by RegionServers, not Datanodes.

Re: regionserver going down with below logs

Please attach a GC log file and complete regionserver logs

Don't have an account?
Coming from Hortonworks? Activate your account here