About donigrubbs

donigrubbs · ‎05-15-2019

Before the failure to open message is this block (only including first line of java stackstrace): 2019-05-14 15:55:53,356 INFO org.apache.hadoop.hbase.regionserver.HRegion: Replaying edits from hdfs://athos/hbase/data/default/deveng_v500/ab693aebe203bc8781f1a9f1c0a1d045/recovered.edits/0000000000094270192 2019-05-14 15:55:53,383 INFO org.apache.hadoop.hbase.regionserver.HRegion: Replaying edits from hdfs://athos/hbase/data/default/deveng_v500/ab693aebe203bc8781f1a9f1c0a1d045/recovered.edits/0000000000094270299 2019-05-14 15:55:53,722 INFO org.apache.hadoop.hbase.regionserver.HRegion: Replaying edits from hdfs://athos/hbase/data/default/deveng_v500/ab693aebe203bc8781f1a9f1c0a1d045/recovered.edits/0000000000094270330 2019-05-14 15:55:53,903 INFO SecurityLogger.org.apache.hadoop.hbase.Server: Auth successful for tomcat (auth:SIMPLE) 2019-05-14 15:55:53,904 INFO SecurityLogger.org.apache.hadoop.hbase.Server: Connection from 10.190.158.151 port: 60648 with unknown version info 2019-05-14 15:55:54,614 ERROR org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open of region=deveng_v500,\x00\x00\x1C\xAB\x92\xBC\xD8\x02,1544486155414.ab693aebe203bc8781f1a9f1c0a1d045., starting to roll back the global memstore size. java.lang.IllegalArgumentException: offset (8) + length (2) exceed the capacity of the array: 0 at org.apache.hadoop.hbase.util.Bytes.explainWrongLengthOrOffset(Bytes.java:631) .............

michalis · ‎03-13-2017

There seems to be issues around update-alternatives command. Which is often caused by a broken alternatives link under /etc/alternatives/ or a bad (zero length, see [0]) alternatives configuration file under /var/lib/alternatives, and based on your description it appears to be the former. The root cause is that Cloudera Manager Agents relies in the OS provided binary of update-alternatives, however the binary doesn't relay feedback on bad entries or problems, therefore we have to resort to manually rectifying issues like these. We have an internal improvement JIRA OPSAPS-39415 to explore options on how to make alternatives updates during upgrades more resilient. To recover from the issue, you would need to remove CDH related entries from alternatives configuration files. [0] https://bugzilla.redhat.com/show_bug.cgi?id=1016725 = = = = = = = # Stop CM agent service on node service cloudera-scm-agent stop # Delete hadoop /etc/alternatives - below will displays the rm command you'll need to issue. ls -l /etc/alternatives/ | grep "\/opt\/cloudera" | awk {'print $9'} | while read m; do if [[ -e /var/lib/alternatives/${m} ]]; then echo "rm -fv /var/lib/alternatives/${m}"; fi; echo "rm -fv /etc/alternatives/${m}"; done # Remove 0 byte /var/lib/alternatives cd /var/lib/alternatives find . -size 0 | awk '{print $1 " "}' | tr -d '\n' # The above command will give you a multi-line output of all 0 byte files in /var/lib/alternatives. Copy all the files, and put into the rm -f rm -f # Start CM agent service cloudera-scm-agent start = = = = = = =

donigrubbs · ‎02-15-2017

Turned out that the nodes were in the excludes files, just not the host.exclude like we use in CDH5, so it was missed.

dnavarro · ‎01-23-2017

Moving from VM to Bare-Metal, as well as increasing heaps --due to the high number of nodes seemed to resolve the problem.

bgooley · ‎07-26-2016

Hello, In short, you can stop Reports Manager and then safely remove any fsimage.tmp file. Reports Manager will download the fsimage to a temp file, index it, then rename the fsimage.tmp to fsimage. If you have fsimage files lying around or directories named after HDFS that no longer exists, they can be removed. The fact that there are fsimage.tmp files left around indicates that indexing did not complete for those files. If an HDFS service is removed, Reports Manager does not clean up the previous files automatically. By default, every hour the fsimage will be downloaded from your NameNode and indexed. If indexing is taking longer than an hour, then you can increase the interval in the Cloudera Management Service Reports Manager "Reports Manager Update Frequency" configuration. If you have specific questions about what to delete, let us know. In generaly, you should have one fsimage (or fsimage.tmp while indexing is occurring) per HDFS service that Cloudera Manager manages. If you have 2 clusters managed by Cloudera Manager, you will have two fsimage files.

donigrubbs · ‎03-03-2016

Yep, that did it. Didn't realize the id's were event id's and not host id's. Used the attributes.HOST_IDS and was able to pull back the information for the host. With this output, I can sort and build alerting off it. Thank you.

Harsh J · ‎08-05-2014

You are correct though that this does not exist as a current feature. Please consider filing a HBASE project JIRA upstream requesting (implementation patches welcome too!) this at https://issues.apache.org/jira/browse/HBASE.

Online	Offline
Last Visited	‎03-16-2021 01:06 PM

Member Since	‎08-01-2014 09:01 AM
Last Visited	‎03-16-2021 01:06 PM
Posts	16

Cloudera Community

Re: UnregisteredDatanodeException on same node wit...

Re: Region stuck in transition, various versions e...

Re: CDH upgrade alternatives link not updated

Re: UnregisteredDatanodeException on same node wit...

Re: Tuning suggestions for Cloudera Manager

Re: Reports Manager /var/lib fsimage.tmp

Re: Why is node in concerning status in CM using A...

Re: Is there a graceful way to failover hbase mast...