About Mark_Heydenrych

MxInstinct · ‎10-03-2019

Do you now another SQL solution for complex querys in HW?

elserj · ‎11-14-2017

Session expiration is often hard to track down. It can be a factor of JVM pauses (due to Garbage Collection) on either the client (HBase Master) or server (ZK Server) or it could be a result of a ZNode which has an inordinately large number of children. The brute-force operation would be to disable your replication process, (potentially) drop the root znode, and re-enable replication, and then sync up the tables with an ExportSnapshot or CopyTable. This would eliminate the data in ZooKeeper being a problem. The other course of action would be looking more at the Master log and ZooKeeper server log to understand why the ZK session is expiring (See https://zookeeper.apache.org/doc/trunk/images/state_dia.jpg for more details on the session lifecycle). A good first step would be checking the number of znodes under /hbase-unsecure/replication.

MattWho · ‎07-07-2017

@Mark Heydenrych You may be able to use the ReplaceText processor to remove those blank lines from your input FlowFile's content before the SplitText processor. I did a little test that worked for me using the following configuration: This evaluates your FlowFile line by line and replace the line return (\n) on any line where the line starts with a line return with nothing. The effectively removes that blank line. After that my splitText reported teh correct fragment.count when I split the file. Thanks, Matt

Enis · ‎05-12-2017

You can do this: - Manually kill the server server05 by issuing kill -9. This will cause the master to recognize that the server is dead, and will re-assign the regions that were hosted there. Also you can safely restart the master in a production env. Nothing in Hbase client depends on master being available, in regular read / write paths (only DDL statements). Master is pretty light and will come up quikly, so you can restart masters safely.

elserj · ‎03-22-2017

"I was under the impression that HBase snapshots stored only metadata without replicating any data" Your impression is incorrect. Snapshot creation is a quick/fast operation as it does not require copying any data. A snapshot is just a reference to a list of files in HDFS that HBase is using. As you continue to write more data into HBase, compactions occur which read old files and create new files. Normally, these old files are deleted. However, your snapshot is referring to these old files. You can't get backups for no-cost, you eventually have to own the cost of storing that data. Please make sure to (re)read the section on snapshots in the HBase book https://hbase.apache.org/book.html#ops.snapshots -- it is very thorough and covers this topic in much more detail than I have. Long-term, you can consider the incremental backup-and-restore work which is on-going as an alternative to snapshots https://hortonworks.com/blog/coming-hdp-2-5-incremental-backup-restore-apache-hbase-apache-phoenix/

Mark_Heydenrych · ‎03-07-2017

@Matt Clarke Thanks for this info. We do currently have a failure count loop as you suggested, which will eventually dump files in an error bucket for reprocessing later. I was just hoping to be able to identify duplicates directly from the attributes themselves. I think I will open a Jira for this.

Mark_Heydenrych · ‎11-15-2016

Thanks for the info. I've spoken to my manager, we're going to upgrade to 2.4.3

elserj · ‎10-21-2016

Hrm. Curious. Maybe the HBase client API is doing multiple retries before giving up? You can try reducing "Maximum Client Retries" on the Ambari configuration page for HBase from 35 to 5 or 10 to make the client give up and tell you why it was failing/retrying. Just a guess given what you've been able to provide so far.

Mark_Heydenrych · ‎07-18-2016

I had already tried hbase hbck -repair as well as -repairHoles prior to posting the question, with no success. We had some problems with HDFS preceding this issue. HDFS showed itself as healthy, but it had previously been corrupt. I believe this was the underlying cause of the issue. We now have HBase stable again. I added a comment to the accepted answer explaining how I solved the issue on my side. Thanks for the help.

Mark_Heydenrych · ‎07-14-2016

You inadvertently solved my problem. I had not seen that the HBase Master tells you which server it is trying to load to. I pulled up region server logs and found the following line: org.apache.hadoop.security.AccessControlException: Permission denied We had mistakenly changed the owner of /apps/hbase to hdfs, meaning that the hbase user could not write. We did hdfs dfs -chown -R hbase /apps/hbase and this has allowed the regions to be correctly assigned. Really appreciate your help. For what it's worth, we're running HDP 2.4

Online	Offline
Last Visited	‎05-25-2018 05:20 AM

Member Since	‎04-22-2016 05:48 AM
Last Visited	‎05-25-2018 05:20 AM
Posts	67
Kudos received	6

Cloudera Community

Re: HBase archive directory very large

Re: Storm-hbase bolt failing over DRPC

Re: Connect Oracle SQL Developer to Hive

Re: HBase archive directory very large

Re: Incorrect fragment.count in nifi

Re: HBase regions stuck in OFFLINE state after Reg...

Re: HBase snapshot disk usage

Re: Capture error message in NiFi PutHDFS

Re: Region servers frequently failing wit NullPoin...

Re: Storm-hbase bolt failing over DRPC

Re: HBase Region in FAILED_OPEN state due to FileN...

Re: HBase Regions stuck in FAILED_OPEN state