We are experiencing intermittent issues when connecting to our HBase cluster via Knox / REST.
We are having trouble understanding what the Knox audit log is telling us.
Sample Knox audit log entry is shown below.
17/08/28 12:04:19 ||29cb79b2-383a-49de-b1aa-92465ae4735f|audit|WEBHBASE||||access|uri|<REDACTED>|unavailable| 17/08/28 12:04:19 ||29cb79b2-383a-49de-b1aa-92465ae4735f|audit|WEBHBASE|<USER>|||authentication|uri|<REDACTED>|success| 17/08/28 12:04:19 ||29cb79b2-383a-49de-b1aa-92465ae4735f|audit|WEBHBASE|<USER>|||authentication|uri|<REDACTED>|success|Groups:  17/08/28 12:04:19 ||29cb79b2-383a-49de-b1aa-92465ae4735f|audit|WEBHBASE|<USER>|||authorization|uri|<REDACTED>|success| 17/08/28 12:04:24 ||29cb79b2-383a-49de-b1aa-92465ae4735f|audit|WEBHBASE|<USER>|||dispatch|uri|<REDACTED>|success|Response status: 200 17/08/28 12:04:24 ||29cb79b2-383a-49de-b1aa-92465ae4735f|audit|WEBHBASE|<USER>|||access|uri|<REDACTED>|success|Response status: 200
I have removed the Username and REST endpoint / query from the log.
The questions I have are as follows:
1. There is a 5s delay evident between the "authorization" log entry and the "dispatch" log entry.
I'm not sure what this means exactly because I'm not sure what Knox is doing under the hood at this time.
It could be:
Does anyone know which of the 2 options above are valid?
Also, are there any ways to fix this delay?
2. The first log entry has a response of "unavailable".
Is this of any concern? All log entries, including non-delayed ones, have this status so I have assumed it is not serious.
Some metrics of our cluster:
Based on the above we don't expect a hardware issue or an issue with HBase itself (although anything is possible). We expect the issue to perhaps be with Knox or the HBase REST API.
I think spawning the HBase connection is the most likely cause of the delay. Authorization architecture with Ranger is all enforced at the edges, in your case at the Knox nodes (can the user access the HBase topology) and at the Region server (can the client access the Hbase objects).
I would think that every call to Knox Hbase topology/Hbase REST will mean a new connection to HBase, but I am not sure.
You could try to eliminate causes by testing whether direct calls to HBase Rest have the same latency.