I have removed the Username and REST endpoint / query from the log.
The questions I have are as follows: 1. There is a 5s delay evident between the "authorization" log entry and the "dispatch" log entry.
I'm not sure what this means exactly because I'm not sure what Knox is doing under the hood at this time.
It could be:
A delay in the Authorization step (Knox is waiting for auth). If the log entry was generated at the start of the authorization step then there would be a delay evident in the audit log until the dispatch entry is written. I would assume that authorization would be quick, but am not sure.
A delay in the Dispatch step (Knox is waiting on the HBase REST API to return). If the entry was generated after the dispatch step was done. This seems like the more likely candidate as it involves setting up a connection to HBase REST, submitting the query, and waiting for a response.
Does anyone know which of the 2 options above are valid?
Also, are there any ways to fix this delay?
2. The first log entry has a response of "unavailable".
Is this of any concern? All log entries, including non-delayed ones, have this status so I have assumed it is not serious.
Some metrics of our cluster:
The delays do not seem to be related to query load / concurrency
We do around 30 queries per second which HBase is supposed to handle. Not sure if the REST API or Knox gateway are rated for this degree of concurrency.
We have plenty of hardware - more than 10 worker nodes. The HBase REST and Knox services are both running on the same master node.
The queries are small and return maybe 10 rows and 30 columns at most from HBase
Based on the above we don't expect a hardware issue or an issue with HBase itself (although anything is possible). We expect the issue to perhaps be with Knox or the HBase REST API.
I think spawning the HBase connection is the most likely cause of the delay. Authorization architecture with Ranger is all enforced at the edges, in your case at the Knox nodes (can the user access the HBase topology) and at the Region server (can the client access the Hbase objects).
I would think that every call to Knox Hbase topology/Hbase REST will mean a new connection to HBase, but I am not sure.
You could try to eliminate causes by testing whether direct calls to HBase Rest have the same latency.