I was going through the HDP documentation which talks about enabling caching for Knox LDAP authentication : https://docs.hortonworks.com/HDPDocuments/HDF2/HDF-2.1.0/bk_dataflow-security/content/ldap_authentic...
What is the default cache expiry time? And How can I reduce/increase the cache expiry time?
You can find details here:
There is a default ehcache.xml file included with Knox which sets the TTL for entities to 120 seconds. This is included in the JAR file /usr/hdp/current/knox-server/dep/shiro-ehcache-1.2.3.jar. You can override this by extracting the file ('unzip /usr/hdp/current/knox-server/dep/shiro-ehcache-1.2.3.jar org/apache/shiro/cache/ehcache/ehcache.xml') and copying the extracted file to /etc/knox/conf, then editing it.
See the documentation here for details: http://knox.apache.org/books/knox-0-9-1/user-guide.html#Advanced+Caching+Config You'll then need to add the following to your Knox topology with the other cache settings provided in the link you already have:
Add to your Knox topology:
<param name="main.cacheManager.cacheManagerConfigFile" value="classpath:ehcache.xml" />
Thanks Carl. @cdraper.
I will try this. An additional question :
When the end user comes and issues a connect command from jdbc client like beeline
For example :
2) now enter !connect string with Knox:port
3) Enter AD UserName/Password
4) AD authenticates OK
5) user submits queries.
6) more queries.
In default case, without enabling EHCache, does further A/D authentication happens for step 5..6.. and onwards ? Or since its part of the same session it doesn't need to re-authenticate?
I am wondering how much % A/D round trips can be avoided on a busy production cluster with cachetime out of 2 mins.
Session is different than cache as far as I understand (which is somewhat limited!):
Sessions is the amount of time you can be logged in to say hive. During this session it wont try to re-authenticate you.
If you quit the hive session and upload a file via knox WebHDFS before the cache timeout you will need to submit your username and password again to authenticate the request but it won't make an AD call and use the cached credentials. Thus limiting AD transactions.
Pretty sure this would need some testing to confirm. If I find a moment I will try myself this is an intresting topic.
Thanks Carl @cdraper . I enabled EhCache and enabled logging for EHCache.
In our use case : We use Knox only for Hive. And as we discussed, during the session Hive Queries do not go through re-authentication, hence, I do not think we will get any benefit by enabling cache. What's your views on this?
AFAIK, Knox uses EHCache, which can be configured further by placing a ehcache.xml file in an appropriate location in classpath as written here:
<param> <name>main.cacheManager.cacheManagerConfigFile</name> <value>classpath:ehcache.xml</value> </param>
You should check "timeToIdleSeconds" (defaults to 120 seconds).
Hope this helps!