Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
Labels (1)
avatar
Super Guru

Question:

I have installed HDP cluster using Ambari which has Ranger service installed and working properly. I enabled the plugin Kafka for Ranger. I noticed something a little bit annoying.

If Ranger Admin is down, Kafka will take a long time to start, because it tries to connect to Ranger Admin to get the repository. The error log is something like as shown below -

### 
Will retry 74 time(s), caught exception: Connection failed to Ranger Admin. Reason - [Errno 111] Connection refused.. Sleeping for 8 sec(s) 
### 

Is there a way to decrease this number of retry or the duration of sleep between retry ? Ranger Admin down should not have any impact on the components for which the plugins are enabled, right ?

Findings:

Ambari uses the following Script to return the ranger admin login check response. /usr/lib/ambari-agent/lib/resource_management/libraries/functions/ranger_functions_v2.py and (ranger_functions.py) /usr/lib/ambari-server/lib/resource_management/libraries/functions/ranger_functions_v2.py and (ranger_functions.py)

These scripts are having the hard coded values for the retry attempts and the sleep interval. something as following:

{code} 
@safe_retry(times=75, sleep_time=8, backoff_factor=1, err_class=Fail, return_on_fail=None) 
def check_ranger_login_urllib2(self, url): 
""" 
:param url: ranger admin host url 
:param usernamepassword: user credentials using which repository needs to be searched. 
:return: Returns login check response 
""" 
. 
. 
. 
{code} 

So as a default behaviour Ambari will attempt total 75 times with sleeping interval of 8 seconds for the ranger admin login check response. If the ranger is down or if it does not comes up during these many times of attempt then it should throw the exception.

REASON FOR ABOVE HARD CODED VALUES:

1. Blueprint based deployment as to make ensure the order of starting the services 2. Ranger admin startup can vary from environment to environment, hence the #retries was kept higher to be safe.

HOW TO:

Q. I would like to decrease this hardcoded values to one minute instead of 10 minutes. Which means 6 retries and 10s of sleep between retries.

A. I see that Ambari uses the following Script to return the ranger admin login check response.

#/usr/lib/ambari-agent/lib/resource_management/libraries/functions/ranger_functions_v2.py and (ranger_functions.py) 

#/usr/lib/ambari-server/lib/resource_management/libraries/functions/ranger_functions_v2.py and (ranger_functions.py)

The script 'ranger_functions_v2.py' (ranger_functions.py) controls these retry interval and sleep timing. Editing the scripts retry attempts and sleep can be a temporary suggestion. However altering the ambari provided scripts are not recommended without consulting Hortonworks. 
467 Views