Created 08-16-2018 07:30 AM
I want to know whether Hive syncs the Ranger policies and uses that cache to provide authorization or will it hit Ranger for every request that needs to be authorized
Created 08-16-2018 09:26 AM
All the plugins that use Ranger as an authorization module will cache local policy and use the same for authorization purpose. Below is an excerpt from Apache Ranger overview:
| Plugins are lightweight Java programs which embed within processes of each cluster component. For example, the Apache Ranger plugin for Apache Hive is embedded within Hiveserver2. These plugins pull in policies from a central server and store them locally in a file. When a user request comes through the component, these plugins intercept the request and evaluate it against the security policy. Plugins also collect data from the user request and follow a separate thread to send this data back to the audit server. | 
Reference: https://hortonworks.com/apache/ranger/#section_2
PS: Please mark the answer if you find it correct 🙂
Created 08-16-2018 09:26 AM
All the plugins that use Ranger as an authorization module will cache local policy and use the same for authorization purpose. Below is an excerpt from Apache Ranger overview:
| Plugins are lightweight Java programs which embed within processes of each cluster component. For example, the Apache Ranger plugin for Apache Hive is embedded within Hiveserver2. These plugins pull in policies from a central server and store them locally in a file. When a user request comes through the component, these plugins intercept the request and evaluate it against the security policy. Plugins also collect data from the user request and follow a separate thread to send this data back to the audit server. | 
Reference: https://hortonworks.com/apache/ranger/#section_2
PS: Please mark the answer if you find it correct 🙂
Created 08-16-2018 09:58 AM
@Chiran RavaniThank you so much, do you know how the policies are pulled from the Ranger? Is it a thrift communication or something else?
Created 08-16-2018 10:06 AM
It is a REST call to Ranger Admin. Property ranger.plugin.<plugin_name>.policy.rest.url will be used to communicate to Ranger Admin.
eg: ranger.plugin.hive.policy.rest.url for Hive and by default it checks every 30 seconds with Ranger Admin to check if there are any changes with regards to current policy cached, and if so, it downloads the new policy and caches the same.
Default Policy Cache location would be /etc/ranger/<CLUSTER_NAME>_<PLUGIN_COMPONENT_NAME>/policycache on the host where service is runing
eg:- /etc/ranger/hdptest_hive/policycache on Hiveserver2 for my cluster.
Created 08-16-2018 10:18 AM
Thanks Chiran, one last thing, do you know the .java file in hive where there are doing this rest call? This would help me a lot. Im trying to understanding from the starting point of the rest call to the build up of cache and successive hits to cache. Thus it will help me if I can get the class name where this Rest call happens. I checked PrivilegeSynchonizer.java. But not sure whether thats the right place.
Created 08-16-2018 11:23 AM
You're Welcome. I would start with RangerAdminRESTClient.java
Created 08-07-2019 02:36 AM
Hi Chiran,
I could understand the details above. However i have one question- Which user does the call for ranger.plugin.hive.policy.rest.url for the first time and subsequent attempts?
In my case i have ranger installed in separate server and cluster in separate servers. What is the authentication and user involved by rest client for talking to each other?
I found from ranger audits and could see curl entries for "admin" user and no other further attempts to authenticate?
 
					
				
				
			
		
