I am pretty new to Atlas and trying to understand the features of Atlas.
Can someone help me by answering below questions
HDP - 2.5.3 is used.
Which policy in ranger will be given high priority - Resource based policy or Tag based policy?
I created one resource based policy in ranger wherein all columns of a table are accessible user x and y.
I created another tag based policy wherein a column in the same table is given a tag and the tag based policy allows that particular tagged column to be accessed by user y only.
What I found is both x and y are able to see all the columns and tag based policy does not seem to work.
Can some one provide more clarity on this, sorry if my question is not very clear enough.
What is the use of AD integration in Atlas? How AD users are used in Atlas?
What is hive hook and can some one provide more information on it.
How to create geo-based policy and time-based policy using Atlas?
Atlas is where data stewards can define tags. Ranger is where security admins can setup authorization policies for resources and tags. I suggest you go through the below webinar and tutorials to understand this better
1. Usually the hive policies work as a whitelist (allow conditions) ie deny access by default, except if there is at least one policy allowing access. In newer versions of Ranger, you can also do blacklist (ie deny conditions) which may be what you are looking for. See: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.3/bk_security/content/ranger_tag_based_policy...
2. To enable users to login to Atlas web UI using AD credentials
3. Hive hook is capturing lineage info into Atlas (e.g. when use runs CTAS operation). More details here: http://atlas.incubator.apache.org/Bridge-Hive.html
4. Policy creation happens in Ranger not Atlas. Check Ranger docs.
Answering your first question. What I found is both x and y are able to see all the columns and tag based policy does not seem to work. --> This is because if any one ranger policy satisfy/grants permissions to user x and y, they will be able to access both x and y data. Since you have created ranger policy in first place giving access to both x and y thats giving access for both x and y to access for all columns. Try removing ranger policy only y user will be able to access that column.
What is the use of AD integration in Atlas? How AD users are used in Atlas? --> you can sync your AD users directly to access Atlas UI and to track data governance.
What is hive hook and can some one provide more information on it. --> Atlas Hive hook is used by Hive to support listeners on hive command execution using hive hooks. This is used to add/update/remove entities in Atlas using the model defined in org.apache.atlas.hive.model.HiveDataModelGenerator. The hook submits the request to a thread pool executor to avoid blocking the command execution. The thread submits the entities as message to the notification server and atlas server reads these messages and registers the entities. Follow these instructions in your hive set-up to add hive hook for Atlas:
How to create geo-based policy and time-based policy using Atlas? ---> As per I know currently we can only integrate your tag sync policies into Atlas.