Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Ranger User Permissions Column Level

avatar
Expert Contributor

Hello,

I'm trying to use Ranger to activate User Column Level permissions.

I am able to do table level permissions by changing HDFS policies.

When I try Hive, column level permissions, and then use Hive CLI, these permissions do not work.

Please let me know what I am doing wrong and what I should be doing.

Thanks,

Marcy

1 ACCEPTED SOLUTION

avatar
Guru

@Marcy Using the Hive CLI, the connection is direct to the Hive Metastore, and relies on Storage-based Authorization. To take advantage of the Ranger-based central security, Hortonworks recommends using Beeline (instead of the Hive CLI) as it will go through HiveServer2 and the Ranger-based policies will apply. In fact, in production environments, it is often suggested to have administrators disable the hive CLI and force users to issue CLI-based interactions through Beeline. Here are some relevant links that you may find useful. As always, if you find this post useful, don't forget to upvote and/or accept the answer.

http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.3/bk_data-access/content/beeline-vs-hive-cli.h...

https://community.hortonworks.com/articles/10367/apache-ranger-and-hive-column-level-security.html

https://community.hortonworks.com/questions/10760/how-to-disable-hive-shell-for-all-users.html

View solution in original post

5 REPLIES 5

avatar
Guru

@Marcy Using the Hive CLI, the connection is direct to the Hive Metastore, and relies on Storage-based Authorization. To take advantage of the Ranger-based central security, Hortonworks recommends using Beeline (instead of the Hive CLI) as it will go through HiveServer2 and the Ranger-based policies will apply. In fact, in production environments, it is often suggested to have administrators disable the hive CLI and force users to issue CLI-based interactions through Beeline. Here are some relevant links that you may find useful. As always, if you find this post useful, don't forget to upvote and/or accept the answer.

http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.3/bk_data-access/content/beeline-vs-hive-cli.h...

https://community.hortonworks.com/articles/10367/apache-ranger-and-hive-column-level-security.html

https://community.hortonworks.com/questions/10760/how-to-disable-hive-shell-for-all-users.html

avatar
Expert Contributor

@Sonu Sahi

Ok...

If I would like users to use HiveQL, what are my options if I disable Hive CLI?

What are the differences between Hive and Beeline?

Can I connect via Spark? RStudio? Python?

Thanks,

Marcia

avatar
Guru

@Marcy If you disable the Hive CLI, your best and recommended option is to have users use Beeline for HiveQL. It is supported by Hortonworks, and is the most popular client. Additionally, you may wish to explore a GUI-based tool included in Ambari called the Ambari Hive View (which gets even better in the upcoming HDP 2.6 release). The first link I included outlines the major differences between Hive and Beeline for you, but in a nutshell, Beeline goes through HiveServer2 which means it will respect Ranger based authorization whereas Hive is more like a brute-force direct connection if you will and bypasses many of the security features. All of the options you listed are possible. When looking at different methods of accessing data in Hive, what you want to ensure is that they go through HiveServer2 so that the Ranger-based security is respected. This is normally Hadoop administrator's primary concern. Here is an additional link that goes over various Hive clients:

https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients

In my experience, BeeLine and the Ambari Hive View is where most Hadoopers start their journey and remain until a use case comes along that requires additional technologies like Spark, R or Python.

avatar
Expert Contributor

@Sonu Sahi

Please let me know what technologies are available for Spark, R, Python.

Thanks,

Marcia

avatar
Guru

@Marcy All of them can work. Their access to Hive is commonly done using a Notebook tool called Apache Zeppelin (included in the Hortonworks Data Platform). Hortonworks has many tutorials that can show you step by step on how to connect these:

https://hortonworks.com/hadoop-tutorial/using-hive-with-orc-from-apache-spark/

https://hortonworks.com/hadoop-tutorial/getting-started-apache-zeppelin/