Support Questions

Find answers, ask questions, and share your expertise

How can I secure my Hive access on Azure Environment

avatar

We are looking for re-assurance that our data sitting in Hive right now is protected from the outside world. We set up SQL workbench on our local PCs and connect to it like shown below. However, it doesn’t matter what we put in for Username & Password, it still let’s us connect to our Hive data which is concerning us for security reasons.

How can I be assured us that if some hacker out in the world came across our external IP (xx.xx.xx.xxxx), they wouldn’t be able to access our data in Hive?

4454-image002.png

I cannot set up AD auth from Azure since I cannot access my corporate AD, so LDAP Authentication is not possible, and I do not want to set up a MIT KDC

Should I:

  1. Leave Hive Authentication to None but apply SQL Standard Authorizations (See https://community.hortonworks.com/questions/22086/can-we-enable-hive-acls-without-ranger.html and https://cwiki.apache.org/confluence/display/Hive/SQL+Standard+Based+Hive+Authorization)
  2. Should I set up Ranger instead of SQL Standard Auth?

With either above would this ensue that if someone log in as hive/ hive, the tables are still secured with the appropriate authorizations?

1 ACCEPTED SOLUTION

avatar

Hi @Ancil McBarnett,

One option that I often use is to disable access to all ports form the outside world except 22 for SSH. Then set up a secure tunnel via SSH (requiring authentication at this stage) that forwards to port 10000 of the Hive Server. For example:

ssh -L 10000:HS2_server_address:10000 user@an_azure_node

Then you can point SQL workbench or any other tool to localhost:10000 and get forwarded across the tunnel and to port 10000 on the HS2 instance. I can provide more detail if you need.

Note that this is really just putting a brick wall around the cluster requiring authentication. If you also want authorization then we'd need to address it another way.

View solution in original post

5 REPLIES 5

avatar

Hi @Ancil McBarnett,

One option that I often use is to disable access to all ports form the outside world except 22 for SSH. Then set up a secure tunnel via SSH (requiring authentication at this stage) that forwards to port 10000 of the Hive Server. For example:

ssh -L 10000:HS2_server_address:10000 user@an_azure_node

Then you can point SQL workbench or any other tool to localhost:10000 and get forwarded across the tunnel and to port 10000 on the HS2 instance. I can provide more detail if you need.

Note that this is really just putting a brick wall around the cluster requiring authentication. If you also want authorization then we'd need to address it another way.

avatar

@Brandon Wilson Thank you this helps. Yep please provide more detail if you can.

avatar

@Ancil McBarnett, I'll try to paraphrase but these guys do a better job explaining it than I ever will. Essentially, since you do not have direct access to port 10000 on the HS2 machine, you need to tunnel to a machine that does have access (i.e., any of the machines in the cluster) and then have that machine push your request to port 10000 on the HS2 machine. So, the command I wrote above creates a tunnel between my machine and "an_azure_node". Then, any connections to my localhost port 10000 will go across this tunnel and, on the other side, will be forwarded to port 10000 on the hs2 node. I hope that helps and clears things up.

avatar
Rising Star

I'm not sure if this question is limited to what can be done from an HDP ecosystem point of view. As another approach, is there a way to protect this using layer 3/4? Specifically, allow only a specific host/IP or subnet and ports to access your hive database. This can be done via firewall rules.

avatar

I ssume also tha Hive with PAM authentication will also be a viable option on Azure.

https://community.hortonworks.com/articles/591/using-hive-with-pam-authentication.html