Created on 05-23-2016 07:30 PM - edited 08-18-2019 04:31 AM
We are looking for re-assurance that our data sitting in Hive right now is protected from the outside world. We set up SQL workbench on our local PCs and connect to it like shown below. However, it doesn’t matter what we put in for Username & Password, it still let’s us connect to our Hive data which is concerning us for security reasons.
How can I be assured us that if some hacker out in the world came across our external IP (xx.xx.xx.xxxx), they wouldn’t be able to access our data in Hive?
I cannot set up AD auth from Azure since I cannot access my corporate AD, so LDAP Authentication is not possible, and I do not want to set up a MIT KDC
Should I:
With either above would this ensue that if someone log in as hive/ hive, the tables are still secured with the appropriate authorizations?
Created 05-23-2016 07:32 PM
Hi @Ancil McBarnett,
One option that I often use is to disable access to all ports form the outside world except 22 for SSH. Then set up a secure tunnel via SSH (requiring authentication at this stage) that forwards to port 10000 of the Hive Server. For example:
ssh -L 10000:HS2_server_address:10000 user@an_azure_node
Then you can point SQL workbench or any other tool to localhost:10000 and get forwarded across the tunnel and to port 10000 on the HS2 instance. I can provide more detail if you need.
Note that this is really just putting a brick wall around the cluster requiring authentication. If you also want authorization then we'd need to address it another way.
Created 05-23-2016 07:32 PM
Hi @Ancil McBarnett,
One option that I often use is to disable access to all ports form the outside world except 22 for SSH. Then set up a secure tunnel via SSH (requiring authentication at this stage) that forwards to port 10000 of the Hive Server. For example:
ssh -L 10000:HS2_server_address:10000 user@an_azure_node
Then you can point SQL workbench or any other tool to localhost:10000 and get forwarded across the tunnel and to port 10000 on the HS2 instance. I can provide more detail if you need.
Note that this is really just putting a brick wall around the cluster requiring authentication. If you also want authorization then we'd need to address it another way.
Created 05-23-2016 07:55 PM
@Brandon Wilson Thank you this helps. Yep please provide more detail if you can.
Created 05-23-2016 09:43 PM
@Ancil McBarnett, I'll try to paraphrase but these guys do a better job explaining it than I ever will. Essentially, since you do not have direct access to port 10000 on the HS2 machine, you need to tunnel to a machine that does have access (i.e., any of the machines in the cluster) and then have that machine push your request to port 10000 on the HS2 machine. So, the command I wrote above creates a tunnel between my machine and "an_azure_node". Then, any connections to my localhost port 10000 will go across this tunnel and, on the other side, will be forwarded to port 10000 on the hs2 node. I hope that helps and clears things up.
Created 05-24-2016 06:25 PM
I'm not sure if this question is limited to what can be done from an HDP ecosystem point of view. As another approach, is there a way to protect this using layer 3/4? Specifically, allow only a specific host/IP or subnet and ports to access your hive database. This can be done via firewall rules.
Created 05-26-2016 03:40 PM
I ssume also tha Hive with PAM authentication will also be a viable option on Azure.
https://community.hortonworks.com/articles/591/using-hive-with-pam-authentication.html