Member since
10-19-2015
279
Posts
340
Kudos Received
25
Solutions
06-28-2018
07:08 AM
this Article explain the additional steps required to configure wire encryption by exporting/importing the certificates across the cluster for distcp to work on wire encrypted multi cluster . Problem: on wire encrypted multi cluster environment distcp fails if steps given in this article are not performed, we may see ssl error as follows:
javax.net.ssl.SSLHandshakeException: DestHost:destPort <KMS_HOST>:9393 , LocalHost:localPort null:0. Failed on local exception: javax.net.ssl.SSLHandshakeException: Error while authenticating with endpoint: https://<KMS_HOST>e:9393/kms/v1/?op=GETDELEGATIONTOKEN&rene.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) Prerequisites: 1) two cluster should be setup with Ranger 2) wire encryption should be enabled on the both clusters already.
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_security/content/enabling-ssl-for-components.html 3) if Ranger kms is installed then wire encryption should be enabled for ranger kms too in both the clusters. steps to configure SSL for distcp to work in multi cluster: 1) export the certificate from Hadoop server key store file on all the host part of the cluster1 and cluster2. cd <server_hadoop_key_location>;keytool -export -alias hadoop_cert_<host_name> -keystore <keystore_file_path> -rfc -file hadoop_cert_<host_name> -storepass <keystore_password> Note: if you don't know the location of the key store, then you can search for config "ssl.server.keystore.location" in the hdfs config
2) copy all the certificates generated for cluster1 in previous step from cluster1 hosts to client key location on all the hosts part of cluster2.
and similarly copy all the certificates generated for cluster2, from cluster2 hosts to client key location on all the host part of cluster1 3) Import the all the cluster1 certificates to the hadoop client trustore on all the host of cluster2 and vice versa. cd <client_hadoop_key_location>;keytool import -noprompt -alias hadoop_cert_<host_name> -file hadoop_cert_<host_name> -keystore <truststore_file_path> -storepass <truststore_password> Note: if you don't know the location of the truststore, then you can search for config "ssl.client.truststore.location" in the hdfs config Additional steps if Ranger Kms is installed: if ranger kms is installed then we need to export the ranger kms certificate from ranger kms hosts of cluster1 to Hadoop client trust store of cluster2 1) export the certificate from Ranger kms server key store file on kms hosts part of the cluster1 and cluster2. cd <kms_key_store_location>;keytool -export -alias kms_cert_<host_name> -keystore <kms_keystore_file_path> -rfc -file kms_cert_<host_name> -storepass <kms_keystore_password> Note: if you don't know the location of the kms key store, then you can search for config "ranger.https.attrib.keystore.file" in the kms config
2) copy all the certificates generated for kms in cluster1 in previous step from cluster1 kms hosts to client key location on all the hosts part of cluster2.
and similarly copy all the certificates generated for kms in cluster2, from cluster2 kms hosts to client key location on all the host part of cluster1 3) Import all the cluster1 kms certificates to the Hadoop client trust store on all the host of cluster2 and vice versa. cd <client_hadoop_key_location>;keytool import -noprompt -alias kms_cert_<host_name> -file kms_cert_<host_name> -keystore <truststore_file_path> -storepass <truststore_password> Now restart Hdfs, Yarn, Mapreduce and Ranger KMS on both the cluster and once both the services successfully started, try distcp it should work fine. hadoop distcp -Dmapreduce.job.hdfs-servers.token-renewal.exclude=cluster1 -skipcrccheck -update /distcp_cluster1 hdfs://cluster2/distcp_cluster2/
... View more
Labels:
03-13-2018
01:32 PM
1 Kudo
prerequisites: 1) You must have an MySQL Server database instance running to be used by Ranger. 2) Execute the following command on the Ambari Server host. Replace database-type with mysql|oracle|postgres|mssql|sqlanywhere and /jdbc/driver/path based on the location of corresponding JDBC driver: ambari-server setup --jdbc-db={database-type} --jdbc-driver={/jdbc/driver/path} 3) make sure root user have access to the db from ranger host and ranger kms host. eg: your ranger host is ranger.host.com & ranger kms host is ranger.kms.host.com
then you should run following command on the mysql database: GRANT ALL ON *.* TO 'root'@'ranger.host.com' IDENTIFIED BY '<root_password>' WITH GRANT OPTION; GRANT ALL ON *.* TO 'root'@'ranger.kms.host.com' IDENTIFIED BY '<root_password>' WITH GRANT OPTION; flush privileges; Steps: 1) go to the ambari and go to the add service wizard 2) select the ranger and ranger kms host 3) make sure you fill following properties carefully in Ranger & Ranger KMS both: DB_FLAVOR = mysql
Ranger DB host = <db_host> eg: test.mysql.com
Setup Database and Database User = yes
Database Administrator (DBA) username = root
Database Administrator (DBA) password = <root_password> 4) any name can be configured to following db or user properties as they will be created by root user as fresh database or user since we selected "Setup Database and Database User" to yes Ranger DB name = ranger
Ranger DB username = rangeradmin
Ranger DB password = rangeradmin 5) Ranger KMS has an additional property: KMS master key password: <kms_password> , this is also newly configured password of your choice. 6)) audit can configured based on your choice if you want audit for service operations, Note: I have not given details of all the properties in this article because these are the important properties where people make mistakes. ,
... View more
Labels:
06-27-2017
10:11 AM
2 Kudos
Background: when it comes to rely completely on ranger, and if you are specific about configuring authorization for a resource to an end user then you have to create one policy for each resource. There should be a way to configure a policy that provide access to specific resources based on the User who is making call. Soultion: {USER} Support:
{USER} support solves this problem , It allows us to create a policy where we can configure resource as {USER} Eg. /user/{USER} and select user also as {USER}. that means all users will get access of their corresponding home directory. Eg.
Hdfs:
resource: /user/{USER} user1 will have access to /user/user1 user2 will have access to /user/user2 Hive:
resource: database:database_{USER} user1 will have access to database database_user1 user2 will have access to database database_user2 resource may contrain {USER} partially or fully. delimiter can be customised also , Steps to configure {USER}: 1) go to ranger admin, and create policy page, there on resource give {USER} as input. 2) in user type {USER} and {USER} will populate , just select it and add the policy more details can be found at https://cwiki.apache.org/confluence/display/RANGER/Support+for+%24username+variable.
... View more
Labels:
09-26-2016
08:50 AM
4 Kudos
Background: prior to HDP2.5 release , people used to look for a solution to sync the users from multiple organisation unit. but this was not possible, so in HDP2.5 this feature is released to support syncing the users from multiple OU. How to configure the multiple OU: OU can be configured in the same way as it was earlier , but if you want to configure multiple OU then it should semicolon(;) separated as follows: ou=Executives,DC=abc,DC=com;ou=Engineering,DC=abc,DC=com Sample User setting: Sample Group setting:
... View more
Labels:
09-25-2016
06:13 PM
13 Kudos
Release - HDP2.5 Background:
Before this feature was released, Ranger allowed User to create only access policy for hive, that can provide security till column level, but folks might be interested in enabling cell level security then this is the solution they should opt. Introduction: This is a newly introduced feature,allows to create new type of ranger-hive policies that help administrator to restrict User from accessing some specific rows in table based on the filter condition in policy or mask the data that is sensitive , this masking can be full or partial. Lets go in details for both policy type:
Row Filter policy: Row filter policy allows ranger to specify filter expression in the hive policy, so that users sees only some specifc rows in the table that belong to him. eg: if user belong to US, and query the employee table and we want to restrict him to see only those employee who belong to US, then filter expression will be location = 'US'. there is a new tab we can see HDP2.5 onwards on hive policy page for 'Row filter', we need to provide database and table name in resource, and in condition enter the row filter expression that need to be used for filtering out the result of the hive query run by user part of row filter policy condition. in filter expression must be a valid WHERE clause , even inner condition are accepted. -Note:
1) there is no column option provide in this policy type because it does not make sense for it. Column Masking Policy: Column masking policy allows ranger to specify masking condition in hive policy to mask the sensitive data for specific users, eg. in Bank account no. & cvv is sensitive data of a customer, now in ranger you can create masking policy to mask a column data partial or full for specific user or group.there is a new tab we can see HDP2.5 onwards on hive policy page for 'column masking', we need to provide database,table and column name in the resource and in condition select the masking condition. following masking conditions are supported currently: 1) Redact:
2) Partial Mask: show last 4 3) partial Mask: show first 4 4) Hash 5) Nullify 6) Unmasked(retain original value) 7) Date: show only year 😎 custom
-Note: 1) wildcards are not allowed in both of the policies type 2) these policies can be created on table or view both. 3) Mask and filter are evaluated while execution of query based on the order they are listed in the policy. Example: Now lets take a example and try a row filter and each column masking technique:
lets say we have a table called "customer" in "Bank" database: ----+--------------------+--+
| customer.id | customer.name | customer.account | customer.cvv | customer.dob | customer.location |
+--------------+----------------+-------------------+---------------+---------------+--------------------+--+
| 432 | Amit | 898981931313131 | 432 | 1975-04-01 | Delhi |
| 493 | John | 79898193128931 | 234 | 1985-09-11 | Bangalore |
| 683 | nisar | 69598193128931 | 121 | 1965-09-11 | Bangalore |
| 532 | rohan | 198981931313131 | 402 | 1995-04-01 | Delhi |
| 400 | Rahul | 69898193128931 | 159 | 1985-09-10 | Bangalore |
| 809 | nisar | 59598193128931 | 096 | 1979-09-11 | Bangalore | Lets create row filter and colum masking policy, and run "select * from customer;" and we will see the difference in results: 1) Row filter policy example: create a row filter policy with filter expression:
location = 'Bangalore' for user1
a) result if query executed by user1, "select * from customer;": +--------------+----------------+-------------------+---------------+---------------+--------------------+--+
| customer.id | customer.name | customer.account | customer.cvv | customer.dob | customer.location |
+--------------+----------------+-------------------+---------------+---------------+--------------------+--+
| 493 | John | 79898193128931 | 234 | 1985-09-11 | Bangalore |
| 683 | nisar | 69598193128931 | 121 | 1965-09-11 | Bangalore |
| 400 | Rahul | 69898193128931 | 159 | 1985-09-10 | Bangalore |
| 809 | nisar | 59598193128931 | 096 | 1979-09-11 | Bangalore |
+--------------+----------------+-------------------+---------------+---------------+--------------------+--+
4 rows selected (0.864 seconds) b) result if query executed by user2, "select * from customer;", since it is not part of policy so it will get all the results: +--------------+----------------+-------------------+---------------+---------------+--------------------+--+
| customer.id | customer.name | customer.account | customer.cvv | customer.dob | customer.location |
+--------------+----------------+-------------------+---------------+---------------+--------------------+--+
| 432 | Amit | 898981931313131 | 432 | 1975-04-01 | Delhi |
| 493 | John | 79898193128931 | 234 | 1985-09-11 | Bangalore |
| 683 | nisar | 69598193128931 | 121 | 1965-09-11 | Bangalore |
| 532 | rohan | 198981931313131 | 402 | 1995-04-01 | Delhi |
| 400 | Rahul | 69898193128931 | 159 | 1985-09-10 | Bangalore |
| 809 | nisar | 59598193128931 | 096 | 1979-09-11 | Bangalore |
+--------------+----------------+-------------------+---------------+---------------+--------------------+--+
2) column masking policy example: create a column masking policy on table customer, column account with masking condition: location = 'Bangalore' for user1 a) result if query executed by user1, "select * from customer;" , since it is part of policy so it will get masked account number and will show only last 4 numbers in it: +--------------+----------------+-------------------+---------------+---------------+--------------------+--+
| customer.id | customer.name | customer.account | customer.cvv | customer.dob | customer.location |
+--------------+----------------+-------------------+---------------+---------------+--------------------+--+
| 432 | Amit | xxxxxxxxxxx3131 | 432 | 1975-04-01 | Delhi |
| 493 | John | xxxxxxxxxx8931 | 234 | 1985-09-11 | Bangalore |
| 683 | nisar | xxxxxxxxxx8931 | 121 | 1965-09-11 | Bangalore |
| 532 | rohan | xxxxxxxxxxx3131 | 402 | 1995-04-01 | Delhi |
| 400 | Rahul | xxxxxxxxxx8931 | 159 | 1985-09-10 | Bangalore |
| 809 | nisar | xxxxxxxxxx8931 | 096 | 1979-09-11 | Bangalore |
+--------------+----------------+-------------------+---------------+---------------+--------------------+--+
6 rows selected (0.841 seconds) b) result if query executed by user2, "select * from customer;", since it is not part of policy so it will get unmasked results: +--------------+----------------+-------------------+---------------+---------------+--------------------+--+
| customer.id | customer.name | customer.account | customer.cvv | customer.dob | customer.location |
+--------------+----------------+-------------------+---------------+---------------+--------------------+--+
| 432 | Amit | 898981931313131 | 432 | 1975-04-01 | Delhi |
| 493 | John | 79898193128931 | 234 | 1985-09-11 | Bangalore |
| 683 | nisar | 69598193128931 | 121 | 1965-09-11 | Bangalore |
| 532 | rohan | 198981931313131 | 402 | 1995-04-01 | Delhi |
| 400 | Rahul | 69898193128931 | 159 | 1985-09-10 | Bangalore |
| 809 | nisar | 59598193128931 | 096 | 1979-09-11 | Bangalore |
+--------------+----------------+-------------------+---------------+---------------+--------------------+--+
6 rows selected (0.649 seconds) same way we can try out other masking types too. There are some good use cases listed on the following wiki page please refer it too: https://cwiki.apache.org/confluence/display/RANGER/Row-level+filtering+and+column-masking+using+Apache+Ranger+policies+in+Apache+Hive please comment for any question.
... View more
Labels:
07-09-2016
09:29 AM
2 Kudos
Problem: some of the rest call you might not face this issue , but some cases where A service to service call is made , for ex. hbase makes call to ranger admin to download the policy using following rest call. https://localhost:6182/service/plugins/policies/download/ in this case ranger admin trust store should have certificate of Client who is trying to download the policy. so for example if hbase tries to download the policy then since we had already setup the ssl for ranger and plugins and ranger admin is having certificate for hbase plugin in ranger admin trust store so this call will work fine but if you try to make this call using curl from your own rest client then it will fail curl -k -u admin:admin 'https://localhost:6182/service/plugins/policies/download/'
it will throw following kind of error: ERROR org.apache.ranger.common.ServiceUtil (ServiceUtil.java:1376) - Unauthorized access. Unable to get client certificate. serviceName=cl1_hadoop
2016-07-06 05:51:46,264 [http-bio-6182-exec-26] INFO org.apache.ranger.common.RESTErrorUtil (RESTErrorUtil.java:65) - Request failed. SessionId=null, loginId=hdfs, logMessage=Unauthorized access - unable to get client certificate
javax.ws.rs.WebApplicationException
at org.apache.ranger.common.RESTErrorUtil.createRESTException(RESTErrorUtil.java:56)
at org.apache.ranger.common.RESTErrorUtil.createRESTException(RESTErrorUtil.java:335)
at org.apache.ranger.common.ServiceUtil.isValidateHttpsAuthentication(ServiceUtil.java:1377)
at org.apache.ranger.rest.ServiceREST.getSecureServicePoliciesIfUpdated(ServiceREST.java:1847)
at org.apache.ranger.rest.ServiceREST$FastClassByCGLIB$92dab672.invoke(<generated>)
at net.sf.cglib.proxy.MethodProxy.invoke(MethodProxy.java:191)
at Resolution: to resolve this problem we need to pass key while making curl call , so first you need to generate the key using following steps: 1. using the keystore of ranger hadoop plugin generate the PKCS12 type store as follows: keytool -importkeystore -srckeystore <source keystore path> -destkeystore <PKCS type store path> -srcstorepass <source store password> -srcstoretype jks -deststoretype PKCS12 -srcalias <source keystor alias> -deststorepass <pkcs store password> -destkeypass <key password> 2. now PCKS12 type store will be generated , now use this store and generate the key openssl pkcs12 -in <PKCS type store path> -out <pem key file path> -nodes -passin pass:<key password> Now you can use this key to make curl call as follows: curl -k -u:admin:admin --cert <pem key file path>:<key password> 'https://localhost:6182/service/plugins/policies/download/' Note: same steps can be followed in case you face such exception while making curl call for any other service where server needs a certificate of client to allow the call.
... View more
Labels:
06-14-2016
10:47 AM
3 Kudos
we should follow the given steps for remote debugging of the ranger: 1. clone the incubator ranger code to your local machine & Import the Apache Ranger project into eclipse. http://ranger.apache.org/quick_start_guide.html 2. Now lets say you want to debug the Ranger Admin process then go to the machine where Ranger Admin process is running. edit the /usr/bin/ranger-admin and add the entry as follows: JAVA_OPTS="$JAVA_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=9000" keep suspend to n , if nothing need to be debug in startup flow. keep suspend to y, if ranger startup flow need to be debug. in this case process will wait for eclipse to connect to the debug port before it starts 3. check whether debug port 9000 is open for ranger admin process netstat -tanlp | grep 16776
tcp 0 0 0.0.0.0:9000 0.0.0.0:* LISTEN 16776/java 4. go to eclipse and in your project Debug as -> Debug Configurations -> Remote Java Applications and there provide the host and port. host should be the node ip/hostname where ranger admin is running port is the debug port we have configured in JAVA_OPTS ( in our case it is 9000) and click on debug. now eclipse client will be connected to the running process. 5. Now lets say we want to debug the search user flow in ranger. then we can put debug point at the respective point and perform the user search operation on ranger UI then we can see flow will come to the debug point we had put. please see the screenshot. ranger user search operation: Debug point: then we can further step over/into and debug the flow. Note: same way we can debug other ranger process such as usersync. - that is it in Remote debugging Ranger process, please comment your questions if any.
... View more
Labels: