About slachterman

slachterman · ‎01-31-2017

In Zeppelin LDAP Authentication with OpenLDAP and How to Set Up OpenLDAP we've shown how to use LDAP Authentication with Zeppelin. In this article, we'll harden that configuration by ensuring that Zeppelin and OpenLDAP communicate over LDAPS. LDAPS is a secure protocol that uses TLS to assure authenticity, confidentiality, and integrity of communications. This prevents man-in-the-middle attacks that sniff traffic to discover LDAP credentials communicated in plaintext, which could compromise the security of the cluster. The first step is to modify the configuration of the OpenLDAP server, as root, to expose LDAPS connectivity, we'll need to modify /etc/openldap/ldap.conf. Please recall that we created /etc/openldap/certs/myldap.field.hortonworks.com.cert in the How to Set Up OpenLDAP article #TLS_CACERTDIR /etc/openldap/certs TLS_CACERT /etc/openldap/certs/myldap.field.hortonworks.com.cert URI ldaps://myldap.field.hortonworks.com ldap://myldap.field.hortonworks.com BASE dc=field,dc=hortonworks,dc=com We also need to modify /etc/sysconfig/slapd : SLAPD_URLS="ldapi:/// ldap:/// ldaps:///" Then restart slapd: systemctl restart slapd You can confirm that slapd is listening on 636: netstat -anp | grep 636 Finally, confirm TLS connectivity and secure ldapsearch (with the appropriate bind user and password from the previous articles): # should succeed openssl s_client -connect myldap.field.hortonworks.com:636 </dev/null # should succeed ldapsearch -H ldaps://myldap.field.hortonworks.com:636 -D cn=ldapadm,dc=field,dc=hortonworks,dc=com -w $password -b "ou=People,dc=field,dc=hortonworks,dc=com" The next step is the client-side configuration changes. Since we are using a self-signed certificate for the OpenLDAP server, we need to import this into the Java truststore, called cacerts, which is in /etc/pki/ca-trust/extracted/java on my CentOS 7 system. Copy the myldap.field.hortonworks.com.cert file from the OpenLDAP server to the Zeppelin server (this file does not contain sensitive key material, only public keys), and run (making sure you set this certificate to be trusted): keytool -import -alias myldap -file /etc/security/certificates/myldap.field.hortonworks.com.cert -keystore cacerts Otherwise, you will see errors like Root exception is javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed Lastly, in Ambari, we just need to make one small change to the shiro.ini configuration in Zeppelin > Config > Advanced zeppelin-env > shiro_ini_content : ldapRealm.contextFactory.url = ldaps://myldap.field.hortonworks.com:636 Note the protocol change to LDAPS and the port number change to 636. To test, restart the Zeppelin service and confirm that users can still log in to the Zeppelin UI.

slachterman · ‎01-25-2017

Apache Ranger uses an embedded Tomcat server to provide the Web UI functionality for administration of Ranger. A previous HCC article provided details on maintenance of the log files that are managed by the log4j configuration, including xa_portal.log, ranger_admin_perf.log, xa_portal_sql.log. We're going to focus on maintenance of the access_log* logs that get automatically generated by Tomcat, but which are not managed by this log4j configuration. With embedded Tomcat, the configuration is contained within the code for the AccessLogValve (as you can see, it uses an hourly rotation pattern unless overridden by ranger.accesslog.dateformat). We'll use the logrotate application in CentOS/RHEL to manage these access_log* logs as the number of files can grow large without rotation and removal in place. You can check to see how many of these files you have on your Ranger Admin node by running (there would be one access_log* file per hour for each day during which the service has ran continuously): ls /var/log/ranger/admin | cut -d '.' -f 1 | uniq -c Within /etc/logrotate.d, we'll create a configuration specific to these Ranger logs, as the configuration for logrotate, in /etc/logrotate.conf by default, will include these application-spcific configurations as well. Create a new file (as root) ranger_access in /etc/logrotate.d in your favorite editor and then insert: /var/log/ranger/admin/access_log* { daily copytruncate compress dateext rotate 5 maxage 7 olddir /var/log/ranger/admin/old missingok } This is just an example logrotate configuration. I'll make note of a couple items, please see the man page for details on each of these options and some additional examples. The copytruncate option ensures that Tomcat can keep writing to the same file handle (as opposed to writing to a newly-created file which requires recycling Tomcat) The compress option will use gzip by default Maxage limits how old the files are that will be kept Olddir indicates that logs are moved into the directory for rotation Logrotate will be invoked daily as a cronjob by default, due to the existence of the logrotate file in /etc/cron.daily. You can run logrotate manually by specifying the configuration: sudo /usr/sbin/logrotate /etc/logrotate.conf Note that logrotate keeps the state of files in the /var/lib/logrotate.status, and it uses the date of last execution captured there as the reference of what to do with a logfile. You can also run logrotate with the -d flag to test your configuration (this won't actually do anything, it will just produce output regarding what would happen). sudo /usr/sbin/logrotate -d /etc/logrotate.conf 2> /tmp/logrotate.debug As a result of this configuration, only 5 days worth of logs are kept, they're kept in the ./old directory, and they're compressed. This ensures that the Ranger admin access_log* logs data does not grow unmanageably large.

azeltov · ‎01-20-2017

Nice works @slachterman !!!

slachterman · ‎12-20-2016

@jzhang good call, I changed to yarn-cluster mode for the Livy interpreter and was not able to reproduce the error in HDP 2.5.

bala_bigdata · ‎07-31-2019

@slachterman i just followed the instructions provided in the article to move generated flow files to ADLS using PutHDFS processor, But I am getting the below errors. Please help. I have specified all the configurations for the PutHDFS as per the article. It is keep on saying "unable to find the valid certification path to requested target, but not sure where i need to upload the ADLS cred certificate in the PutHDFS processor 2019-07-30 17:24:20,657 ERROR [Timer-Driven Process Thread-9] o.apache.nifi.processors.hadoop.PutHDFS PutHDFS[id=44e51785-016c-1000-901a-6aa4d9167c2c] Failed to access HDFS due to com.microsoft.azure.datalake.store.ADLException: Last encountered exception thrown after 5 tries. [javax.net.ssl.SSLHandshakeException,javax.net.ssl.SSLHandshakeException,javax.net.ssl.SSLHandshakeException,javax.net.ssl.SSLHandshakeException,javax.net.ssl.SSLHandshakeException] [ServerRequestId:null]: com.microsoft.azure.datalake.store.ADLException: Error getting info for file / Operation GETFILESTATUS failed with exception javax.net.ssl.SSLHandshakeException : sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target Last encountered exception thrown after 5 tries. [javax.net.ssl.SSLHandshakeException,javax.net.ssl.SSLHandshakeException,javax.net.ssl.SSLHandshakeException,javax.net.ssl.SSLHandshakeException,javax.net.ssl.SSLHandshakeException] [ServerRequestId:null] Last encountered exception thrown after 5 tries. [javax.net.ssl.SSLHandshakeException,javax.net.ssl.SSLHandshakeException,javax.net.ssl.SSLHandshakeException,javax.net.ssl.SSLHandshakeException,javax.net.ssl.SSLHandshakeException] [ServerRequestId:null] at com.microsoft.azure.datalake.store.ADLStoreClient.getExceptionFromResponse(ADLStoreClient.java:1194) at com.microsoft.azure.datalake.store.ADLStoreClient.getDirectoryEntry(ADLStoreClient.java:741) at org.apache.hadoop.fs.adl.AdlFileSystem.getFileStatus(AdlFileSystem.java:487) at org.apache.nifi.processors.hadoop.PutHDFS$1.run(PutHDFS.java:268) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1942) at org.apache.nifi.processors.hadoop.PutHDFS.onTrigger(PutHDFS.java:236) at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1162) at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:209) at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117) at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target at sun.security.ssl.Alerts.getSSLException(Alerts.java:192) at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1946) at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:316) at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:310) at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1639) at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:223) at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1037) at sun.security.ssl.Handshaker.process_record(Handshaker.java:965) at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1064) at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1367) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1395) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1379) at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559) at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185) at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1564) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492) at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480) at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:347) at com.microsoft.azure.datalake.store.HttpTransport.makeSingleCall(HttpTransport.java:307) at com.microsoft.azure.datalake.store.HttpTransport.makeCall(HttpTransport.java:90) at com.microsoft.azure.datalake.store.Core.getFileStatus(Core.java:691) at com.microsoft.azure.datalake.store.ADLStoreClient.getDirectoryEntry(ADLStoreClient.java:739) ... 18 common frames omitted

muthukumar_siva · ‎11-18-2016

Very nice article. If you got step by step procedure with pre-requisites,could you pls fwd to me (muthukumar.siva@gmail.com) i would like to implement in my environment. Thank you in advance.

azeltov · ‎11-15-2016

@Sunile Manjee Answer is Yes. In HDP 2.5 Spark Column Security is available with LLAP and Ranger integration You get Fine-Grained Column Level Access Control for SparkSQL. Fully dynamic policies per user. Doesn’t require views. Use Standard Ranger policies and tools to control access and masking policies. Flow: 1.SparkSQL gets data locations known as “splits” from HiveServer and plans query. 2.HiveServer2 authorizes access using Ranger. Per-user policies like row filtering are applied. 3.Spark gets a modified query plan based on dynamic security policy. 4.Spark reads data from LLAP. Filtering / masking guaranteed by LLAP server.

sankar_tumuluru · ‎01-25-2018

Were you able to complete the rest of the steps?

carsten_herbe · ‎12-07-2016

We don't use Ranger (yet) but HiveServer2 with impersonation. So I must specify the Hive principal in the connection string. Works in beeline and with JDBC clients, but I don't know where to specify principal=hive/hivehost@MYREALM in Tableau. Do you have any idea by chance? Thx!

maykiwogno · ‎10-20-2016

thanks, I'll try it and tell you it is ok.

Online	Offline
Last Visited	‎05-03-2018 08:43 PM

Member Since	‎06-20-2016 02:58 PM
Last Visited	‎05-03-2018 08:43 PM
Posts	251
Kudos received	196

Cloudera Community

Re: PySpark and Python version (<3.6)?

Re: Ambari Server Start failure - Ranger Atlas Ta...

Re: Using underscore _ in a database name in HIVE

Re: Active directory as Directory Service and MIT ...

Re: 4 node cluster configuration

Hardening Zeppelin-OpenLDAP connections using TLS

How to manage Ranger Admin access_log log file gro...

Re: Enabling TLS for the Zeppelin UI

Re: Livy HTTP 403 Error

Re: Connecting to Azure Data Lake from a NiFi data...

Re: Ranger Audit Analytics with NiFi and Zeppelin

Re: Ranger Dynamic query rewrite available for hiv...

Re: How to Create a Ranger Policy that Prohibits C...

Re: Connecting to Hive via Knox from Tableau

Re: NIFI - ListSFTP / FETCHSFTP / PUTHDFS