About Braz

Braz · ‎03-18-2020

I'm attempting to set up 2 way SSL between Ranger Admin and the Ranger HDFS Plugin. Ranger Admin works without issue, but the HDFS plugin is not able to communicate properly to Ranger Admin via SSL, and Ranger HDFS policies do not get applied. Main error message from /var/log/ranger/admin/xa_portal.log: [http-xxx-xxxx-exec-1] INFO org.apache.ranger.common.RESTErrorUtil (RESTErrorUtil.java:345) - Request failed. loginId=null, logMessage=VXResponse={org.apache.ranger.view.VXResponse@7bed9b22statusCode={1} msgDesc={Unauthorized access - unable to get client certificate} messageList={[VXMessage={org.apache.ranger.view.VXMessage@72ecb1bfname={OPER_NOT_ALLOWED_FOR_ENTITY} rbKey={xa.error.oper_not_allowed_for_state} message={Operation not allowed for entity} objectId={null} fieldName={null} }]} }javax.ws.rs.WebApplicationException Is there something in particular I am missing? References for steps followed: 1) https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.5/configuring-wire-encryption/content/self_signed_configuring_the_ranger_hdfs_plugin_for_ssl.html 2) https://www.youtube.com/watch?v=g6m-LII4zjE&feature=emb_title 3) https://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.0/com.ibm.swg.im.infosphere.biginsights.admin.doc/doc/admin_ranger_ssl_selfsigned_plugins.html Thanks Braz

Braz · ‎06-25-2018

Is there not a Kudu command which will allow for obtaining table size information? If not, then how does Cloudera Manager perform this? We would like to be able to replicate this behavior so that we can configure e-mail alerts to be sent whenever a table reaches a particular size. Thanks, Braz

Braz · ‎04-24-2018

Okay this makes sense I think, but what I meant is that users that access through Impala will be subject to Impala's more fine-grained permissions from Sentry. On the other hand, users that access using another method such as Kudu CLI will not be since it accesses Kudu directly, which is not directly integrated with Sentry, thus not picking up Hive or HDFS ACLs applied there. Thanks, Braz

Braz · ‎04-24-2018

So to limit access exclusively through Impala, we would add the impala user to the Kudu User Access Control List without adding individual IDs who would interact with Kudu via Impala? This would prevent these same user IDs from being able to delete or view data from the Kudu CLI, since only 'impala' is allowed access? Would this also prevent users from accessing from Kudu APIs, including through Spark? Thanks, Braz

Braz · ‎04-23-2018

Kudu 1.5.0 has been installed on our cluster currently running CDH 5.13.1. After reading that Kudu authorization is coarse-grained, and allows users that have access to Kudu full access to the data. https://www.cloudera.com/documentation/enterprise/5-13-x/topics/kudu_security.html#concept_bbg_4jr_kz After testing, we've found that this is through the Kudu command-line interface where these coarse-grained ACLs get applied. We observed a non-admin user had access to drop tables in a database that this account had not been granted access to. This was via the following command: kudu table delete <master_address> <table_name> This is a security risk and concern for us when implementing Kudu in live environments where data security is critical. Is there a way to limit access to the Kudu command-line interface to admin groups, and is this a suggested method? Thank you, Braz

Braz · ‎04-05-2018

Okay, and I hope I'm not asking too much in this one forum post, but since it's related: what is the recommended number of Tablet Server directories? Could we number of directories to each JBOD disk used by the DataNode? Of course without using a sub-directory of the DataNode.

Braz · ‎04-04-2018

Thanks for these answers. Having issues with replying, but for your last answer concerning the WAL directory and the metadata, would you recommend having a separate directory for the Tablet Server WAL? Thanks

Braz · ‎04-03-2018

Hello, We have just installed Kudu in our test environment, and are currently running CDH 5.13.1. Due to this being a small POC environment, we only have 2 tablet servers, and a single master, making it only usable for functional testing. There were 4 requried configuration properties upon installation of Kudu with CDH 5.13, for whichthe following were configured: Kudu Master WAL Directory: /data/kudu/master_wal Kudu Master Data Directories /data/kudu/master_wal Kudu Tablet Server WAL Directory /data1/kudu/tablet_wal Kudu Tablet Server Data Directories /data1/kudu/tablet_data /data2/kudu/tablet_data /data3/kudu/tablet_data My question is concerning the Master data directories configuration property: Should multiple directories be used for storing the Kudu master data? It appears this is expected with the configuration property being plural, and it's set up to be configured similar to the tablet server data directories from Cloudera Manager. But if the Kudu Master server resides on one of the master/utility nodes, then there are not multiple JBOD mount points like a worker node. Are there significant benefits of having multiple Kudu master data directories or inherit risks with just a single master data directory? If we configured an additional master data directory on the OS disk (such as under /var or /opt), would this be a concern? I've read that SSDs are recommended for the WAL directories. Is there a major performance impact if the WAL directory is on the same mount point as one of the data directories? Thank you, Braz

Braz · ‎03-28-2018

All I'm seeing are messages like the following when restarting haproxy: 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy main started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy static started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy impala started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy impalajdbc started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy main started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy impalajdbc started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy app started. And sometimes other messages like the following in the haproxy log - but they don't come in immediately after running an Impala query over JDBC such as from Hue or beeline: 2018-03-27T15:51:47-04:00 172.28.2.234 haproxy[4286]: 172.28.6.234:37768 [27/Mar/2018:15:51:47.621] impalajdbc impalajdbc/impalajdbc1 0/0/+0 +0 -- 2/2/2/1/0 0/0 2018-03-27T18:08:08-04:00 172.28.2.234 haproxy[4286]: 172.28.2.20:39978 [27/Mar/2018:18:08:08.888] impala impala/impalad1 0/0/+0 +0 -- 3/1/1/1/0 0/0 However, no logs come in when accessing from a BI tool such as Tableau using its native Cloudera Impala connector or the Cloudera Impala ODBC driver. Is there a way to increase the logging for haproxy so that we can know which Impala Daemon a query is executing on for the purpose of debugging potential issues of someone accessing from a BI application? I already have it said to debug under the listen section for impalajdbc. listen impalajdbc :21051 mode tcp option tcplog balance roundrobin log 172.28.xx.xx local2 debug server impalajdbc1 hdp104v.cmssvc.local:21050 server impalajdbc2 hdp105v.cmssvc.local:21050 Thanks, Braz

Braz · ‎03-21-2018

Below are the contents of the haproxy.cfg file: #--------------------------------------------------------------------- # Example configuration for a possible web application. See the # full configuration options online. # # http://haproxy.1wt.eu/download/1.4/doc/configuration.txt # #--------------------------------------------------------------------- #--------------------------------------------------------------------- # Global settings #--------------------------------------------------------------------- global # to have these messages end up in /var/log/haproxy.log you will # need to: # # 1) configure syslog to accept network log events. This is done # by adding the '-r' option to the SYSLOGD_OPTIONS in # /etc/sysconfig/syslog # # 2) configure local2 events to go to the /var/log/haproxy.log # file. A line like the following can be added to # /etc/sysconfig/syslog # # local2.* /var/log/haproxy.log # log 172.28.xx.xx local2 chroot /var/lib/haproxy pidfile /var/run/haproxy.pid maxconn 4000 user haproxy group haproxy daemon # turn on stats unix socket # stats socket /var/lib/haproxy/stats #--------------------------------------------------------------------- # common defaults that all the 'listen' and 'backend' sections will # use if not designated in their block #--------------------------------------------------------------------- defaults mode tcp log global option tcplog option dontlognull # option http-server-close # option forwardfor except 127.0.0.0/8 option redispatch retries 3 timeout http-request 10s timeout queue 1m timeout connect 10s #Values increased for fixing timeout issue with impala queries #timeout client 1m #timeout server 1m timeout client 120m timeout server 120m timeout http-keep-alive 10s timeout check 10s maxconn 3000 #--------------------------------------------------------------------- # main frontend which proxys to the backends #--------------------------------------------------------------------- frontend main *:5000 acl url_static path_beg -i /static /images /javascript /stylesheets acl url_static path_end -i .jpg .gif .png .css .js use_backend static if url_static default_backend app option tcplog log 172.28.xx.xx local2 debug #--------------------------------------------------------------------- # static backend for serving up images, stylesheets and such #--------------------------------------------------------------------- backend static balance roundrobin #server static 127.0.0.1:4331 check #This sets up the admin page for HA proxy at port 25002 #listen stats :25002 # listen stats # bind hdp004v.cmssvc.local:20020 ssl crt /data/security/hdp004v/hdp004v.crt.pem # balance # mode http # stats enable # stats auth username:password # stats uri /haproxy?stats #This is the setup for Imapala. Impala client connect to load_balancer_host:25003 #HAProxy will balance connections among the list of servers listed below. # The list of impalad is listening at port 21000 for beeswax (impala-shell) or original ODBC driver. # For JDBC or ODBC version 2.x driver, use port 21050 instead of 21000 listen impala :25003 mode tcp option tcplog balance leastconn server impalad1 hdp104v.cmssvc.local:21000 server impalad2 hdp105v.cmssvc.local:21000 # setup for hue or toher JDBC-enabled applications. Hue requires sticky sessions. # The application connects to load_balancer_host:21051, and HAProxy balances # connections to the associated hosts, where Imapala listens for JDBC # requests on port 21050 listen impalajdbc :21051 mode tcp option tcplog balance source log 172.28.xx.xx local2 debug server impalajdbc1 hdp104v.cmssvc.local:21050 server impalajdbc2 hdp105v.cmssvc.local:21050 #--------------------------------------------------------------------- # round robin balancing between the various backends #--------------------------------------------------------------------- backend app balance roundrobin #server app1 127.0.0.1:5001 check #server app2 127.0.0.1:5002 check #server app3 127.0.0.1:5003 check #server app4 127.0.0.1:5004 check Thanks, Braz

Online	Offline
Last Visited	‎09-14-2020 05:15 PM

Member Since	‎12-13-2017 10:02 AM
Last Visited	‎09-14-2020 05:15 PM
Posts	11

Cloudera Community

Re: HAProxy Not Producing Sufficient Logs

Ranger HDFS Policies Failing

Re: kudu table size

Re: Kudu Security and CLI

Re: Kudu Security and CLI

Kudu Security and CLI

Re: Kudu Master Directories

Re: Kudu Master Directories

Kudu Master Directories

Re: HAProxy Not Producing Sufficient Logs

Re: HAProxy Not Producing Sufficient Logs