Member since
12-13-2017
11
Posts
0
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
7626 | 03-28-2018 11:22 AM |
03-18-2020
12:12 PM
I'm attempting to set up 2 way SSL between Ranger Admin and the Ranger HDFS Plugin.
Ranger Admin works without issue, but the HDFS plugin is not able to communicate properly to Ranger Admin via SSL, and Ranger HDFS policies do not get applied. Main error message from /var/log/ranger/admin/xa_portal.log:
[http-xxx-xxxx-exec-1] INFO org.apache.ranger.common.RESTErrorUtil (RESTErrorUtil.java:345) - Request failed. loginId=null, logMessage=VXResponse={org.apache.ranger.view.VXResponse@7bed9b22statusCode={1} msgDesc={Unauthorized access - unable to get client certificate} messageList={[VXMessage={org.apache.ranger.view.VXMessage@72ecb1bfname={OPER_NOT_ALLOWED_FOR_ENTITY} rbKey={xa.error.oper_not_allowed_for_state} message={Operation not allowed for entity} objectId={null} fieldName={null} }]} }javax.ws.rs.WebApplicationException
Is there something in particular I am missing?
References for steps followed:
1)
https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.5/configuring-wire-encryption/content/self_signed_configuring_the_ranger_hdfs_plugin_for_ssl.html
2)
https://www.youtube.com/watch?v=g6m-LII4zjE&feature=emb_title
3)
https://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.0/com.ibm.swg.im.infosphere.biginsights.admin.doc/doc/admin_ranger_ssl_selfsigned_plugins.html
Thanks
Braz
... View more
Labels:
- Labels:
-
Apache Ranger
06-25-2018
09:58 AM
Is there not a Kudu command which will allow for obtaining table size information? If not, then how does Cloudera Manager perform this? We would like to be able to replicate this behavior so that we can configure e-mail alerts to be sent whenever a table reaches a particular size. Thanks, Braz
... View more
04-24-2018
12:36 PM
Okay this makes sense I think, but what I meant is that users that access through Impala will be subject to Impala's more fine-grained permissions from Sentry. On the other hand, users that access using another method such as Kudu CLI will not be since it accesses Kudu directly, which is not directly integrated with Sentry, thus not picking up Hive or HDFS ACLs applied there. Thanks, Braz
... View more
04-24-2018
11:26 AM
So to limit access exclusively through Impala, we would add the impala user to the Kudu User Access Control List without adding individual IDs who would interact with Kudu via Impala? This would prevent these same user IDs from being able to delete or view data from the Kudu CLI, since only 'impala' is allowed access? Would this also prevent users from accessing from Kudu APIs, including through Spark? Thanks, Braz
... View more
04-23-2018
03:02 PM
Kudu 1.5.0 has been installed on our cluster currently running CDH 5.13.1. After reading that Kudu authorization is coarse-grained, and allows users that have access to Kudu full access to the data. https://www.cloudera.com/documentation/enterprise/5-13-x/topics/kudu_security.html#concept_bbg_4jr_kz After testing, we've found that this is through the Kudu command-line interface where these coarse-grained ACLs get applied. We observed a non-admin user had access to drop tables in a database that this account had not been granted access to. This was via the following command: kudu table delete <master_address> <table_name> This is a security risk and concern for us when implementing Kudu in live environments where data security is critical. Is there a way to limit access to the Kudu command-line interface to admin groups, and is this a suggested method? Thank you, Braz
... View more
Labels:
- Labels:
-
Apache Kudu
04-05-2018
10:06 AM
Okay, and I hope I'm not asking too much in this one forum post, but since it's related: what is the recommended number of Tablet Server directories? Could we number of directories to each JBOD disk used by the DataNode? Of course without using a sub-directory of the DataNode.
... View more
04-04-2018
02:02 PM
Thanks for these answers. Having issues with replying, but for your last answer concerning the WAL directory and the metadata, would you recommend having a separate directory for the Tablet Server WAL? Thanks
... View more
04-03-2018
02:50 PM
Hello, We have just installed Kudu in our test environment, and are currently running CDH 5.13.1. Due to this being a small POC environment, we only have 2 tablet servers, and a single master, making it only usable for functional testing. There were 4 requried configuration properties upon installation of Kudu with CDH 5.13, for whichthe following were configured: Kudu Master WAL Directory: /data/kudu/master_wal Kudu Master Data Directories /data/kudu/master_wal Kudu Tablet Server WAL Directory /data1/kudu/tablet_wal Kudu Tablet Server Data Directories /data1/kudu/tablet_data /data2/kudu/tablet_data /data3/kudu/tablet_data My question is concerning the Master data directories configuration property: Should multiple directories be used for storing the Kudu master data? It appears this is expected with the configuration property being plural, and it's set up to be configured similar to the tablet server data directories from Cloudera Manager. But if the Kudu Master server resides on one of the master/utility nodes, then there are not multiple JBOD mount points like a worker node. Are there significant benefits of having multiple Kudu master data directories or inherit risks with just a single master data directory? If we configured an additional master data directory on the OS disk (such as under /var or /opt), would this be a concern? I've read that SSDs are recommended for the WAL directories. Is there a major performance impact if the WAL directory is on the same mount point as one of the data directories? Thank you, Braz
... View more
Labels:
- Labels:
-
Apache Kudu
03-28-2018
11:22 AM
All I'm seeing are messages like the following when restarting haproxy: 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy main started.
2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy static started.
2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy impala started.
2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy impalajdbc started.
2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy main started.
2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy impalajdbc started.
2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy app started. And sometimes other messages like the following in the haproxy log - but they don't come in immediately after running an Impala query over JDBC such as from Hue or beeline: 2018-03-27T15:51:47-04:00 172.28.2.234 haproxy[4286]: 172.28.6.234:37768 [27/Mar/2018:15:51:47.621] impalajdbc impalajdbc/impalajdbc1 0/0/+0 +0 -- 2/2/2/1/0 0/0
2018-03-27T18:08:08-04:00 172.28.2.234 haproxy[4286]: 172.28.2.20:39978 [27/Mar/2018:18:08:08.888] impala impala/impalad1 0/0/+0 +0 -- 3/1/1/1/0 0/0 However, no logs come in when accessing from a BI tool such as Tableau using its native Cloudera Impala connector or the Cloudera Impala ODBC driver. Is there a way to increase the logging for haproxy so that we can know which Impala Daemon a query is executing on for the purpose of debugging potential issues of someone accessing from a BI application? I already have it said to debug under the listen section for impalajdbc. listen impalajdbc :21051
mode tcp
option tcplog
balance roundrobin
log 172.28.xx.xx local2 debug
server impalajdbc1 hdp104v.cmssvc.local:21050
server impalajdbc2 hdp105v.cmssvc.local:21050 Thanks, Braz
... View more
03-21-2018
09:28 AM
Below are the contents of the haproxy.cfg file: #---------------------------------------------------------------------
# Example configuration for a possible web application. See the
# full configuration options online.
#
# http://haproxy.1wt.eu/download/1.4/doc/configuration.txt
#
#---------------------------------------------------------------------
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
# to have these messages end up in /var/log/haproxy.log you will
# need to:
#
# 1) configure syslog to accept network log events. This is done
# by adding the '-r' option to the SYSLOGD_OPTIONS in
# /etc/sysconfig/syslog
#
# 2) configure local2 events to go to the /var/log/haproxy.log
# file. A line like the following can be added to
# /etc/sysconfig/syslog
#
# local2.* /var/log/haproxy.log
#
log 172.28.xx.xx local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
# turn on stats unix socket
# stats socket /var/lib/haproxy/stats
#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
mode tcp
log global
option tcplog
option dontlognull
# option http-server-close
# option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
#Values increased for fixing timeout issue with impala queries
#timeout client 1m
#timeout server 1m
timeout client 120m
timeout server 120m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
#---------------------------------------------------------------------
# main frontend which proxys to the backends
#---------------------------------------------------------------------
frontend main *:5000
acl url_static path_beg -i /static /images /javascript /stylesheets
acl url_static path_end -i .jpg .gif .png .css .js
use_backend static if url_static
default_backend app
option tcplog
log 172.28.xx.xx local2 debug
#---------------------------------------------------------------------
# static backend for serving up images, stylesheets and such
#---------------------------------------------------------------------
backend static
balance roundrobin
#server static 127.0.0.1:4331 check
#This sets up the admin page for HA proxy at port 25002
#listen stats :25002
# listen stats
# bind hdp004v.cmssvc.local:20020 ssl crt /data/security/hdp004v/hdp004v.crt.pem
# balance
# mode http
# stats enable
# stats auth username:password
# stats uri /haproxy?stats
#This is the setup for Imapala. Impala client connect to load_balancer_host:25003
#HAProxy will balance connections among the list of servers listed below.
# The list of impalad is listening at port 21000 for beeswax (impala-shell) or original ODBC driver.
# For JDBC or ODBC version 2.x driver, use port 21050 instead of 21000
listen impala :25003
mode tcp
option tcplog
balance leastconn
server impalad1 hdp104v.cmssvc.local:21000
server impalad2 hdp105v.cmssvc.local:21000
# setup for hue or toher JDBC-enabled applications. Hue requires sticky sessions.
# The application connects to load_balancer_host:21051, and HAProxy balances
# connections to the associated hosts, where Imapala listens for JDBC
# requests on port 21050
listen impalajdbc :21051
mode tcp
option tcplog
balance source
log 172.28.xx.xx local2 debug
server impalajdbc1 hdp104v.cmssvc.local:21050
server impalajdbc2 hdp105v.cmssvc.local:21050
#---------------------------------------------------------------------
# round robin balancing between the various backends
#---------------------------------------------------------------------
backend app
balance roundrobin
#server app1 127.0.0.1:5001 check
#server app2 127.0.0.1:5002 check
#server app3 127.0.0.1:5003 check
#server app4 127.0.0.1:5004 check Thanks, Braz
... View more