Member since
06-05-2017
18
Posts
7
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2113 | 09-29-2017 08:34 AM |
05-13-2020
12:22 AM
@getschwifty Revised the article to reflect the best practices. try it out and see if that helps you. thanks for your valuable feedback.
... View more
05-12-2020
08:17 AM
@getschwifty Please refer to latest documentation on setting up HBase Client account. Use the client account principle and key tab files from Java application. You will also have to adjust the HBase native ACLs or Ranger policies to allow the user/tenants to access tenant-specific HBase resources.
... View more
03-19-2018
02:09 PM
3 Kudos
This article provides an overview of various aspects of Monitoring Hive LLAP key metrics like Hive LLAP Configurations, YARN Queue setup, YARN containers, LLAP cache hit ratio, executors, IO elevator metrics, JVM Heap usage and non-heap usage… etc Execution Engine LLAP is not an execution engine (like MapReduce or Tez). Overall execution is scheduled and monitored by an existing Hive execution engine (such as Tez) transparently over both LLAP nodes, as well as regular containers. Obviously, LLAP level of support depends on each individual execution engine (starting with Tez). MapReduce support is not planned, but other engines may be added later. Other frameworks like Pig and Spark also have the choice of using LLAP daemons. Enabling LLAP, Setting up Memory per daemon, In-Memory cache per Daemon, Number of Node(s) for running Hive LLAP daemon (num_llap_nodes_for_llap_daemons) and Number of executors per LLAP daemon in Advanced Hive Interactive-site: Cache Basics The daemon caches metadata for input files, as well as the data. The metadata and index information can be cached even for data that is not currently cached. Metadata is stored in the process in Java objects; cached data is stored and kept off-heap. Eviction policy is tuned for analytical workloads with frequent (partial) table-scans. Initially, a simple policy like LRFU is used. The policy is pluggable. Caching granularity. Column-chunks are the unit of data in the cache. This achieves a compromise between low-overhead processing and storage efficiency. The granularity of the chunks depends on the particular file format and execution engine (Vectorized Row Batch size, ORC stripe, etc.). A bloom filter is automatically created to provide Dynamic Runtime Filtering. Resource Management YARN remains responsible for the management and allocation of resources. The YARN container delegation model is used to allow the transfer of allocated resources to LLAP. To avoid the limitations of JVM memory settings, cached data is kept off-heap, as well as large buffers for processing (e.g., group by, joins). This way, the daemon can use a small amount of memory, and additional resources (i.e., CPU and memory) will be assigned based on workload. LLAP Yarn Queue It is important to know how different parameters in YARN queue configurations effects in LLAP performance. yarn.scheduler.capacity.root.llap.capacity=60
yarn.scheduler.capacity.root.llap.maximum-capacity=60
yarn.scheduler.capacity.root.llap.minimum-user-limit-percent=100
yarn.scheduler.capacity.root.llap.ordering-policy=fifo
yarn.scheduler.capacity.root.llap.priority=1
yarn.scheduler.capacity.root.llap.state=RUNNING
yarn.scheduler.capacity.root.llap.user-limit-factor=1 Resource Manager UI Please refer to the original article for different Grafana dashboards: http://www.kartikramalingam.com/hive-llap/
... View more
Labels:
11-27-2017
05:57 PM
1 Kudo
Just look at Kafka as a distibuted log file. even if you insert the same data, again and again, it just appends the data into a distibuted log file.
... View more
09-29-2017
08:34 AM
1 Kudo
AMbari can create user home directory when you create new user in Ambari - https://docs.hortonworks.com/HDPDocuments/Ambari-2.5.2.0/bk_ambari-administration/content/create_user_home_directory.html
... View more
09-02-2017
04:03 PM
@Pravin Bhagade thanks. updated the document.
... View more
08-04-2017
09:08 PM
Short Description: This sample code helps to connect to Kerberos enabled HBase cluster from Java program. Code Walkthrough: Create HBaseConfiguration and pass HBase cluster parameters. Configuration hbaseConfig = HBaseConfiguration.create();
hbaseConfig.addResource("/path_to_hbase_conf/hbase-site.xml");
hbaseConfig.addResource("/path_to_hbase_conf/core-site.xml");
hbaseConfig.set("hadoop.security.authentication", "Kerberos"); User principal and key tab file names. Please make sure key tab files are in the respective folder. String principal = System.getProperty("kerberosPrincipal", "hbaseuser@EXAMPLE.COM");
String keytab = System.getProperty("kerberosKeytab", "/path_to_keytab/hbase-client.keytab"); The essential Kerberos configuration information is the default realm and the default KDC. As with most Kerberos installations, a Kerberos configuration file krb5.conf is consulted to determine such things as the default realm and KDC. The default location is /etc/krb5.conf (Linux). If the krb5.conf file is in a different location or you want to pass custom krb5.conf: System.setProperty("java.security.krb5.conf","src/krb5.conf"); Login user from key tab file: UserGroupInformation.setConfiguration(hbaseConfig);
UserGroupInformation.loginUserFromKeytab(principal, keytab); Check the connection: HBaseAdmin.checkHBaseAvailable(hbaseConfig); Options to enable debug log: System.setProperty("sun.security.jgss.debug", "true");
System.setProperty("sun.security.krb5.debug", "true");
System.setProperty("sun.security.jgss.debug", "true");
System.setProperty("java.security.debug", "logincontext,policy,scl,gssloginconfig"); Well, you are good to go now.
... View more
Labels:
08-01-2017
08:16 AM
Is it a secured HDP cluster and what version of Kafka and Scala you are running on? If your Kafka is not kerberized , lets try to command out Kerberos related stuff and give it a try. To fix them comment out Kerberos related commands from them on all brokers. sed -i '/^export KAFKA_CLIENT_KERBEROS_PARAMS/s/^/# /' /usr/hdp/current/kafka-broker/bin/*.shgrep "export KAFKA_CLIENT_KERBEROS" /usr/hdp/current/kafka-broker/bin/*.sh # to confirm
... View more
08-01-2017
07:43 AM
1 Kudo
Just to make sure, could you please check if the Kafka broker is listening on 172.16.3.196 machine and port number is 6667.
... View more
07-28-2017
08:10 AM
1 Kudo
1. Now let's configure Knox to use our AD for authentication. Replace below content in Ambari > Knox > Config > Advanced topology <topology>
<gateway>
<provider>
<role>authentication</role>
<name>ShiroProvider</name>
<enabled>true</enabled>
<param>
<name>sessionTimeout</name>
<value>30</value>
</param>
<param>
<name>main.ldapRealm</name>
<value>org.apache.hadoop.gateway.shirorealm.KnoxLdapRealm</value>
</param>
<!-- changes for AD/user sync -->
<param>
<name>main.ldapContextFactory</name>
<value>org.apache.hadoop.gateway.shirorealm.KnoxLdapContextFactory</value>
</param>
<!-- main.ldapRealm.contextFactory needs to be placed before other main.ldapRealm.context
Factory* entries
-->
<param>
<name>main.ldapRealm.contextFactory</name>
<value>$ldapContextFactory</value>
</param>
<!-- AD url -->
<param>
<name>main.ldapRealm.contextFactory.url</name>
<value>ldap://ad01.lab.hortonworks.net:389</value>
</param>
<!-- system user -->
<param>
<name>main.ldapRealm.contextFactory.systemUsername</name>
<value>cn=ldap-reader,ou=ServiceUsers,dc=lab,dc=hortonworks,dc=net</value>
</param>
<!-- pass in the password using the alias created earlier -->
<param>
<name>main.ldapRealm.contextFactory.systemPassword</name>
<value>${ALIAS=knoxLdapSystemPassword}</value>
</param>
<param>
<name>main.ldapRealm.contextFactory.authenticationMechanism</name>
<value>simple</value>
</param>
<param>
<name>urls./**</name>
<value>authcBasic</value>
</param>
<!-- AD groups of users to allow -->
<param>
<name>main.ldapRealm.searchBase</name>
<value>ou=CorpUsers,dc=lab,dc=hortonworks,dc=net</value>
</param>
<param>
<name>main.ldapRealm.userObjectClass</name>
<value>person</value>
</param>
<param>
<name>main.ldapRealm.userSearchAttributeName</name>
<value>sAMAccountName</value>
</param>
<!-- changes needed for group sync-->
<param>
<name>main.ldapRealm.authorizationEnabled</name>
<value>true</value>
</param>
<param>
<name>main.ldapRealm.groupSearchBase</name>
<value>ou=CorpUsers,dc=lab,dc=hortonworks,dc=net</value>
</param>
<param>
<name>main.ldapRealm.groupObjectClass</name>
<value>group</value>
</param>
<param>
<name>main.ldapRealm.groupIdAttribute</name>
<value>cn</value>
</param>
</provider>
<provider>
<role>identity-assertion</role>
<name>Default</name>
<enabled>true</enabled>
</provider>
<provider>
<role>authorization</role>
<name>XASecurePDPKnox</name>
<enabled>true</enabled>
</provider>
<!-- Knox HaProvider for Hadoop services -->
<provider>
<role>ha</role>
<name>HaProvider</name>
<enabled>true</enabled>
<param>
<name>OOZIE</name>
<value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true</value>
</param>
<param>
<name>HBASE</name>
<value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true</value>
</param>
<param>
<name>WEBHCAT</name>
<value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true</value>
</param>
<param>
<name>WEBHDFS</name>
<value>maxFailoverAttempts=3;failoverSleep=1000;maxRetryAttempts=300;retrySleep=1000;enabled=true</value>
</param>
<param>
<name>HIVE</name>
<value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true;zookeeperEnsemble=machine1:2181,machine2:2181,machine3:2181;
zookeeperNamespace=hiveserver2</value>
</param>
</provider>
<!-- END Knox HaProvider for Hadoop services -->
</gateway>
<service>
<role>NAMENODE</role>
<url>hdfs://{{namenode_host}}:{{namenode_rpc_port}}</url>
</service>
<service>
<role>JOBTRACKER</role>
<url>rpc://{{rm_host}}:{{jt_rpc_port}}</url>
</service>
<service>
<role>WEBHDFS</role>
<url>http://{{namenode_host}}:{{namenode_http_port}}/webhdfs</url>
</service>
<service>
<role>WEBHCAT</role>
<url>http://{{webhcat_server_host}}:{{templeton_port}}/templeton</url>
</service>
<service>
<role>OOZIE</role>
<url>http://{{oozie_server_host}}:{{oozie_server_port}}/oozie</url>
</service>
<service>
<role>WEBHBASE</role>
<url>http://{{hbase_master_host}}:{{hbase_master_port}}</url>
</service>
<service>
<role>HIVE</role>
<url>http://{{hive_server_host}}:{{hive_http_port}}/{{hive_http_path}}</url>
</service>
<service>
<role>RESOURCEMANAGER</role>
<url>http://{{rm_host}}:{{rm_port}}/ws</url>
</service>
</topology>
Restart Knox gateway services using Ambari 2. Make sure Knox is configured to use CA certificates openssl s_client -showcerts -connect knoxhostname:8443 3. Validate Topology definition /usr/hdp/current/knox-server/bin/knoxcli.sh validate-topology --cluster default 4. Test LDAP Authentication and Authorization /usr/hdp/current/knox-server/bin/knoxcli.sh user-auth-test [--cluster c] [--u username] [--p password] [--g] [--d] This command will test a topology’s ability to connect, authenticate, and authorise a user with an LDAP server. The only required argument is the –cluster argument to specify the name of the topology you wish to use. Refer to http://knox.apache.org/books/knox-0-12-0/user-guide.html#LDAP+Authentication+and+Authorization for more options. 5. Test the ability to connect, bind, and authenticate with the LDAP server /usr/hdp/current/knox-server/bin/knoxcli.sh system-user-auth-test [--cluster c] [--d] This command will test a given topology’s ability to connect, bind, and authenticate with the LDAP server from the settings specified in the topology file. The bind currently only will with Shiro as the authentication provider 6. Test with Knox connection string to webhdfs curl -vik -u admin:admin-password 'https://<hostname>:8443/gateway/default/webhdfs/v1/?op=LISTSTATUS' 7. To make Knox use Ranger authorization Edit Advance Topology section and Change the authorization from XASecurePDPKnox to AclsAuthz. For example, change, <provider>
<role>authorization</role>
<name>AclsAuthz</name>
<enabled>true</enabled>
</provider>
to,
<provider>
<role>authorization</role>
<name>XASecurePDPKnox</name>
<enabled>true</enabled>
</provider>
8. Configure Ranger Knox plugin debug logging This Knox log setting should show you what is getting passed to RANGER from the KNOX Plugin. Modify the gateway-log4j.properties like below, restart Knox and review the ranger Knox plugin log in the file ranger.knoxagent.log #Ranger Knox Plugin debug
ranger.knoxagent.logger=DEBUG,console,KNOXAGENT
ranger.knoxagent.log.file=ranger.knoxagent.log
log4j.logger.org.apache.ranger=${ranger.knoxagent.logger}
log4j.additivity.org.apache.ranger=false
log4j.appender.KNOXAGENT =org.apache.log4j.DailyRollingFileAppender
log4j.appender.KNOXAGENT.File=${app.log.dir}/${ranger.knoxagent.log.file}
log4j.appender.KNOXAGENT.layout=org.apache.log4j.PatternLayout
log4j.appender.KNOXAGENT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n %L
log4j.appender.KNOXAGENT.DatePattern=.yyyy-MM-dd
... View more
Labels: