About kramalingam

kramalingam · ‎05-13-2020

@getschwifty Revised the article to reflect the best practices. try it out and see if that helps you. thanks for your valuable feedback.

kramalingam · ‎05-12-2020

@getschwifty Please refer to latest documentation on setting up HBase Client account. Use the client account principle and key tab files from Java application. You will also have to adjust the HBase native ACLs or Ranger policies to allow the user/tenants to access tenant-specific HBase resources.

kramalingam · ‎03-19-2018

This article provides an overview of various aspects of Monitoring Hive LLAP key metrics like Hive LLAP Configurations, YARN Queue setup, YARN containers, LLAP cache hit ratio, executors, IO elevator metrics, JVM Heap usage and non-heap usage… etc Execution Engine LLAP is not an execution engine (like MapReduce or Tez). Overall execution is scheduled and monitored by an existing Hive execution engine (such as Tez) transparently over both LLAP nodes, as well as regular containers. Obviously, LLAP level of support depends on each individual execution engine (starting with Tez). MapReduce support is not planned, but other engines may be added later. Other frameworks like Pig and Spark also have the choice of using LLAP daemons. Enabling LLAP, Setting up Memory per daemon, In-Memory cache per Daemon, Number of Node(s) for running Hive LLAP daemon (num_llap_nodes_for_llap_daemons) and Number of executors per LLAP daemon in Advanced Hive Interactive-site: Cache Basics The daemon caches metadata for input files, as well as the data. The metadata and index information can be cached even for data that is not currently cached. Metadata is stored in the process in Java objects; cached data is stored and kept off-heap. Eviction policy is tuned for analytical workloads with frequent (partial) table-scans. Initially, a simple policy like LRFU is used. The policy is pluggable. Caching granularity. Column-chunks are the unit of data in the cache. This achieves a compromise between low-overhead processing and storage efficiency. The granularity of the chunks depends on the particular file format and execution engine (Vectorized Row Batch size, ORC stripe, etc.). A bloom filter is automatically created to provide Dynamic Runtime Filtering. Resource Management YARN remains responsible for the management and allocation of resources. The YARN container delegation model is used to allow the transfer of allocated resources to LLAP. To avoid the limitations of JVM memory settings, cached data is kept off-heap, as well as large buffers for processing (e.g., group by, joins). This way, the daemon can use a small amount of memory, and additional resources (i.e., CPU and memory) will be assigned based on workload. LLAP Yarn Queue It is important to know how different parameters in YARN queue configurations effects in LLAP performance. yarn.scheduler.capacity.root.llap.capacity=60 yarn.scheduler.capacity.root.llap.maximum-capacity=60 yarn.scheduler.capacity.root.llap.minimum-user-limit-percent=100 yarn.scheduler.capacity.root.llap.ordering-policy=fifo yarn.scheduler.capacity.root.llap.priority=1 yarn.scheduler.capacity.root.llap.state=RUNNING yarn.scheduler.capacity.root.llap.user-limit-factor=1 Resource Manager UI Please refer to the original article for different Grafana dashboards: http://www.kartikramalingam.com/hive-llap/

kramalingam · ‎11-27-2017

Just look at Kafka as a distibuted log file. even if you insert the same data, again and again, it just appends the data into a distibuted log file.

kramalingam · ‎09-29-2017

AMbari can create user home directory when you create new user in Ambari - https://docs.hortonworks.com/HDPDocuments/Ambari-2.5.2.0/bk_ambari-administration/content/create_user_home_directory.html

kramalingam · ‎09-02-2017

@Pravin Bhagade thanks. updated the document.

kramalingam · ‎08-04-2017

Short Description: This sample code helps to connect to Kerberos enabled HBase cluster from Java program. Code Walkthrough: Create HBaseConfiguration and pass HBase cluster parameters. Configuration hbaseConfig = HBaseConfiguration.create(); hbaseConfig.addResource("/path_to_hbase_conf/hbase-site.xml"); hbaseConfig.addResource("/path_to_hbase_conf/core-site.xml"); hbaseConfig.set("hadoop.security.authentication", "Kerberos"); User principal and key tab file names. Please make sure key tab files are in the respective folder. String principal = System.getProperty("kerberosPrincipal", "hbaseuser@EXAMPLE.COM"); String keytab = System.getProperty("kerberosKeytab", "/path_to_keytab/hbase-client.keytab"); The essential Kerberos configuration information is the default realm and the default KDC. As with most Kerberos installations, a Kerberos configuration file krb5.conf is consulted to determine such things as the default realm and KDC. The default location is /etc/krb5.conf (Linux). If the krb5.conf file is in a different location or you want to pass custom krb5.conf: System.setProperty("java.security.krb5.conf","src/krb5.conf"); Login user from key tab file: UserGroupInformation.setConfiguration(hbaseConfig); UserGroupInformation.loginUserFromKeytab(principal, keytab); Check the connection: HBaseAdmin.checkHBaseAvailable(hbaseConfig); Options to enable debug log: System.setProperty("sun.security.jgss.debug", "true"); System.setProperty("sun.security.krb5.debug", "true"); System.setProperty("sun.security.jgss.debug", "true"); System.setProperty("java.security.debug", "logincontext,policy,scl,gssloginconfig"); Well, you are good to go now.

kramalingam · ‎08-01-2017

Is it a secured HDP cluster and what version of Kafka and Scala you are running on? If your Kafka is not kerberized , lets try to command out Kerberos related stuff and give it a try. To fix them comment out Kerberos related commands from them on all brokers. sed -i '/^export KAFKA_CLIENT_KERBEROS_PARAMS/s/^/# /' /usr/hdp/current/kafka-broker/bin/*.shgrep "export KAFKA_CLIENT_KERBEROS" /usr/hdp/current/kafka-broker/bin/*.sh # to confirm

kramalingam · ‎08-01-2017

Just to make sure, could you please check if the Kafka broker is listening on 172.16.3.196 machine and port number is 6667.

kramalingam · ‎07-28-2017

1. Now let's configure Knox to use our AD for authentication. Replace below content in Ambari > Knox > Config > Advanced topology <topology> <gateway> <provider> <role>authentication</role> <name>ShiroProvider</name> <enabled>true</enabled> <param> <name>sessionTimeout</name> <value>30</value> </param> <param> <name>main.ldapRealm</name> <value>org.apache.hadoop.gateway.shirorealm.KnoxLdapRealm</value> </param>  <param> <name>main.ldapContextFactory</name> <value>org.apache.hadoop.gateway.shirorealm.KnoxLdapContextFactory</value> </param>  <param> <name>main.ldapRealm.contextFactory</name> <value>$ldapContextFactory</value> </param>  <param> <name>main.ldapRealm.contextFactory.url</name> <value>ldap://ad01.lab.hortonworks.net:389</value> </param>  <param> <name>main.ldapRealm.contextFactory.systemUsername</name> <value>cn=ldap-reader,ou=ServiceUsers,dc=lab,dc=hortonworks,dc=net</value> </param>  <param> <name>main.ldapRealm.contextFactory.systemPassword</name> <value>${ALIAS=knoxLdapSystemPassword}</value> </param> <param> <name>main.ldapRealm.contextFactory.authenticationMechanism</name> <value>simple</value> </param> <param> <name>urls./**</name> <value>authcBasic</value> </param>  <param> <name>main.ldapRealm.searchBase</name> <value>ou=CorpUsers,dc=lab,dc=hortonworks,dc=net</value> </param> <param> <name>main.ldapRealm.userObjectClass</name> <value>person</value> </param> <param> <name>main.ldapRealm.userSearchAttributeName</name> <value>sAMAccountName</value> </param>  <param> <name>main.ldapRealm.authorizationEnabled</name> <value>true</value> </param> <param> <name>main.ldapRealm.groupSearchBase</name> <value>ou=CorpUsers,dc=lab,dc=hortonworks,dc=net</value> </param> <param> <name>main.ldapRealm.groupObjectClass</name> <value>group</value> </param> <param> <name>main.ldapRealm.groupIdAttribute</name> <value>cn</value> </param> </provider> <provider> <role>identity-assertion</role> <name>Default</name> <enabled>true</enabled> </provider> <provider> <role>authorization</role> <name>XASecurePDPKnox</name> <enabled>true</enabled> </provider>  <provider> <role>ha</role> <name>HaProvider</name> <enabled>true</enabled> <param> <name>OOZIE</name> <value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true</value> </param> <param> <name>HBASE</name> <value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true</value> </param> <param> <name>WEBHCAT</name> <value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true</value> </param> <param> <name>WEBHDFS</name> <value>maxFailoverAttempts=3;failoverSleep=1000;maxRetryAttempts=300;retrySleep=1000;enabled=true</value> </param> <param> <name>HIVE</name> <value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true;zookeeperEnsemble=machine1:2181,machine2:2181,machine3:2181; zookeeperNamespace=hiveserver2</value> </param> </provider>  </gateway> <service> <role>NAMENODE</role> <url>hdfs://{{namenode_host}}:{{namenode_rpc_port}}</url> </service> <service> <role>JOBTRACKER</role> <url>rpc://{{rm_host}}:{{jt_rpc_port}}</url> </service> <service> <role>WEBHDFS</role> <url>http://{{namenode_host}}:{{namenode_http_port}}/webhdfs</url> </service> <service> <role>WEBHCAT</role> <url>http://{{webhcat_server_host}}:{{templeton_port}}/templeton</url> </service> <service> <role>OOZIE</role> <url>http://{{oozie_server_host}}:{{oozie_server_port}}/oozie</url> </service> <service> <role>WEBHBASE</role> <url>http://{{hbase_master_host}}:{{hbase_master_port}}</url> </service> <service> <role>HIVE</role> <url>http://{{hive_server_host}}:{{hive_http_port}}/{{hive_http_path}}</url> </service> <service> <role>RESOURCEMANAGER</role> <url>http://{{rm_host}}:{{rm_port}}/ws</url> </service> </topology> Restart Knox gateway services using Ambari 2. Make sure Knox is configured to use CA certificates openssl s_client -showcerts -connect knoxhostname:8443 3. Validate Topology definition /usr/hdp/current/knox-server/bin/knoxcli.sh validate-topology --cluster default 4. Test LDAP Authentication and Authorization /usr/hdp/current/knox-server/bin/knoxcli.sh user-auth-test [--cluster c] [--u username] [--p password] [--g] [--d] This command will test a topology’s ability to connect, authenticate, and authorise a user with an LDAP server. The only required argument is the –cluster argument to specify the name of the topology you wish to use. Refer to http://knox.apache.org/books/knox-0-12-0/user-guide.html#LDAP+Authentication+and+Authorization for more options. 5. Test the ability to connect, bind, and authenticate with the LDAP server /usr/hdp/current/knox-server/bin/knoxcli.sh system-user-auth-test [--cluster c] [--d] This command will test a given topology’s ability to connect, bind, and authenticate with the LDAP server from the settings specified in the topology file. The bind currently only will with Shiro as the authentication provider 6. Test with Knox connection string to webhdfs curl -vik -u admin:admin-password 'https://<hostname>:8443/gateway/default/webhdfs/v1/?op=LISTSTATUS' 7. To make Knox use Ranger authorization Edit Advance Topology section and Change the authorization from XASecurePDPKnox to AclsAuthz. For example, change, <provider> <role>authorization</role> <name>AclsAuthz</name> <enabled>true</enabled> </provider> to, <provider> <role>authorization</role> <name>XASecurePDPKnox</name> <enabled>true</enabled> </provider> 8. Configure Ranger Knox plugin debug logging This Knox log setting should show you what is getting passed to RANGER from the KNOX Plugin. Modify the gateway-log4j.properties like below, restart Knox and review the ranger Knox plugin log in the file ranger.knoxagent.log #Ranger Knox Plugin debug ranger.knoxagent.logger=DEBUG,console,KNOXAGENT ranger.knoxagent.log.file=ranger.knoxagent.log log4j.logger.org.apache.ranger=${ranger.knoxagent.logger} log4j.additivity.org.apache.ranger=false log4j.appender.KNOXAGENT =org.apache.log4j.DailyRollingFileAppender log4j.appender.KNOXAGENT.File=${app.log.dir}/${ranger.knoxagent.log.file} log4j.appender.KNOXAGENT.layout=org.apache.log4j.PatternLayout log4j.appender.KNOXAGENT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n %L log4j.appender.KNOXAGENT.DatePattern=.yyyy-MM-dd

Online	Offline
Last Visited	‎09-16-2025 12:00 AM

Member Since	‎06-05-2017 12:44 PM
Last Visited	‎09-16-2025 12:00 AM
Posts	18
Kudos received	7

Cloudera Community

Re: Why is there no option to create hdfs working ...

Re: Connecting to Kerberos secured HBase cluster f...

Re: Connecting to Kerberos secured HBase cluster f...

Monitoring Hive LLAP

Re: Ingest same data in Kafka topic

Re: Why is there no option to create hdfs working ...

Re: How to configure and troubleshoot a Knox topol...

Connecting to Kerberos secured HBase cluster from ...

Re: Kafka console consumer nor reading messages wh...

Re: Kafka console consumer nor reading messages wh...

How to configure and troubleshoot a Knox topology