About ahallam

ahallam · ‎12-01-2017

Complementary article to Hive CLI security to clarify the risk of using Hive CLI. Hive CLI ( or Hive shell ) is not recommended and Apache asked users to move to Beeline even if it's still supported by Hortonworks ( HDP2.6 ) Ranger Hive plugin does not enforce permissions for Hive CLI users however it doesn’t bypass systematically “All” Ranger policies, it bypass only hive policies. This risk is therefore for all hive managed tables. ( ones under /apps/hive/warehouse/ ) All external DB/Tables will still be protected by HDFS policies.

ahallam · ‎12-01-2017

When hive.server2.enable.doAs=True , HiveServer2 performs the query processing as the user who submitted the query (usually the user you kinit with, it could be service account or an account assigned to a team ). But if the parameter is set to false, the query will run as the user that the hiveserver2 process runs as, mostly Hive This will help to : 1-Better control the users via Hive Ranger policies 2-Better control the ACLs mappings for Yarn, so you can assign every user on a specific Queue.

ahallam · ‎08-07-2017

Symtoms : NameNode HA states: active_namenodes =[], standby_namenodes =[], unknown_namenodes =[(u'nn1', Solution : Could be in order : 1) Ambari is doing the timeout ( 5 sec is default ) and killing the process if the NN takes long to start you can change the value of the timeout in /var/lib/ambari-server/resources/common-services/HDFS/vXXXX/package/scripts/hdfs_namenode.py From this: @retry(times=5, sleep_time=5, backoff_factor=2, err_class=Fail) To this: @retry(times=25, sleep_time=25, backoff_factor=2, err_class=Fail) if not enough to this: @retry(times=50, sleep_time=25, backoff_factor=2, err_class=Fail) 2) Could be the Zookeeper not getting the status of the NN for this you can try Restart zookeeper, if it's still not working , then try the following Check the content of the Znode ( hadoop-ha ), save the namespace of the NN and delete the content and restart the NN

abhijth8 · ‎08-09-2017

Doing this process may lead to bad things in a production cluster. The right way would be. hdfs haadmin -failover nn1 nn2 The NN UI shows the service tags

ahallam · ‎07-24-2017

In large clusters , sometimes restarting Namenode or a secondary namenode will fail and Ambari will keep trying multiple times then fails. One thing can be done quickly is to increase the timeouts of Ambari from 5s to 25s ( or up to 50s ) In /var/lib/ambari-server/resources/common-services/HDFS/XXX-VERSION-XXX/package/scripts/hdfs_namenode.py From this: @retry(times=5, sleep_time=5, backoff_factor=2, err_class=Fail) To this: @retry(times=25, sleep_time=25, backoff_factor=2, err_class=Fail) If it still fail, you can try @retry(times=50, sleep_time=50, backoff_factor=2, err_class=Fail) One of the root causes of this maybe SOLR audit logs ( from Ambari Infra ) when creating huge logs that needs to be written to hdfs. Restart Ambari server You can clear the logs of NN and SNN here : /var/log/hadoop/hdfs/audit/solr/spool Becareful on deleting only on Standby NN - then do a failover to delete from the other server. do not delete logs while the namenode is active.

ahallam · ‎07-24-2017

In large clusters , sometimes restarting Namenode or a secondary namenode will fail and Ambari will keep trying mltiple times then fail. One thing can be done quickly is to increase the timeouts of Ambari from 5s to 25s. In /var/lib/ambari-server/resources/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py From this: @retry(times=5, sleep_time=5, backoff_factor=2, err_class=Fail) To this: @retry(times=25, sleep_time=25, backoff_factor=2, err_class=Fail) If it still fail, you can try @retry(times=50, sleep_time=50, backoff_factor=2, err_class=Fail) One of the root causes of this maybe SOLR audit logs ( from Ambari Infra ) when creating huge logs that needs to be written to hdfs. You can clear the logs of NN and SNN here : /var/log/hadoop/hdfs/audit/solr/spool Becareful on deleting only on Standby NN - then do a failover to delete from the other server. do not delete logs while the namenode is active.

ahallam · ‎07-14-2017

By default, the file container-executor.cfg under /etc/hadoop/conf/ is overwritten in every nodemanager by /var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/templates/container-executor.cfg.j2 When you have LinuxContainerExecutor , yarn execute jobs as the end user, in this case It's not recommended to change banned.users and allowed.system.user Why you should ban super user from running Yarn jobs ? this is because anyone can run the job as "super-user" within hadoop group , hadoop trust what user you say you are when submitting jobs - if you pass kerberos wall with the keytab.( which can easily be found and used in the job ) - then any user can basically have full super user permissions on job submission.

kalyandas831 · ‎12-01-2017

I am facing similar issue where standby NN is not starting. In the hdfs out file we are getting java.lang.OutOfMemoryError: Requested array size exceeds VM limit. Can we uncheck Audit to SOLR from Advance ranger audit and then start Standby NN. Will there be any impact on the cluster if we uncheck Audit to SOLR

ahallam · ‎05-07-2017

HORTONWORKS : SCRIPT HOW TO DISABLE THP REDHAT 7 ? #!/bin/bash ### BEGIN INIT INFO # Provides: disable-transparent-hugepages # Required-Start: $local_fs # Required-Stop: # Author: Amine Hallam # Default-Start: 2 3 4 5 # Default-Stop: 0 1 6 # Short-Description: Disable Linux transparent huge pages # Description: Disable Linux transparent huge pages, to improve # database performance. ### END INIT INFO case $1 in start) if [ -d /sys/kernel/mm/transparent_hugepage ]; then thp_path=/sys/kernel/mm/transparent_hugepage elif [ -d /sys/kernel/mm/redhat_transparent_hugepage ]; then thp_path=/sys/kernel/mm/redhat_transparent_hugepage else return 0 fi echo 'never' > ${thp_path}/enabled echo 'never' > ${thp_path}/defrag re='^[0-1]+$' if [[ $(cat ${thp_path}/khugepaged/defrag) =~ $re ]] then # RHEL 7 echo 0 > ${thp_path}/khugepaged/defrag else # RHEL 6 echo 'no' > ${thp_path}/khugepaged/defrag fi unset re unset thp_path ;; esac sudo chmod 755 /etc/init.d/disable-transparent-hugepages sudo chkconfig --add disable-transparent-hugepages Copied from here

ahallam · ‎05-04-2017

Please consider the following for this install; IBM Power servers are on centos 7 The install is performed by using a non-root user there is no access to internet or proxy to remote repos, we installed a local repos ------------------------------------------------------------prerequisite------------------------------------------------------------- #Check the Maximum Open File Descriptors #The recommended maximum number of open file descriptors is 10000, or more. To check the current value set for the maximum number of open file descriptors, execute the following shell commands on each host: ulimit -Sn ulimit -Hn #If the output is not greater than 10000, run the following command to set it to a suitable default: ulimit -n 10000 #SElinux sudo setenforce 0 sudo sh -c 'sudo sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config' sudo sh -c 'sudo sed -i 's/SELINUX=permissive/SELINUX=disabled/g' /etc/selinux/config' sudo umask 0022 sudo echo umask 0022 >> /etc/profile ----------------------------- JDK - open JDK only ( oracle JDK not supported ) ------------------------------------------------- sudo yum install java-1.8.0-openjdk sudo yum install java-1.8.0-openjdk-devel sudo yum install java-1.8.0-openjdk-headless export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk vi ~/.bash_profile ---------------------------------- Installation of Mysql / MariaDB on the server ---------------------------------- sudo yum update sudo yum install mysql-connector-java sudo yum install mysql sudo yum update sudo yum install mariadb-server sudo systemctl enable mariadb sudo systemctl start mariadb #how to connect : mysql -u root -p #no password by default ---------------------------------- Setting Up a Local Repository for HDP on the server - No internet access --------------------------- sudo yum install yum-utils createrepo sudo mkdir -p /var/www/html/ -------------------------------- Prepare the httpd service on the server ------------------------------- sudo yum install httpd sudo service httpd start sudo systemctl enable httpd ---------------------------- prepare the repos ---------------------------------------- #HDP #Download from http://public-repo-1.hortonworks.com/HDP/centos7-ppc/2.x/updates/2.6.0.3/HDP-2.6.0.3-centos7-ppc-rpm.tar.gz tar -xvf HDP-2.6.0.3-centos7-rpm.tar.gz sudo mv HDP /var/www/html/ cd /var/www/html/HDP/centos7 #HDP-UTILS #Download from http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.21/repos/ppc64le/HDP-UTILS-1.1.0.21-centos7.tar.gz #Ambari #Download from http://public-repo-1.hortonworks.com/ambari/centos7-ppc/2.x/updates/2.5.0.3/ambari-2.5.0.3-centos7-ppc.tar.gz ambari-2.5.0.3-centos7-ppc.tar.gz tar -xvf ambari-2.5.0.3-centos7.tar.gz sudo mv ambari /var/www/html/ cd /var/www/html/ambari/centos7 ----------------------------------------------- HDP.repo Example ----------------------------------------------- #VERSION_NUMBER=2.6.0.3-8 [HDP-2.6.0.3] name=HDP Version - HDP-2.6.0.3 #baseurl=http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.6.0.3 baseurl=http://XXXXXX/HDP/centos7-ppc/ gpgcheck=1 #gpgkey=http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.6.0.3/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins gpgkey=http://XXXXXX/HDP/centos7-ppc/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins enabled=1 priority=1 [HDP-UTILS-1.1.0.21] name=HDP-UTILS Version - HDP-UTILS-1.1.0.21 baseurl=http://XXXXXX/HDP-UTILS-1.1.0.21/repos/ppc64le gpgcheck=0 gpgkey=http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.6.0.3/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins enabled=0 priority=1 ---------------------------------------------------- Prepare the the repo files on the server XXXX --------------------------------------- sudo mv ambari.repo /etc/yum.repos.d/ sudo mv hdp.repo /etc/yum.repos.d/ #on ambari.repo modify the following [ambari-2.5.0.3] name=ambari Version - ambari-2.5.0.3 baseurl=http://XXXXXX/ambari/centos7-ppc/ gpgcheck=1 gpgkey=http://XXXX/ambari/centos7-ppc/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins enabled=1 priority=1 #Confirm that the repository is configured by checking the repo list yum repolist sudo yum install lambari-server sudo ambari-server setup --jdbc-db=mysql --jdbc-driver=/usr/share/java/mysql-connector-java.jar ambari-server setup -j $JAVA_HOME ... Check more here

Online	Offline
Last Visited	‎03-20-2018 04:43 PM

Member Since	‎11-09-2016 02:44 PM
Last Visited	‎03-20-2018 04:43 PM
Posts	68
Kudos received	16

Cloudera Community

Re: hdfs namenode -bootstrapStandby + nodename no...

Re: how to join worker/kafka machine to ambari clu...

Re: HDP on AWS using EFS mounts for yarn and hdfs

Re: Hive - Error while cleaning up the server reso...

Re: Hive Metastore database driver configuration

What is the risk of using Hive CLI ?

Enable DoAs option Hive to allow users to runs que...

How to resolve : caught exception: No active NameN...

Re: How to force Namenode failover in HA

How to resolve : NameNode nn1 is not listed as Act...

How to resolve : NameNode nn1 is not listed as Act...

Why container-executor.cfg is overwritten when res...

Re: Namenode down due to java.lang.OutOfMemoryErro...

Hortonworks : Script how to Disable permanently TH...

Step by Step : Installation of HDP 2.6 on IBM POWE...