Member since
11-09-2016
68
Posts
16
Kudos Received
5
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2517 | 12-07-2017 06:32 PM | |
937 | 12-07-2017 06:29 PM | |
1551 | 12-01-2017 11:56 AM | |
9511 | 02-10-2017 08:55 AM | |
3008 | 01-23-2017 09:44 PM |
07-24-2017
11:28 AM
In large clusters , sometimes restarting Namenode or a secondary namenode will fail and Ambari will keep trying multiple times then fails. One thing can be done quickly is to increase the timeouts of Ambari from 5s to 25s ( or up to 50s ) In /var/lib/ambari-server/resources/common-services/HDFS/XXX-VERSION-XXX/package/scripts/hdfs_namenode.py From this:
@retry(times=5, sleep_time=5, backoff_factor=2, err_class=Fail) To this:
@retry(times=25, sleep_time=25, backoff_factor=2, err_class=Fail) If it still fail, you can try
@retry(times=50, sleep_time=50, backoff_factor=2, err_class=Fail) One of the root causes of this maybe SOLR audit logs ( from Ambari Infra ) when creating huge logs that needs to be written to hdfs. Restart Ambari server You can clear the logs of NN and SNN here : /var/log/hadoop/hdfs/audit/solr/spool Becareful on deleting only on Standby NN - then do a failover to delete from the other server. do not delete logs while the namenode is active.
... View more
Labels:
07-24-2017
11:26 AM
In large clusters , sometimes restarting Namenode or a secondary namenode will fail and Ambari will keep trying mltiple times then fail. One thing can be done quickly is to increase the timeouts of Ambari from 5s to 25s. In /var/lib/ambari-server/resources/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py From this:
@retry(times=5, sleep_time=5, backoff_factor=2, err_class=Fail) To this:
@retry(times=25, sleep_time=25, backoff_factor=2, err_class=Fail) If it still fail, you can try
@retry(times=50, sleep_time=50, backoff_factor=2, err_class=Fail) One of the root causes of this maybe SOLR audit logs ( from Ambari Infra ) when creating huge logs that needs to be written to hdfs. You can clear the logs of NN and SNN here : /var/log/hadoop/hdfs/audit/solr/spool Becareful on deleting only on Standby NN - then do a failover to delete from the other server. do not delete logs while the namenode is active.
... View more
Labels:
07-14-2017
04:57 PM
By default, the file container-executor.cfg under /etc/hadoop/conf/ is overwritten in every nodemanager by /var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/templates/container-executor.cfg.j2 When you have LinuxContainerExecutor , yarn execute jobs as the end user, in this case It's not recommended to change banned.users and allowed.system.user Why you should ban super user from running Yarn jobs ? this is because anyone can run the job as "super-user" within hadoop group , hadoop trust what user you say you are when submitting jobs - if you pass kerberos wall with the keytab.( which can easily be found and used in the job ) - then any user can basically have full super user permissions on job submission.
... View more
Labels:
07-13-2017
01:42 PM
Ambari kills the process of the namenode when it detects a java error with type : HDFS -> Under hadoop-env template
replace -XX:OnOutOfMemoryError=\"/usr/hdp/current/hadoop-hdfs-secondarynamenode/bin/kill-secondary-name-node\" by -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=\"/tmp/heap\" This will only allow you to start the namenode , investigate what is causing the issue. But this is a temporary solution, you need to analyze your heap dump and see what's wrong with it. -- SOLR could be one of the causes of this, when creating huge logs that needs to be written to hdfs. You can clear the logs of NN and SNN here : /var/log/hadoop/hdfs/audit/solr/spool Becareful on deleting only on Standby NN - then do a failover to delete from the other server. do not delete logs while the namenode is active.
... View more
Labels:
05-07-2017
03:17 PM
HORTONWORKS : SCRIPT HOW TO DISABLE THP REDHAT 7 ? #!/bin/bash
### BEGIN INIT INFO
# Provides: disable-transparent-hugepages
# Required-Start: $local_fs
# Required-Stop:
# Author: Amine Hallam
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: Disable Linux transparent huge pages
# Description: Disable Linux transparent huge pages, to improve
# database performance.
### END INIT INFO
case $1 in
start)
if [ -d /sys/kernel/mm/transparent_hugepage ]; then
thp_path=/sys/kernel/mm/transparent_hugepage
elif [ -d /sys/kernel/mm/redhat_transparent_hugepage ]; then
thp_path=/sys/kernel/mm/redhat_transparent_hugepage
else
return 0
fi
echo 'never' > ${thp_path}/enabled
echo 'never' > ${thp_path}/defrag
re='^[0-1]+$'
if [[ $(cat ${thp_path}/khugepaged/defrag) =~ $re ]]
then
# RHEL 7
echo 0 > ${thp_path}/khugepaged/defrag
else
# RHEL 6
echo 'no' > ${thp_path}/khugepaged/defrag
fi
unset re
unset thp_path
;;
esac sudo chmod 755 /etc/init.d/disable-transparent-hugepages sudo chkconfig --add disable-transparent-hugepages Copied from here
... View more
05-05-2017
12:06 PM
1 Kudo
It's supported from HDP 2.6, check here for more info
... View more
05-05-2017
12:05 PM
HDP now supports IBM power server, check here draft manual for install & the repos to download. https://community.hortonworks.com/articles/101185/installation-of-hdp-26-on-ibm-power-systems.html
... View more
05-04-2017
11:58 PM
1 Kudo
Please consider the following for this install; IBM Power servers are on centos 7 The install is performed by using a non-root user there is no access to internet or proxy to remote repos, we installed a local repos ------------------------------------------------------------prerequisite------------------------------------------------------------- #Check the Maximum
Open File Descriptors #The recommended
maximum number of open file descriptors is 10000, or more. To check
the current value set for the maximum number of open file
descriptors, execute the following shell commands on each host: ulimit -Sn ulimit -Hn #If the output is
not greater than 10000, run the following command to set it to a
suitable default: ulimit -n 10000 #SElinux sudo setenforce 0 sudo sh -c 'sudo sed
-i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config' sudo sh -c 'sudo sed
-i 's/SELINUX=permissive/SELINUX=disabled/g' /etc/selinux/config' sudo umask 0022 sudo echo umask 0022
>> /etc/profile ----------------------------- JDK - open JDK only
( oracle JDK not supported ) ------------------------------------------------- sudo yum install
java-1.8.0-openjdk sudo yum install
java-1.8.0-openjdk-devel sudo yum install
java-1.8.0-openjdk-headless export
JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk vi ~/.bash_profile ---------------------------------- Installation of
Mysql / MariaDB on the server ---------------------------------- sudo yum update sudo yum install
mysql-connector-java sudo yum install
mysql sudo yum update sudo yum install
mariadb-server sudo systemctl
enable mariadb sudo systemctl start
mariadb #how to connect : mysql -u root -p #no password by
default ---------------------------------- Setting Up a Local
Repository for HDP on the server - No internet access --------------------------- sudo yum install
yum-utils createrepo sudo mkdir -p
/var/www/html/ -------------------------------- Prepare the httpd
service on the server ------------------------------- sudo yum install
httpd sudo service httpd
start sudo systemctl
enable httpd ---------------------------- prepare the repos ---------------------------------------- #HDP #Download from
http://public-repo-1.hortonworks.com/HDP/centos7-ppc/2.x/updates/2.6.0.3/HDP-2.6.0.3-centos7-ppc-rpm.tar.gz tar -xvf
HDP-2.6.0.3-centos7-rpm.tar.gz sudo mv HDP
/var/www/html/ cd
/var/www/html/HDP/centos7 #HDP-UTILS #Download from
http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.21/repos/ppc64le/HDP-UTILS-1.1.0.21-centos7.tar.gz #Ambari #Download from
http://public-repo-1.hortonworks.com/ambari/centos7-ppc/2.x/updates/2.5.0.3/ambari-2.5.0.3-centos7-ppc.tar.gz ambari-2.5.0.3-centos7-ppc.tar.gz tar -xvf
ambari-2.5.0.3-centos7.tar.gz sudo mv ambari
/var/www/html/ cd
/var/www/html/ambari/centos7 ----------------------------------------------- HDP.repo Example ----------------------------------------------- #VERSION_NUMBER=2.6.0.3-8 [HDP-2.6.0.3] name=HDP Version -
HDP-2.6.0.3 #baseurl=http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.6.0.3 baseurl=http://XXXXXX/HDP/centos7-ppc/ gpgcheck=1 #gpgkey=http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.6.0.3/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins gpgkey=http://XXXXXX/HDP/centos7-ppc/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins enabled=1 priority=1 [HDP-UTILS-1.1.0.21] name=HDP-UTILS
Version - HDP-UTILS-1.1.0.21 baseurl=http://XXXXXX/HDP-UTILS-1.1.0.21/repos/ppc64le gpgcheck=0 gpgkey=http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.6.0.3/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins enabled=0 priority=1 ---------------------------------------------------- Prepare the the repo
files on the server XXXX --------------------------------------- sudo mv ambari.repo
/etc/yum.repos.d/ sudo mv hdp.repo
/etc/yum.repos.d/ #on ambari.repo
modify the following [ambari-2.5.0.3] name=ambari Version
- ambari-2.5.0.3 baseurl=http://XXXXXX/ambari/centos7-ppc/ gpgcheck=1 gpgkey=http://XXXX/ambari/centos7-ppc/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins enabled=1 priority=1 #Confirm that the
repository is configured by checking the repo list yum repolist sudo yum install
lambari-server sudo ambari-server
setup --jdbc-db=mysql
--jdbc-driver=/usr/share/java/mysql-connector-java.jar ambari-server setup
-j $JAVA_HOME ... Check more here
... View more
Labels:
04-03-2017
02:57 PM
Thanks @David Streever Very helpful post. For spnego, i have added ipa-getkeytab -s IPA_SERVER -p HTTP/NODE1@REALM -k /etc/security/keytabs/spnego.service.keytab
ipa-getkeytab -s IPA_SERVER -p HTTP/NODE2@REALM -k /etc/security/keytabs/spnego.service.keytab
ipa-getkeytab -s IPA_SERVER -p HTTP/MASTER1@REALM -k /etc/security/keytabs/spnego.service.keytab ...etc
... View more
03-06-2017
12:31 PM
@Ashok Kumar BM Ok, for 2400 billion rows effectively it's a lot, there is no need to unpivot the table. Did you considered Phoenix ? I would suggest to use JDBC connector through Phoenix and to create an index on every column in the where condition ( 3 indexes in your case ) and give it a try, only inconvenient here is that phoenix will create more data as for every index it's a more storage, and if your data changes a lot , indexes needs more processing to be maintained. let me know your thoughts
... View more