Community Articles

Find and share helpful community-sourced technical articles.
avatar

HDB 2.1.1

Reference: http://hdb.docs.pivotal.io/211

http://hdb.docs.pivotal.io/211/hdb/releasenotes/HAWQ211ReleaseNotes.html

http://hdb.docs.pivotal.io/211/hdb/install/install-ambari.html

Download HDB from Hortonworks at http://hortonworks.com/downloads/ or directly from Pivotal at https://network.pivotal.io/products/pivotal-hdb (You need to create a pivotal account)

What to look out for

  1. If you use only only 1 Master Node, you cannot have a Hawq Master and Standby
  2. If I install Hawq Master on Same node with Ambari need to change PostGres Port from 5432 on Install

Prep

  1. Ensure that httpd is installed
yum install httpd

sudo service httpd status

sudo service httpd start
  • Get and Install repo
    1. Log onto Pivotal and download hdb-2.1.1.0-7.tar
  • /* On Ambari Node */
    
    1. mkdir /staging
    2. chmod a+rx /staging
    3. scp -i <<your key>>  -o 'StrictHostKeyChecking=no' hdb-2.1.1.0-7.tar root@<<ambarinode>>:~/staging
    4. tar -zxvf hdb-2.1.1.0-7.tarcd /staging/hdb-2.1.1.0./setup_repo.sh
    
    /* You should see the message “hdb-2.1.1.0 Repo file successfully created at /etc/yum.repos.d/hdb-2.1.1.0.repo. */
    
    5. yum install -y hawq-ambari-plugin
    
    6. cd /var/lib/hawq
    
    7. ./add-hawq.py --user admin --password admin --stack HDP-2.5 
    /* if the repo is in the same node as Ambari else pint to where the repo lives*/
    ./add-hawq.py --user <admin-username> --password <admin-password> --stack HDP-2.5 --hawqrepo <hdb-2.1.x-url> --addonsrepo <hdb-add-ons-2.1.x-url>
    
    8. ambari-server restart
  • Configurations during Install with Ambari
    1. Set VM overcommit to 0 if you plan to use Hive and/or LLAP also on the same cluster; Don’t follow Pivotal docs to set this to 2 ele your Hive processes will have memory issues.
    2. Advanced hdfs-site
  • PropertySetting
    dfs.allow.truncatetrue
    dfs.block.access.token.enablefalse for an unsecured HDFS cluster, or true for a secure cluster
    dfs.block.local-path-access.usergpadmin
    dfs.client.read.shortcircuittrue
    dfs.client.socket-timeout***300000000
    dfs.client.use.legacy.blockreader.localfalse
    dfs.datanode.handler.count60
    dfs.datanode.socket.write.timeout***7200000
    dfs.namenode.handler.count600
    dfs.support.appendtrue
    1. Advanced core-site
    PropertySetting
    ipc.client.connection.maxidletime**3600000
    ipc.client.connect.timeout**300000
    ipc.server.listen.queue.size3300

    Some HAWQ Resources

    1. Date Type Formating Functions: https://www.postgresql.org/docs/8.2/static/functions-formatting.html
    2. Date Time Functions: https://www.postgresql.org/docs/8.2/static/functions-datetime.html
    3. Hawq Date Functions: http://tapoueh.org/blog/2013/08/20-Window-Functions HAWQ is better with dates; can automatically handle ’08/01/2016’ and ’01-Aug-2016’
    4. PostGreSQL Cheat Sheet Commands: http://www.postgresonline.com/downloads/special_feature/postgresql83_psql_cheatsheet.pdf
    5. System Catalog Tables: http://hdb.docs.pivotal.io/131/docs-hawq-shared/ref_guide/system_catalogs/catalog_ref-tables.html

    HAWQ Toolkit

    1. Make sure and make use of the Hawq Toolkit: http://hdb.docs.pivotal.io/211/hawq/reference/toolkit/hawq_toolkit.html
    2. How to find the data files for specific tables: https://discuss.pivotal.io/hc/en-us/articles/204072646-Pivotal-HAWQ-find-data-files-for-specific-tab...
    3. Size of table on Disk:
    select * from hawq_toolkit.hawq_size_of_table_disk;
    1. How to find the Size of Database:
    select sodddatname, sodddatsize/(1024*1024) from hawq_toolkit.hawq_size_of_database;
    1. How to find the Size of Partitioned Tables:
    select * hawq_toolkit.hawq_size_of_partition_and_indexes_disk
    1. Tip to find how many segments for a Hawq Table
    2. SELECT gp_segment_id, COUNT(*)
      FROM <<table>>
      GROUP BY gp_segment_id
      ORDER BY 1;

    Creating Tables

    <<TBD>

    1. Make SURE AFTER YOU CREATE THE TABLE ANALYZE: As an Example:
    1. vacuum analyze device.priority_counter_hist_rand;

    Loading Data to Tables

    <<TBD>

    Potential HAWQ Errors

    Too many open files in system

    1. To fix this check the value for fs.file-max in /etc/sysctl.conf. If configured a value that is lower than the total # of open files for the entire system at a given point (lsof | wc -l) then we would have increase this. To increase this value follow the below steps
      1. Open Files: lsof | wc -l
      2. ulimit -a | grep open
      3. Edit the following line in the /etc/sysctl.conf file:
      4. fs.file-max = value #value is the new file descriptor limit that you want to set.
      5. Apply the change by running the following command:# /sbin/sysctl -p
      6. We can disable over-commit temporarily: echo 0 > /proc/sys/vm/overcommit_memory
    2. For permanent solution:
      1. Add vm.overcommit_memory = 0 in /etc/sysctl.conf
      2. #fs.file-max=65536
      3. fs.file-max=2900000
      4. #Added for Hortonworks HDB
      5. kernel.threads-max=798720
      6. vm.overcommit_memory=0
    3,165 Views