About amcbarnett

amcbarnett · ‎04-13-2018

@Dominika Bialek Looks like the CloudBreak 2.5 docs were removed.

amcbarnett · ‎03-21-2017

WITH t1nulltest AS ( select date_column ,SUM(IF(c1 IS NULL,1,0)) OVER (PARTITION BY date_column) as c1null ,SUM(IF(c2 IS NULL,1,0)) OVER (PARTITION BY date_column) as c2null ,SUM(IF(c3 IS NULL,1,0)) OVER (PARTITION BY date_column) as c3null ,SUM(IF(c4 IS NULL,1,0)) OVER (PARTITION BY date_column) as c4null ,SUM(IF(c5 IS NULL,1,0)) OVER (PARTITION BY date_column) as c5null ,row_number() OVER (PARTITION BY date_column) as rowno from t1) select date_column, c1null, c2null,c3null,c4null,c5null from t1nulltest where rowno =1;

amcbarnett · ‎03-03-2017

#3.. In this case flume is connecting to HBase via Phoenix JDBC. So the question is if we need to do something for the JDBC connection to secure with SSL..

amcbarnett · ‎02-10-2017

This was awesome Tim

amcbarnett · ‎02-06-2017

No it doesn't. You have to run additional scripts to delete HDP

amcbarnett · ‎02-04-2017

@Sunile Manjee Can you clearly state what you entered for zeppelin.jdbc.principal? Is this the Hive Principal or the Zeppelin Principal with Key tab. Also what exactly is in the URL for the JDBC Interpreter? Just: jdbc:hive2://HiveHost:10000/default;principal=hive/_HOST@MY-REALM.COM Finally, did you have to copy the Hive JDBC Jars or create softlinks to Zeppelin /usr/hdp/current/zeppelin-server/interpreter/jdbc

amcbarnett · ‎01-19-2017

HDB 2.1.1 Reference: http://hdb.docs.pivotal.io/211 http://hdb.docs.pivotal.io/211/hdb/releasenotes/HAWQ211ReleaseNotes.html http://hdb.docs.pivotal.io/211/hdb/install/install-ambari.html Download HDB from Hortonworks at http://hortonworks.com/downloads/ or directly from Pivotal at https://network.pivotal.io/products/pivotal-hdb (You need to create a pivotal account) What to look out for If you use only only 1 Master Node, you cannot have a Hawq Master and Standby If I install Hawq Master on Same node with Ambari need to change PostGres Port from 5432 on Install Prep Ensure that httpd is installed yum install httpd sudo service httpd status sudo service httpd start Get and Install repo Log onto Pivotal and download hdb-2.1.1.0-7.tar /* On Ambari Node */ 1. mkdir /staging 2. chmod a+rx /staging 3. scp -i <<your key>> -o 'StrictHostKeyChecking=no' hdb-2.1.1.0-7.tar root@<<ambarinode>>:~/staging 4. tar -zxvf hdb-2.1.1.0-7.tarcd /staging/hdb-2.1.1.0./setup_repo.sh /* You should see the message “hdb-2.1.1.0 Repo file successfully created at /etc/yum.repos.d/hdb-2.1.1.0.repo. */ 5. yum install -y hawq-ambari-plugin 6. cd /var/lib/hawq 7. ./add-hawq.py --user admin --password admin --stack HDP-2.5 /* if the repo is in the same node as Ambari else pint to where the repo lives*/ ./add-hawq.py --user <admin-username> --password <admin-password> --stack HDP-2.5 --hawqrepo <hdb-2.1.x-url> --addonsrepo <hdb-add-ons-2.1.x-url> 8. ambari-server restart Configurations during Install with Ambari Set VM overcommit to 0 if you plan to use Hive and/or LLAP also on the same cluster; Don’t follow Pivotal docs to set this to 2 ele your Hive processes will have memory issues. Advanced hdfs-site Property Setting dfs.allow.truncate true dfs.block.access.token.enable false for an unsecured HDFS cluster, or true for a secure cluster dfs.block.local-path-access.user gpadmin dfs.client.read.shortcircuit true dfs.client.socket-timeout*** 300000000 dfs.client.use.legacy.blockreader.local false dfs.datanode.handler.count 60 dfs.datanode.socket.write.timeout*** 7200000 dfs.namenode.handler.count 600 dfs.support.append true Advanced core-site Property Setting ipc.client.connection.maxidletime** 3600000 ipc.client.connect.timeout** 300000 ipc.server.listen.queue.size 3300 Some HAWQ Resources Date Type Formating Functions: https://www.postgresql.org/docs/8.2/static/functions-formatting.html Date Time Functions: https://www.postgresql.org/docs/8.2/static/functions-datetime.html Hawq Date Functions: http://tapoueh.org/blog/2013/08/20-Window-Functions HAWQ is better with dates; can automatically handle ’08/01/2016’ and ’01-Aug-2016’ PostGreSQL Cheat Sheet Commands: http://www.postgresonline.com/downloads/special_feature/postgresql83_psql_cheatsheet.pdf System Catalog Tables: http://hdb.docs.pivotal.io/131/docs-hawq-shared/ref_guide/system_catalogs/catalog_ref-tables.html HAWQ Toolkit Make sure and make use of the Hawq Toolkit: http://hdb.docs.pivotal.io/211/hawq/reference/toolkit/hawq_toolkit.html How to find the data files for specific tables: https://discuss.pivotal.io/hc/en-us/articles/204072646-Pivotal-HAWQ-find-data-files-for-specific-tables Size of table on Disk: select * from hawq_toolkit.hawq_size_of_table_disk; How to find the Size of Database: select sodddatname, sodddatsize/(1024*1024) from hawq_toolkit.hawq_size_of_database; How to find the Size of Partitioned Tables: select * hawq_toolkit.hawq_size_of_partition_and_indexes_disk Tip to find how many segments for a Hawq Table SELECT gp_segment_id, COUNT(*) FROM <<table>> GROUP BY gp_segment_id ORDER BY 1; Creating Tables <<TBD> Make SURE AFTER YOU CREATE THE TABLE ANALYZE: As an Example: vacuum analyze device.priority_counter_hist_rand; Loading Data to Tables <<TBD> Potential HAWQ Errors Too many open files in system To fix this check the value for fs.file-max in /etc/sysctl.conf. If configured a value that is lower than the total # of open files for the entire system at a given point (lsof | wc -l) then we would have increase this. To increase this value follow the below steps Open Files: lsof | wc -l ulimit -a | grep open Edit the following line in the /etc/sysctl.conf file: fs.file-max = value #value is the new file descriptor limit that you want to set. Apply the change by running the following command:# /sbin/sysctl -p We can disable over-commit temporarily: echo 0 > /proc/sys/vm/overcommit_memory For permanent solution: Add vm.overcommit_memory = 0 in /etc/sysctl.conf #fs.file-max=65536 fs.file-max=2900000 #Added for Hortonworks HDB kernel.threads-max=798720 vm.overcommit_memory=0

amcbarnett · ‎12-05-2016

I will test this also... As far I know, via Ambari, the user must be gpadmin. If you are installing manually, you can set the OS user to be anything you want. However the postgress user is gpadmin. I will get a Pivotal engineer to answer this also...hopefully in a day or so.

amcbarnett · ‎11-17-2016

Maybe this should be a separate HCC question. However if you are using HDF 2.x/ Nifi 1.0 with Ambari, you cannot install it on the same platform using Ambari and HDP2.5. It would have to be a separate Ambari install on another node, just for HDF Perhaps you may need to contact us at Hortonworks to get us to understand your use case further.

amcbarnett · ‎11-16-2016

Why are you using HDP 2.5? Also what version of Ambari are you using? Please refer to http://hdb.docs.pivotal.io/201/hdb/releasenotes/HAWQ201ReleaseNotes.html Please use HDP 2.4.2 and Ambari 2.4 or 2.4.1 Or you can go maverick and try deleting /var/run/ambari-server/stack-recommendations/1 manually yourself.

Online	Offline
Last Visited	‎04-13-2018 03:07 PM

Member Since	‎09-29-2015 05:35 PM
Last Visited	‎04-13-2018 03:07 PM
Posts	286
Kudos received	595

Cloudera Community

Re: HIVE : counting null values based on group by

Re: ERROR 500 received - when installing the PIVOT...

Re: How do you achieve high availability in HDFS w...

Re: Why can't we use LDAP for Hadoop authenticatio...

Re: Error Installing HDB HAWQ Standby Master

Re: What's New in Cloudbreak 2.5.0 TP

Re: HIVE : counting null values based on group by

Re: HBase end-to-end over the wire encryption

Re: Adding Stanford CoreNLP To Big Data Pipelines ...

Re: How to Completely Clean, Remove or Uninstall A...

Re: Zeppelin user impersonation for Hive?

Cheat Sheet on Hortonworks HDB powered by Apache H...

Re: How to install Hawq without using "gpadmin" as...

Re: ERROR 500 received - when installing the PIVOT...

Re: ERROR 500 received - when installing the PIVOT...