Member since
09-19-2015
15
Posts
16
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5728 | 08-25-2016 01:13 PM | |
2465 | 07-14-2016 06:14 PM | |
1724 | 12-17-2015 04:22 PM | |
1484 | 11-23-2015 02:33 PM |
05-05-2017
01:22 PM
In the Zeppelin Notebook, there is a versioning button that I have been using to commit my script, but I have not been able to find how to revert back to a previous version of the script within the notebook. How can I access this feature?
... View more
Labels:
- Labels:
-
Apache Zeppelin
04-08-2017
07:27 PM
1 Kudo
login as
root
[root@Node0 ~]$
[root@Node0 ~]$ cd /var/lib/pgsql
[root@Node0 ~]$ su - postgres
[root@Node0 pgsql]$ sudo su - postgres
-bash-4.2 $ grep listen /var/lib/pgsql/data/postgresql.conf
#listen_addresses = 'localhost' # what IP address(es) to listen on;
-bash-4.2 $ vi /var/lib/pgsql/data_bkp/postgresql.conf
Find
listen_address= 'localhost' and change to local_address='*', then and exit vi, then exit back to root.
service postgresql restart
Redirecting to /bin/systemctl restart postgresql.service [root@Node0 ~]$ ambari-server setup and accept all prompts, and when ambari-server setup will successfully complete, [root@Node0 ~]$ ambari-server start Ambari Server 'start' completed successfully.
... View more
08-25-2016
01:13 PM
B) Create a hive table. The Hive table should have all the columns stated in your hive2parquet.csv file. Assume (col1, col2, col3). Also assume your csv file is in /tmp dir inside HDFS. 1- Log into Hive and at hive command prompt and execute 2- and 3- and C) below; // create the hive table 2- create table temp_txt (col1 string,col2 string, col3 string) row format delimited fields terminated by ','; // load the hive table with hive2parquet.csv file 3- load data input ' /tmp/hive2parquet.csv' into table temp_text; // Insert from table 'temp_txt' to table 'table_parquet_file' C- insert into table table_parquet_file select * from temp_txt;
... View more
07-14-2016
06:14 PM
1 Kudo
Nick - Re run the install (should be a drop down arrow next to the Upgrade button >> Reinstall); Let me know how it goes. Alex -
... View more
07-14-2016
06:04 PM
Nick - Can you check quickly what type of disk space you have in the edge nodes? Also, did all the other Data Nodes complete the installation properly? So your problem is only on the edge nodes? Alex -
... View more
04-15-2016
01:37 PM
@emaxwell Great thread Eric. Cheers.
... View more
04-15-2016
01:09 PM
Labels:
- Labels:
-
Apache Ambari
01-27-2016
01:31 PM
6 Kudos
Microsoft HDInsights provides the four different options under Cluster Type and they are:
Hadoop, Hbase, Storm and Spark (Preview);
Some of HDP components are not standard in the HDInsight distribution of choice.
In this particular case, the customer was interested in Spark / Spark Notebooks Components and especially Zeppelin.
When choosing the Spark Option, only these services will be provided under Ambari:
HDFS, MR2, YARN, Tez, Hive, Pig, Sqoop, OOzie, Zookeeper, Ambari Metrics, Jupyter, Livy (Livy is the remote job submission server for Spark), and Spark;
Zeppelin is missing from the HDI Spark distro at the time of writing this article;
So in order to install Zeppelin When choosing 'Spark' as the HDI ‘Cluster Type' , and under 'New HDInsights Cluster' >> 'Optional Configuration’ >> 'Scripts Action' put the following URI as shown below:
https://hdiconfigactions.blob.core.windows.net/linuxincubatorzeppelinv01/install-zeppelin-spark151-v01.sh
Further services can be added for the Spark Cluster Type through Ambari (Add Service):
Accumulo, Atlas, Mahout, Ranger, Ranger KMS and Slider.
(notice that Hbase is not a default install when choosing the Spark HDI install option, and has to be installed seperatly or through chosing the Hadoop cluster Type instead of Spark)
Here are other 'Script Actions' URLs for installing Spark, R, Solr, and Giraph while installing HDInsight:
Install Spark >> https://hdiconfigactions.blob.core.windows.net/sparkconfigactionv03/spark-installer-v03.ps1.
Install R >> https://hdiconfigactions.blob.core.windows.net/rconfigactionv02/r-installer-v02.ps1.
Install Solr>> https://hdiconfigactions.blob.core.windows.net/solrconfigactionv01/solr-installer-v01.ps1.
Install Giraph >> https://hdiconfigactions.blob.core.windows.net/giraphconfigactionv01/giraph-installer-v01.ps1.
... View more
12-17-2015
04:22 PM
3 Kudos
Root
Cause - 1. Milliseconds were used in
opentsdb metrics, which may generate over 32000 metrics in one hour. each
column of milliseconds metrics uses 4 bytes, when compacting, the integrated
column size may exceed the 128KB (hfile.index.block.max.size) 2.If the size of ( rowkey +
columnfamily:qualifer ) is greater than hfile.index.block.max.size,
this may cause the memstore flush to infinite loop during writing the
hfile index. … That's why the
compaction hangs, and the tmp folder of regions on hdfs increases all the time,
and makes the region server down. When HBase is starting, it will
create that huge file in a .tmp directory in one of the subdirectories under
the tsdb directory. Solution
- 1.Shut down HBase.
2.Found .tmp disk usage by
HBASE in the opentsdb keyspaces and deleted them completely. 3.If the parent directory contains
a directory called recovered.edits, delete the recovered.edits directory or
rename it to something like recovered.edits.bak.
4.Modified the hbase-site.xml in
Ambari and increased the size of the hfile.index.block.maxsize=1024kb (from
default of 128kb). 5.Then restarted HBASE followed
by OpenTSDB. The cluster was immediately
stable (no more increasing disk space problem) and old data could be seen
in openTSDB and viewed in grafana without issue. Dataflow was turned on and
everything appears to be working normally again. Further
Solution Details - Here’s what openTSDB structure looked like
before solution was applied … 5.7
G /apps/hbase/data/data/default/tsdb/08bfcc080d15d1127a0ebe664fdb1d80 5.0
G
/apps/hbase/data/data/default/tsdb/0a55f4589b4e4bc9d7f71957a5795b4f 7.9
G
/apps/hbase/data/data/default/tsdb/10066f9f83ac300e955ab9d0129ebf22 3.6
G /apps/hbase/data/data/default/tsdb/1cbf2fbf04b1e276b3de3615b95dc68f 5.5
G
/apps/hbase/data/data/default/tsdb/1ed939038b8902a9c391d6d6d5a519f4 9.2
G
/apps/hbase/data/data/default/tsdb/25b4dadf6621b09a63d2b1b9401203b9 6.5
G /apps/hbase/data/data/default/tsdb/2919b649b3b4a027ce8aece9a3e5ffd9 967.2 G
/apps/hbase/data/data/default/tsdb/2bb4cdaaf052abe6eef753470303f099 4.6
G
/apps/hbase/data/data/default/tsdb/39ab2524f8aaf2ee5685fc85a8dc1543 1022.7 M
/apps/hbase/data/data/default/tsdb/39e3ce75e7225805534595c8c7e03305 4.1
G
/apps/hbase/data/data/default/tsdb/4356d552eacfa526df24b400fd8007c7 9.9
G
/apps/hbase/data/data/default/tsdb/4f2c1cc6d7c650c3f822136614921076 5.9
G
/apps/hbase/data/data/default/tsdb/57eb8cbb2099bd6e1746cd4c8e007207 6.8
G /apps/hbase/data/data/default/tsdb/5e26da2eacba074a132edefca38017a3 1.2
T
/apps/hbase/data/data/default/tsdb/69fc543c6f94891d3071005294c3c116 4.1
G
/apps/hbase/data/data/default/tsdb/6ccc256f9721216bc9afa29e7d056bd4 7.5
G /apps/hbase/data/data/default/tsdb/6d43c524d221f3f54356e716c8f8849d 6.1
G
/apps/hbase/data/data/default/tsdb/70f9f9ee045cde2823cf8ab485662a63 3.2
G
/apps/hbase/data/data/default/tsdb/75f232ce81d5de2efb3f763d09d9c76f 8.7
G
/apps/hbase/data/data/default/tsdb/7b4f9c05d64151d3f54558c70e5e9811 5.5
G
/apps/hbase/data/data/default/tsdb/7fa9d913fd9a059733e9bb7a31b03e22 8.5
G
/apps/hbase/data/data/default/tsdb/840f9f977262f1fbf9f4c06a8014c44b 3.0
G
/apps/hbase/data/data/default/tsdb/9a3e7dbad294134eae934af980dd8c1c 6.5
G
/apps/hbase/data/data/default/tsdb/a161383c9ae7df3c0cb15da093312908 4.0
G
/apps/hbase/data/data/default/tsdb/b5e22e68e8f93ac50c9d9d3a3ca3f029 7.1
G
/apps/hbase/data/data/default/tsdb/c9a9f105e8c4a44e8fd9172131b929d0 4.7
G /apps/hbase/data/data/default/tsdb/cc27e8c5a020291b7b3ac010dc50e25b 5.6
G
/apps/hbase/data/data/default/tsdb/cc7f6645bac92ec514f8545fdb39b617 9.3
G
/apps/hbase/data/data/default/tsdb/ce6426bfac53fb06fe8d320f3de150ee 1.8
G /apps/hbase/data/data/default/tsdb/d2ee226094556e6a90599e91bcba70f4 6.8
G
/apps/hbase/data/data/default/tsdb/df6e88e27be5d3f0759d477812ab9277 3.0
G
/apps/hbase/data/data/default/tsdb/efde9cca49c0f23a4e39e80e4040ac5a 5.9
G
/apps/hbase/data/data/default/tsdb/f82790d90791b30aacd2bd990a1d4655 7.5
G
/apps/hbase/data/data/default/tsdb/fcbb1b8f04f4e74a80882ef074244173 4.8
G
/apps/hbase/data/data/default/tsdb/fece7565715c791028581022b70672e7 Attacked the two largest offenders and
found that all of the space was in the ./tmp folder and we recovered all of the
lost disk space. hadoop fs -rm -R -skipTrash
/apps/hbase/data/data/default/tsdb/69fc543c6f94891d3071005294c3c116/.tmp hadoop fs -rm -R -skipTrash
/apps/hbase/data/data/default/tsdb/69fc543c6f94891d3071005294c3c116/recovered.edits hadoop fs -rm -R -skipTrash
/apps/hbase/data/data/default/tsdb/2bb4cdaaf052abe6eef753470303f099/.tmp hadoop fs -rm -R -skipTrash
/apps/hbase/data/data/default/tsdb/2bb4cdaaf052abe6eef753470303f099/recovered.edits Then went into Ambari Configuration for HBASE and added
this setting: Which increases the default of hfile.index.block.maxsize from 128kb to 1024kb Will keep monitoring and will report any further anomalies.
... View more
12-17-2015
01:41 PM
2 Kudos
Background : Customer has an 8 Node cluster on AWS with ephemeral
storage, 5 of which are Hbase. OpenTSDB and Grafana were installed on the cluster as
well. Customer was ingesting time series data with OpenTSDB,
at a rate of ~50k records/second. Symptom: In a span of a couple of
hours, the disk utilization of hdfs skyrocketed from a few hundred GB to over
6TB all of it in HBASE / openTSDB. In attempting to troubleshoot
– turning off all data ingest and stopping openTSDBand just running HBase
caused the disk utilization to continue to grow unabated and out of control
dozens of GB per minute, even when openTSDB was completely shut down.
... View more
Labels:
- Labels:
-
Apache HBase