About adaher

adaher · ‎05-05-2017

In the Zeppelin Notebook, there is a versioning button that I have been using to commit my script, but I have not been able to find how to revert back to a previous version of the script within the notebook. How can I access this feature?

adaher · ‎04-08-2017

login as root [root@Node0 ~]$ [root@Node0 ~]$ cd /var/lib/pgsql [root@Node0 ~]$ su - postgres [root@Node0 pgsql]$ sudo su - postgres -bash-4.2 $ grep listen /var/lib/pgsql/data/postgresql.conf #listen_addresses = 'localhost' # what IP address(es) to listen on; -bash-4.2 $ vi /var/lib/pgsql/data_bkp/postgresql.conf Find listen_address= 'localhost' and change to local_address='*', then and exit vi, then exit back to root. service postgresql restart Redirecting to /bin/systemctl restart postgresql.service [root@Node0 ~]$ ambari-server setup and accept all prompts, and when ambari-server setup will successfully complete, [root@Node0 ~]$ ambari-server start Ambari Server 'start' completed successfully.

adaher · ‎08-25-2016

B) Create a hive table. The Hive table should have all the columns stated in your hive2parquet.csv file. Assume (col1, col2, col3). Also assume your csv file is in /tmp dir inside HDFS. 1- Log into Hive and at hive command prompt and execute 2- and 3- and C) below; // create the hive table 2- create table temp_txt (col1 string,col2 string, col3 string) row format delimited fields terminated by ','; // load the hive table with hive2parquet.csv file 3- load data input ' /tmp/hive2parquet.csv' into table temp_text; // Insert from table 'temp_txt' to table 'table_parquet_file' C- insert into table table_parquet_file select * from temp_txt;

adaher · ‎07-14-2016

Nick - Re run the install (should be a drop down arrow next to the Upgrade button >> Reinstall); Let me know how it goes. Alex -

adaher · ‎07-14-2016

Nick - Can you check quickly what type of disk space you have in the edge nodes? Also, did all the other Data Nodes complete the installation properly? So your problem is only on the edge nodes? Alex -

adaher · ‎04-15-2016

@emaxwell Great thread Eric. Cheers.

adaher · ‎04-15-2016

adaher · ‎01-27-2016

Microsoft HDInsights provides the four different options under Cluster Type and they are: Hadoop, Hbase, Storm and Spark (Preview); Some of HDP components are not standard in the HDInsight distribution of choice. In this particular case, the customer was interested in Spark / Spark Notebooks Components and especially Zeppelin. When choosing the Spark Option, only these services will be provided under Ambari: HDFS, MR2, YARN, Tez, Hive, Pig, Sqoop, OOzie, Zookeeper, Ambari Metrics, Jupyter, Livy (Livy is the remote job submission server for Spark), and Spark; Zeppelin is missing from the HDI Spark distro at the time of writing this article; So in order to install Zeppelin When choosing 'Spark' as the HDI ‘Cluster Type' , and under 'New HDInsights Cluster' >> 'Optional Configuration’ >> 'Scripts Action' put the following URI as shown below: https://hdiconfigactions.blob.core.windows.net/linuxincubatorzeppelinv01/install-zeppelin-spark151-v01.sh Further services can be added for the Spark Cluster Type through Ambari (Add Service): Accumulo, Atlas, Mahout, Ranger, Ranger KMS and Slider. (notice that Hbase is not a default install when choosing the Spark HDI install option, and has to be installed seperatly or through chosing the Hadoop cluster Type instead of Spark) Here are other 'Script Actions' URLs for installing Spark, R, Solr, and Giraph while installing HDInsight: Install Spark >> https://hdiconfigactions.blob.core.windows.net/sparkconfigactionv03/spark-installer-v03.ps1. Install R >> https://hdiconfigactions.blob.core.windows.net/rconfigactionv02/r-installer-v02.ps1. Install Solr>> https://hdiconfigactions.blob.core.windows.net/solrconfigactionv01/solr-installer-v01.ps1. Install Giraph >> https://hdiconfigactions.blob.core.windows.net/giraphconfigactionv01/giraph-installer-v01.ps1.

adaher · ‎12-17-2015

Root Cause - 1. Milliseconds were used in opentsdb metrics, which may generate over 32000 metrics in one hour. each column of milliseconds metrics uses 4 bytes, when compacting, the integrated column size may exceed the 128KB (hfile.index.block.max.size) 2.If the size of ( rowkey + columnfamily:qualifer ) is greater than hfile.index.block.max.size, this may cause the memstore flush to infinite loop during writing the hfile index. … That's why the compaction hangs, and the tmp folder of regions on hdfs increases all the time, and makes the region server down. When HBase is starting, it will create that huge file in a .tmp directory in one of the subdirectories under the tsdb directory. Solution - 1.Shut down HBase.  2.Found .tmp disk usage by HBASE in the opentsdb keyspaces and deleted them completely. 3.If the parent directory contains a directory called recovered.edits, delete the recovered.edits directory or rename it to something like recovered.edits.bak.  4.Modified the hbase-site.xml in Ambari and increased the size of the hfile.index.block.maxsize=1024kb (from default of 128kb). 5.Then restarted HBASE followed by OpenTSDB. The cluster was immediately stable (no more increasing disk space problem) and old data could be seen in openTSDB and viewed in grafana without issue. Dataflow was turned on and everything appears to be working normally again. Further Solution Details - Here’s what openTSDB structure looked like before solution was applied … 5.7 G /apps/hbase/data/data/default/tsdb/08bfcc080d15d1127a0ebe664fdb1d80 5.0 G /apps/hbase/data/data/default/tsdb/0a55f4589b4e4bc9d7f71957a5795b4f 7.9 G /apps/hbase/data/data/default/tsdb/10066f9f83ac300e955ab9d0129ebf22 3.6 G /apps/hbase/data/data/default/tsdb/1cbf2fbf04b1e276b3de3615b95dc68f 5.5 G /apps/hbase/data/data/default/tsdb/1ed939038b8902a9c391d6d6d5a519f4 9.2 G /apps/hbase/data/data/default/tsdb/25b4dadf6621b09a63d2b1b9401203b9 6.5 G /apps/hbase/data/data/default/tsdb/2919b649b3b4a027ce8aece9a3e5ffd9 967.2 G /apps/hbase/data/data/default/tsdb/2bb4cdaaf052abe6eef753470303f099 4.6 G /apps/hbase/data/data/default/tsdb/39ab2524f8aaf2ee5685fc85a8dc1543 1022.7 M /apps/hbase/data/data/default/tsdb/39e3ce75e7225805534595c8c7e03305 4.1 G /apps/hbase/data/data/default/tsdb/4356d552eacfa526df24b400fd8007c7 9.9 G /apps/hbase/data/data/default/tsdb/4f2c1cc6d7c650c3f822136614921076 5.9 G /apps/hbase/data/data/default/tsdb/57eb8cbb2099bd6e1746cd4c8e007207 6.8 G /apps/hbase/data/data/default/tsdb/5e26da2eacba074a132edefca38017a3 1.2 T /apps/hbase/data/data/default/tsdb/69fc543c6f94891d3071005294c3c116 4.1 G /apps/hbase/data/data/default/tsdb/6ccc256f9721216bc9afa29e7d056bd4 7.5 G /apps/hbase/data/data/default/tsdb/6d43c524d221f3f54356e716c8f8849d 6.1 G /apps/hbase/data/data/default/tsdb/70f9f9ee045cde2823cf8ab485662a63 3.2 G /apps/hbase/data/data/default/tsdb/75f232ce81d5de2efb3f763d09d9c76f 8.7 G /apps/hbase/data/data/default/tsdb/7b4f9c05d64151d3f54558c70e5e9811 5.5 G /apps/hbase/data/data/default/tsdb/7fa9d913fd9a059733e9bb7a31b03e22 8.5 G /apps/hbase/data/data/default/tsdb/840f9f977262f1fbf9f4c06a8014c44b 3.0 G /apps/hbase/data/data/default/tsdb/9a3e7dbad294134eae934af980dd8c1c 6.5 G /apps/hbase/data/data/default/tsdb/a161383c9ae7df3c0cb15da093312908 4.0 G /apps/hbase/data/data/default/tsdb/b5e22e68e8f93ac50c9d9d3a3ca3f029 7.1 G /apps/hbase/data/data/default/tsdb/c9a9f105e8c4a44e8fd9172131b929d0 4.7 G /apps/hbase/data/data/default/tsdb/cc27e8c5a020291b7b3ac010dc50e25b 5.6 G /apps/hbase/data/data/default/tsdb/cc7f6645bac92ec514f8545fdb39b617 9.3 G /apps/hbase/data/data/default/tsdb/ce6426bfac53fb06fe8d320f3de150ee 1.8 G /apps/hbase/data/data/default/tsdb/d2ee226094556e6a90599e91bcba70f4 6.8 G /apps/hbase/data/data/default/tsdb/df6e88e27be5d3f0759d477812ab9277 3.0 G /apps/hbase/data/data/default/tsdb/efde9cca49c0f23a4e39e80e4040ac5a 5.9 G /apps/hbase/data/data/default/tsdb/f82790d90791b30aacd2bd990a1d4655 7.5 G /apps/hbase/data/data/default/tsdb/fcbb1b8f04f4e74a80882ef074244173 4.8 G /apps/hbase/data/data/default/tsdb/fece7565715c791028581022b70672e7 Attacked the two largest offenders and found that all of the space was in the ./tmp folder and we recovered all of the lost disk space. hadoop fs -rm -R -skipTrash /apps/hbase/data/data/default/tsdb/69fc543c6f94891d3071005294c3c116/.tmp hadoop fs -rm -R -skipTrash /apps/hbase/data/data/default/tsdb/69fc543c6f94891d3071005294c3c116/recovered.edits hadoop fs -rm -R -skipTrash /apps/hbase/data/data/default/tsdb/2bb4cdaaf052abe6eef753470303f099/.tmp hadoop fs -rm -R -skipTrash /apps/hbase/data/data/default/tsdb/2bb4cdaaf052abe6eef753470303f099/recovered.edits Then went into Ambari Configuration for HBASE and added this setting: Which increases the default of hfile.index.block.maxsize from 128kb to 1024kb Will keep monitoring and will report any further anomalies.

adaher · ‎12-17-2015

Background : Customer has an 8 Node cluster on AWS with ephemeral storage, 5 of which are Hbase. OpenTSDB and Grafana were installed on the cluster as well. Customer was ingesting time series data with OpenTSDB, at a rate of ~50k records/second. Symptom: In a span of a couple of hours, the disk utilization of hdfs skyrocketed from a few hundred GB to over 6TB all of it in HBASE / openTSDB. In attempting to troubleshoot – turning off all data ingest and stopping openTSDBand just running HBase caused the disk utilization to continue to grow unabated and out of control dozens of GB per minute, even when openTSDB was completely shut down.

Online	Offline
Last Visited	‎04-30-2018 12:05 PM

Member Since	‎09-19-2015 02:09 AM
Last Visited	‎04-30-2018 12:05 PM
Posts	15
Kudos received	16

Cloudera Community

Re: Stored data from CSV into a Parquet File and e...

Re: Could not upgrade hdp from 2.3.4 to 2.4.2 with...

Re: Why did the hdfs disk utilization skyrocketed ...

Re: Could not connect to Ambari log-on screen

How can I Revert to previous code version on Zeppe...

Re: unable to start postgres for ambari

Re: Stored data from CSV into a Parquet File and e...

Re: Could not upgrade hdp from 2.3.4 to 2.4.2 with...

Re: Could not upgrade hdp from 2.3.4 to 2.4.2 with...

Re: Assume you reboot a datanode in a running clus...

Assume you reboot a datanode in a running cluster ...

How to install Apache Zeppelin, R, Solr, and Girap...

Re: Why did the hdfs disk utilization skyrocketed ...

Why did the hdfs disk utilization skyrocketed from...