Member since
02-02-2016
31
Posts
41
Kudos Received
6
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3810 | 03-03-2016 06:05 AM | |
2030 | 03-01-2016 12:30 PM | |
25015 | 02-23-2016 09:19 AM | |
1384 | 02-18-2016 09:12 AM | |
10839 | 02-15-2016 09:49 AM |
01-27-2020
12:16 PM
Thanks for the information. In using this command, it did cause some serious performance degradation when writing to HDFS. Every 128MB block would take about 20-30 secs to write to HDFS. The issue had to do with trying to compress the tar file. It's better to remove the "z" flag in tar and not compress. Just to provide some numbers, writing almost 1TB of data from local disk to HDFS would take 13+ hours with compression (z) and it would actually eventually fail due to kerberos ticket expiration. Removing the "z" flag, the copy to HDFS took less than an hour for the same 1TB of data!
... View more
06-12-2016
07:35 PM
The easiest way to do it: Just log in to the Ambari using these credentials: User: admin Pass: 4o12t0n cheers
... View more
02-14-2017
04:54 PM
1 Kudo
Tensorflow on Spark by Yahoo
http://yahoohadoop.tumblr.com/post/157196317141/open-sourcing-tensorflowonspark-distributed-deep
https://github.com/yahoo/TensorFlowOnSpark
https://github.com/yahoo/TensorFlowOnSpark/wiki/GetStarted_YARN
https://github.com/yahoo/TensorFlowOnSpark/wiki/Conversion
https://github.com/yahoo/TensorFlowOnSpark/tree/master/examples/slim
https://github.com/tensorflow/models/tree/master/slim
... View more
01-03-2019
01:25 PM
1 Kudo
Hi, I'd like to share a situation we encountered where 99% of our HDFS blocks were reported missing and we were able to recover them. We had a system with 2 namenodes with high availability enabled. For some reason, under the data folders of the datanodes, i.e /data0x/hadoop/hdfs/data/current - we had 2 Block Pools folders listed (example of such folder is BP-1722964902-1.10.237.104-1541520732855). There was one folder containing the IP of namenode1 and another containing the IP of namenode 2. All the data was under the BlockPool of namenode 1, but inside the VERSION files of the namenodes (/data0x/hadoop/hdfs/namenode/current/) the BlockPool id and the namespace ID were of namenode 2 - the namenode was looking for blocks in the wrong block pool folder. I don't know how we got to the point of having 2 block pools folders, but we did. In order to fix the problem - and get HDFS healthy again - we just needed to update the VERSION file on all the namenode disks (on both NN machines) and on all the journal node disks (on all JN machines), to point to Namenode 1. We then restarted HDFS and made sure all the blocks are
reported and there's no more missing blocks.
... View more
05-30-2016
06:03 AM
Hi @Rushikesh Deshmukh The following table provides an overview for quickly comparing these approaches, which I’ll describe in detail below. http://blog.cloudera.com/blog/2013/11/approaches-to-backup-and-disaster-recovery-in-hbase/ i used distcp as well but that did not work for me , in the sense data was copied but while running hbck i had issue if you want to create backup on same cluster then copytable and sanpshot are very easy for inter cluster snapshot works good let me know if you need more details Also this below link is really very useful and clear http://hbase.apache.org/0.94/book/ops.backup.html
... View more