Member since
07-08-2013
35
Posts
11
Kudos Received
6
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5774 | 08-05-2015 08:18 AM | |
2355 | 06-18-2015 04:33 PM | |
16120 | 05-10-2015 08:10 PM | |
51778 | 05-10-2015 07:34 PM | |
5146 | 05-08-2015 09:09 AM |
04-02-2019
07:48 AM
Hi Vinod, Can you please start a different thread and share your Master logs with us? JMS
... View more
04-07-2017
05:41 AM
In fact the issue was that hbase was not installed.
... View more
02-20-2017
07:09 PM
Hi, When i Run fsck on my cluster i got that several blocks under replicated and the target replication is 3 even i changed the dfs.replication to NN/ DNs and client server to replication factor 2, and mapred.submit.replication changed to 2. tried also: <property> <name>dfs.replication</name> <value>2</value> <final>true</final> </property> I also restarted all service at my cluster including the oozie. Looking at one of the running jobs conf and see the following with replication factor 3: mapreduce.client.submit.file.replication s3.replication kfs.replication dfs.namenode.replication.interval ftp.replication s3native.replication
... View more
08-05-2015
08:18 AM
Hi Asif, You will need to do some code for that. Create a small table, put some data in it and call the bulk delete. Here is an example: https://github.com/apache/hbase/blob/master/hbase-examples/src/test/java/org/apache/hadoop/hbase/coprocessor/example/TestBulkDeleteProtocol.java You might be able to re-use most of this code. JM
... View more
06-18-2015
04:33 PM
1 Kudo
Hi Neil, As you said in your message, it all depends or what you want to run, however here are some guidelines. I'm not talking about the "data" drives since you can have as many as you want with the size you need. 1) CM requirements will apply to CDH, like for the /var folder 2) CM will start to alert you when Journal Nodes, Namenodes, and other processes directories start to be under 10GB. Therefore, accounting at least 20GB per service for the "meta" (logs, configs, binaries, etc.) is a good idea. So if you have a YARN + DN + Spark on a node, give them at least 60GB of disk space for 3) Master processes will use space based on the size of the cluster. Indeed, the bigger the cluster is, the more data, the more blocks, the more space is used on the NN and JN directories. So for clusters bigger than 30 nodes you might want to think about giving them a bit more. Now. It is not recommended to run any sercive on the OS disk (and not just partition). And since disks are bigger and bigger, you might end up with something like 1TB available on your partitition sur CM agaent + CDH services (on worker nodes). If that's the case, I don't think you should really worry about the available space and just share this space between the different mounting points (if split in partitions). Let me know if I can provide anymore details or information or if this doesn't reply to your question. JM
... View more
05-10-2015
04:04 PM
Hi Jean-Marc, Thanks for your thorough analysis! Making sure the HFiles stay around makes perfect sense, so it is just a permissions issue. And hopefully this will be fixed with HBase 1.2 then ? I will use a permissions workaround meanwhile. Best regards Jost
... View more
05-06-2015
09:59 PM
Hi, The "clean" was is to decommission the node to make sure nothing is going into this node anymore and not any block is at risk. However, if what you want to have is 2 blocks only, then increasing the replication to 3, waiting for all the blocks to be fully replicated and then just stopping a datanode before moving it will allow you to always have the minimum of 2 blocks you are looking for. And this will be way faster. But not as clean as doing the decommission. Also, if you move the datanode fast enought, most of the blocks on it will simply be re-used when it will be re-connected. JM
... View more