Member since
01-18-2018
34
Posts
3
Kudos Received
0
Solutions
06-25-2018
06:49 AM
1 Kudo
@karthik nedunchezhiyan Always stick to the Hortonworks documentation when preparing the environment,people tend to go out off script and encounter problems. Our assumption is that the HW documentation was followed and the issues encountered are software integration related but when like now we learn it was the passwordless config that was ignored then what else you have under the hood.
... View more
06-20-2018
12:42 PM
You may want to take a step back and look at your access patterns. If this is the only way you are planning on accessing this data, you can move key2 into the value section since you are not searching on it. If you will sometimes be searching on key1+key2 then it makes sense to keep it as a key. You shouldn't have a problem with the value being null but usually in a key-value system there is some sort of value you are looking for, so be sure to understand what it is you're searching for in this scenario and structure your data properly.
... View more
06-16-2018
02:43 AM
@Josh Elser Tnx dude for your reply, can you please explain me how query matcher in hbase work or else if you have a good link regarding query matcher please share with me, it will helpful for me to understand. Thank you
... View more
06-12-2018
09:19 PM
2 Kudos
Zookeeper.out file contains log for zookeeper server. you can refer to below thread to enable log rotating for zookeeper. This way you can avoid too big log files. https://community.hortonworks.com/questions/39282/zookeeper-log-file-not-rotated.html
... View more
06-07-2018
03:50 AM
@Geoffrey Shelton Okot Tnx dude now i understand..
... View more
06-05-2018
02:34 PM
First, remember you need to have an odd number of journalnode (3, 5, 7, etc). All the Journal nodes should be sync. Tail the log file across all the JN to be sure you are in the same edit. tail -f /var/log/hadoop/hdfs/hadoop-hdfs-journalnode*.log * Note: If you comment this post make sure you tag my name. And If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
... View more
05-30-2018
11:39 AM
1 Kudo
But i am not using ambari, is there any other way to add without ambari
... View more
05-30-2018
09:44 AM
@Artem Ervits Yes we are using SSDs, Is mileage is the only thing affected by multi WAL feature?
... View more
05-22-2018
07:30 PM
JMX metrics can provide you compaction related parameters. http://<region server>:16030/jmx
... View more
05-12-2018
05:52 AM
depends on your use pattern and on how acceptable a data loss is for you. How many write (puts) do you have in your scenario? If you are unsure, i would start with default config and tweak it in case you see poor performance, To get some more information on the caching and flushing parameters: http://hadoop-hbase.blogspot.de/2014/07/about-hbase-flushes-and-compactions.html http://gbif.blogspot.de/2012/07/optimizing-writes-in-hbase.html
... View more
03-30-2018
12:21 PM
1 Kudo
---->How region server will use heap and what are the
parameters will occupy heap memory Region server heap memory consumption is mainly dependent on
below three. - Block cache (Buffer maintained in heap for read) - Memstore Size (Buffer maintained in heap for write and
flush) - Other Objects created within region server while during
various operations. Below two are the parameter which controls the max % of heap
block cache and block cache consume. hfile.block.cache.size hbase.regionserver.global.memstore.size =>Below links will help in understading region server
configuration for block cache/memstore and region server sizing. https://hbase.apache.org/book.html#perf.rs.memstore.size https://hbase.apache.org/book.html#ops.capacity.regions.count https://hbase.apache.org/book.html#block.cache.usage -----> Is region server will use only 16 GB heap or even
other space of RAM Region server will comsume maximum = 16GB of heap +
XX:MaxDirectMemorySize which you have configured. MaxDirectMemorySize—The JVM has a kind of memory called
direct memory, which is distinct from normal JVM heap memory, that can run out.
You can increase the direct buffer memory either by increasing the maximum heap
size (see previous JVM Heap Size), which increases both the maximum heap and
the maximum direct memory, or by only increasing the maximum direct memory
using -XX:MaxDirectMemorySize. The following parameter added to the Java
application startup increases the maximum direct memory size to 256 megabytes:
... View more
02-21-2018
12:25 PM
the basic idea is, that the zookeeper server will tell the client that he is lagging behind. In your scenario s3 would be a follower of s5 (the leader). So any change is requested to the leader (s5), which will only report successs to the client when all followers have acknowledged the change. But of course in this phase the network between s3 and s5 can fail, and after a timeout, s5 will drop s3 as the follower and still send a success to the client. But at this point s3 also notices that he is no longer a follower, because: 1. the network failed in both directions and s3 has also missed the heartbeat from s5, and therefore now knows that the leader is gone 2. the network connection failed just in the direction s5->s3, then s3 has still processed the update, but s5 will drop s3 as a follower as s5 doesn't receive an acknowledge. It then stops sending heartbeats and sync messages, so for s3 the same situation as in 1. occurs. Besides this situation the most interesting part is the new leader election when the communication between the current leader and a follower is broken. It is important to ensure that not 2 leaders are elected.
... View more
02-13-2018
10:44 PM
1 Kudo
I'm guessing you've already seen http://hbase.apache.org/0.94/book/secondary.indexes.html which basically is telling you that you'll need to have a second table whose rowkey is your "secondary index" and is only being used to find the rowkey needed for the actual table. The coprocessor strategy, as I understand it, is to just formalize & automate the "dual-write secondary index" strategy. Good luck and happy Hadooping!
... View more
01-20-2018
06:01 AM
Tnx bro....
... View more
01-29-2018
02:32 PM
1 Kudo
@karthik nedunchezhiyan A simplified explanation of the process : Whenever a NN HA is achieved, there will be two NNs , One Active NN and other Standby NN, 1) DataNodes will send heartbeats to both NNs , so both Active and Standby will know where the blocks are placed. 2) Journal Nodes maintain the Shared edits , Whenever there is a write operation the JNs will update the edits, not the Active or Standby NN. Once the edits are updated by JN, the Standby will update its FS Image. 3)So this way at any point in time both the Active and the Standby will have the same updated FS Image. 4)Zookeeper will be responsible for holding the lock for the Active NN. 5) There will be two Zookeeper Failover Controllers, which will be responsible for monitoring the health of the NNs. 6) Whenever the Zookeeper does not receive a communication from the Zookeeper FC, it will release the lock and this will be acquired by the other Zookeeper FC and the Standby NN will become the Active NN.
... View more