Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Why Hbase database integrated in hadoop and what is the advantage ?

Why Hbase database integrated in hadoop and what is the advantage ?

Expert Contributor

Why Hbase database integrated in hadoop and what is the advantage ?

9 REPLIES 9
Highlighted

Re: Why Hbase database integrated in hadoop and what is the advantage ?

Super Collaborator

the main question is what makes you think you 'need to use' Hbase? If you want MongoDb or cassandra, what stops you using it?

Highlighted

Re: Why Hbase database integrated in hadoop and what is the advantage ?

Expert Contributor

Following this are considered Hbase in hadoop

1) Its only nosql Hbase database support/Integrated with HDFS/Hadoop file system for storing its data.

2) We can scale-up the cluster with Horizontal Scaling by adding more Machines, for storage more data and performance improvement without bringing down the cluster/services.

3) It support batch MAP/REDUCE task with help of HDFS file system.

4) Its real-time query capabilities with a key/value store(column family) and even it allow query for individual record as well.

Kindly add if any other feature need to included.

Highlighted

Re: Why Hbase database integrated in hadoop and what is the advantage ?

Super Collaborator

OK, but what do you want to get answered with your post? You don't have to use HBase, you can use Cassandra or MongoDB if you want. With both you are able to build a scalable cluster, and integrate it into other jobs running on your hadoop platform. 'real time' I guess means low latency? With key-values stores I guess you really mean without a strict DB structure (like SQL DBs)?

Just keep in mind, that Cassandra is a BigTable DB, as HBase, but MongoDB is a document DB, storing JSONs. So MongoDB implements a different concept.

  • do you need Cassandra or MongoDB? Then describe your scenario and get to the details of your issue, so that we can help you.
  • do you need a supported DB within Hadoop and Hortonworks? My proposal: stick to HBase.
  • you just like to challenge the decision for HBase within Hadoop and Hortonworks? No issue with that, but I am out in this case, as this will be subjective anyway.
Highlighted

Re: Why Hbase database integrated in hadoop and what is the advantage ?

Expert Contributor

thanks for your comments, it made me to rephrase my questions. :)

Highlighted

Re: Why Hbase database integrated in hadoop and what is the advantage ?

Super Guru

HBase runs on your existing HDFS infrastructure is faster, more secure and more scalable then those other NoSQL choices. You also get Phoenix for SQL ontop of that.

HBase is the best choice especially if you have an existing Hadoop cluster.

Highlighted

Re: Why Hbase database integrated in hadoop and what is the advantage ?

Not sure I ever thought I'd see "traditional" and "NoSQL" in the same sentence. ;-)

Seriously, @Timothy Spann is correct that HBase & Phoenix are part of HDP (thus easily stood-up and managed if you run HDP) and available for your use AND @Harald Berghoff is also correct that you don't "need to use" HBase -- if your solution is solved best with another (notice I didn't say "traditional" -- wasn't HBase here before MongoDB anyways; hehe! I ~think~ they started in 2006 and 2007, respectively) NoSQL database then by all means leverage it.

Good luck and happy Hadooping (or whatever Big Data-y framework you are using)!!

Highlighted

Re: Why Hbase database integrated in hadoop and what is the advantage ?

Expert Contributor

why only Hbase is we need to use in Hadoop that is my concern !

Highlighted

Re: Why Hbase database integrated in hadoop and what is the advantage ?

Expert Contributor

thanks for pointing the question, corrected it :)

Highlighted

Re: Why Hbase database integrated in hadoop and what is the advantage ?

Expert Contributor

Following this are considered Hbase in hadoop

1) Its only nosql Hbase database support/Integrated with HDFS/Hadoop file system for storing its data.

2) We can scale-up the cluster with Horizontal Scaling by adding more Machines, for storage more data and performance improvement without bringing down the cluster/services.

3) It support batch MAP/REDUCE task with help of HDFS file system.

4) Its real-time query capabilities with a key/value store(column family) and even it allow query for individual record as well.

5) Real time data Replication using Kafka and Hbase in Hadoop.

Kindly add if any other feature need to included.

Don't have an account?
Coming from Hortonworks? Activate your account here