Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Sharding in HDFS

avatar
Rising Star

Hi,

Is it possible to achieve sharding in HDFS, even though HDFS has its own replica mechanism, What if I want to use sharding mechanism in HDFS ? If its possible, than how one can achieve this ?

1 ACCEPTED SOLUTION

avatar
Super Guru
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
4 REPLIES 4

avatar
Contributor

I would answer this question by asking what you are trying to achieve. Sharding (as I understand it) is used in traditional databases to do some of the distributed stuff that Hadoop does ... but in a different way.

The "horizontal partitioning" in shards sounds similar to column-oriented storage. See ORC files in Hive.

The "distributing tables across servers to spread the load" part of sharding is what HDFS does natively.

If you are trying to do in Hadoop what you do in a relational database, then I would advise that you take a deeper look at the way that Hadoop works.

It is also possible that I've misunderstood your question, and what you are trying to achieve.

avatar
Super Guru

@Justin Watkins

I would not associate SHARDING with traditional RDBMS databases, that is mostly an exception, but with NoSQL databases like MongoDB, etc where is mostly the rule.

@Viraj Vekaria

What are you trying to achieve?

avatar
Rising Star

@Constantin Stanca : I am trying to find out if sharding is done by HDFS automatically or what ? If not than How can we achieve that...but I think as per Justin's answer on that, its done natively.

So basically I just want sharding no matter in what way !!

avatar
Super Guru
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login