Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Where Hadoop does stores its data?

Highlighted

Where Hadoop does stores its data?

New Contributor

Where does Hadoop stores its Data?

5 REPLIES 5

Re: Where Hadoop does stores its data?

New Contributor

HDFS is the storage mechanism of Hadoop which stores very large files running on the cluster of commodity hardware. It works on the principle of storage of less number of large files rather than the huge number of small files. It stores data reliably even in the case of hardware failure. In HDFS, Files are broken into blocks that are distributed across the cluster on the basis of replication factor. The default replication factor is 3, thus each block is replicated 3 times. The first replica is stored on the first data node. The second replica is stored on another datanode within the same rack to minimize network dependency and third is stored on datanode in different racks, ensuring that even if rack fails the data is not lost. Namenode keeps the information of blocks like number of blocks, their replicas, and other details. While Datanode stores actual data and performs various operations like block creation, deletion and replication according to instruction of Namenode. Namenode keeps all meta data like data node location, blocks in it, replication factor etc..Data ode stores actual data and performs instructions given by namenode.

Re: Where Hadoop does stores its data?

Contributor

@Harshali Patel

HDFS data is distributed across datanodes in local file system storage. You can configure list of storage disk dfs.datanode.data.dir in hdfs-site.xml.

dfs.datanode.data.dir - Determines where on the local filesystem an HDFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored.

Re: Where Hadoop does stores its data?

Super Collaborator

These are spam accounts, by the way. Look at all the "answers" from the other users for every question, and they all link back to dataflair's website.

Re: Where Hadoop does stores its data?

Contributor

Oops! It looks these are the spam account. Thx @Jordan Moore for your info.

Re: Where Hadoop does stores its data?

New Contributor

Btw, they spam us with granular & top-notch resources. I think it worths the spam. ^.^