Support Questions

Find answers, ask questions, and share your expertise

HDFS for Installing Hadoop

avatar
New Contributor

Hi Everyone,Hope you are doing well. I have a virtual machine and I've added a new ext4 disk.I want to convert this to hdfs( Hadoop file system).Can anyone please provide me the information about how can i do this.

1 ACCEPTED SOLUTION

avatar

Yes, you can use ext4 disks to store HDFS data. Once you have installed HDP, you can set the property :

dfs.datanode.data.dir in hdfs-site.xml to point to any locations you want the data to be stored.

Below post talks about the same:

https://community.hortonworks.com/questions/89786/file-uri-required-for-dfsdatanodedatadir.html

For your second question, I am not aware of any rpm which can convert existing data to HDFS. You would have to migrate the existing data to HDFS. There are many ways possible as how to migrate the data also depending upon where the data resides (on disk, in database etc). Below is one basic example of loading the data:

https://hortonworks.com/hadoop-tutorial/loading-data-into-the-hortonworks-sandbox/

View solution in original post

3 REPLIES 3

avatar

If you want entire HDP (which includes major Hadoop components like HDFS etc) stack for learning / tutorial purposes, you can download and install Hortonworks Sandbox:

https://hortonworks.com/products/sandbox/

There is another article which talks about custom install of HDP:

https://community.hortonworks.com/articles/16763/cheat-sheet-and-tips-for-a-custom-install-of-horto....

avatar
New Contributor

Thanks Namit.So like I can use ext4 that will be used to store Hadoop files? I was thinking like if there is any option to convert existing file system to hadoop file system through a rpm or so.like we do for asm.

avatar

Yes, you can use ext4 disks to store HDFS data. Once you have installed HDP, you can set the property :

dfs.datanode.data.dir in hdfs-site.xml to point to any locations you want the data to be stored.

Below post talks about the same:

https://community.hortonworks.com/questions/89786/file-uri-required-for-dfsdatanodedatadir.html

For your second question, I am not aware of any rpm which can convert existing data to HDFS. You would have to migrate the existing data to HDFS. There are many ways possible as how to migrate the data also depending upon where the data resides (on disk, in database etc). Below is one basic example of loading the data:

https://hortonworks.com/hadoop-tutorial/loading-data-into-the-hortonworks-sandbox/