Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

HDFS for Installing Hadoop

avatar
New Member

Hi Everyone,Hope you are doing well. I have a virtual machine and I've added a new ext4 disk.I want to convert this to hdfs( Hadoop file system).Can anyone please provide me the information about how can i do this.

1 ACCEPTED SOLUTION

avatar

Yes, you can use ext4 disks to store HDFS data. Once you have installed HDP, you can set the property :

dfs.datanode.data.dir in hdfs-site.xml to point to any locations you want the data to be stored.

Below post talks about the same:

https://community.hortonworks.com/questions/89786/file-uri-required-for-dfsdatanodedatadir.html

For your second question, I am not aware of any rpm which can convert existing data to HDFS. You would have to migrate the existing data to HDFS. There are many ways possible as how to migrate the data also depending upon where the data resides (on disk, in database etc). Below is one basic example of loading the data:

https://hortonworks.com/hadoop-tutorial/loading-data-into-the-hortonworks-sandbox/

View solution in original post

3 REPLIES 3

avatar

If you want entire HDP (which includes major Hadoop components like HDFS etc) stack for learning / tutorial purposes, you can download and install Hortonworks Sandbox:

https://hortonworks.com/products/sandbox/

There is another article which talks about custom install of HDP:

https://community.hortonworks.com/articles/16763/cheat-sheet-and-tips-for-a-custom-install-of-horto....

avatar
New Member

Thanks Namit.So like I can use ext4 that will be used to store Hadoop files? I was thinking like if there is any option to convert existing file system to hadoop file system through a rpm or so.like we do for asm.

avatar

Yes, you can use ext4 disks to store HDFS data. Once you have installed HDP, you can set the property :

dfs.datanode.data.dir in hdfs-site.xml to point to any locations you want the data to be stored.

Below post talks about the same:

https://community.hortonworks.com/questions/89786/file-uri-required-for-dfsdatanodedatadir.html

For your second question, I am not aware of any rpm which can convert existing data to HDFS. You would have to migrate the existing data to HDFS. There are many ways possible as how to migrate the data also depending upon where the data resides (on disk, in database etc). Below is one basic example of loading the data:

https://hortonworks.com/hadoop-tutorial/loading-data-into-the-hortonworks-sandbox/