Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

HDFS over object storage for Hadoop on demand?

Solved Go to solution

HDFS over object storage for Hadoop on demand?

Guru

Azure HDInsight service provides that capability to create a Hadoop cluster that can be torn down and brought back up without losing any data (including meta store). Can this setup be achieved with Open Stack Swift and Cloud Break? If so, what are the steps and considerations to implement this architecture?

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: HDFS over object storage for Hadoop on demand?

Expert Contributor

Hi,

we have not tested Cloudbreak with Swift, but Hadoop supports Swift out of the box , therefore it shall work on a Hadoop cluster which have been installed with Cloudbreak.

Attila

View solution in original post

3 REPLIES 3
Highlighted

Re: HDFS over object storage for Hadoop on demand?

Expert Contributor

Hi,

we have not tested Cloudbreak with Swift, but Hadoop supports Swift out of the box , therefore it shall work on a Hadoop cluster which have been installed with Cloudbreak.

Attila

View solution in original post

Highlighted

Re: HDFS over object storage for Hadoop on demand?

Guru

@Attila Kanto So if you have Object storage and use cloud break to install Hadoop, HDFS will sit on top of the object storage once everything is installed, I get that. But will the data show up on HDFS if the cluster is taken down and then brought back up with Cloud Break or will it have to be reloaded?

Highlighted

Re: HDFS over object storage for Hadoop on demand?

Expert Contributor

@Vadim, no, the HDFS and the Swift Object storage will be two different storage and both work parallel with the same cluster.

1.) HDFS: the HDFS components will be installed by Ambari and you can use the HDFS storage as usual, you can store data on it and access to it as usual e.g:

hdfs dfs -ls /some_dir/ 

2.) You can connect to Swift with the aid of the Swift connector shown in the link above. The Swift connector is basically allows you to communicate with Swift trough a HDFS API and the Swift connector will translate the HDFS API calls to API calls that can be understood by the Swift object storage. Therefore commands like this will work:

hdfs dfs -ls swift://some_container/ 
Don't have an account?
Coming from Hortonworks? Activate your account here