Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Estimate cost for Data Lake architecture

Solved Go to solution
Highlighted

Estimate cost for Data Lake architecture

New Contributor

Hello,

I am new to the world of big data and am doing research to understand and hopefully implement a data lake solution for my company. I am looking to build a data lake with a 25 TB Platform. I do not see anywhere on how I can estimate the costs of this structure to submit to upper-management in order to determine feasibility of the project. Can anyone help me estimate the costs of the entire structure. A ballpark estimate would be fine as well. Any guidance or links towards any useful information would be greatly appreciated as well.

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Estimate cost for Data Lake architecture

@Alpha3645

This could be an entire questionnaire, however, if I were an enterprise architect and needed to provide a 100,000ft view number, assuming a basic data lake to support 25 TB and grow another 25 TB (data replication factor of 3) and average workloads of several services, e.g. HDFS, Hive, HBase, and 3 master (16 core, 128 GB RAM, 2 x 2 TB) + 7 data nodes (16 cores, 256 GB RAM, 12 x 2 TB), 10 Gbps network, the cost for hardware would be anywhere between 60 and 100K. Add the cost of Hortonworks Data Subscription (check with sales rep for better number), your budget (exclude labor) would be anywhere between 100K and 150K.

Check cluster capacity planning here: http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_cluster-planning-guide/content/ch_hardwar...

More: http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/index.html

If this is what you were looking ballpark, vote this answer and accept it as a best answer.

2 REPLIES 2

Re: Estimate cost for Data Lake architecture

@Alpha3645

This could be an entire questionnaire, however, if I were an enterprise architect and needed to provide a 100,000ft view number, assuming a basic data lake to support 25 TB and grow another 25 TB (data replication factor of 3) and average workloads of several services, e.g. HDFS, Hive, HBase, and 3 master (16 core, 128 GB RAM, 2 x 2 TB) + 7 data nodes (16 cores, 256 GB RAM, 12 x 2 TB), 10 Gbps network, the cost for hardware would be anywhere between 60 and 100K. Add the cost of Hortonworks Data Subscription (check with sales rep for better number), your budget (exclude labor) would be anywhere between 100K and 150K.

Check cluster capacity planning here: http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_cluster-planning-guide/content/ch_hardwar...

More: http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/index.html

If this is what you were looking ballpark, vote this answer and accept it as a best answer.

Re: Estimate cost for Data Lake architecture

New Contributor

Straight to the point, exactly what I wanted. Thanks so much.

Don't have an account?
Coming from Hortonworks? Activate your account here