Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Estimate cost for Data Lake architecture

avatar
New Contributor

Hello,

I am new to the world of big data and am doing research to understand and hopefully implement a data lake solution for my company. I am looking to build a data lake with a 25 TB Platform. I do not see anywhere on how I can estimate the costs of this structure to submit to upper-management in order to determine feasibility of the project. Can anyone help me estimate the costs of the entire structure. A ballpark estimate would be fine as well. Any guidance or links towards any useful information would be greatly appreciated as well.

1 ACCEPTED SOLUTION

avatar
Super Guru

@Alpha3645

This could be an entire questionnaire, however, if I were an enterprise architect and needed to provide a 100,000ft view number, assuming a basic data lake to support 25 TB and grow another 25 TB (data replication factor of 3) and average workloads of several services, e.g. HDFS, Hive, HBase, and 3 master (16 core, 128 GB RAM, 2 x 2 TB) + 7 data nodes (16 cores, 256 GB RAM, 12 x 2 TB), 10 Gbps network, the cost for hardware would be anywhere between 60 and 100K. Add the cost of Hortonworks Data Subscription (check with sales rep for better number), your budget (exclude labor) would be anywhere between 100K and 150K.

Check cluster capacity planning here: http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_cluster-planning-guide/content/ch_hardwar...

More: http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/index.html

If this is what you were looking ballpark, vote this answer and accept it as a best answer.

View solution in original post

2 REPLIES 2

avatar
Super Guru

@Alpha3645

This could be an entire questionnaire, however, if I were an enterprise architect and needed to provide a 100,000ft view number, assuming a basic data lake to support 25 TB and grow another 25 TB (data replication factor of 3) and average workloads of several services, e.g. HDFS, Hive, HBase, and 3 master (16 core, 128 GB RAM, 2 x 2 TB) + 7 data nodes (16 cores, 256 GB RAM, 12 x 2 TB), 10 Gbps network, the cost for hardware would be anywhere between 60 and 100K. Add the cost of Hortonworks Data Subscription (check with sales rep for better number), your budget (exclude labor) would be anywhere between 100K and 150K.

Check cluster capacity planning here: http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_cluster-planning-guide/content/ch_hardwar...

More: http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/index.html

If this is what you were looking ballpark, vote this answer and accept it as a best answer.

avatar
New Contributor

Straight to the point, exactly what I wanted. Thanks so much.