Support Questions
Find answers, ask questions, and share your expertise

how many data Volume can my cluster Handle?



I have a 4 node cluster configured to have 1 Namenode and 3 datanodes. Im performing a TPCH benchmark and i would like to know how much data you think my cluster can handle without affecting query response times. The nodes have 16gb of ram each and 8 cores. My total amount of disk available is ~700GB.

Thank you


@mÁRIO Rodrigues

Its not that simple with the data provided.

Need to know the block size, how the nodes are arranges, network traffic/speed, how complex the transformation logics are, Are you trying to perform joins?, does it have unique key value pair, How the data is stored?

These are few basic question which you need to ask to understand cluster performances.

; ;