I want to know about big data stack for monthly 1TB data Storage.How much namenode and data nodes require for that? I want to store structured(MySQL), Unstructured(PCAP), Binary files data..
Let's assume you produce 12TB of data annually and 40 TB storage considering 3 replication factor, accordingly you have to decide your cluster size. Also it can be horizontally scaled as in when your data grows. Number of name would be 1 and 10 data node each of size 10TB. Hope this helps!!!
Thanks for ur support. I want to know about best configuration for that cluster.
If possible kindly tell the eco system of big data.
What is the role of edge node nd administration node?
Edge nodes akagateway nodes are the interface between the Hadoop cluster and the outside network.Edge nodes are used to run client applications and this should be the only node accessible to developers and all other none administrative tasks like running ingestion tasks.
The Ambari usually sits on the admin node as the monitoring and administration tools are considered admin node.