We are planning to put into production a Cloudera (Impala) cluster with a mix of virtual machines (VMware) and physical hosts (DataNodes), with the following configuration:
NameNode Active (VMware Virtual Machine)
NameNode Standby (VMware Virtual Machine)
PostgreSQL Database (VMware Virtual Machine)
7 x DataNode with Impala Daemon (DELL Physical hosts)
We are a kind of new in Hadoop and CLoudera and I am wonder if only putting the NameNode and DB Machine in VMware could dramatically impact the production? The DataNode will have good resources like 2 Sockets 12 Cores 2.8GHz, 128 GB RAM, 4 disks SATA 10K 1TB, 10Gbps Ethernet connection.
Please your comments will be very important for us.