Is the attached figure a decent architecture, and are the components to be installed on each node a good way to do it?
Is it okay to have VMWare master nodes and physical worker nodes?
also wait for the best answer...
plus recommendation for the technical spec for each node ?
We don't know anything about the type of access you are going to have to your cluster and the security concerts related with this; but as a general recommendation I think it would be better to have the management and edge nodes (green) in an external DMZ LAN, separated from the HDP/Hadoop internal network (masters+workers). You should also co-locate any KDC/AD and/or database server used with the HDP cluster into this internal network.