Hi! I am writing diploma "Learning Big data services". I would like to create a simple minimal cluster to test/learn work of different services (e.g. Hive). Let's suppose that i am not going to process huge amount of data. How many virtual machines should i have to configure environment and test some Hive features?
Ideally a 4 node cluster is good enough to setup locally with VirtualBox in a local environment to test various features in a HDFS/Yarn HA Enabled cluster.
However, if you want to quickly test some features in a single host machine then you can also have a look at the HDP Sandbox.
In addition to @jsensharma update, you can also refer to below documentation for a cluster hardware requirement and for Cloudera Reference Architecture