I'm a new user. I have to install claudera to do a study on big data. I wanted to know what are the limits of claudera when it is installed on vm ware. The results of the case studies on the data, are reliable?
thanks to all in advance.
Are you referring to an installation of the VMware flavor of the QuickStart VM? If so, the software inside is precisely the same as you would use for a "direct" install. So yes, that is a reliable test bed.
I have installed the QuickStart VM. Now, I do not have large amounts of data (Big Data). What features can be tested even though I do not have large amounts of data and at the same time having acceptable results (almost close to those obtained in a normal contensto Big Data)?
With QuickStart can also simulate a cluster of nodes?
thanks in advance
You can run any Hadoop example of your choosing on the Quickstart VM. It should have all the components there and ready to run mapreduce or HBase or other Hadoop examples of any kind. (There are lots of examples available in the various O'Reilly books and on the internet/github)
You could also add VMs to make a miniature (eg. 3 node) cluster using the Quickstart VM as the master node, but I would not think that chaining three Quickstart VMs together into a cluster would work, because Cloudera Manager and the mgmt services would be installed on each of the nodes and that's not a proper configuration. You should just create some generic Linux VMs and use Cloudera Manager to deploy CDH to them and add them to your cluster.
Please note that the Quickstart VM is intended to demo Hadoop features only. It is not going to be a good way for you to performance test Hadoop and get benchmark data about production cluster throughput, etc.