07-15-2014 05:07 AM
How we can connect two machines when on one we have vanilla hadoop installed(this should be the name node), and other we have Cloudera CDH4 installed (this should work as datanode).
07-15-2014 10:47 AM
I wouldn't recommend doing that, you should always run the same version of hadoop accross all machines in the same cluster. This is for software compatability reasons since not all versions are tested against the others to verify compatability, and major versions will almost certainly not be compatable.
I'm curious what factors are preventing you from either installing the apache hadoop release on the datanode or upgrading your cluster to CDH4?
07-16-2014 01:54 AM
07-17-2014 01:26 AM
So let's see if i understand,
you want more capacity in your vanilla cluster and you are wondering if both the vanilla hadoop installation and the cloudera CDH installation can both run on the same server, effectively having two datanodes on one server?
If so I haven't ever been forced to try that, I think it would be tricky to keep them from trying to use each others config files and environment settings.