I want to read data from Oracle and Mysql. Then I need to Compare data which are read from Oracle and Mysql and the output will be stored in MSSQL server. I want to perform all this using MapReduce job in c#.
If I have configured Hadoop multinode cluster and I am executing MapReduce job to read data from Oracle and MySQL, data which MapReduce reads will distribute across the memory of multiple nodes in the cluster.?
Yess you can use MR to do all you have said OR you can use some inbuilt tools .
Sqoop is a wrapper over MapReduce to pull/push data to Database. You can always write your own custom MapReduce to do the same.