11-20-2016 08:58 PM
We want to refresh the HDFS with Oracle Data real time, I have option to use Oralce Gloden Gate FLUME however Oralce GG creates lot of redo and it is problmetic to Oracle Performance. Do we have any other option for real time data sync.
11-28-2016 07:17 AM
If you need to get your data from Oracle to HDFS in *near real time* then a Golden Gate solution is probably the best option. While you may need some additional redo logging for this to work, it probably results in the minimum overall impact on the database.
If you can tolerate some lag in the data appearing in HDFS, so you could setup some jobs to use Sqoop to pull data from Oracle at regular intervals. With the correct indexes and a way for Sqoop to identify new records this could be done quite efficiently.
Depending on how the applicaiton that writes the data to Oracle is architected, there could also be an option to change that applicaiton to write to Oracle and also write to Flume or Kafka, but that does require significant applicaiton changes which may not be feasible in your case.