I'm looking to extract a large amount of data from a few Oracle tables and transfering it to a HDFS file system. There appears to be two possible ways of achieving this:
Clearly the second option is more work, however that isn't the issue. It's been suggested that because Sqoop is copying data across the network that the locks on the Oracle table might remain for longer would be otherwise required. I'll be extracting large amounts of data and copying it to a remote location (so there will be significant network latency).
Does anybody know if this is correct?