i have a a spark streaming program that will aggregate and process data in a window of 15 minutes. the output of this needs to be pushed to oracle tables.
what would be the best approach here?
if i write my data into hive and then use sqoop to push it to oracle, then i will have to schedule my sqoop job in a certain frequency and sqoop should somehow understand what data it pulled before and what is delta that it should pull from hive now. i am not sure if sqoop can do that.