I have a requirement to process and store changes from oracle database into hadoop data lake.
Goldengate replicates the changes and drops files into hdfs directly ( I don't see an option to write to hive table directly) . Each file will have Inserts , updates and deletes.
How can I quickly process the changes into a hive table?is Spark an option for local file processing?
I need to capture the changes from individual files into hive and also denormalize multiple transaction tables ( Eg: Headers, Detail) into one for faster querying.
Has anyone implemented a better solution for this kind of problem in hadoop? Please suggest
But there are a couple of gotchas for that!
Let know if that helps!