we used sqoop to migrate the data for all tables from RDBMS to hadoop.Then we performed query analysis using hive and finally we are generating reports using tableau.we are good till now.
Intially for RDBMS oracle database we used change capture stage in Datastage ETL to capture delta changes between two tables.I am not sure How to implement delta changes between two tables in hive since we migrated the Data from RDBMS to hadoopfor all tables?
can anybody suggest steps in RDBMS oracle tables to Hadoop migration?
Sqoop has an incremental option. Please see the following link which describes doing CDC in Hadoop using Spark data frames.
@vamsi valiveti, kindly find the below example to import using sqoop...
sqoop-import --connect jdbc:oracle://hostname:1521/schema --username <user_name> --password <pwd> --table <table_name> --hive-table schemaname.tablename --hive-import --hive-home /home/Hadoop/hive-0.11.0 -m 5
I am already using sqoop and i need inputs on below point
clarification 2:- can anybody suggest steps in RDBMS oracle tables to Hadoop migration?
Thanks for input.It is a new project after importing the data into HDFS.what is the next step after migration.I am aware of that option of sqoop.currently i am in POC.we want to migrate the tables from RDBMS to hadoop.company do not want to use any RDBMS going further.