Hi Team,
We need to move TBs of data from RDBMS to HBase. Using HDP 2.6.5 in AWS. What is the best possible way to do this activity?
As of now I could find below options:
Option-1 : Use Sqoop to migrate. But can be slow because it uses "put" command to load data. But advantage is that its simple to right and can handle new line characters which we have in some fields in RDBMS
Option-2 : Create delimited files from RDBMS by taking care of new line somehow (not sure). Then use the HBase bulk load utility to load data in HBase
Options-3: Create delimited from RDBMS by taking care of new line somehow (not sure). Then use Spark HBase SHC connector to load data in HBase.
Can you guys please guide on the best option or if any other option I missed out.
Thanks