Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to automatically sync a Hive external table with a MySQL table without using Sqoop?

How to automatically sync a Hive external table with a MySQL table without using Sqoop?

New Contributor

I'm already having a MySQL table in my local machine (Linux) itself, and I have a Hive external table with the same schema as the MySQL table. I want to sync my hive external table whenever new record is inserted or updated.Batch update is ok with me say hourly. What is the best possible approach to achieve the same without using sqoop?

Thanks ,

Sumit

3 REPLIES 3

Re: How to automatically sync a Hive external table with a MySQL table without using Sqoop?

New Contributor

This can be very easily achivable using NiFi. Check queryDatabase processor or ExecuteSQL

Re: How to automatically sync a Hive external table with a MySQL table without using Sqoop?

New Contributor

If I use queryDatabase processor or ExecuteSQL in Nifi it will create the multiple files in case of update transaction.

I want to merge the data as well in the target hive table.How to achieve that?

Highlighted

Re: How to automatically sync a Hive external table with a MySQL table without using Sqoop?

Hi @Sumit Deshmukh!
Guess you have some approaches to retrieve Mysql Data, like:

- Use CDC to get data from Mysql without being invasive (gathering data from BINLOG). Then you can use tools like Nifi (recommended) or other cdc tools like canal from Alibaba. Take a look at the link below:
https://community.hortonworks.com/articles/113941/change-data-capture-cdc-with-apache-nifi-version-1...

- Or use JDBC like Kafka Connect, then throw the data directly into a kafka.
https://www.confluent.io/blog/simplest-useful-kafka-connect-data-pipeline-world-thereabouts-part-1/

Hope this helps! :)