I am trying to move data from MYSQL to Snowflake. Can someone recommend me the best practices and tools for doing the same. Also, it would be great if someone can recommend how I can move the data in realtime?
Hello @Henry2410 ,
thank you for raising the question about how to migrate data from MySql and how to move data in real time between clusters.
For real-time data streaming we recommend NiFi and Kafka on Cloudera Data Platform.
Here is a great blog article about NiFi and Kafka on CDP:
"Kafka and NiFi’s availability in CDP Data Hub allows organizations to build the foundation for their Data Movement and Stream Processing use cases in the cloud. CDP Data Hub provides a cloud-native service experience built to meet the security and governance needs of large enterprises."
One way of exporting a MySql database is to stream out the data, which you can carry out with the above products.
Please take a look at our Cloudera Data Warehouse for an all-in-one place to replace multiple vendor solutions from different vendors.
For data migration into CDP, please check out this documentation.
For backup and disaster recovery you can use the Replication Manager.
What Cloudera product are you using currently, please?
MySQL Server is intended for mission-critical, heavy-load production systems as well as for embedding into mass-deployed software. On the other hand, Snowflake is detailed as "The data warehouse built for the cloud".
There's not really an equivalence between MySQL and Snowflake use cases. What you are asking really is whether Snowflake can play the role of an OLTP database. Snowflake is not an OLTP database. It is an OLAP database. So generally speaking I would say no.
Snowflake is a cloud-based warehouse and it would be used most of the times for OLAP purpose back to your questions, Snowflake can be used under the following conditions:
If you have only inserts into target table and not much updates to the table
we can achieve good performance by using cluster by and other inline views
Having said that, to explore your use case a little bit more I would ask yourself or your stakeholders the following questions:
If you said yes to ANY of 1, 2, 3 then go MySQL. If you said NO to ALL 1, 2, and 3, then Snowflake might be viable.
But even then I would not recommend it, as that is not what Snowflake was built for.