Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Sqoop Metastore - Usage, Supported databases, backup

Contributor

1. Its not very clear from the HDP documentation whether Sqoop supports Mysql or any DB other than HSQLDB. Has any one tried using MySQL or Postgres for the sqoop metastore?

2. I think metastore is required only for the sqoop jobs and not required otherwise. Is this correct?

3. If only HSQLDB is supported, what are the backup, DR strategies for the same?

1 ACCEPTED SOLUTION

Mentor

Sqoop metastore on MySQL is not out but there are jiras open. There are tutorials outthere but i was never successful to make it work.

Yes metastore is for metadata about jobs, storing last record procesaed, job name, etc.

There really aren't any backup strategies for hsqldb. It's a file on disk, while some in memory data is not flushed to disk you have potential Dara loss. Maybe putting it on RAID 1? Use ECC memory chips, etc.

View solution in original post

5 REPLIES 5

Mentor

Sqoop metastore on MySQL is not out but there are jiras open. There are tutorials outthere but i was never successful to make it work.

Yes metastore is for metadata about jobs, storing last record procesaed, job name, etc.

There really aren't any backup strategies for hsqldb. It's a file on disk, while some in memory data is not flushed to disk you have potential Dara loss. Maybe putting it on RAID 1? Use ECC memory chips, etc.

Mentor

I should mention that you can store job definition in source control and write out the last record procesaed to an additional destination like hdfs as precaution. I would do that with Java hdfs api at some point to make sure not losing the row sequence.

Contributor

sounds good, thanks. I am looking to store last record info at enterprise level metadata store in DB2.

Mentor

Sure whatever works, you can show info on completed Sqoop job in metastore and parse that, then store it in whatever means convenient for you.

Contributor

We are currently using a mysql metastore in our test environment so it is possible (we run hpd 2.5). Only thing is that oozie requires the sqoop-site.xml file to be placed somewhere in hdfs to access the metastore. We don't really like the idea that passwords are just stored like that..

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.