Support Questions

Find answers, ask questions, and share your expertise

Sqoop Metastore - Usage, Supported databases, backup

avatar
Expert Contributor

1. Its not very clear from the HDP documentation whether Sqoop supports Mysql or any DB other than HSQLDB. Has any one tried using MySQL or Postgres for the sqoop metastore?

2. I think metastore is required only for the sqoop jobs and not required otherwise. Is this correct?

3. If only HSQLDB is supported, what are the backup, DR strategies for the same?

1 ACCEPTED SOLUTION

avatar
Master Mentor

Sqoop metastore on MySQL is not out but there are jiras open. There are tutorials outthere but i was never successful to make it work.

Yes metastore is for metadata about jobs, storing last record procesaed, job name, etc.

There really aren't any backup strategies for hsqldb. It's a file on disk, while some in memory data is not flushed to disk you have potential Dara loss. Maybe putting it on RAID 1? Use ECC memory chips, etc.

View solution in original post

5 REPLIES 5

avatar
Master Mentor

Sqoop metastore on MySQL is not out but there are jiras open. There are tutorials outthere but i was never successful to make it work.

Yes metastore is for metadata about jobs, storing last record procesaed, job name, etc.

There really aren't any backup strategies for hsqldb. It's a file on disk, while some in memory data is not flushed to disk you have potential Dara loss. Maybe putting it on RAID 1? Use ECC memory chips, etc.

avatar
Master Mentor

I should mention that you can store job definition in source control and write out the last record procesaed to an additional destination like hdfs as precaution. I would do that with Java hdfs api at some point to make sure not losing the row sequence.

avatar
Expert Contributor

sounds good, thanks. I am looking to store last record info at enterprise level metadata store in DB2.

avatar
Master Mentor

Sure whatever works, you can show info on completed Sqoop job in metastore and parse that, then store it in whatever means convenient for you.

avatar
Rising Star

We are currently using a mysql metastore in our test environment so it is possible (we run hpd 2.5). Only thing is that oozie requires the sqoop-site.xml file to be placed somewhere in hdfs to access the metastore. We don't really like the idea that passwords are just stored like that..