Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Sqoop Metastore - Usage, Supported databases, backup

Solved Go to solution
Highlighted

Sqoop Metastore - Usage, Supported databases, backup

Contributor

1. Its not very clear from the HDP documentation whether Sqoop supports Mysql or any DB other than HSQLDB. Has any one tried using MySQL or Postgres for the sqoop metastore?

2. I think metastore is required only for the sqoop jobs and not required otherwise. Is this correct?

3. If only HSQLDB is supported, what are the backup, DR strategies for the same?

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Sqoop Metastore - Usage, Supported databases, backup

Mentor

Sqoop metastore on MySQL is not out but there are jiras open. There are tutorials outthere but i was never successful to make it work.

Yes metastore is for metadata about jobs, storing last record procesaed, job name, etc.

There really aren't any backup strategies for hsqldb. It's a file on disk, while some in memory data is not flushed to disk you have potential Dara loss. Maybe putting it on RAID 1? Use ECC memory chips, etc.

View solution in original post

5 REPLIES 5
Highlighted

Re: Sqoop Metastore - Usage, Supported databases, backup

Mentor

Sqoop metastore on MySQL is not out but there are jiras open. There are tutorials outthere but i was never successful to make it work.

Yes metastore is for metadata about jobs, storing last record procesaed, job name, etc.

There really aren't any backup strategies for hsqldb. It's a file on disk, while some in memory data is not flushed to disk you have potential Dara loss. Maybe putting it on RAID 1? Use ECC memory chips, etc.

View solution in original post

Highlighted

Re: Sqoop Metastore - Usage, Supported databases, backup

Mentor

I should mention that you can store job definition in source control and write out the last record procesaed to an additional destination like hdfs as precaution. I would do that with Java hdfs api at some point to make sure not losing the row sequence.

Re: Sqoop Metastore - Usage, Supported databases, backup

Contributor

sounds good, thanks. I am looking to store last record info at enterprise level metadata store in DB2.

Highlighted

Re: Sqoop Metastore - Usage, Supported databases, backup

Mentor

Sure whatever works, you can show info on completed Sqoop job in metastore and parse that, then store it in whatever means convenient for you.

Highlighted

Re: Sqoop Metastore - Usage, Supported databases, backup

Contributor

We are currently using a mysql metastore in our test environment so it is possible (we run hpd 2.5). Only thing is that oozie requires the sqoop-site.xml file to be placed somewhere in hdfs to access the metastore. We don't really like the idea that passwords are just stored like that..

Don't have an account?
Coming from Hortonworks? Activate your account here