Created 06-23-2016 01:53 PM
EDITED:
Just in case somebody ends up in this question trying to set MySQL as the Sqoop metastore, the instructions to do that are described in this post:
https://community.hortonworks.com/articles/55937/using-sqoop-with-mysql-as-metastore.html
Hi,
I have a single Hortonworks node, using Sqoop to import some tables from an SQLServer database. Whenever I try to execute two Oozie workflows simultaneously (each of which performs a Sqoop import job), I get an exception related to the hsqldb metastore.
3606 [main] ERROR org.apache.sqoop.tool.JobTool - I/O error performing job operation: java.io.IOException: Exception creating SQL connection at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.init(HsqldbJobStorage.java:216) at org.apache.sqoop.metastore.hsqldb.AutoHsqldbStorage.open(AutoHsqldbStorage.java:112) at org.apache.sqoop.tool.JobTool.run(JobTool.java:274) at org.apache.sqoop.Sqoop.run(Sqoop.java:148) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:184) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:226) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:235) at org.apache.sqoop.Sqoop.main(Sqoop.java:244) at org.apache.oozie.action.hadoop.SqoopMain.runSqoopJob(SqoopMain.java:197) at org.apache.oozie.action.hadoop.SqoopMain.run(SqoopMain.java:177) at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:47) at org.apache.oozie.action.hadoop.SqoopMain.main(SqoopMain.java:46) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:241) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) Caused by: java.sql.SQLException: The database is already in use by another process: org.hsqldb.persist.NIOLockFile@9a5ed4be[file =/home/badevelop/.sqoop/metastore.db.lck, exists=true, locked=false, valid=false, fl =null]: java.lang.Exception: checkHeartbeat(): lock file [/home/david/.sqoop/metastore.db.lck] is presumably locked by another process. at org.hsqldb.jdbc.Util.sqlException(Unknown Source) at org.hsqldb.jdbc.jdbcConnection.<init>(Unknown Source) at org.hsqldb.jdbcDriver.getConnection(Unknown Source) at org.hsqldb.jdbcDriver.connect(Unknown Source) at java.sql.DriverManager.getConnection(DriverManager.java:664) at java.sql.DriverManager.getConnection(DriverManager.java:247) at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.init(HsqldbJobStorage.java:176) ... 25 more
I assume that when one of the processes gets the .lock file in order to access to the hsqldb database, it prevents the other from getting into the metastore, raising the exception. Is this behaviour normal using the default Sqoop configuration? (I haven't modified any config property in the sqoop-site.xml file or in the metastore.db.* scripts). Since the metastore is going to be used by many Oozie workflows, I don't understand why this message seems to imply that the access to the database has to be sequential.
Will changing the metastore to another database (for instance MySQL) solve the problem?
Any hint would be greatly appreciated.
Created 06-24-2016 09:18 AM
Yes you need to move to non-default database to avoid such issues. derby is not meant for concurrent access.
Created 06-24-2016 09:18 AM
Yes you need to move to non-default database to avoid such issues. derby is not meant for concurrent access.
Created 06-24-2016 09:46 AM
Would implement an own database to handle sqoop metastore have any potential trade off (beyond implementation itself) or everything that sqoop stores and retrieves from the derby default metastore can be implemented on a ad hoc one?
Do you think a MySql can be a good choice to implement a custom sqoop metastore or would you rather pick any other database format?
Created 06-24-2016 10:01 AM
Thanks, Artem.
As I read from another question you answered, you were not able to use MySql as a metastore (https://community.hortonworks.com/questions/22152/sqoop-metastore-usage-supported-databases-backup.h...
Do you think it is worth having a try with MySQL, or would you use another database? Maybe Postgres?
Created 08-03-2016 07:14 PM
the database flavor has nothing to do with it, it's a matter of Sqoop having that functionality, which at this moment it does not.