Member since
10-26-2017
5
Posts
0
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3303 | 11-11-2017 04:22 PM | |
1686 | 10-31-2017 03:23 PM |
11-11-2017
04:22 PM
Finally figured this out with the help of this StackOverflow post! The default engine used when creating a table in MySQL is INNODB, which introduces transactions, thus the error. I simply recreated the SQOOP_SESSIONS table using MyISAM instead of INNODB.
... View more
11-03-2017
12:55 PM
This didn't solve the issue unfortunately.
... View more
10-31-2017
03:34 PM
We switched sqoop to use a MySQL backed metastore.
We then created a job as follows:
sqoop job --create myjob -- import --append --connect "jdbc:sqlserver://<server>;database=<database>" --username <databaseUser> --password <databasePassword> --table <table> --target-dir "<someDataDirectory>" --hive-import --hive-database <hiveDatabase> --incremental append --check-column <column> --last-value -9999999 --autoreset-to-one-mapper --hive-delims-replacement " " --outdir <outputdir> --bindir <bindir> -- -- --schema <databaseSchema><br>
The job is created successfully in the metastore. I queried MySQL and can see that it was created with the properties specified.
However, when I run the job, the incremental.last.value is not updated.
What can I do to start debugging this issue?
** UPDATE **
I dug a little deeper. I've enabled verbose logging and removed all --hive-import arguments.
I found the error that occurs when sqoop tries to update the metastore. Below is a dump of the log, starting when sqoop connects to the application servers. The error is at the bottom: 17/11/11 07:46:18 INFO client.RMProxy: Connecting to ResourceManager at <location>
17/11/11 07:46:19 INFO client.AHSProxy: Connecting to Application History server at <location>
17/11/11 07:46:54 DEBUG db.DBConfiguration: Fetching password from job credentials store
17/11/11 07:46:54 INFO db.DBInputFormat: Using read commited transaction isolation
17/11/11 07:46:54 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN([Id]), MAX([Id]) FROM [schema].[table] WHERE ( [Id] > -9999999 AND [Id] <= 500512 )
17/11/11 07:46:54 INFO db.IntegerSplitter: Split size: 125127; Num splits: 4 from: 1 to: 500512
17/11/11 07:46:54 DEBUG db.IntegerSplitter: Splits: [ 1 to 500,512] into 4 parts
17/11/11 07:46:54 DEBUG db.IntegerSplitter: 1
17/11/11 07:46:54 DEBUG db.IntegerSplitter: 125,129
17/11/11 07:46:54 DEBUG db.IntegerSplitter: 250,257
17/11/11 07:46:54 DEBUG db.IntegerSplitter: 375,385
17/11/11 07:46:54 DEBUG db.IntegerSplitter: 500,512
17/11/11 07:46:54 DEBUG db.DataDrivenDBInputFormat: Creating input split with lower bound '[Id] >= 1' and upper bound '[Id] < 125129'
17/11/11 07:46:54 DEBUG db.DataDrivenDBInputFormat: Creating input split with lower bound '[Id] >= 125129' and upper bound '[Id] < 250257'
17/11/11 07:46:54 DEBUG db.DataDrivenDBInputFormat: Creating input split with lower bound '[Id] >= 250257' and upper bound '[Id] < 375385'
17/11/11 07:46:54 DEBUG db.DataDrivenDBInputFormat: Creating input split with lower bound '[Id] >= 375385' and upper bound '[Id] <= 500512'
17/11/11 07:46:55 INFO mapreduce.JobSubmitter: number of splits:4
17/11/11 07:46:55 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1509726871681_3374
17/11/11 07:46:58 INFO impl.YarnClientImpl: Submitted application application_1509726871681_3374
17/11/11 07:46:58 INFO mapreduce.Job: The url to track the job: <url>
17/11/11 07:46:58 INFO mapreduce.Job: Running job: job_1509726871681_3374
17/11/11 07:47:05 INFO mapreduce.Job: Job job_1509726871681_3374 running in uber mode : false
17/11/11 07:47:05 INFO mapreduce.Job: map 0% reduce 0%
17/11/11 07:47:13 INFO mapreduce.Job: map 25% reduce 0%
17/11/11 07:47:14 INFO mapreduce.Job: map 75% reduce 0%
17/11/11 07:47:15 INFO mapreduce.Job: map 100% reduce 0%
17/11/11 07:47:15 INFO mapreduce.Job: Job job_1509726871681_3374 completed successfully
17/11/11 07:47:15 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=674124
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=476
HDFS: Number of bytes written=175543
HDFS: Number of read operations=16
HDFS: Number of large read operations=0
HDFS: Number of write operations=8
Job Counters
Launched map tasks=4
Other local map tasks=4
Total time spent by all maps in occupied slots (ms)=44554
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=22277
Total vcore-milliseconds taken by all map tasks=22277
Total megabyte-milliseconds taken by all map tasks=34217472
Map-Reduce Framework
Map input records=863
Map output records=863
Input split bytes=476
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=367
CPU time spent (ms)=15950
Physical memory (bytes) snapshot=1008590848
Virtual memory (bytes) snapshot=13268086784
Total committed heap usage (bytes)=847773696
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=175543
17/11/11 07:47:15 INFO mapreduce.ImportJobBase: Transferred 171.4287 KB in 57.0946 seconds (3.0025 KB/sec)
17/11/11 07:47:15 INFO mapreduce.ImportJobBase: Retrieved 863 records.
17/11/11 07:47:15 DEBUG util.ClassLoaderStack: Restoring classloader: sun.misc.Launcher$AppClassLoader@9629756
17/11/11 07:47:15 INFO util.AppendUtils: Appending to directory table
17/11/11 07:47:15 INFO util.AppendUtils: Using found partition 4
17/11/11 07:47:16 DEBUG util.AppendUtils: Filename: _SUCCESS ignored
17/11/11 07:47:16 DEBUG util.AppendUtils: Filename: part-m-00000 repartitioned to: part-m-00004
17/11/11 07:47:16 DEBUG util.AppendUtils: Filename: part-m-00001 repartitioned to: part-m-00005
17/11/11 07:47:16 DEBUG util.AppendUtils: Filename: part-m-00002 repartitioned to: part-m-00006
17/11/11 07:47:16 DEBUG util.AppendUtils: Filename: part-m-00003 repartitioned to: part-m-00007
17/11/11 07:47:16 DEBUG util.AppendUtils: Deleting temporary folder 99599ad0da924af5979d77bb3e4fdfa1_table
17/11/11 07:47:16 INFO tool.ImportTool: Saving incremental import state to the metastore
17/11/11 07:47:16 DEBUG hsqldb.HsqldbJobStorage: Checking for table: SQOOP_ROOT
17/11/11 07:47:16 DEBUG hsqldb.HsqldbJobStorage: Found table: SQOOP_ROOT
17/11/11 07:47:16 DEBUG hsqldb.HsqldbJobStorage: Looking up property sqoop.hsqldb.job.storage.version for version null
17/11/11 07:47:16 DEBUG hsqldb.HsqldbJobStorage: => 0
17/11/11 07:47:16 DEBUG hsqldb.HsqldbJobStorage: Looking up property sqoop.hsqldb.job.info.table for version 0
17/11/11 07:47:16 DEBUG hsqldb.HsqldbJobStorage: => SQOOP_SESSIONS
17/11/11 07:47:16 DEBUG hsqldb.HsqldbJobStorage: Checking for table: SQOOP_SESSIONS
17/11/11 07:47:16 DEBUG hsqldb.HsqldbJobStorage: Found table: SQOOP_SESSIONS
17/11/11 07:47:16 DEBUG hsqldb.HsqldbJobStorage: Creating job: myjob
17/11/11 07:47:16 DEBUG hsqldb.HsqldbJobStorage: Job: myjob; Setting property sqoop.tool with class schema => import
17/11/11 07:47:16 DEBUG hsqldb.HsqldbJobStorage: Job: myjob; Getting property sqoop.tool with class schema
17/11/11 07:47:16 DEBUG hsqldb.HsqldbJobStorage: => import
17/11/11 07:48:07 DEBUG hsqldb.HsqldbJobStorage: Flushing current transaction
17/11/11 07:48:07 DEBUG hsqldb.HsqldbJobStorage: Closing connection
17/11/11 07:48:07 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: Error communicating with database
at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.createInternal(HsqldbJobStorage.java:426)
at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.update(HsqldbJobStorage.java:445)
at org.apache.sqoop.tool.ImportTool.saveIncrementalState(ImportTool.java:164)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:528)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:615)
at org.apache.sqoop.tool.JobTool.execJob(JobTool.java:243)
at org.apache.sqoop.tool.JobTool.run(JobTool.java:298)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:225)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
at org.apache.sqoop.Sqoop.main(Sqoop.java:243)
Caused by: java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1078)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4187)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4119)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2570)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2731)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2820)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2159)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2462)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2379)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2363)
at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.setV0Property(HsqldbJobStorage.java:707)
at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.createInternal(HsqldbJobStorage.java:391)
... 12 more
17/11/11 07:48:07 DEBUG hsqldb.HsqldbJobStorage: Flushing current transaction
17/11/11 07:48:07 DEBUG hsqldb.HsqldbJobStorage: Closing connection In short, there seems to be a lock on the transaction when sqoop is trying to update the mysql metastore: Lock wait timeout exceeded; try restarting transaction It is locking on this statement: UPDATE SQOOP_SESSIONS SET propval = 'import' WHERE job_name = 'myjob' This doesn't seem like a valid statement, as there are 62 entries that are returned with that where clause. Any idea what could be causing this, or where to start debugging further?
... View more
Labels:
- Labels:
-
Apache Sqoop
10-31-2017
03:23 PM
I figured this out. I was trying to run concurrent sqoop jobs against the local metastore. That metastore is just a file on disk. During the concurrent runs, the file was corrupted, causing the metastore to no longer function. We have since created a MySQL metastore, which works 90%. The last issue we are trying to resolve is it doesn't update the incremental.last.value for jobs.
... View more
10-26-2017
01:37 PM
We are running sqoop to import data from RDBMS. We have several databases to import with thousands of tables.
We generate sqoop jobs for each table so we can use the incremental import feature of sqoop.
We are running these jobs in parallel.
Suddenly, after many successful runs, fails to connect to what I am assuming is the meta store.
Running any sqoop command causes this error. For example: sqoop job --list We haven't explicitly set users on the metastore, and the job definition doesn't include a metastore connect command. Any idea what could cause this and how to fix it? Here is the error after running sqoop job --list: Warning: /usr/hdp/2.6.2.0-205/hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /usr/hdp/2.6.2.0-205/accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
17/10/26 09:27:02 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.6.2.0-205
17/10/26 09:27:03 ERROR tool.JobTool: I/O error performing job operation: java.io.IOException: Exception creating SQL connection
at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.init(HsqldbJobStorage.java:216)
at org.apache.sqoop.metastore.hsqldb.AutoHsqldbStorage.open(AutoHsqldbStorage.java:112)
at org.apache.sqoop.tool.JobTool.run(JobTool.java:289)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:225)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
at org.apache.sqoop.Sqoop.main(Sqoop.java:243)
Caused by: java.sql.SQLException: User not found: SA
at org.hsqldb.jdbc.Util.sqlException(Unknown Source)
at org.hsqldb.jdbc.jdbcConnection.<init>(Unknown Source)
at org.hsqldb.jdbcDriver.getConnection(Unknown Source)
at org.hsqldb.jdbcDriver.connect(Unknown Source)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:247)
at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.init(HsqldbJobStorage.java:176)
... 8 more
... View more
Labels:
- Labels:
-
Apache Sqoop