Created 05-04-2017 11:58 PM
Hi
I am trying to import data from sybase-ASE table into hive
When running following cmd
sqoop import "-Dorg.apache.sqoop.splitter.allow_text_splitter=true" --driver net.sourceforge.jtds.jdbc.Driver --connect jdbc:jtds:sybase://11.9.179.5:4000/BAS_csrp --username user--password xxxxxx --table csrp_total_bas --split-by cpr_nr --hcatalog-database default --hcatalog-table csrp_total_baseline --hcatalog-home /tmp/simon/std/CPRUPD/BASELINE --create-hcatalog-table --hcatalog-storage-stanza "stored as orcfile"
I get this error :
Warning: /usr/hdp/2.5.0.0-1245/accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. 17/05/04 17:20:12 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.5.0.0-1245 17/05/04 17:20:12 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 17/05/04 17:20:12 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time. 17/05/04 17:20:12 INFO manager.SqlManager: Using default fetchSize of 1000 17/05/04 17:20:12 INFO tool.CodeGenTool: Beginning code generation 17/05/04 17:22:19 ERROR manager.SqlManager: Error executing statement: java.sql.SQLException: Network error IOException: Connection timed out java.sql.SQLException: Network error IOException: Connection timed out at net.sourceforge.jtds.jdbc.JtdsConnection.<init>(JtdsConnection.java:436) at net.sourceforge.jtds.jdbc.Driver.connect(Driver.java:184) at java.sql.DriverManager.getConnection(DriverManager.java:664) at java.sql.DriverManager.getConnection(DriverManager.java:247) at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:904) at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:763) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:786) at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:289) at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:260) at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:246) at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:328) at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1853) at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1653) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:107) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:488) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:615) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:225) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234) at org.apache.sqoop.Sqoop.main(Sqoop.java:243) Caused by: java.net.ConnectException: Connection timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at net.sourceforge.jtds.jdbc.SharedSocket.createSocketForJDBC3(SharedSocket.java:288) at net.sourceforge.jtds.jdbc.SharedSocket.<init>(SharedSocket.java:251) at net.sourceforge.jtds.jdbc.JtdsConnection.<init>(JtdsConnection.java:331) ... 22 more 17/05/04 17:22:19 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: No columns to generate for ClassWriter at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1659) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:107) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:488) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:615) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:225) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234) at org.apache.sqoop.Sqoop.main(Sqoop.java:243)
When running sqoop without hcatalog i get some warnings but it runs on exatly the same database and table
sqoop import "-Dorg.apache.sqoop.splitter.allow_text_splitter=true" --driver net.sourceforge.jtds.jdbc.Driver --connect jdbc:jtds:sybase://10.9.179.5:4000/BAS_csrp --username user --password xxxxxx --table csrp_total_bas --split-by cpr_nr --target-dir /tmp/simon/std/CPRUPD/BASELINE_file/
this is output for the sqoop which works
Warning: /usr/hdp/2.5.0.0-1245/accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. 17/05/04 17:14:47 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.5.0.0-1245 17/05/04 17:14:47 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 17/05/04 17:14:47 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time. 17/05/04 17:14:47 INFO manager.SqlManager: Using default fetchSize of 1000 17/05/04 17:14:47 INFO tool.CodeGenTool: Beginning code generation 17/05/04 17:14:47 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM csrp_total_bas AS t WHERE 1=0 17/05/04 17:14:47 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM csrp_total_bas AS t WHERE 1=0 17/05/04 17:14:47 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/2.5.0.0-1245/hadoop-mapreduce Note: /tmp/sqoop-w20960/compile/ccd3eec0b22d23dc5103e861bf57e345/csrp_total_bas.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. 17/05/04 17:14:49 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-w20960/compile/ccd3eec0b22d23dc5103e861bf57e345/csrp_total_bas.jar 17/05/04 17:14:49 INFO mapreduce.ImportJobBase: Beginning import of csrp_total_bas 17/05/04 17:14:49 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM csrp_total_bas AS t WHERE 1=0 17/05/04 17:14:50 INFO impl.TimelineClientImpl: Timeline service address: http://sktudv01hdp02.ccta.dk:8188/ws/v1/timeline/ 17/05/04 17:14:50 INFO client.AHSProxy: Connecting to Application History server at sktudv01hdp02.ccta.dk/172.20.242.53:10200 17/05/04 17:14:51 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 8899 for w20960 on ha-hdfs:hdpudv01 17/05/04 17:14:51 INFO security.TokenCache: Got dt for hdfs://hdpudv01; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:hdpudv01, Ident: (HDFS_DELEGATION_TOKEN token 8899 for w20960) 17/05/04 17:14:51 WARN token.Token: Cannot find class for token kind kms-dt 17/05/04 17:14:51 INFO security.TokenCache: Got dt for hdfs://hdpudv01; Kind: kms-dt, Service: 172.20.242.53:9292, Ident: 00 06 57 32 30 39 36 30 04 79 61 72 6e 00 8a 01 5b d4 07 2a 3d 8a 01 5b f8 13 ae 3d 3b 09 17/05/04 17:14:51 WARN ipc.Client: Failed to connect to server: sktudv01hdp02.ccta.dk/172.20.242.53:8032: retries get failed due to exceeded maximum allowed retries number: 0 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:650) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:745) at org.apache.hadoop.ipc.Client$Connection.access$3200(Client.java:397) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1618) at org.apache.hadoop.ipc.Client.call(Client.java:1449) at org.apache.hadoop.ipc.Client.call(Client.java:1396) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) at com.sun.proxy.$Proxy22.getNewApplication(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getNewApplication(ApplicationClientProtocolPBClientImpl.java:221) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:278) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:194) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:176) at com.sun.proxy.$Proxy23.getNewApplication(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNewApplication(YarnClientImpl.java:225) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.createApplication(YarnClientImpl.java:233) at org.apache.hadoop.mapred.ResourceMgrDelegate.getNewJobID(ResourceMgrDelegate.java:188) at org.apache.hadoop.mapred.YARNRunner.getNewJobID(YARNRunner.java:231) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:153) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308) at org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:200) at org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:173) at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:270) at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:692) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:507) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:615) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:225) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234) at org.apache.sqoop.Sqoop.main(Sqoop.java:243) 17/05/04 17:14:51 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2 17/05/04 17:14:51 WARN token.Token: Cannot find class for token kind kms-dt 17/05/04 17:14:51 INFO security.TokenCache: Got dt for hdfs://hdpudv01; Kind: kms-dt, Service: 172.20.242.54:9292, Ident: 00 06 57 32 30 39 36 30 04 79 61 72 6e 00 8a 01 5b d4 07 2a c5 8a 01 5b f8 13 ae c5 8e 04 f8 2b 17/05/04 17:14:52 INFO db.DBInputFormat: Using read commited transaction isolation 17/05/04 17:14:52 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(cpr_nr), MAX(cpr_nr) FROM csrp_total_bas 17/05/04 17:17:30 WARN db.TextSplitter: Generating splits for a textual index column. 17/05/04 17:17:30 WARN db.TextSplitter: If your database sorts in a case-insensitive order, this may result in a partial import or duplicate records. 17/05/04 17:17:30 WARN db.TextSplitter: You are strongly encouraged to choose an integral split column. 17/05/04 17:17:30 INFO mapreduce.JobSubmitter: number of splits:6 17/05/04 17:17:31 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1491967685506_0645 17/05/04 17:17:31 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:hdpudv01, Ident: (HDFS_DELEGATION_TOKEN token 8899 for w20960) 17/05/04 17:17:31 WARN token.Token: Cannot find class for token kind kms-dt 17/05/04 17:17:31 WARN token.Token: Cannot find class for token kind kms-dt Kind: kms-dt, Service: 172.20.242.53:9292, Ident: 00 06 57 32 30 39 36 30 04 79 61 72 6e 00 8a 01 5b d4 07 2a 3d 8a 01 5b f8 13 ae 3d 3b 09 17/05/04 17:17:31 WARN token.Token: Cannot find class for token kind kms-dt 17/05/04 17:17:31 WARN token.Token: Cannot find class for token kind kms-dt Kind: kms-dt, Service: 172.20.242.54:9292, Ident: 00 06 57 32 30 39 36 30 04 79 61 72 6e 00 8a 01 5b d4 07 2a c5 8a 01 5b f8 13 ae c5 8e 04 f8 2b 17/05/04 17:17:32 INFO impl.YarnClientImpl: Submitted application application_1491967685506_0645 17/05/04 17:17:32 INFO mapreduce.Job: The url to track the job: http://sktudv01hdp01.ccta.dk:8088/proxy/application_1491967685506_0645/ 17/05/04 17:17:32 INFO mapreduce.Job: Running job: job_1491967685506_0645 17/05/04 17:17:39 INFO mapreduce.Job: Job job_1491967685506_0645 running in uber mode : false 17/05/04 17:17:39 INFO mapreduce.Job: map 0% reduce 0% 17/05/04 17:20:47 INFO mapreduce.Job: map 33% reduce 0% 17/05/04 17:20:48 INFO mapreduce.Job: map 83% reduce 0% 17/05/04 17:36:50 INFO mapreduce.Job: map 100% reduce 0% 17/05/04 17:36:50 INFO mapreduce.Job: Job job_1491967685506_0645 completed successfully 17/05/04 17:36:51 INFO mapreduce.Job: Counters: 30 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=1037068 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=774 HDFS: Number of bytes written=47578984831 HDFS: Number of read operations=24 HDFS: Number of large read operations=0 HDFS: Number of write operations=12 Job Counters Launched map tasks=6 Other local map tasks=6 Total time spent by all maps in occupied slots (ms)=2083062 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=2083062 Total vcore-milliseconds taken by all map tasks=2083062 Total megabyte-milliseconds taken by all map tasks=4266110976 Map-Reduce Framework Map input records=46527615 Map output records=46527615 Input split bytes=774 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=14955 CPU time spent (ms)=1281100 Physical memory (bytes) snapshot=1798893568 Virtual memory (bytes) snapshot=22238547968 Total committed heap usage (bytes)=1817706496 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=47578984831 17/05/04 17:36:51 INFO mapreduce.ImportJobBase: Transferred 44.3114 GB in 1,320.7926 seconds (34.3543 MB/sec) 17/05/04 17:36:51 INFO mapreduce.ImportJobBase: Retrieved 46527615 records.
I really don't understand it, i guess it might be some configuration issues on hive or in sqoop but i am just guessing
Created 05-17-2017 08:22 AM
The message "WARN ipc.Client:Failed to connect to server: sktudv01hdp02.ccta.dk/172.20.242.53:8032: retries get failed due to exceeded maximum allowed retries number:0" is basically when the sqoop command tries to connect to standby Resource Manager with RM HA enabled and continues to connect to rm2 for spawning the application.
"17/05/0417:14:51 INFO client.ConfiguredRMFailoverProxyProvider:Failing over to rm2"
Created 05-17-2017 08:22 AM
The message "WARN ipc.Client:Failed to connect to server: sktudv01hdp02.ccta.dk/172.20.242.53:8032: retries get failed due to exceeded maximum allowed retries number:0" is basically when the sqoop command tries to connect to standby Resource Manager with RM HA enabled and continues to connect to rm2 for spawning the application.
"17/05/0417:14:51 INFO client.ConfiguredRMFailoverProxyProvider:Failing over to rm2"
Created 05-19-2017 08:20 AM
@SindhuThank you very much-