Created 05-01-2017 01:11 PM
Sqoop Code:
export HADOOP_CLASSPATH=$(hcat -classpath) HIVE_HOME=/usr/hdp/current/hive-client HCAT_HOME=/usr/hdp/current/hive-webhcat export LIB_JARS=$HCAT_HOME/share/hcatalog/hive-hcatalog-core.jar,$HIVE_HOME/lib/hive-metastore.jar,$HIVE_HOME/lib/libthrift-0.9.3.jar,$HIVE_HOME/lib/hive-exec.jar,$HIVE_HOME/lib/libfb303-0.9.3.jar,$HIVE_HOME/lib/jdo-api-3.0.1.jar,$HIVE_HOME/lib/hive-cli.jar
sqoop import \ -libjars $LIB_JARS \ --username aaa -P \ --connect jdbc:teradata://teradata.com \ --connection-manager org.apache.sqoop.teradata.TeradataConnManager \ --delete-target-dir \ --hive-drop-import-delims \ --target-dir /user/aaa/filename \ --fields-terminated-by '\001' \ --num-mappers 1 \ --m 1 \ --hive-import \ --query 'SELECT id,cast(CREATE_DTS as date) FROM db.tbl where cast(create_dts as date) between '2016-02-11' and '2017-04-27'' \ --hive-table db.tbl
Error Log:
17/05/01 09:01:32 INFO common.ConnectorPlugin: load plugins in jar:file:/usr/hdp/2.5.3.0-37/sqoop/lib/teradata-connector.jar!/teradata.connector.plugins.xml 17/05/01 09:01:32 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir 17/05/01 09:01:32 INFO hive.metastore: Trying to connect to metastore with URI thrift://hpchdp2.hpc.ford.com:9083 17/05/01 09:01:32 INFO hive.metastore: Connected to metastore. 17/05/01 09:01:32 INFO processor.TeradataInputProcessor: input preprocessor com.teradata.connector.teradata.processor.TeradataSplitByPartitionProcessor starts at: 1493643692818 17/05/01 09:01:33 INFO utils.TeradataUtils: the input database product is Teradata 17/05/01 09:01:33 INFO utils.TeradataUtils: the input database version is 15.10 17/05/01 09:01:33 INFO utils.TeradataUtils: the jdbc driver version is 15.10 17/05/01 09:01:33 INFO processor.TeradataInputProcessor: the teradata connector for hadoop version is: 1.5.1 17/05/01 09:01:33 INFO processor.TeradataInputProcessor: input jdbc properties are jdbc:teradata://tera07.dearborn.ford.com/DATABASE=FAST_VIEW 17/05/01 09:01:33 INFO processor.TeradataInputProcessor: the number of mappers are 1 17/05/01 09:01:33 INFO processor.TeradataInputProcessor: input preprocessor com.teradata.connector.teradata.processor.TeradataSplitByPartitionProcessor ends at: 1493643693834 17/05/01 09:01:33 INFO processor.TeradataInputProcessor: the total elapsed time of input preprocessor com.teradata.connector.teradata.processor.TeradataSplitByPartitionProcessor is: 1s 17/05/01 09:01:33 INFO hive.metastore: Trying to connect to metastore with URI thrift://hpchdp2.hpc.ford.com:9083 17/05/01 09:01:33 INFO hive.metastore: Connected to metastore. 17/05/01 09:01:34 INFO impl.TimelineClientImpl: Timeline service address: http://hpchdp2.hpc.ford.com:8188/ws/v1/timeline/ 17/05/01 09:01:34 INFO client.RMProxy: Connecting to ResourceManager at hpchdp2.hpc.ford.com/19.1.2.65:8050 17/05/01 09:01:34 INFO client.AHSProxy: Connecting to Application History server at hpchdp2.hpc.ford.com/19.1.2.65:10200 17/05/01 09:01:34 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 24252766 for bpanneer on ha-hdfs:hdp2cluster 17/05/01 09:01:34 INFO security.TokenCache: Got dt for hdfs://hdp2cluster; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:hdp2cluster, Ident: (HDFS_DELEGATION_TOKEN token 24252766 for bpanneer) 17/05/01 09:01:39 INFO impl.TimelineClientImpl: Timeline service address: http://hpchdp2.hpc.ford.com:8188/ws/v1/timeline/ 17/05/01 09:01:39 INFO client.RMProxy: Connecting to ResourceManager at hpchdp2.hpc.ford.com/19.1.2.65:8050 17/05/01 09:01:39 INFO client.AHSProxy: Connecting to Application History server at hpchdp2.hpc.ford.com/19.1.2.65:10200 17/05/01 09:01:39 WARN mapred.ResourceMgrDelegate: getBlacklistedTrackers - Not implemented yet 17/05/01 09:01:39 INFO mapreduce.JobSubmitter: number of splits:1 17/05/01 09:01:39 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1491131890127_1216905 17/05/01 09:01:39 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:hdp2cluster, Ident: (HDFS_DELEGATION_TOKEN token 24252766 for bpanneer) 17/05/01 09:01:39 INFO impl.YarnClientImpl: Submitted application application_1491131890127_1216905 17/05/01 09:01:39 INFO mapreduce.Job: The url to track the job: http://hpchdp2.hpc.ford.com:8088/proxy/application_1491131890127_1216905/ 17/05/01 09:01:39 INFO mapreduce.Job: Running job: job_1491131890127_1216905 17/05/01 09:01:54 INFO mapreduce.Job: Job job_1491131890127_1216905 running in uber mode : false 17/05/01 09:01:54 INFO mapreduce.Job: map 0% reduce 0% 17/05/01 09:02:09 INFO mapreduce.Job: map 100% reduce 0% 17/05/01 09:02:09 INFO mapreduce.Job: Job job_1491131890127_1216905 completed successfully 17/05/01 09:02:09 INFO mapreduce.Job: Counters: 30 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=207049 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=1488 HDFS: Number of bytes written=0 HDFS: Number of read operations=2 HDFS: Number of large read operations=0 HDFS: Number of write operations=1 Job Counters Launched map tasks=1 Rack-local map tasks=1 Total time spent by all maps in occupied slots (ms)=25094 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=12547 Total vcore-milliseconds taken by all map tasks=12547 Total megabyte-milliseconds taken by all map tasks=51392512 Map-Reduce Framework Map input records=0 Map output records=0 Input split bytes=1488 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=68 CPU time spent (ms)=4410 Physical memory (bytes) snapshot=514166784 Virtual memory (bytes) snapshot=5195005952 Total committed heap usage (bytes)=554696704 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=0 17/05/01 09:02:09 INFO hive.metastore: Trying to connect to metastore with URI thrift://hpchdp2.hpc.ford.com:9083 17/05/01 09:02:09 INFO hive.metastore: Connected to metastore. 17/05/01 09:02:09 INFO utils.HiveUtils: load data into hive table 17/05/01 09:02:09 INFO hive.metastore: Trying to connect to metastore with URI thrift://hpchdp2.hpc.ford.com:9083 17/05/01 09:02:09 INFO hive.metastore: Connected to metastore. 17/05/01 09:02:11 INFO session.SessionState: Created local directory: /tmp/06532b73-4079-4dbc-8aca-eaeac27d1a1d_resources 17/05/01 09:02:11 INFO session.SessionState: Created HDFS directory: /tmp/hive/bpanneer/06532b73-4079-4dbc-8aca-eaeac27d1a1d 17/05/01 09:02:11 INFO session.SessionState: Created local directory: /tmp/bpanneer/06532b73-4079-4dbc-8aca-eaeac27d1a1d 17/05/01 09:02:11 INFO session.SessionState: Created HDFS directory: /tmp/hive/bpanneer/06532b73-4079-4dbc-8aca-eaeac27d1a1d/_tmp_space.db 17/05/01 09:02:11 INFO log.PerfLogger: <PERFLOG method=Driver.run from=org.apache.hadoop.hive.ql.Driver> 17/05/01 09:02:11 INFO log.PerfLogger: <PERFLOG method=compile from=org.apache.hadoop.hive.ql.Driver> 17/05/01 09:02:11 INFO ql.Driver: We are setting the hadoop caller context from CLI to bpanneer_20170501090211_af34564c-708e-4ae7-b2eb-e69ffe9937c1 17/05/01 09:02:11 INFO log.PerfLogger: <PERFLOG method=parse from=org.apache.hadoop.hive.ql.Driver> 17/05/01 09:02:11 INFO parse.ParseDriver: Parsing command: Load data inpath '/user/bpanneer/temp_090133' into table gdiastg1.fva_gfstwb1_a_fva_jrnl_dtl_vw_test0428 17/05/01 09:02:12 INFO parse.ParseDriver: Parse Completed 17/05/01 09:02:12 INFO log.PerfLogger: </PERFLOG method=parse start=1493643731904 end=1493643732653 duration=749 from=org.apache.hadoop.hive.ql.Driver> 17/05/01 09:02:12 INFO log.PerfLogger: <PERFLOG method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver> FAILED: SemanticException Line 1:17 Invalid path ''/user/bpanneer/temp_090133'': No files matching path hdfs://hdp2cluster/user/bpanneer/temp_090133 17/05/01 09:02:12 ERROR ql.Driver: FAILED: SemanticException Line 1:17 Invalid path ''/user/bpanneer/temp_090133'': No files matching path hdfs://hdp2cluster/user/bpanneer/temp_090133 org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:17 Invalid path ''/user/bpanneer/temp_090133'': No files matching path hdfs://hdp2cluster/user/bpanneer/temp_090133 at org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.applyConstraintsAndGetFiles(LoadSemanticAnalyzer.java:146) at org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.analyzeInternal(LoadSemanticAnalyzer.java:224) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:230) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:464) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:320) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1219) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1260) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1146) at com.teradata.connector.hive.utils.HiveUtils.loadDataintoHiveTable(HiveUtils.java:336) at com.teradata.connector.hive.processor.HiveOutputProcessor.outputPostProcessor(HiveOutputProcessor.java:274) at com.teradata.connector.common.tool.ConnectorJobRunner.runJob(ConnectorJobRunner.java:162) at com.teradata.connector.common.tool.ConnectorJobRunner.runJob(ConnectorJobRunner.java:58) at org.apache.sqoop.teradata.TeradataSqoopImportHelper.runJob(TeradataSqoopImportHelper.java:374) at org.apache.sqoop.teradata.TeradataConnManager.importQuery(TeradataConnManager.java:532) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:509) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:615) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:225) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234) at org.apache.sqoop.Sqoop.main(Sqoop.java:243) 17/05/01 09:02:12 INFO log.PerfLogger: </PERFLOG method=compile start=1493643731863 end=1493643732719 duration=856 from=org.apache.hadoop.hive.ql.Driver> 17/05/01 09:02:12 INFO metadata.Hive: Dumping metastore api call timing information for : compilation phase 17/05/01 09:02:12 INFO metadata.Hive: Total time spent in this metastore function was greater than 1000ms : getFunctions_(String, String, )=1808 17/05/01 09:02:12 INFO ql.Driver: We are resetting the hadoop caller context to CLI 17/05/01 09:02:12 INFO log.PerfLogger: <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver> 17/05/01 09:02:12 INFO log.PerfLogger: </PERFLOG method=releaseLocks start=1493643732720 end=1493643732720 duration=0 from=org.apache.hadoop.hive.ql.Driver> 17/05/01 09:02:12 ERROR teradata.TeradataSqoopImportHelper: Exception running Teradata import job com.teradata.connector.common.exception.ConnectorException: FAILED: SemanticException Line 1:17 Invalid path ''/user/bpanneer/temp_090133'': No files matching path hdfs://hdp2cluster/user/bpanneer/temp_090133 at com.teradata.connector.hive.processor.HiveOutputProcessor.outputPostProcessor(HiveOutputProcessor.java:286) at com.teradata.connector.common.tool.ConnectorJobRunner.runJob(ConnectorJobRunner.java:162) at com.teradata.connector.common.tool.ConnectorJobRunner.runJob(ConnectorJobRunner.java:58) at org.apache.sqoop.teradata.TeradataSqoopImportHelper.runJob(TeradataSqoopImportHelper.java:374) at org.apache.sqoop.teradata.TeradataConnManager.importQuery(TeradataConnManager.java:532) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:509) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:615) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:225) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234) at org.apache.sqoop.Sqoop.main(Sqoop.java:243) 17/05/01 09:02:12 INFO teradata.TeradataSqoopImportHelper: Teradata import job completed with exit code 1 17/05/01 09:02:12 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: Exception running Teradata import job at org.apache.sqoop.teradata.TeradataSqoopImportHelper.runJob(TeradataSqoopImportHelper.java:377) at org.apache.sqoop.teradata.TeradataConnManager.importQuery(TeradataConnManager.java:532) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:509) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:615) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:225) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234) at org.apache.sqoop.Sqoop.main(Sqoop.java:243) Caused by: com.teradata.connector.common.exception.ConnectorException: FAILED: SemanticException Line 1:17 Invalid path ''/user/bpanneer/temp_090133'': No files matching path hdfs://hdp2cluster/user/bpanneer/temp_090133 at com.teradata.connector.hive.processor.HiveOutputProcessor.outputPostProcessor(HiveOutputProcessor.java:286) at com.teradata.connector.common.tool.ConnectorJobRunner.runJob(ConnectorJobRunner.java:162) at com.teradata.connector.common.tool.ConnectorJobRunner.runJob(ConnectorJobRunner.java:58) at org.apache.sqoop.teradata.TeradataSqoopImportHelper.runJob(TeradataSqoopImportHelper.java:374) ... 9 more 17/05/01 09:02:12 INFO log.PerfLogger: <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver> 17/05/01 09:02:12 INFO log.PerfLogger: </PERFLOG method=releaseLocks start=1493643732742 end=1493643732742 duration=0 from=org.apache.hadoop.hive.ql.Driver>
Please Help!
Thanks
Created 05-01-2017 02:18 PM
Hi,
From the error msg it looks like it tries to load data from an non existing directory: hdfs://hdp2cluster/user/bpanneer/temp_090133 :
Invalid path ''/user/bpanneer/temp_090133'': No files matching path hdfs://hdp2cluster/user/bpanneer/temp_090133
Is this path available on HDFS?
Created 05-01-2017 03:09 PM
Hi Ward,
Yes it is available, and each time I run a job the temp file is created with a new temp_id like the one above (temp_090133), and turn out the temporary data is not being stored there.
Created 05-01-2017 03:36 PM
Yes it is available, and each time I run a job the temp file is created with a new temp_id like the one above (temp_090133), and turn out the temporary data is not being stored there.