Member since
12-03-2018
3
Posts
0
Kudos Received
0
Solutions
12-27-2018
05:07 AM
Hello Sir, Thank you very much for your reply. Basically I am trying to import data from oracle to hive using sqoop. In oracle table has 2 field, 1st column is containing normal data and 2nd column containing xml data. For xml data I used getStringVal function. That's why while creating the table it's getting error. sqoop import --connect "jdbc:oracle:thin:@xxx:1521:dcpdb" --username "nblt24" --password "nblt24" --query 'SELECT RECID, XMLTYPE.getStringVal(XMLRECORD) FROM F_COMPANY where $CONDITIONS' --target-dir /home/hduser/hadoop/share/hadoop/hdfsa --delete-target-dir --num-mappers 1 --hive-import --hive-table hr.f_company_xml_tab --verbose Is there any other solution to import xml data? Regards Arman
... View more
12-24-2018
01:54 PM
I am stuck while importing xml data from oracle database. Please help me. hduser@hadoop:~$ sqoop import --connect "jdbc:oracle:thin:@xxx:1521:dcpdb" --username "nblt24" --password "nblt24" --query 'SELECT RECID, XMLTYPE.getStringVal(XMLRECORD) FROM F_COMPANY where $CONDITIONS' --target-dir /home/hduser/hadoop/share/hadoop/hdfsa --delete-target-dir --num-mappers 1 --hive-import --hive-table hr.f_company_xml_tab --verbose Warning: /home/hduser/sqoop/../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /home/hduser/sqoop/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/hduser/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/hduser/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 18/12/24 12:01:00 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6 18/12/24 12:01:00 DEBUG tool.BaseSqoopTool: Enabled debug logging. 18/12/24 12:01:00 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 18/12/24 12:01:00 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override 18/12/24 12:01:00 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc. 18/12/24 12:01:00 DEBUG sqoop.ConnFactory: Loaded manager factory: org.apache.sqoop.manager.oracle.OraOopManagerFactory 18/12/24 12:01:00 DEBUG sqoop.ConnFactory: Loaded manager factory: com.cloudera.sqoop.manager.DefaultManagerFactory 18/12/24 12:01:00 DEBUG sqoop.ConnFactory: Trying ManagerFactory: org.apache.sqoop.manager.oracle.OraOopManagerFactory 18/12/24 12:01:00 DEBUG oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop can be called by Sqoop! 18/12/24 12:01:00 INFO oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop is disabled. 18/12/24 12:01:00 DEBUG sqoop.ConnFactory: Trying ManagerFactory: com.cloudera.sqoop.manager.DefaultManagerFactory 18/12/24 12:01:00 DEBUG manager.DefaultManagerFactory: Trying with scheme: jdbc:oracle:thin:@xx:1521 18/12/24 12:01:00 DEBUG manager.OracleManager$ConnCache: Instantiated new connection cache. 18/12/24 12:01:00 INFO manager.SqlManager: Using default fetchSize of 1000 18/12/24 12:01:00 DEBUG sqoop.ConnFactory: Instantiated ConnManager org.apache.sqoop.manager.OracleManager@4df50bcc 18/12/24 12:01:00 INFO tool.CodeGenTool: Beginning code generation 18/12/24 12:01:00 DEBUG manager.SqlManager: Execute getColumnInfoRawQuery : SELECT RECID, XMLTYPE.getStringVal(XMLRECORD) FROM F_COMPANY where (1 = 0) 18/12/24 12:01:00 DEBUG manager.OracleManager: Creating a new connection for jdbc:oracle:thin:@xxx.45:1521:dcpdb, using username: nblt24 18/12/24 12:01:00 DEBUG manager.OracleManager: No connection paramenters specified. Using regular API for making connection. 18/12/24 12:01:00 INFO manager.OracleManager: Time zone has been set to GMT 18/12/24 12:01:00 DEBUG manager.SqlManager: Using fetchSize for next query: 1000 18/12/24 12:01:00 INFO manager.SqlManager: Executing SQL statement: SELECT RECID, XMLTYPE.getStringVal(XMLRECORD) FROM F_COMPANY where (1 = 0) 18/12/24 12:01:00 DEBUG manager.SqlManager: Found column RECID of type [12, 255, 0] 18/12/24 12:01:00 DEBUG manager.SqlManager: Found column XMLTYPE.GETSTRINGVAL(XMLRECORD) of type [12, 4000, 0] 18/12/24 12:01:00 DEBUG manager.SqlManager: Using fetchSize for next query: 1000 18/12/24 12:01:00 INFO manager.SqlManager: Executing SQL statement: SELECT RECID, XMLTYPE.getStringVal(XMLRECORD) FROM F_COMPANY where (1 = 0) 18/12/24 12:01:00 DEBUG manager.SqlManager: Found column RECID 18/12/24 12:01:00 DEBUG manager.SqlManager: Found column XMLTYPE.GETSTRINGVAL(XMLRECORD) 18/12/24 12:01:00 DEBUG orm.ClassWriter: selected columns: 18/12/24 12:01:00 DEBUG orm.ClassWriter: RECID 18/12/24 12:01:00 DEBUG orm.ClassWriter: XMLTYPE_GETSTRINGVAL_XMLRECORD_ 18/12/24 12:01:00 DEBUG orm.ClassWriter: Writing source file: /tmp/sqoop-hduser/compile/a5b10acdb0159723f9a2d3fb7b9fc84c/QueryResult.java 18/12/24 12:01:00 DEBUG orm.ClassWriter: Table name: null 18/12/24 12:01:00 DEBUG orm.ClassWriter: Columns: RECID:12, XMLTYPE.GETSTRINGVAL(XMLRECORD):12, 18/12/24 12:01:00 DEBUG orm.ClassWriter: sourceFilename is QueryResult.java 18/12/24 12:01:00 DEBUG orm.CompilationManager: Found existing /tmp/sqoop-hduser/compile/a5b10acdb0159723f9a2d3fb7b9fc84c/ 18/12/24 12:01:00 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/hduser/hadoop 18/12/24 12:01:00 DEBUG orm.CompilationManager: Returning jar file path /home/hduser/hadoop/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/hadoop-common-2.8.0.jar:/home/hduser/hadoop/share/hadoop/kms/tomcat/webapps/kms/WEB-INF/lib/hadoop-common-2.8.0.jar 18/12/24 12:01:00 DEBUG orm.CompilationManager: Current sqoop classpath = /home/hduser/hadoop/etc/hadoop:/home/hduser/hadoop/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/home/hduser/hadoop/share/hadoop/common/lib/xmlenc-0.52.jar:/home/hduser/hadoop/contrib/capacity-scheduler/*.jar 18/12/24 12:01:00 DEBUG orm.CompilationManager: Adding source file: /tmp/sqoop-hduser/compile/a5b10acdb0159723f9a2d3fb7b9fc84c/QueryResult.java 18/12/24 12:01:00 DEBUG orm.CompilationManager: Invoking javac with args: 18/12/24 12:01:00 DEBUG orm.CompilationManager: -sourcepath 18/12/24 12:01:00 DEBUG orm.CompilationManager: /tmp/sqoop-hduser/compile/a5b10acdb0159723f9a2d3fb7b9fc84c/ 18/12/24 12:01:00 DEBUG orm.CompilationManager: -d 18/12/24 12:01:00 DEBUG orm.CompilationManager: /tmp/sqoop-hduser/compile/a5b10acdb0159723f9a2d3fb7b9fc84c/ 18/12/24 12:01:00 DEBUG orm.CompilationManager: -classpath 18/12/24 12:01:00 DEBUG orm.CompilationManager: /home/hduser/hadoop/etc/hadoop:/home/hduser/hadoop/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/home/hduser/hadoop/share/hadoop/common/lib/xmlenc-0.52.jar:/home/hduser/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.8.0.jar:/home/hduser/sqoop/sqoop-1.4.6.jar Note: /tmp/sqoop-hduser/compile/a5b10acdb0159723f9a2d3fb7b9fc84c/QueryResult.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. 18/12/24 12:01:01 DEBUG orm.CompilationManager: Could not rename /tmp/sqoop-hduser/compile/a5b10acdb0159723f9a2d3fb7b9fc84c/QueryResult.java to /home/hduser/./QueryResult.java 18/12/24 12:01:01 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hduser/compile/a5b10acdb0159723f9a2d3fb7b9fc84c/QueryResult.jar 18/12/24 12:01:01 DEBUG orm.CompilationManager: Scanning for .class files in directory: /tmp/sqoop-hduser/compile/a5b10acdb0159723f9a2d3fb7b9fc84c 18/12/24 12:01:01 DEBUG orm.CompilationManager: Got classfile: /tmp/sqoop-hduser/compile/a5b10acdb0159723f9a2d3fb7b9fc84c/QueryResult.class -> QueryResult.class 18/12/24 12:01:01 DEBUG orm.CompilationManager: Finished writing jar file /tmp/sqoop-hduser/compile/a5b10acdb0159723f9a2d3fb7b9fc84c/QueryResult.jar 18/12/24 12:01:02 INFO tool.ImportTool: Destination directory /home/hduser/hadoop/share/hadoop/hdfsa deleted. 18/12/24 12:01:02 INFO mapreduce.ImportJobBase: Beginning query import. 18/12/24 12:01:02 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address 18/12/24 12:01:02 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 18/12/24 12:01:02 DEBUG db.DBConfiguration: Securing password into job credentials store 18/12/24 12:01:02 DEBUG mapreduce.DataDrivenImportJob: Using table class: QueryResult 18/12/24 12:01:02 DEBUG mapreduce.DataDrivenImportJob: Using InputFormat: class com.cloudera.sqoop.mapreduce.db.DataDrivenDBInputFormat 18/12/24 12:01:02 DEBUG manager.OracleManager$ConnCache: Caching released connection for jdbc:oracle:thin:@xxxx:1521:dcpdb/nblt24 18/12/24 12:01:02 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 18/12/24 12:01:02 DEBUG mapreduce.JobBase: Adding to job classpath: file:/home/hduser/sqoop/sqoop-1.4.6.jar 18/12/24 12:01:02 DEBUG mapreduce.JobBase: Adding to job classpath: file:/home/hduser/sqoop/lib/ant-eclipse-1.0-jvm1.2.jar 18/12/24 12:01:02 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 18/12/24 12:01:06 DEBUG db.DBConfiguration: Fetching password from job credentials store 18/12/24 12:01:06 INFO db.DBInputFormat: Using read commited transaction isolation 18/12/24 12:01:06 DEBUG db.DataDrivenDBInputFormat: Creating input split with lower bound '1=1' and upper bound '1=1' 18/12/24 12:01:06 INFO mapreduce.JobSubmitter: number of splits:1 18/12/24 12:01:07 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1545568287772_0004 18/12/24 12:01:07 INFO impl.YarnClientImpl: Submitted application application_1545568287772_0004 18/12/24 12:01:07 INFO mapreduce.Job: The url to track the job: http://hadoop:8088/proxy/application_1545568287772_0004/ 18/12/24 12:01:07 INFO mapreduce.Job: Running job: job_1545568287772_0004 18/12/24 12:01:13 INFO mapreduce.Job: Job job_1545568287772_0004 running in uber mode : false 18/12/24 12:01:13 INFO mapreduce.Job: map 0% reduce 0% 18/12/24 12:01:19 INFO mapreduce.Job: map 100% reduce 0% 18/12/24 12:01:20 INFO mapreduce.Job: Job job_1545568287772_0004 completed successfully 18/12/24 12:01:20 INFO mapreduce.Job: Counters: 30 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=154736 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=87 HDFS: Number of bytes written=784988 HDFS: Number of read operations=4 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=1 Other local map tasks=1 Total time spent by all maps in occupied slots (ms)=3847 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=3847 Total vcore-milliseconds taken by all map tasks=3847 Total megabyte-milliseconds taken by all map tasks=3939328 Map-Reduce Framework Map input records=203 Map output records=203 Input split bytes=87 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=56 CPU time spent (ms)=1570 Physical memory (bytes) snapshot=209362944 Virtual memory (bytes) snapshot=1988337664 Total committed heap usage (bytes)=114294784 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=784988 18/12/24 12:01:20 INFO mapreduce.ImportJobBase: Transferred 766.5898 KB in 17.9024 seconds (42.8204 KB/sec) 18/12/24 12:01:20 INFO mapreduce.ImportJobBase: Retrieved 203 records. 18/12/24 12:01:20 DEBUG hive.HiveImport: Hive.inputTable: null 18/12/24 12:01:20 DEBUG hive.HiveImport: Hive.outputTable: hr.f_company_xml_tab 18/12/24 12:01:20 DEBUG manager.SqlManager: Execute getColumnInfoRawQuery : SELECT RECID, XMLTYPE.getStringVal(XMLRECORD) FROM F_COMPANY where (1 = 0) 18/12/24 12:01:20 DEBUG manager.OracleManager$ConnCache: Got cached connection for jdbc:oracle:thin:@xxx:1521:dcpdb/nblt24 18/12/24 12:01:20 INFO manager.OracleManager: Time zone has been set to GMT 18/12/24 12:01:20 DEBUG manager.SqlManager: Using fetchSize for next query: 1000 18/12/24 12:01:20 INFO manager.SqlManager: Executing SQL statement: SELECT RECID, XMLTYPE.getStringVal(XMLRECORD) FROM F_COMPANY where (1 = 0) 18/12/24 12:01:20 DEBUG manager.SqlManager: Found column RECID of type [12, 255, 0] 18/12/24 12:01:20 DEBUG manager.SqlManager: Found column XMLTYPE.GETSTRINGVAL(XMLRECORD) of type [12, 4000, 0] 18/12/24 12:01:20 DEBUG manager.SqlManager: Using fetchSize for next query: 1000 18/12/24 12:01:20 INFO manager.SqlManager: Executing SQL statement: SELECT RECID, XMLTYPE.getStringVal(XMLRECORD) FROM F_COMPANY where (1 = 0) 18/12/24 12:01:20 DEBUG manager.SqlManager: Found column RECID 18/12/24 12:01:20 DEBUG manager.SqlManager: Found column XMLTYPE.GETSTRINGVAL(XMLRECORD) 18/12/24
12:01:20 DEBUG hive.TableDefWriter: Create statement: CREATE TABLE IF
NOT EXISTS `hr.f_company_xml_tab` ( `RECID` STRING,
`XMLTYPE.GETSTRINGVAL(XMLRECORD)` STRING) COMMENT 'Imported by
sqoop on 2018/12/24 12:01:20' ROW FORMAT DELIMITED FIELDS TERMINATED BY
'\001' LINES TERMINATED BY '\012' STORED AS TEXTFILE 18/12/24 12:01:20 DEBUG hive.TableDefWriter: Load statement: LOAD DATA INPATH 'hdfs://localhost:9000/home/hduser/hadoop/share/hadoop/hdfsa' INTO TABLE `hr.f_company_xml_tab` 18/12/24 12:01:20 INFO hive.HiveImport: Loading uploaded data into Hive 18/12/24 12:01:20 DEBUG hive.HiveImport: Using external Hive process. 18/12/24 12:01:25 INFO hive.HiveImport: SLF4J: Class path contains multiple SLF4J bindings. 18/12/24 12:01:25 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/home/hduser/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] 18/12/24 12:01:25 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/home/hduser/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] 18/12/24 12:01:25 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/home/hduser/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] 18/12/24 12:01:25 INFO hive.HiveImport: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. 18/12/24 12:01:25 INFO hive.HiveImport: SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] 18/12/24 12:01:26 INFO hive.HiveImport: 18/12/24 12:01:26 INFO hive.HiveImport: Logging initialized using configuration in jar:file:/home/hduser/hive/lib/hive-common-2.3.3.jar!/hive-log4j2.properties Async: true 18/12/24
12:01:28 INFO hive.HiveImport: FailedPredicateException([., :] can not
be used in column name in create table statement.,{}?) 18/12/24 12:01:28 INFO hive.HiveImport: at org.apache.hadoop.hive.ql.parse.HiveParser.throwColumnNameException(HiveParser.java:1256) 18/12/24 12:01:28 INFO hive.HiveImport: at org.apache.hadoop.hive.ql.parse.HiveParser.columnNameType(HiveParser.java:32991) 18/12/24 12:01:28 INFO hive.HiveImport: at org.apache.hadoop.hive.ql.parse.HiveParser.columnNameTypeOrPKOrFK(HiveParser.java:33432) 18/12/24 12:01:28 INFO hive.HiveImport: at org.apache.hadoop.hive.ql.parse.HiveParser.columnNameTypeOrPKOrFKList(HiveParser.java:29513) 18/12/24 12:01:28 INFO hive.HiveImport: at org.apache.hadoop.hive.ql.parse.HiveParser.createTableStatement(HiveParser.java:6175) 18/12/24 12:01:28 INFO hive.HiveImport: at org.apache.hadoop.hive.ql.parse.HiveParser.ddlStatement(HiveParser.java:3808) 18/12/24 12:01:28 INFO hive.HiveImport: at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 18/12/24 12:01:28 INFO hive.HiveImport: at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 18/12/24 12:01:28 INFO hive.HiveImport: at java.lang.reflect.Method.invoke(Method.java:498) 18/12/24 12:01:28 INFO hive.HiveImport: at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 18/12/24
12:01:28 INFO hive.HiveImport: FAILED: ParseException line 1:108 Failed
to recognize predicate ')'. Failed rule: '[., :] can not be used in
column name in create table statement.' in column specification 18/12/24
12:02:26 ERROR tool.ImportTool: Encountered IOException running import
job: java.io.IOException: Hive exited with status 64 at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:389) at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:339) at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:240) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:514) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605) at org.apache.sqoop.Sqoop.run(Sqoop.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) at org.apache.sqoop.Sqoop.main(Sqoop.java:236) 18/12/24 12:02:26 DEBUG manager.OracleManager$ConnCache: Caching released connection for jdbc:oracle:thin:@xxx:1521:dcpdb/nblt24
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Hive
-
Apache Sqoop