Created on 06-04-2021 04:42 PM - last edited on 06-07-2021 07:34 AM by VidyaSargur
Hello everybody,
I am looking for ways to increase speed of data ingestion in my Company's Datalake with, specifically, use of Oracle Data Connector in Sqoop.
Our HortonWorks(Now Cloudera) environment uses Sqoop 1.4.6.
Following theses docs in sqoop site I asked our dba's Team a user with this features:
A) create and alter session privileges
B) Object privileges:
Playing with this Sqoop feature I ran this command:
Sqoop import
-Doraoop.jdbc.url.verbatim=true
--direct
--connect
jdbc:oracle:thin:@myserver.myserver.myserver:1521/mydb
--username
myuser
--password-alias
********
--table
myschema.mytable
--target-dir
/mypath/mytable
--verbose
And get this output:
>>> Invoking Sqoop command line now >>> 2021-06-04 18:05:04,323 [main] WARN org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration. 2021-06-04 18:05:04,365 [main] INFO org.apache.sqoop.Sqoop - Running Sqoop version: 1.4.6.2.6.5.0-292 2021-06-04 18:05:04,388 [main] DEBUG org.apache.sqoop.tool.BaseSqoopTool - Enabled debug logging. 2021-06-04 18:05:04,390 [main] DEBUG org.apache.sqoop.util.password.CredentialProviderHelper - Reflecting credential provider classes and methods 2021-06-04 18:05:04,391 [main] DEBUG org.apache.sqoop.util.password.CredentialProviderHelper - Found org.apache.hadoop.security.alias.CredentialProvider 2021-06-04 18:05:04,399 [main] DEBUG org.apache.sqoop.util.password.CredentialProviderHelper - Found org.apache.hadoop.security.alias.CredentialProviderFactory 2021-06-04 18:05:04,400 [main] DEBUG org.apache.sqoop.util.password.CredentialProviderHelper - Found CredentialProvider#createCredentialEntry 2021-06-04 18:05:04,400 [main] DEBUG org.apache.sqoop.util.password.CredentialProviderHelper - Found CredentialProvider#flush 2021-06-04 18:05:04,401 [main] DEBUG org.apache.sqoop.util.password.CredentialProviderHelper - Found Configuration#getPassword 2021-06-04 18:05:04,401 [main] DEBUG org.apache.sqoop.util.password.CredentialProviderHelper - Found CredentialProviderFactory#getProviders 2021-06-04 18:05:04,402 [main] DEBUGorg.apache.sqoop.util.password.CredentialProviderHelper - Resolving alias with credential provider path set to jceks://hdfs/mypath/my.jceks 2021-06-04 18:05:04,672 [main] WARN org.apache.sqoop.ConnFactory - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration. 2021-06-04 18:05:04,685 [main] DEBUG org.apache.sqoop.ConnFactory - Loaded manager factory: org.apache.sqoop.manager.oracle.OraOopManagerFactory 2021-06-04 18:05:04,709 [main] DEBUG org.apache.sqoop.ConnFactory - Loaded manager factory: com.cloudera.sqoop.manager.DefaultManagerFactory 2021-06-04 18:05:04,709 [main] DEBUG org.apache.sqoop.ConnFactory - Trying ManagerFactory: org.apache.sqoop.manager.oracle.OraOopManagerFactory 2021-06-04 18:05:04,752 [main] DEBUG org.apache.sqoop.manager.oracle.OraOopManagerFactory - Data Connector for Oracle and Hadoop can be called by Sqoop! 2021-06-04 18:05:04,771 [main] INFO org.apache.sqoop.manager.SqlManager - Using default fetchSize of 1000 2021-06-04 18:05:05,699 [main] DEBUG org.apache.sqoop.manager.oracle.OraOopUtilities - The Oracle table context has been derived from:oracleConnectionUserName = myuser
tableStr = myschema.mytable
as:
owner : myschema
table : mytable 2021-06-04 18:05:06,333 [main] INFO org.apache.sqoop.manager.oracle.OraOopManagerFactory - ************************************************** *** Using Data Connector for Oracle and Hadoop *** ************************************************** 2021-06-04 18:05:06,334 [main] DEBUG org.apache.sqoop.manager.oracle.OraOopUtilities - Updated mapred.child.java.opts from "-Xmx200m" to "-Djava.security.egd=file:///dev/urandom -Xmx200m" 2021-06-04 18:05:06,353 [main] ERROR org.apache.sqoop.manager.oracle.OraOopManagerFactory - Unable to obtain the Oracle database version. java.sql.SQLException: Result set after last row at oracle.jdbc.driver.GeneratedScrollableResultSet.getInt(GeneratedScrollableResultSet.java:570) [...] 2021-06-04 18:05:06,377 [main] INFO org.apache.sqoop.manager.oracle.OraOopManagerFactory - This Oracle database is a RAC. 2021-06-04 18:05:06,377 [main] INFO org.apache.sqoop.manager.oracle.OraOopManagerFactory - Data Connector for Oracle and Hadoop will not use dynamically generated JDBC URLs - this feature has been disabled. 2021-06-04 18:05:06,377 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.map.max.attempts is deprecated. Instead, use mapreduce.map.maxattempts 2021-06-04 18:05:06,377 [main] DEBUG org.apache.sqoop.ConnFactory - Instantiated ConnManager org.apache.sqoop.manager.oracle.OraOopConnManager@20134094 2021-06-04 18:05:06,378 [main] INFO org.apache.sqoop.tool.CodeGenTool - Beginning code generation 2021-06-04 18:05:06,391 [main] DEBUG org.apache.sqoop.manager.oracle.OraOopOracleQueries - getTableColumns() : sql = SELECT column_name, data_type FROM dba_tab_columns WHERE owner = ? and table_name = ? and (DATA_TYPE IN ('BINARY_DOUBLE','BINARY_FLOAT','BLOB','CHAR','CLOB','DATE','FLOAT','LONG','NCHAR','NCLOB','NUMBER','NVARCHAR2','RAW','ROWID','URITYPE','VARCHAR2') OR DATA_TYPE LIKE 'INTERVAL YEAR(%) TO MONTH' OR DATA_TYPE LIKE 'INTERVAL DAY(%) TO SECOND(%)' OR DATA_TYPE LIKE 'TIMESTAMP(%)' OR DATA_TYPE LIKE 'TIMESTAMP(%) WITH TIME ZONE' OR DATA_TYPE LIKE 'TIMESTAMP(%) WITH LOCAL TIME ZONE') ORDER BY column_id 2021-06-04 18:05:06,605 [main] ERROR org.apache.sqoop.Sqoop - Got exception running Sqoop: java.lang.RuntimeException: java.sql.SQLSyntaxErrorException: ORA-00942: table or view does not exist <<< Invocation of Sqoop command completed <<<
After some googling about the message "Unable to obtain the Oracle database version." I arrived on this link where I can see how OraOop identifys Oracle Database Version between lines 163 and 190 with this query:
SELECT
v.banner,
rtrim(v.version) full_version,
rtrim(v.version_bit) version_bit,
substr(v.version, 1, instr(v.version, '.', 1, 1) - 1) major,
substr(v.version, instr(v.version, '.', 1, 1) + 1, instr(v.version, '.', 1, 2) - instr(v.version, '.', 1, 1) - 1) minor,
substr(v.version, instr(v.version, '.', 1, 2) + 1, instr(v.version, '.', 1, 3) - instr(v.version, '.', 1, 2) - 1) version,
substr(v.version, instr(v.version, '.', 1, 3) + 1, instr(v.version, '.', 1, 4) - instr(v.version, '.', 1, 3) - 1) patch,
decode(instr(v.banner, '64bit'), 0, 'False', 'True') isdb64bit,
decode(instr(b.banner, 'HPUX'), 0, 'False', 'True') ishpux
FROM (
SELECT
ROWNUM row_num,
banner,
substr(substr(banner, instr(banner, 'Release ') + 8), 1) version_bit,
substr(substr(banner, instr(banner, 'Release ') + 8), 1, instr(substr(banner, instr(banner, 'Release ') + 8), ' ')) version
FROM
v$version
WHERE
banner LIKE 'Oracle%'
OR banner LIKE 'Personal Oracle%'
) v,
v$version b
WHERE v.row_num = 1 and b.banner like 'TNS%' ;
When I run the query in Database it returned me no lines, but when I run the query without the bold part above i receve this line as aswer:
BANNER FULL_VERSION VERSION_BIT MAJOR MINOR VERSION PATCH ISDB6 ISHPU
------------------------------------------------------------------------- -------------- ------------------------- ------- ------- --------- ------ ----- -----
Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production 18.0.0.0.0 18.0.0.0.0 - Production 18 0 0 0 False False
So the OraOop code will only work when this query return at least one line, but I cannot modidy this class code for my needs.... after this long way now I m looking for a compatibility matrix between sqoop 1.4.6 versus Oracle Database versions, but couldn't find one.
Anyone can help me with it?
I'm in the right way to solve this problem?
Any thoughts will be appreciated.
Thanks!
Created on 06-14-2021 11:34 PM - edited 06-14-2021 11:38 PM
Hello @cardozogp ,
I am just wondering if you've seen the below documentation regarding the Sqoop - Oracle compatibility:
Hope it helps you to move forward with your new requirement!
Kind regards:
Ferenc
Ferenc Erdelyi, Technical Solutions Manager
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Created 06-15-2021 11:41 AM
Hi @Bender,
Thanks for your replay. The Faq wasn't useful. It's aswer is:
"With BDA V2.0 Sqoop automatically supports Oracle Database and MySQL. Hence connect strings beginning with jdbc:oracle or jdbc:mysql:// are handled with no additional setup."
😞