Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

sqoop import of BLOB columns from oracle database

sqoop import of BLOB columns from oracle database

Explorer

I have to import table from oracle that contains both blob and clob

 

--map-column-java CLOB_column=String works fine

--map-column-java BLOB_column=String throws error

 

How do we import blob data type columns from oracle using sqoop ?

 

Query used:

 

sqoop import --connect 'jdbc:oracle:thin:@serveri_ip:port/database' --username <> -password <> --query "SELECT *  FROM table_name WHERE \$CONDITIONS" --map-column-java CMID=Integer,DATAPROP=String --as-parquetfile --delete-target-dir --target-dir /user/test/ --m 1

 

DATAPROP --> is the BLOB column

 

4 REPLIES 4

Re: sqoop import of BLOB columns from oracle database

New Contributor

@sow 

 

Actually i guess that you have to map the blob datatype to binary instead of a string. Try the below command hope it works:

sqoop import --connect 'jdbc:oracle:thin:@serveri_ip:port/database' --username <> -password <> --query "SELECT *  FROM table_name WHERE \$CONDITIONS" --map-column-java CMID=Integer,DATAPROP=binary --as-parquetfile --delete-target-dir --target-dir /user/test/ --m 1

 

Re: sqoop import of BLOB columns from oracle database

Explorer

@pavan_kumar 

 

I tried with --map-column-java column_name=binary /Binary

Below is the errors that i get : 

20/01/05 10:04:58 ERROR tool.ImportTool: Import failed: No ResultSet method for Java type binary

 

20/01/05 10:04:07 ERROR tool.ImportTool: Import failed: No ResultSet method for Java type Binary

 

i tried map-column-hive column_name=binary

The sqoop import was success but when i query the table in impala it shows the below error

AnalysisException: Unsupported type 'BINARY' in Table

All other columns are accessible but the blob one is corrupted and hence unable to query

 

 

 

 

 

Highlighted

Re: sqoop import of BLOB columns from oracle database

Mentor

@sow 

 

Impala does not allow binary data. What you can do is use a serialize-deserialize methodology. This means you convert your image to a String format that still contains all the information necessary to transform it back. Once you need to retrieve an image on HDFS you will need to deserialize, meaning converting the string to the original format.

Found this example using Python it  would work like this:

 

import base64

def img_to_string(image_path):
with open(image_path, "rb") as imageFile:
image_string= base64.b64encode(imageFile.read())
print image_string

 

def string_to_img(image_string):
with open("new_image.png", "wb") as imageFile:
imageFile.write(str.decode('base64'))

 

 

 

Re: sqoop import of BLOB columns from oracle database

Explorer

@Shelton 

 

Thank you for your inputs.

The above shows serialize-deserializing of a file that contains images.

The one I am trying is to import a column in oracle table to HDFS using sqoop.
Other documentations shows mapping the column to string using  --map-column-java as the solution.

But the same throws error:

Caused by: java.sql.SQLException: Invalid column type: getString/getNString not implemented for class oracle.jdbc.driver.T4CBlobAccessor

 

Don't have an account?
Coming from Hortonworks? Activate your account here