Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How can I convert the BLOB data to the actual file when importing the Oracle table data to HDFS using Sqoop

avatar

Hi Team,

I have a table in Oracle DB with one column holding BLOB data. Using Sqoop import the blob content is stored as binary value in HDFS. I want to see the original content of the blob data. How can I do that?

Please help.

Thanks in Advance !!

2 REPLIES 2

avatar
Super Collaborator

Hi @jyothi k,

While I was doing a migration from RDBMS to Hive I did come across the same scenario, that BLOB, CLOB data.

I did approch in a manner that covert the BLOB and CLOB data using bese64 encoding (convert any kind of binary data into readable text format) store in Hive.

select UTL_ENCODE.BASE64_ENCODE(blob_column) from oracle_tbl;  -- from Orcale

This gives the base64 encoding string, so that you can store this as String in Hive/hdfs, which fed to Sqoop as string.

on the other hand to convert back to BLOB you can use hive unbase64 ().

or the java Base64 package (can be used in either native java Apps or Spark etc etc..)

example :

select unbase64(converted_blob_column) from hive_table;

for native apps you may refer the java docs for Base64 conversion here

Hope this helps !!

avatar

Seems BASE64_ENCODE has limitation on byte size, I got below error while trying import with BLOB data size ~127KB.

java.sql.SQLException: ORA-22835: Buffer too small for CLOB to CHAR or BLOB to RAW conversion (actual: 129892, maximum: 2000)

Sundar Gampa