I have a table in Oracle DB with one column holding BLOB data. Using Sqoop import the blob content is stored as binary value in HDFS. I want to see the original content of the blob data. How can I do that?
Thanks in Advance !!
Hi @jyothi k,
While I was doing a migration from RDBMS to Hive I did come across the same scenario, that BLOB, CLOB data.
I did approch in a manner that covert the BLOB and CLOB data using bese64 encoding (convert any kind of binary data into readable text format) store in Hive.
select UTL_ENCODE.BASE64_ENCODE(blob_column) from oracle_tbl; -- from Orcale
This gives the base64 encoding string, so that you can store this as String in Hive/hdfs, which fed to Sqoop as string.
on the other hand to convert back to BLOB you can use hive unbase64 ().
or the java Base64 package (can be used in either native java Apps or Spark etc etc..)
select unbase64(converted_blob_column) from hive_table;
for native apps you may refer the java docs for Base64 conversion here
Hope this helps !!
Seems BASE64_ENCODE has limitation on byte size, I got below error while trying import with BLOB data size ~127KB.
java.sql.SQLException: ORA-22835: Buffer too small for CLOB to CHAR or BLOB to RAW conversion (actual: 129892, maximum: 2000)