Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

SQOOP import of "image" data type into hive

Highlighted

SQOOP import of "image" data type into hive

Explorer

I am trying to import a table from MSSQL which has few columns with data type "Image"

The value in the columns are of binary format(LOBS) .

Ex

database type : MSSQL

column name : Settings

Data_type : image

value of the column:

( 0x1F8B0800000400EDBD07601C499625262FD7E0CC188CDE692EC1D69472329AB2A81CA6501320CCE74A10880604010ED9D....)

 

When I import using sqoop the COLUMNS is automatically taken as string in hive but the data is corrupted.

 

COMMAND USED :

 

sqoop import --driver 'com.microsoft.sqlserver.jdbc.SQLServerDriver' --connect 'jdbc:sqlserver://IP:PORT;database=DB;' --connection-manager 'org.apache.sqoop.manager.SQLServerManager' --username <> -password <> --as-parquetfile --delete-target-dir --target-dir '/user/test/' --query "select GUID,Name,cast(Settings AS binary) AS Settings FROM my_table_name where \$CONDITIONS; " --m 1

 

 

 

5 REPLIES 5
Highlighted

Re: SQOOP import of "image" data type into hive

Explorer

@VidyaSargur 

 

Any sqoop expert who can throw some light on this issue ?

 

Re: SQOOP import of "image" data type into hive

Community Manager

@EricL Is this something that you can help with?

Thanks,

Vidya


Vidya Sargur, Community Manager

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Highlighted

Re: SQOOP import of "image" data type into hive

Guru
@sow,

Sorry, did not noticed your question here. I am wondering if you have tried using "--map-column-hive" option of Sqoop, which is mentioned in the doc here:
https://sqoop.apache.org/docs/1.4.7/SqoopUserGuide.html#_controlling_type_mapping

So in your case, the command will look like below:

sqoop import --driver 'com.microsoft.sqlserver.jdbc.SQLServerDriver' --connect 'jdbc:sqlserver://IP:PORT;database=DB;' --connection-manager 'org.apache.sqoop.manager.SQLServerManager' --username <> -password <> --as-parquetfile --delete-target-dir --target-dir '/user/test/' --query "select GUID,Name,Settings FROM my_table_name where \$CONDITIONS; " --m 1 --map-column-hive Settings=binary

Note that you do not need to CAST anymore from MSSQL side. See if it helps.

Cheers
Eric
Highlighted

Re: SQOOP import of "image" data type into hive

Explorer

@EricL 

 

I tried the above. the map-column-hive is not making any impact.

the column data type is taken as string in hive after import and the data is still corrupted.

 

Using hive-import and hive-table with it helped in getting the data type as binary in hive. But the data is still corrupted

cmd used :
sqoop import --driver 'com.microsoft.sqlserver.jdbc.SQLServerDriver' --connect <> --connection-manager 'org.apache.sqoop.manager.SQLServerManager' --username <> -password <> --hive-import --hive-table testc --as-parquetfile --delete-target-dir --target-dir '/user/test/' --query "select * FROM my_table where \$CONDITIONS; " --m 1 --map-column-hive Settings=binary

 

Highlighted

Re: SQOOP import of "image" data type into hive

Explorer

@EricL 

 

I tried -map-column-java Settings=String and got the data type as string in hive.

however the data doesnt look corrupted but the hexadecimal representation of the value is not retained

 

cmd used : 

sqoop import --driver 'com.microsoft.sqlserver.jdbc.SQLServerDriver' --connect <> --connection-manager 'org.apache.sqoop.manager.SQLServerManager' --username <> -password <> --table 'my_table' --as-parquetfile --delete-target-dir --target-dir /user/test/ -map-column-java Settings=String --m 1 

 

Result of above query: 

column data type in source : image

column value in source : 0xFFffDeDJBF.......dDf.      --> ( A hexadecimal value )

column data type in hive: string 

column value in hive: FFffDeDJBF.......dDf

(0x is not retained)

 

 

 

 

Don't have an account?
Coming from Hortonworks? Activate your account here