Support Questions

Find answers, ask questions, and share your expertise

SQOOP import of "image" data type into hive


I am trying to import a table from MSSQL which has few columns with data type "Image"

The value in the columns are of binary format(LOBS) .


database type : MSSQL

column name : Settings

Data_type : image

value of the column:

( 0x1F8B0800000400EDBD07601C499625262FD7E0CC188CDE692EC1D69472329AB2A81CA6501320CCE74A10880604010ED9D....)


When I import using sqoop the COLUMNS is automatically taken as string in hive but the data is corrupted.




sqoop import --driver '' --connect 'jdbc:sqlserver://IP:PORT;database=DB;' --connection-manager 'org.apache.sqoop.manager.SQLServerManager' --username <> -password <> --as-parquetfile --delete-target-dir --target-dir '/user/test/' --query "select GUID,Name,cast(Settings AS binary) AS Settings FROM my_table_name where \$CONDITIONS; " --m 1








Any sqoop expert who can throw some light on this issue ?


Community Manager

@EricL Is this something that you can help with?




Vidya Sargur,
Community Manager

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:


Sorry, did not noticed your question here. I am wondering if you have tried using "--map-column-hive" option of Sqoop, which is mentioned in the doc here:

So in your case, the command will look like below:

sqoop import --driver '' --connect 'jdbc:sqlserver://IP:PORT;database=DB;' --connection-manager 'org.apache.sqoop.manager.SQLServerManager' --username <> -password <> --as-parquetfile --delete-target-dir --target-dir '/user/test/' --query "select GUID,Name,Settings FROM my_table_name where \$CONDITIONS; " --m 1 --map-column-hive Settings=binary

Note that you do not need to CAST anymore from MSSQL side. See if it helps.





I tried the above. the map-column-hive is not making any impact.

the column data type is taken as string in hive after import and the data is still corrupted.


Using hive-import and hive-table with it helped in getting the data type as binary in hive. But the data is still corrupted

cmd used :
sqoop import --driver '' --connect <> --connection-manager 'org.apache.sqoop.manager.SQLServerManager' --username <> -password <> --hive-import --hive-table testc --as-parquetfile --delete-target-dir --target-dir '/user/test/' --query "select * FROM my_table where \$CONDITIONS; " --m 1 --map-column-hive Settings=binary





I tried -map-column-java Settings=String and got the data type as string in hive.

however the data doesnt look corrupted but the hexadecimal representation of the value is not retained


cmd used : 

sqoop import --driver '' --connect <> --connection-manager 'org.apache.sqoop.manager.SQLServerManager' --username <> -password <> --table 'my_table' --as-parquetfile --delete-target-dir --target-dir /user/test/ -map-column-java Settings=String --m 1 


Result of above query: 

column data type in source : image

column value in source : 0xFFffDeDJBF.......dDf.      --> ( A hexadecimal value )

column data type in hive: string 

column value in hive: FFffDeDJBF.......dDf

(0x is not retained)





Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.