Created on 11-21-2019 12:06 AM - last edited on 11-21-2019 01:00 AM by VidyaSargur
I am trying to import a table from MSSQL which has few columns with data type "Image"
The value in the columns are of binary format(LOBS) .
Ex
database type : MSSQL
column name : Settings
Data_type : image
value of the column:
( 0x1F8B0800000400EDBD07601C499625262FD7E0CC188CDE692EC1D69472329AB2A81CA6501320CCE74A10880604010ED9D....)
When I import using sqoop the COLUMNS is automatically taken as string in hive but the data is corrupted.
COMMAND USED :
sqoop import --driver 'com.microsoft.sqlserver.jdbc.SQLServerDriver' --connect 'jdbc:sqlserver://IP:PORT;database=DB;' --connection-manager 'org.apache.sqoop.manager.SQLServerManager' --username <> -password <> --as-parquetfile --delete-target-dir --target-dir '/user/test/' --query "select GUID,Name,cast(Settings AS binary) AS Settings FROM my_table_name where \$CONDITIONS; " --m 1
Created 11-26-2019 06:03 AM
Created 11-27-2019 02:16 AM
@EricL Is this something that you can help with?
Thanks,
Vidya
Regards,
Vidya Sargur,Created 11-27-2019 04:00 AM
Created 11-27-2019 04:20 AM
I tried the above. the map-column-hive is not making any impact.
the column data type is taken as string in hive after import and the data is still corrupted.
Using hive-import and hive-table with it helped in getting the data type as binary in hive. But the data is still corrupted
cmd used :
sqoop import --driver 'com.microsoft.sqlserver.jdbc.SQLServerDriver' --connect <> --connection-manager 'org.apache.sqoop.manager.SQLServerManager' --username <> -password <> --hive-import --hive-table testc --as-parquetfile --delete-target-dir --target-dir '/user/test/' --query "select * FROM my_table where \$CONDITIONS; " --m 1 --map-column-hive Settings=binary
Created on 11-27-2019 04:29 AM - edited 11-27-2019 04:40 AM
I tried -map-column-java Settings=String and got the data type as string in hive.
however the data doesnt look corrupted but the hexadecimal representation of the value is not retained
cmd used :
sqoop import --driver 'com.microsoft.sqlserver.jdbc.SQLServerDriver' --connect <> --connection-manager 'org.apache.sqoop.manager.SQLServerManager' --username <> -password <> --table 'my_table' --as-parquetfile --delete-target-dir --target-dir /user/test/ -map-column-java Settings=String --m 1
Result of above query:
column data type in source : image
column value in source : 0xFFffDeDJBF.......dDf. --> ( A hexadecimal value )
column data type in hive: string
column value in hive: FFffDeDJBF.......dDf
(0x is not retained)
Created 09-25-2023 02:06 PM
Created 09-25-2023 03:17 PM
@DaveNepal Hello! Welcome to the Welcome to the Cloudera Community!
As this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post. Thanks.
Regards,
Diana Torres,