Created on 09-18-2018 05:47 PM - edited 08-17-2019 10:42 PM
I am having some issues with sqooping data from sql server database to Hive. I am using Hive in Ambari (HortonWorks) and when I sqooped the data from SQL server table to Hive, the data doesn't look pretty.
Here is the sqoop command used: sqoop import --connect "jdbc:sqlserver://{server_name};database=DRE; username=sqoop;password=hadoop" --table NCentralAlerts --hive-import
I have attached screenshots of the data from SQL Server and that of Hive. The data doesn't look pretty in Hive.
Can anybody provide an insight as to what i am doing wrong here? Why is there a big difference between the data in sql server table and hive table?
Thanks in advance.
Created 09-18-2018 05:53 PM
From the looks of the data from Hive result and SQLServer result, seems like the delimiter are not set correct.
1. You may want to check the delimiter of Hive table and try to set that in sqoop.
2. Try getting the data into HDFS first and validate how the data looks like before loading into Hive (if the above does not fit)
Created 09-18-2018 07:16 PM
When I sqooped the data into the hdfs, I am getting the following error:
18/09/18 19:14:01 ERROR tool.ImportAllTablesTool: Encountered IOException running import job: java.io.IOException: Generating splits for a textual index column allowed only in case of "-Dorg.apache.sqoop.splitte r.allow_text_splitter=true" property passed as a parameter How do I set this property to true in the following sqoop command? sqoop import-all-tables --connect "jdbc:sqlserver://192.168.109.69;database=DRE; username=sqoop;password=hadoop" --warehouse-dir "/user/DRE"