I'm trying to import a large database with tons of tables and columns from MySQL to Hive using Sqoop.
MySQL 8.0, Sqoop 1.4.7, Hive 3.1.1
To achieve that I've used first the following command line:
sqoop-import-all-tables --connect jdbc:mysql://IP:porta/mydatabase
- p --hive-import
But there is some incompatibility between data type in MySQL and Hive, and I received the following error message:
ERROR tool . ImportAllTablesTool : Encountered IOException running import job : java . io . IOException : Hive does not support the SQL type for column rowguid .
So, looking at Sqoop 1.4.7 documentation, I noticed that there is a parameter --map-column-hive <name-of-column-to-map>. So, I added it to my command line: --map-column-hive rowguid=binary and got this new error:
ERROR sqoop . Sqoop : Got exception running Sqoop : java . lang . IllegalArgumentException : No column by the name rowguid found while importing data
java . lang . IllegalArgumentException : No column by the name rowguid found while importing data
This error happens when it's importing the next table after the one containing rowguid column.
My database has several tables and tons of columns. When I execute the above command line, it goes normal until it throws me an error related to missing column rowguid.
Of course, this mapping should be done just on the specific table where this column exists, not in all tables.
Is there a way to define table-name + column-name in --map-column-hive parameter? Or another way to step over this issue? Maybe importing all tables changing automatically the data types?
... View more