Created on 01-30-2017 02:43 PM - edited 09-16-2022 03:59 AM
I have 1000 tables with more than 100000 records in each table in mysql. The tables have 300-500 columns. Some of tables have columns with special characters like .(dot) and space in the column names. Now I want to do sqoop import and create a hive table in HDFS in a single shot query like below as textfile
sqoop import --connect ${domain}:${port}/$(database) --username ${username} --password ${password}\
--table $(table) -m 1 --hive-import --hive-database ${hivedatabase} --hive-table $(table) --create-hive-table\
--target-dir /user/hive/warehouse/${hivedatabase}.db/$(table)
After this the hive table is created but when I do select * query on the table it shows error as
This error output is a sample output.
Error while compiling statement: FAILED: RuntimeException java.lang.RuntimeException: cannot find field emp from [0:emp.id, 1:emp.name, 2:emp.salary, 3:emp.dno]
However when I do the import as parquet file the table is created and when I do select query the query gives me expected results. Even the .(dot) are replaced with _(underscore) automatically
Why is this happening?
Created on 02-04-2017 09:17 PM - edited 02-04-2017 09:19 PM
if you are having trouble passing the special characte, trying using --map-column-java and you should be able to suprass those errors . type cast to your needs. or You can use --map-column-hive
sqoop import ... --map-column-java id=String,value=Integer
Created 02-04-2017 09:22 PM
Hi Csguna,
Doing this for one or two columns is fine but doing for more than 200 columns is where I am stuck
Sqoop import ... --map-column-java id=String,value=Integer