Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Using Parquet in sqoop import automatically converts the datatypes in Hive table

Using Parquet in sqoop import automatically converts the datatypes in Hive table

Champion

Hi

 

We are planning to implement the Parquet logic in Sqoop Import for the existing process. But it has converted some column datatype in Hive table automatically as follows:

 

Table_Name	Column Name			Data_Type(Before parquet)	Data_Type(After Parquet)		
Table1		actiontime			string						bigint						
Table1		createdate			string						bigint						
Table1		action				string						string						
Table1		studentid			double						string						

Sqoop import is for our staging load. if parquet automatically converts the data type this will impact the sub-sequent codes, so we want to maintain the Source Data_Type in Hive as it is... How to customize the datatype in sqoop import when using 'parquet'?

 

 

Thanks

Kumar

 

1 REPLY 1

Re: Using Parquet in sqoop import automatically converts the datatypes in Hive table

Champion

Would you consider using --map-coloum-hive <mapping>  in your sqoop import exeution. 

example

sqoop import ... --map-column-hive actiontime=String,