Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Issues With Sqoop Ingestion

Issues With Sqoop Ingestion

New Contributor

Hi 

Am trying to ingest data from Oracle to S3 in Avro Format and creating Hive tables on top of location using auto generated AVSC schema file. But i found datatype mismatch from oracle to hive. By default columns with Date Type is converted to bigint(epoc), [Number, decimal,and varchar  to String]. 

As per our requirement we dont date type columns in epoch ,to handle this  during ingestion process using sqoop we map the column to string, And columns with Number are mapped to Integer, Decimal mapped to Double.

Now in Hive date type column  is in String, Decimal type is in Double.

 

1.Does using date type column in string would effect any spark processing time.

2.Does this column will be used to compare the results ? To perform any kind of aggregations between two dates

3. Or casting all date type columns to timestamp in later stages using spark would solve this issues. Final requirement , all date type columns should be in TIMESTAMP, 

 

 

 

 

But as per our requirement we need Date columns in timestamp in hive 

Don't have an account?
Coming from Hortonworks? Activate your account here