Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

TIMESTAMP on HIVE table

New Contributor

I'm trying to load data from Oracle to Hive as parquet. Every time i load a table with date/timestamp column to hive, it automatically converts these columns to BIGINT. Is is possible to load timestamp/date formats to hive using sqoop and as a parquet file?

Already tried creating the table first in hive then using impala to LOAD DATA INPATH the parquet file.
Still failed with errors

"file XX has an incompatible Parquet schema for column XX column: TIMESTAMP"

2 REPLIES 2

Champion

@helpdk

 

 

Can you share your script? so that you may get desirable answer

 

Have you tried this option?
--query 'select col1, date_function(col2), etc from table_name where $CONDITIONS'

 

instead of

--table table_name

Note:
1. date_function is not an actual function name. You can use the applicable hive function 
2. Your query will end with "where $CONDITIONS" by default and this is syntax

New Contributor

@saranvisa

 

Already tried using the --query. cast the column as timestamp but still converted TIMESTAMP to BIGINT in the HIVE table.

 

Here's my query:

sqoop import \
--connect "jdbc:oracle:thin:@192.168.1.17:1521:XE" \
--username "xxxxx" \
--password-file /user/cloudera/credentials/ora.password \
--as-parquetfile \
--hive-import \
--hive-overwrite \
--target-dir /user/cloudera/oracle \
--compression-codec snappy \
--query "select EMPLOYEE_ID, START_DATE, END_DATE, JOB_ID, DEPARTMENT_ID, cast(TSTAMP as TIMESTAMP) from SQOOPDB.JOB_HISTORY where \$CONDITIONS" \
--hive-table job_history9 \
--hive-database oracle \
-m 1

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.