Support Questions

kerjo · ‎09-21-2016

Hello guys,

I am using Scoop to transfer data from postgresql to hive usaing storage parquet.

I have a problem with the postgresql format "timestamp without time zone" which is mapped to BIGINT and becomes unsable for queriing.

I found the import option: --map-column-hive varname=type.

I have tried with the type:STRING and TIMESTAMP, both are ignored and I do not get any message about it.

Version used: SQOOP 1.4.6, Postgresql 9.4, CDH 5.8.

any idea is welcome.

kerjo · ‎09-23-2016

Nobody is facing this problem?

AndrewLee586 · ‎09-30-2016

Did you find a solution for this problem?

I am having the same issue. I've scoured the internet and I think In my case, --map-column-hive is not working with the --as-parquetfile command for some reason. I am not really sure why but I think that is the reason.

A workaround that I am trying is to import the file regularly and create a table in hive using the file you imported and casting to timestamp there

manpreet2 · ‎10-07-2016

Do we have any solution for this? I am facing the same issue.

csguna · ‎10-08-2016

Could you let us know the timestamp format - is it something like the below

yyyy-mm-dd hh:mm:ss?

manpreet2 · ‎10-13-2016

Hi,
The format is yyyy-mm-dd hh:mm:ss in RDBMS. When i try to sqoop it in hive, it does unix timestamp conversion but i want it to be in date format which is readable.
Thanks

kerjo · ‎10-13-2016

Hello,

in PostgreSQL, I have my datetime stored as TIMESTAMP without TIME ZONE.

I have tried with other variables types, the option is not working when doing an import to create a hive table in parquet format.

csguna · ‎10-13-2016

Would consider trying type casting BIGINT TO Timestamp .

Also please refer this document , I read it long back. I am quoting it from the cloudera manul document

If you use Sqoop to convert RDBMS data to Parquet, be careful with interpreting any resulting values from DATE, DATETIME, or TIMESTAMP columns. 
The underlying values are represented as the Parquet INT64 type, which is represented as BIGINT in the Impala table. 
The Parquet values represent the time in milliseconds, while Impala interprets BIGINT as the time in seconds. 
Therefore, if you have a BIGINT column in a Parquet table that was imported this way from Sqoop, divide the values by 1000 when interpreting as the TIMESTAMP type.

I guess there is underlying problem with Timestamp when you use Parquet file.

http://www.cloudera.com/documentation/archive/impala/2-x/2-1-x/topics/impala_parquet.html#parquet_da...

kerjo · ‎10-13-2016

Hi csguna,

this is not the point, there is an option for changing/forcing type which is being ignored in the case of importing a table from postgresql to Hive in parquet format.

manpreet2 · ‎10-13-2016

I agree with Kerjo.. Type cast is further step but i want to get the data straightaway in date format. Tried --map-column-hive but it is getting ignored

Cloudera Community

Support Questions

SQOOP IMPORT --map-column-hive ignored