Created 07-01-2016 09:33 AM
Hi,
I have a table where I have defined all the number fields as DECIMAL(38,0). Everytime I do inserts in this table it somehow automatically rounds the numbers. I would like to see the decimal values. eg 15.56 automatically becomes 16 when inserted into this table.
I have tried recreating the tables with INT datatype as well but it does the same.
P.S - The table is defined as an ORC, so that I can do updates and deletes too
Any suggestions if I can define another datatype?
Created 07-02-2016 04:35 PM
I tried with hive ver. 1.2.1 and I couldn't reproduce your issue as shown in the below.
What version of hive are you using?
hive> set hive.support.concurrency=true; hive> set hive.enforce.bucketing=true; hive> set hive.exec.dynamic.partition.mode=nonstrict; hive> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; hive> set hive.compactor.initiator.on=true; hive> create table test_decimal(i int, time_key DECIMAL(38,2)) clustered by (TIME_KEY) into 16 buckets stored as orc TBLPROPERTIES ('transactional' = 'true'); OK Time taken: 0.557 seconds hive> insert into table test_decimal values(1, 15.56); Query ID = hrt_qa_20160702163059_7e2c279f-158e-4477-9123-357db69075b9 Total jobs = 1 Launching Job 1 out of 1 Status: Running (Executing on YARN cluster with App id application_1467208466253_0268) -------------------------------------------------------------------------------- VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -------------------------------------------------------------------------------- Map 1 .......... SUCCEEDED 1 1 0 0 0 0 Reducer 2 ...... SUCCEEDED 16 16 0 0 0 0 -------------------------------------------------------------------------------- VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 44.74 s -------------------------------------------------------------------------------- Loading data to table default.test_decimal Table default.test_decimal stats: [numFiles=1, numRows=1, totalSize=655, rawDataSize=0] OK Time taken: 49.152 seconds hive> select * from test_decimal; OK 115.56 Time taken: 0.282 seconds, Fetched: 1 row(s)
Created 07-01-2016 10:20 AM
Decimal declaration format is decimal(precision,scale). Precision is the total number of digits including the digits after the decimal point. So , if you want a number for example 1001.23 the the decimal declaration should be decimal(6,2). Or decimal(4,2) for your given example of 15.56.
Created 07-01-2016 11:32 AM
Hi @apathak,
Thank you for your quick response.
I created 2 tables with same structure and data type, one of them was an ORC table for ACID transactions and the other one was a normal table and I used DECIMAL(38,2) this time
When I do inserts in both of them, the normal table retains the decimals but the ORC table always rounds the number.
Is there some issue with ORC tables or am I missing out on some settings.
ORC TABLE :-
create table unica.FACT_PUNTER_ALL_MARKETS ( punter_key DECIMAL(38,2), user_no DECIMAL(38,2), user_name STRING, provider STRING, product_name STRING, sub_product_name STRING, channel STRING, indicator_name STRING, indicator_value DECIMAL(38,2), brand STRING, country STRING, time_zone STRING, time_key DECIMAL(38,2), calendar_date STRING, jurisdiction STRING, insert_time STRING, source_table STRING ) clustered by (TIME_KEY) into 50 buckets stored as orc TBLPROPERTIES ('transactional' = 'true');
NORMAL TABLE :-
create table unica.FACT_PUNTER_ALL_MARKETS_INCR ( punter_key DECIMAL(38,2), user_no DECIMAL(38,2), user_name STRING, provider STRING, product_name STRING, sub_product_name STRING, channel STRING, indicator_name STRING, indicator_value DECIMAL(38,2), brand STRING, country STRING, time_zone STRING, time_key DECIMAL(38,2), calendar_date STRING, jurisdiction STRING, insert_time STRING, source_table STRING );
INSERT STATEMENTS USED
-- normal table insert into table unica.FACT_PUNTER_ALL_MARKETS_INCR select * from unica.FACT_PUNTER_ALL_MARKETS_TEMP -- orc table insert into table unica.FACT_PUNTER_ALL_MARKETS select * from unica.FACT_PUNTER_ALL_MARKETS_TEMP
Created 07-01-2016 01:06 PM
38 digits is the max for Decimal. You cannot do Decimal(38,2) You could do Decimal(36,2) or Decimal (10,10 ) or whatever.
Created 07-01-2016 04:00 PM
tried with Decimal(36,2), still seeing the numbers getting rounded.
Created 07-02-2016 04:35 PM
I tried with hive ver. 1.2.1 and I couldn't reproduce your issue as shown in the below.
What version of hive are you using?
hive> set hive.support.concurrency=true; hive> set hive.enforce.bucketing=true; hive> set hive.exec.dynamic.partition.mode=nonstrict; hive> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; hive> set hive.compactor.initiator.on=true; hive> create table test_decimal(i int, time_key DECIMAL(38,2)) clustered by (TIME_KEY) into 16 buckets stored as orc TBLPROPERTIES ('transactional' = 'true'); OK Time taken: 0.557 seconds hive> insert into table test_decimal values(1, 15.56); Query ID = hrt_qa_20160702163059_7e2c279f-158e-4477-9123-357db69075b9 Total jobs = 1 Launching Job 1 out of 1 Status: Running (Executing on YARN cluster with App id application_1467208466253_0268) -------------------------------------------------------------------------------- VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -------------------------------------------------------------------------------- Map 1 .......... SUCCEEDED 1 1 0 0 0 0 Reducer 2 ...... SUCCEEDED 16 16 0 0 0 0 -------------------------------------------------------------------------------- VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 44.74 s -------------------------------------------------------------------------------- Loading data to table default.test_decimal Table default.test_decimal stats: [numFiles=1, numRows=1, totalSize=655, rawDataSize=0] OK Time taken: 49.152 seconds hive> select * from test_decimal; OK 115.56 Time taken: 0.282 seconds, Fetched: 1 row(s)
Created 07-04-2016 10:33 AM
Thank you guys! @Takahiko Saito @Benjamin Leonhardi @apathak I tried changing the number fields to Decimal(36,2) again and this time it works without any problem.
Thanks once again for your quick response.