Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Hive is rounding the number columns automatically.

Explorer

Hi,

I have a table where I have defined all the number fields as DECIMAL(38,0). Everytime I do inserts in this table it somehow automatically rounds the numbers. I would like to see the decimal values. eg 15.56 automatically becomes 16 when inserted into this table.

I have tried recreating the tables with INT datatype as well but it does the same.

P.S - The table is defined as an ORC, so that I can do updates and deletes too

Any suggestions if I can define another datatype?

1 ACCEPTED SOLUTION

Rising Star

Hi @Parinita Kothari,

I tried with hive ver. 1.2.1 and I couldn't reproduce your issue as shown in the below.

What version of hive are you using?

hive> set hive.support.concurrency=true;
hive> set hive.enforce.bucketing=true;
hive> set hive.exec.dynamic.partition.mode=nonstrict;
hive> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
hive> set hive.compactor.initiator.on=true;
hive> create table test_decimal(i int, time_key         DECIMAL(38,2)) clustered by (TIME_KEY) into 16 buckets stored as orc TBLPROPERTIES ('transactional' = 'true');
OK
Time taken: 0.557 seconds
hive> insert into table test_decimal values(1, 15.56);
Query ID = hrt_qa_20160702163059_7e2c279f-158e-4477-9123-357db69075b9
Total jobs = 1
Launching Job 1 out of 1
Status: Running (Executing on YARN cluster with App id application_1467208466253_0268)
--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
Map 1 ..........   SUCCEEDED      1          1        0        0       0       0
Reducer 2 ......   SUCCEEDED     16         16        0        0       0       0
--------------------------------------------------------------------------------
VERTICES: 02/02  [==========================>>] 100%  ELAPSED TIME: 44.74 s
--------------------------------------------------------------------------------
Loading data to table default.test_decimal
Table default.test_decimal stats: [numFiles=1, numRows=1, totalSize=655, rawDataSize=0]
OK
Time taken: 49.152 seconds
hive> select * from test_decimal;
OK
115.56
Time taken: 0.282 seconds, Fetched: 1 row(s)

View solution in original post

6 REPLIES 6

Cloudera Employee

Decimal declaration format is decimal(precision,scale). Precision is the total number of digits including the digits after the decimal point. So , if you want a number for example 1001.23 the the decimal declaration should be decimal(6,2). Or decimal(4,2) for your given example of 15.56.

Explorer

Hi @apathak,

Thank you for your quick response.

I created 2 tables with same structure and data type, one of them was an ORC table for ACID transactions and the other one was a normal table and I used DECIMAL(38,2) this time

When I do inserts in both of them, the normal table retains the decimals but the ORC table always rounds the number.

Is there some issue with ORC tables or am I missing out on some settings.

ORC TABLE :-

create table unica.FACT_PUNTER_ALL_MARKETS
(
  punter_key       DECIMAL(38,2),
  user_no          DECIMAL(38,2),
  user_name        STRING,
  provider         STRING,
  product_name     STRING,
  sub_product_name STRING,
  channel          STRING,
  indicator_name   STRING,
  indicator_value  DECIMAL(38,2),
  brand            STRING,
  country          STRING,
  time_zone        STRING,
  time_key         DECIMAL(38,2),
  calendar_date    STRING,
  jurisdiction     STRING,
  insert_time      STRING,
  source_table     STRING
)
clustered by (TIME_KEY) into 50 buckets stored as orc TBLPROPERTIES ('transactional' = 'true');

NORMAL TABLE :-

create table unica.FACT_PUNTER_ALL_MARKETS_INCR
(
  punter_key       DECIMAL(38,2),
  user_no          DECIMAL(38,2),
  user_name        STRING,
  provider         STRING,
  product_name     STRING,
  sub_product_name STRING,
  channel          STRING,
  indicator_name   STRING,
  indicator_value  DECIMAL(38,2),
  brand            STRING,
  country          STRING,
  time_zone        STRING,
  time_key         DECIMAL(38,2),
  calendar_date    STRING,
  jurisdiction     STRING,
  insert_time      STRING,
  source_table     STRING
);

INSERT STATEMENTS USED

-- normal table
insert into table unica.FACT_PUNTER_ALL_MARKETS_INCR
select * from unica.FACT_PUNTER_ALL_MARKETS_TEMP

-- orc table
insert into table unica.FACT_PUNTER_ALL_MARKETS
select * from unica.FACT_PUNTER_ALL_MARKETS_TEMP

38 digits is the max for Decimal. You cannot do Decimal(38,2) You could do Decimal(36,2) or Decimal (10,10 ) or whatever.

Explorer

tried with Decimal(36,2), still seeing the numbers getting rounded.

Rising Star

Hi @Parinita Kothari,

I tried with hive ver. 1.2.1 and I couldn't reproduce your issue as shown in the below.

What version of hive are you using?

hive> set hive.support.concurrency=true;
hive> set hive.enforce.bucketing=true;
hive> set hive.exec.dynamic.partition.mode=nonstrict;
hive> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
hive> set hive.compactor.initiator.on=true;
hive> create table test_decimal(i int, time_key         DECIMAL(38,2)) clustered by (TIME_KEY) into 16 buckets stored as orc TBLPROPERTIES ('transactional' = 'true');
OK
Time taken: 0.557 seconds
hive> insert into table test_decimal values(1, 15.56);
Query ID = hrt_qa_20160702163059_7e2c279f-158e-4477-9123-357db69075b9
Total jobs = 1
Launching Job 1 out of 1
Status: Running (Executing on YARN cluster with App id application_1467208466253_0268)
--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
Map 1 ..........   SUCCEEDED      1          1        0        0       0       0
Reducer 2 ......   SUCCEEDED     16         16        0        0       0       0
--------------------------------------------------------------------------------
VERTICES: 02/02  [==========================>>] 100%  ELAPSED TIME: 44.74 s
--------------------------------------------------------------------------------
Loading data to table default.test_decimal
Table default.test_decimal stats: [numFiles=1, numRows=1, totalSize=655, rawDataSize=0]
OK
Time taken: 49.152 seconds
hive> select * from test_decimal;
OK
115.56
Time taken: 0.282 seconds, Fetched: 1 row(s)

Explorer

Thank you guys! @Takahiko Saito @Benjamin Leonhardi @apathak I tried changing the number fields to Decimal(36,2) again and this time it works without any problem.

Thanks once again for your quick response.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.