Support Questions

Find answers, ask questions, and share your expertise

Decimal Data Type returning slightly incorrect results when dividing decimals

avatar
Contributor

Hi all,

 

Has anybody ran into problems when dividing decimal types in Impala?

 

We run the following query:

 

select a.var1, a.var2, a.var3,

a.var1 * (1 + (a.var2- a.var3) /1200 ) as problem_statement,

150 * (1 + (5.56 - 0.36) /1200) as raw_values

from

(

select * from table_a

) a,

(select * from table_b

) b

 

Our results are:

 

a.var1         a.var2          a.var3       problem_statement         raw_values

150              5.56              0.36         150.6499999995             150.65

 

Even though the problem_statement and raw_value are the same select statement, they get back slightly different values. Because we round some of our variables and call them later in the query, this error is altering our entire query results. The correct value is being returned by the raw_results query.

 

I will note that the data types for problem_statement and raw_value are different. One is a decimal data type and the other is a double

 

lv_hmbs_participation_bal       decimal(12,2)

grosscoupon                            decimal(10,6)

lv_hmbs_servicing_fee            decimal(10,6)

problem_statement_no_round decimal(30,13)

raw_values                               double

 

Has anybody else ran into this issue?

                

1 ACCEPTED SOLUTION

avatar

I think there are a couple of things going on. You may be running into some of the rounding issues covered by https://issues.apache.org/jira/browse/IMPALA-4810. In Impala 2.9 there will be a DECIMAL_V2 query option that will switch to a decimal mode that addresses some of these problem.

 

The expression (5.56 - 0.36) / 1200 also is treated as DOUBLE rather than DECIMAL, which is confusing and we plan to fix https://issues.apache.org/jira/browse/IMPALA-3437. If you change 1200 to 1200.0 you'll actually get a DECIMAL result. You can use the typeof() function to inspect the result types of functions:

 

[localhost:21000] > select typeof((5.56 - 0.36)  / 1200), typeof((5.56 - 0.36) / 1200.0);
+------------------------------+--------------------------------+
| typeof((5.56 - 0.36) / 1200) | typeof((5.56 - 0.36) / 1200.0) |
+------------------------------+--------------------------------+
| DOUBLE                       | DECIMAL(11,8)                  |
+------------------------------+--------------------------------+
Fetched 1 row(s) in 0.01s

View solution in original post

4 REPLIES 4

avatar

I think there are a couple of things going on. You may be running into some of the rounding issues covered by https://issues.apache.org/jira/browse/IMPALA-4810. In Impala 2.9 there will be a DECIMAL_V2 query option that will switch to a decimal mode that addresses some of these problem.

 

The expression (5.56 - 0.36) / 1200 also is treated as DOUBLE rather than DECIMAL, which is confusing and we plan to fix https://issues.apache.org/jira/browse/IMPALA-3437. If you change 1200 to 1200.0 you'll actually get a DECIMAL result. You can use the typeof() function to inspect the result types of functions:

 

[localhost:21000] > select typeof((5.56 - 0.36)  / 1200), typeof((5.56 - 0.36) / 1200.0);
+------------------------------+--------------------------------+
| typeof((5.56 - 0.36) / 1200) | typeof((5.56 - 0.36) / 1200.0) |
+------------------------------+--------------------------------+
| DOUBLE                       | DECIMAL(11,8)                  |
+------------------------------+--------------------------------+
Fetched 1 row(s) in 0.01s

avatar

I should clarify that DECIMAL_V2 is currently just an experimental flag and the behaviour may change. Likely its behaviour will be in flux until all the subtasks of https://issues.apache.org/jira/browse/IMPALA-4924 are finished.

avatar
Contributor

Hi TIm, thank you for your response.

 

I am currently running CDH 5.8.3 and Impala 2.6. Will Impala 2.9 be included in CDH 5.11?

 

So it looks like my options are to either use the double type instead of the decimal type, or use 1200.0 

 

I want to also note I am getting this issue in Hive as well as Impala 

avatar
Contributor

Hi Tim,

 

I'd like to resond and say that we were running into the issues that you brought up, I will also note that changing our double values of 1200 to 1200.0 does seem to remedy that particular problem.

 

Thank you for your response.