Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Impala round function does not return expected result

avatar
Explorer

Hi,

 

I'm using impala-2.7.0-cdh5.10.0, and I find execute round() in impala-shell may give unexpected result, while hive can give correct result.

 

e.g.

 

impala> select round(269586/334026 * 100, 2);
80.70999999999999

hive > select round(269586/334026 * 100, 2);
80.71

Is this a known issue?

1 ACCEPTED SOLUTION

avatar

Looks like a client display issue. Both Impala and Hive return a DOUBLE for round(). A DOUBLE cannot precisely represent 80.71 so I'm guessing that technically the results are the same, it's just that the Hive client "pretty prints" the results in a different way than Impala's.

 

If you require precise rouding results it's recommended you use decimal:

 

select round(cast(269586 / 334026 * 100 as decimal(17, 15)), 2);

80.71

View solution in original post

7 REPLIES 7

avatar

Looks like a client display issue. Both Impala and Hive return a DOUBLE for round(). A DOUBLE cannot precisely represent 80.71 so I'm guessing that technically the results are the same, it's just that the Hive client "pretty prints" the results in a different way than Impala's.

 

If you require precise rouding results it's recommended you use decimal:

 

select round(cast(269586 / 334026 * 100 as decimal(17, 15)), 2);

80.71

avatar
New Contributor

Hi Alex,

 

it works using the solution you provided, however if i use variable/column to indicate the number of digits after decimal point then it throws an error, which is :

 

"AnalysisException: round() must be called with a constant second argument."

 

e.g Below query works perfectly fine

 

select round(cast(test1 as decimal(17,15)),2) as test3
from (

select round(269586/334026 * 100,2) as test1,
1 as test2
)A

 

 

However this query does not work:

 

select round(cast(test1 as decimal(17,15)),test2) as test3
from (

select round(269586/334026 * 100,2) as test1,
1 as test2
)A

 

the problem lies with the decimal typecast, if i remove it then issue will disappaear however the original issue will appear.

 

Please help, i have been struggling for quite some time with this issue. for my requirement i need to use column name to indicate numbers of digits to round, because the same field(column) may have numbers rounded off to various decimal digits.

 

Thanks in advance.

 

 

Regards

Nitin

avatar
Hi @nitinagr. You pretty much have two options here right now: * Use a constant second argument * Cast the first input to DOUBLE (which means you may lose some precision if the values can't be exactly represented in a DOUBLE type). I think supporting this seems like a valid ask so I went ahead and filed a JIRA: https://issues.apache.org/jira/browse/IMPALA-7613 . I remember a while back debating with someone about whether users were likely to do data-driven rounding, so I think you just settled that argument!

avatar
New Contributor

Thank you Tim.

 

Double is not working in my case, currently i am using case statement to get the desired result (i need maximum 2 digits after decimal, so i have only 3 scenarios in case statement).

 

Looking forward to the fix.

 

 

Thanks,

Nitin

avatar
New Contributor

Hi Nitin

 

I was working on similar problem of dynamic rounding and found a workaround. In my case I do average price statistics and I want to round average prices to 3 significant digits, in order to group similar observations. Now every price has different number of digits, so e.g. I must round 1.495 to 1.50, but also 1099 should be rounded to 1100.

 

Following expression did the trick for me:

 

SELECT ROUND(`price` / POWER(10.0, (FLOOR(LOG10(`price`)) - 2)), 0) * POWER(10.0, (FLOOR(LOG10(`price`)) - 2)) AS `price_rounded`

The -2 constant is related to my specific problem, i.e. 3 significant digits, so that could be adjusted or omitted in other use case.

Conceptually what I do is that I reflect number of digits to round on by transforming the first argument, then I can always round to zero digits no matter what, and finally transform data back.

 

May work better than CASE WHEN in situation when you don't know dimensions of your data upfront.

 

BR,

Mirek

 

avatar
New Contributor
impala> select 269586/334026 * 100;
80.708088591906019

impala> select round(269586/334026 * 100, 2);
80.70999999999999

I'm curious as to why this isn't 80.71000000000000

avatar

Hi @scuffster

 

There are some interesting issues here with the different numeric data types here - INT, DOUBLE, DECIMAL, etc. The behaviour you're seeing is because the first input to round() is a DOUBLE expression, which cannot exactly represent all decimal values. Generally the output type of the round() function is the same as the input type.

 

Impala does support precise decimal arithmetic with the DECIMAL type. If you are operating on DECIMAL columns or you cast the input to a decimal type with the right precision and scale, you may get the behaviour you're hoping for. Here's a query showing the type of your expressions and an alternative version with a cast to DECIMAL:

 

> select typeof(269586/334026 * 100), typeof(round(269586/334026 * 100, 2)), round(269586/334026 * 100, 2), round(cast(269586/334026 * 100 as DECIMAL(20, 8)), 2);

+-------------------------------+-----------------------------------------+---------------------------------+--------------------------------------------------------+
| typeof(269586 / 334026 * 100) | typeof(round(269586 / 334026 * 100, 2)) | round(269586 / 334026 * 100, 2) | round(cast(269586 / 334026 * 100 as decimal(20,8)), 2) |
+-------------------------------+-----------------------------------------+---------------------------------+--------------------------------------------------------+
| DOUBLE                        | DOUBLE                                  | 80.70999999999999               | 80.71                                                  |
+-------------------------------+-----------------------------------------+---------------------------------+--------------------------------------------------------+