# Support Questions

Announcements
Celebrating as our community reaches 100,000 members! Thank you!

## Hive: cast String to Float alters decimal part

New Contributor

Hi,

I have a column where float values were stored as STRING type, e.g. 9989.64

When I cast that column to obtain a FLOAT like so "SELECT a, cast(a AS FLOAT) ... " I obtain the following:

 a cast(a AS FLOAT) 5724.95 5724.9501953125

Why does Hive alter the decimal part? How can I get the exact number?

BTW, casting the STRING to DOUBLE does render the expected behaviour

BR

1 ACCEPTED SOLUTION
Expert Contributor

The behavior you're observing is related to the precision differences between STRING and FLOAT data types. When you cast a STRING to a FLOAT, Hive attempts to interpret and represent the value as accurately as possible within the constraints of a FLOAT data type.

FLOATs are limited in precision, and the fractional part might not be represented exactly. In your example, "5724.95" in FLOAT was stored as "5724.9501953125." This discrepancy is due to the way binary floating-point numbers work and how they might not be able to precisely represent certain decimal values.

If you need exact decimal representation, you should consider using a DECIMAL data type instead of FLOAT. DECIMAL provides higher precision and is better suited for scenarios where you need to maintain the exact decimal value without potential loss of precision.

Here's how you can cast your STRING column to DECIMAL to preserve the exact decimal value:

``SELECT a, CAST(a AS DECIMAL(20, 10)) AS exact_value FROM your_table;``

In this example, DECIMAL(20, 10) indicates a decimal type with a total width of 20 digits and 10 decimal places. This will preserve the exact decimal representation you need.

Keep in mind that DECIMAL has higher storage requirements compared to FLOAT because it maintains precision, so choose the appropriate data type based on your requirements.

Example :

``````0: jdbc:hive2://nightly-71x-zg-2.nightly-71x-> SELECT CAST('5724.9501953125' AS DECIMAL(20, 10)) AS decimal_value ;
INFO  : Compiling command(queryId=hive_20231106110627_aaa98b66-4db6-4307-9be3-598018c13fbf): SELECT CAST('5724.9501953125' AS DECIMAL(20, 10)) AS decimal_value
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:decimal_value, type:decimal(20,10), comment:null)], properties:null)
INFO  : Completed compiling command(queryId=hive_20231106110627_aaa98b66-4db6-4307-9be3-598018c13fbf); Time taken: 0.062 seconds
INFO  : Executing command(queryId=hive_20231106110627_aaa98b66-4db6-4307-9be3-598018c13fbf): SELECT CAST('5724.9501953125' AS DECIMAL(20, 10)) AS decimal_value
INFO  : Completed executing command(queryId=hive_20231106110627_aaa98b66-4db6-4307-9be3-598018c13fbf); Time taken: 0.006 seconds
INFO  : OK
+------------------+
|  decimal_value   |
+------------------+
| 5724.9501953125  |
+------------------+``````

3 REPLIES 3
Community Manager

Regards,

Vidya Sargur,
Community Manager

If you find a reply useful, say thanks by clicking on the thumbs up button.
Expert Contributor

The behavior you're observing is related to the precision differences between STRING and FLOAT data types. When you cast a STRING to a FLOAT, Hive attempts to interpret and represent the value as accurately as possible within the constraints of a FLOAT data type.

FLOATs are limited in precision, and the fractional part might not be represented exactly. In your example, "5724.95" in FLOAT was stored as "5724.9501953125." This discrepancy is due to the way binary floating-point numbers work and how they might not be able to precisely represent certain decimal values.

If you need exact decimal representation, you should consider using a DECIMAL data type instead of FLOAT. DECIMAL provides higher precision and is better suited for scenarios where you need to maintain the exact decimal value without potential loss of precision.

Here's how you can cast your STRING column to DECIMAL to preserve the exact decimal value:

``SELECT a, CAST(a AS DECIMAL(20, 10)) AS exact_value FROM your_table;``

In this example, DECIMAL(20, 10) indicates a decimal type with a total width of 20 digits and 10 decimal places. This will preserve the exact decimal representation you need.

Keep in mind that DECIMAL has higher storage requirements compared to FLOAT because it maintains precision, so choose the appropriate data type based on your requirements.

Example :

``````0: jdbc:hive2://nightly-71x-zg-2.nightly-71x-> SELECT CAST('5724.9501953125' AS DECIMAL(20, 10)) AS decimal_value ;
INFO  : Compiling command(queryId=hive_20231106110627_aaa98b66-4db6-4307-9be3-598018c13fbf): SELECT CAST('5724.9501953125' AS DECIMAL(20, 10)) AS decimal_value
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:decimal_value, type:decimal(20,10), comment:null)], properties:null)
INFO  : Completed compiling command(queryId=hive_20231106110627_aaa98b66-4db6-4307-9be3-598018c13fbf); Time taken: 0.062 seconds
INFO  : Executing command(queryId=hive_20231106110627_aaa98b66-4db6-4307-9be3-598018c13fbf): SELECT CAST('5724.9501953125' AS DECIMAL(20, 10)) AS decimal_value
INFO  : Completed executing command(queryId=hive_20231106110627_aaa98b66-4db6-4307-9be3-598018c13fbf); Time taken: 0.006 seconds
INFO  : OK
+------------------+
|  decimal_value   |
+------------------+
| 5724.9501953125  |
+------------------+``````

Guru

To add to the point of @ggangadharan, there are lots of good articles/posts why the float and even the double datatype has these problems. Note that this is not Hive / Hadoop or Java specific.

https://stackoverflow.com/questions/3730019/why-not-use-double-or-float-to-represent-currency

Miklos

Announcements
Community Announcements
Community Announcements
Community Announcements
What's New @ Cloudera
Community Announcements