Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to compare float datatype columns with string value

avatar
New Contributor

I have a table `exchangedata` in hive with the following columns

day STRING
euro float
yen float
dollar float

and the tasks to perform on this table are

1) Load data from '/user/practiceuser/exchangedata.csv' in HDFS into `exchangedata` table

2) Remove All the rows where value of euro, yen and dollar is equal to string '#N/A'

3) Load the result into `exchangedata` table

My Solution -------

As the `exchangedata` table is not in ORC format and is also not clustered(For DELETE query), I created a temporary table `exchange_temp` with the same columns and then used below query to insert data into `exchangedata`

INSERT INTO TABLE exchangedata SELECT day,euro,yen,dollar FROM exchange_temp WHERE euro != '#N/A' AND yen != '#N/A' AND dollar != '#N/A';

But after executing this query I am not getting any row in output. My question is how can I compare float datatype column with String '#N/A'?

The data in the CSV file looks like this

FRIDAY,1.234,2.456,1.059
1 ACCEPTED SOLUTION

avatar
Rising Star

That is always a night mare in JAVA based tools. Hive relies on JAVA (plus SQL) so it respects the IEEE standard for number semantics. That means especially NaN (not a number) values in float columns are a tricky thing.

First of all: Have you tested what is returned for the '#N/A' columns when you do a select? I guess it is rather 'NaN' than '#N/A'.

So after testing the return value, I would suggest to test two approaches. Either try to use cast():

	cast(dollar as String) <>'NaN' 

(because all possible NaN values are displayed as "NaN" even if they are not strictly "equal" in the arithmetical sense)

or do the old trick and test the value of the column to fit a mathematical operation like e.g.

dollar +1.0 > dollar

View solution in original post

3 REPLIES 3

avatar
Rising Star

That is always a night mare in JAVA based tools. Hive relies on JAVA (plus SQL) so it respects the IEEE standard for number semantics. That means especially NaN (not a number) values in float columns are a tricky thing.

First of all: Have you tested what is returned for the '#N/A' columns when you do a select? I guess it is rather 'NaN' than '#N/A'.

So after testing the return value, I would suggest to test two approaches. Either try to use cast():

	cast(dollar as String) <>'NaN' 

(because all possible NaN values are displayed as "NaN" even if they are not strictly "equal" in the arithmetical sense)

or do the old trick and test the value of the column to fit a mathematical operation like e.g.

dollar +1.0 > dollar

avatar
New Contributor

Thanks for your replay Peter. Actually, this was EXACTLY the question asked during my certification exam and the column was a string '#N/A'. I didn't know about the cast() function, so I was directly comparing float datatype to a string value using '!='. Also the SELECT wasn't returning any rows.

avatar
New Contributor

Anyways, your answer has given an approach to solve this. Shall accept your answer.