Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Hive in built function greatest is not working

I have a requirement to calculate greatest value in multiple columns.

Input:

namesub1sub2sub3
st1101215
st2503035
st3151813

expected output:

namemax_sub
st115
st250
st318

When I try to use greatest function as below, I'm getting

Failed with exception java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating greatest(sub1,sub2,sub3)

select name,greatest(sub1,sub2,sub3) from students;

any other functions or simple udf available?

Note: I cant' use CASE statement as I have to consider around 15 columns.

1 ACCEPTED SOLUTION

Expert Contributor
@Hanu V

To work around your issue, you can explicitly cast to string or int (or whatever your data type should be).

I could reproduce your case:

select greatest(month, billed) from sales limit 5;
Error: java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating greatest(month,billed) (state=,code=0)

with workaround:

select greatest(int(month), int(billed)) from sales limit 5;
+------+--+
| _c0  |
+------+--+
| 8    |
| 8    |
| 8    |
| 8    |
| 2    |
+------+--+

or

select greatest(string(month), string(billed)) from sales limit 5;
+------+--+
| _c0  |
+------+--+
| 8    |
| 8    |
| 8    |
| 8    |
| 2    |
+------+--+

View solution in original post

6 REPLIES 6

Rising Star

@Hanu V

Can you attach a desc extended for your table. Greatest should work provided all the columns have the same datatype and are only primitive types.

@rtrivedi

Attached the screenshot FYR.

greatest.jpg

Expert Contributor
@Hanu V

To work around your issue, you can explicitly cast to string or int (or whatever your data type should be).

I could reproduce your case:

select greatest(month, billed) from sales limit 5;
Error: java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating greatest(month,billed) (state=,code=0)

with workaround:

select greatest(int(month), int(billed)) from sales limit 5;
+------+--+
| _c0  |
+------+--+
| 8    |
| 8    |
| 8    |
| 8    |
| 2    |
+------+--+

or

select greatest(string(month), string(billed)) from sales limit 5;
+------+--+
| _c0  |
+------+--+
| 8    |
| 8    |
| 8    |
| 8    |
| 2    |
+------+--+

Expert Contributor
@Hanu V

This is related to HIVE-12082. This issue exist in Hive1

1) You can either use Hive2 (Interactive HS2)

2) Workaround using explicitly cast to int.

select name, greatest(int(sub1),int(sub2),int(sub3)) from students;

3) Use orc format

Thanks @Naresh P R. Can you please let me know where can I track all the issues like this?

Thanks in advance!

Expert Contributor

You can check for the relevant issue in HIVE Jira

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.