Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

LAG, LEAD, and Other Analytic Functions in HiveQL

Solved Go to solution
Highlighted

LAG, LEAD, and Other Analytic Functions in HiveQL

I have manually installed Cloudera's Hadoop and Cloudera's Hive RPM on RHEL. I have Sqooped data into Hive and can run normal HiveQL queries on the data fine. However, I cannot run LAG or LEAD on it.

 

This site suggests that it is possible: http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.0.2/ds_Hive/language_manual/ptf-window.html

 

But this site says that Cloudera doesn't provide the version of Hive that comes with LAG, LEAD, etc.: http://www.justinjworkman.com/big-data/hive-0-11-0-on-cloudera/#building

 

Is there any documentation on how to run these types of functions in HiveQL?

 

Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions

Re: LAG, LEAD, and Other Analytic Functions in HiveQL

Master Guru
These are available in the base version of Apache Hive shipped with CDH5 (beta currently). The CDH4 equivalent is on a stable release launched before the features were added upstream.

You could use Justin's guide to get a custom build of newer version of Hive running on your CDH4 cluster - this is possible to do.
1 REPLY 1

Re: LAG, LEAD, and Other Analytic Functions in HiveQL

Master Guru
These are available in the base version of Apache Hive shipped with CDH5 (beta currently). The CDH4 equivalent is on a stable release launched before the features were added upstream.

You could use Justin's guide to get a custom build of newer version of Hive running on your CDH4 cluster - this is possible to do.
Don't have an account?
Coming from Hortonworks? Activate your account here