Support Questions
Find answers, ask questions, and share your expertise

Hive query for functions

Explorer

I'm a newbie, I want to calculate month on month percentage change in price change in my table. Im using hive and using query:

 

select year(stock_date),month(stock_date),sum(stock_price) over (partition by year(stock_date),month(stock_date) ORDER BY month(stock_date)), PERCENT_RANK() OVER(ORDER BY month(stock_date) DESC) AS per from table4;

 

it's generating, It's showing July 2019 at the top and goes descending thru the numbers. how do I get the months in correct order and the percentage change with it?

 

mystefied__0-1648012945794.png

can anyone help me understand what am I doing wrong?

 

12 REPLIES 12

Master Collaborator

@mystefied_ ,

 

Would you be able to describe in words what is the desired results of the query?

The query above doesn't make much sense to me but if you explain what you want to achieve I should be able to help you get the right query.

 

Cheers,

André

 

 

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Explorer
Sure, I want to compare pricing across months by viewing percentage change across months.

Master Collaborator

@mystefied_ ,

 

What's the granularity of the data in the table? Do you have one row per day, multiple rows per day?

What's the "month price" that you need to consider? Average of the intra-month prices, max, min?

 

ANdré

 

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Explorer

There is one row per date. to compare the difference over months, I was to sum the amount of each month, then find percentage change of the total price each month. 

 

I'm trying the date_trunc function, but it isnt working. 

Master Collaborator

@mystefied_ ,

 

Please try this:

select
  month,
  month_total,
  month_total / lag(month_total, 1) over (order by month) as percentage_over_previous_month,
  sum(month_total) over (order by month) as running_sum,
  sum(month_total) over (order by month) / sum(month_total) over (partition by 1) as running_percentage
from (
  select
    trunc(stock_date, 'MONTH') as month,
    sum(stock_price) as month_total
  from table4
  group by trunc(stock_date, 'MONTH')
) x

 

Cheers,

André

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Explorer

Im getting this error when executing the above query, I might be running an old version of hive

 

FAILED: SemanticException [Error 10011]: Line 4:9 Invalid function 'trunc'

Master Collaborator

@mystefied_ ,

 

Which version of Hive are you using? That query works well on my cluster.

Nevertheless, you should be able to run the below, which is pretty much the same:

select
  yr,
  mth,
  month_total,
  month_total / lag(month_total, 1) over (order by yr, mth) as percentage_over_previous_month,
  sum(month_total) over (order by yr, mth) as running_sum,
  sum(month_total) over (order by yr, mth) / sum(month_total) over (partition by 1) as running_percentage
from (
  select
    year(stock_date) as yr,
    month(stock_date) as mth,
    sum(stock_price) as month_total
  from table4
  group by year(stock_date), month(stock_date)
) x

 

Cheers,

André

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Explorer

I am running version Hive 1.1.0-cdh5.4.3

 

When I ran the most recent query, It's giving me the following error:

 

mystefied__0-1648272163481.jpeg

 

 

Master Collaborator

Woah! CDH 5.4.3 is *really* old. Unfortunatelly I don't have a cluster running that version here to test.

Hive has come a long way since then. We're already using Hive 3 on CDP 7.x.

I'd recommend you upgrade your system, if possible.

 

Cheers,

André

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Explorer

Im using a VM that is running this version. I'll try if I can update it. In case I can then I;ll try your queries again. 

 

Thanks a lot for trying to help. 

Explorer

Hello Andre,

 

Can you share the steps to upgrade the CDH / Hive to version 2 or 3 so at least the above commands can execute.


Also advise if I can download CDH 7 or 6 VM from Cloudera website, I do not have an active subscription but need trial for few days.

 

Thanks a lot for your help.

Master Collaborator

@mystefied_ ,

 

You can download the CDP Trial version from the Cloudera website below:

 

https://www.cloudera.com/downloads/cdp-private-cloud-trial/cdp-private-cloud-base-trial.html

 

Cheers,

André

 

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.