Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Hive query for functions

avatar
Explorer

I'm a newbie, I want to calculate month on month percentage change in price change in my table. Im using hive and using query:

 

select year(stock_date),month(stock_date),sum(stock_price) over (partition by year(stock_date),month(stock_date) ORDER BY month(stock_date)), PERCENT_RANK() OVER(ORDER BY month(stock_date) DESC) AS per from table4;

 

it's generating, It's showing July 2019 at the top and goes descending thru the numbers. how do I get the months in correct order and the percentage change with it?

 

mystefied__0-1648012945794.png

can anyone help me understand what am I doing wrong?

 

12 REPLIES 12

avatar
Super Guru

@mystefied_ ,

 

Would you be able to describe in words what is the desired results of the query?

The query above doesn't make much sense to me but if you explain what you want to achieve I should be able to help you get the right query.

 

Cheers,

André

 

 

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.

avatar
Explorer
Sure, I want to compare pricing across months by viewing percentage change across months.

avatar
Super Guru

@mystefied_ ,

 

What's the granularity of the data in the table? Do you have one row per day, multiple rows per day?

What's the "month price" that you need to consider? Average of the intra-month prices, max, min?

 

ANdré

 

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.

avatar
Explorer

There is one row per date. to compare the difference over months, I was to sum the amount of each month, then find percentage change of the total price each month. 

 

I'm trying the date_trunc function, but it isnt working. 

avatar
Super Guru

@mystefied_ ,

 

Please try this:

select
  month,
  month_total,
  month_total / lag(month_total, 1) over (order by month) as percentage_over_previous_month,
  sum(month_total) over (order by month) as running_sum,
  sum(month_total) over (order by month) / sum(month_total) over (partition by 1) as running_percentage
from (
  select
    trunc(stock_date, 'MONTH') as month,
    sum(stock_price) as month_total
  from table4
  group by trunc(stock_date, 'MONTH')
) x

 

Cheers,

André

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.

avatar
Explorer

Im getting this error when executing the above query, I might be running an old version of hive

 

FAILED: SemanticException [Error 10011]: Line 4:9 Invalid function 'trunc'

avatar
Super Guru

@mystefied_ ,

 

Which version of Hive are you using? That query works well on my cluster.

Nevertheless, you should be able to run the below, which is pretty much the same:

select
  yr,
  mth,
  month_total,
  month_total / lag(month_total, 1) over (order by yr, mth) as percentage_over_previous_month,
  sum(month_total) over (order by yr, mth) as running_sum,
  sum(month_total) over (order by yr, mth) / sum(month_total) over (partition by 1) as running_percentage
from (
  select
    year(stock_date) as yr,
    month(stock_date) as mth,
    sum(stock_price) as month_total
  from table4
  group by year(stock_date), month(stock_date)
) x

 

Cheers,

André

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.

avatar
Explorer

I am running version Hive 1.1.0-cdh5.4.3

 

When I ran the most recent query, It's giving me the following error:

 

mystefied__0-1648272163481.jpeg

 

 

avatar
Super Guru

Woah! CDH 5.4.3 is *really* old. Unfortunatelly I don't have a cluster running that version here to test.

Hive has come a long way since then. We're already using Hive 3 on CDP 7.x.

I'd recommend you upgrade your system, if possible.

 

Cheers,

André

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.