Support Questions
Find answers, ask questions, and share your expertise

Simple hive query taking forever

Contributor

I've made a fresh install of HDP 2.6.1.0 and I'm trying to run the same query I've ran on 2.5:

ADD JAR hdfs://192.168.1.11:8020/user/admin/oozie-workflows/lib/json-serde-1.3.8-jar-with-dependencies.jar;

SELECT
  t.retweeted_screen_name,
  sum(retweets) AS total_retweets,
  count(*) AS tweet_count
FROM (SELECT
        retweeted_status.user.screen_name as retweeted_screen_name,
         retweeted_status.text,
         max(retweeted_status.retweet_count) as retweets
      FROM tweets
      GROUP BY retweeted_status.user.screen_name,
               retweeted_status.text) t
GROUP BY t.retweeted_screen_name
ORDER BY total_retweets DESC
LIMIT 10;

The table tweets is an external table with a few hundred rows...
The problem is that the query runs like forever...

Can you please help?
Many thanks in advance

2 REPLIES 2

Re: Simple hive query taking forever

Master Collaborator

Can you check the state of the YARN application (on YARN RM UI) corresponding to the query, perhaps YARN doesn't have capacity to run the application?

Re: Simple hive query taking forever

Contributor

The problem was lack of permissions... deactivated permissions on hive and the problem is solved.