I've made a fresh install of HDP 18.104.22.168 and I'm trying to run the same query I've ran on 2.5:
ADD JAR hdfs://192.168.1.11:8020/user/admin/oozie-workflows/lib/json-serde-1.3.8-jar-with-dependencies.jar; SELECT t.retweeted_screen_name, sum(retweets) AS total_retweets, count(*) AS tweet_count FROM (SELECT retweeted_status.user.screen_name as retweeted_screen_name, retweeted_status.text, max(retweeted_status.retweet_count) as retweets FROM tweets GROUP BY retweeted_status.user.screen_name, retweeted_status.text) t GROUP BY t.retweeted_screen_name ORDER BY total_retweets DESC LIMIT 10;
The table tweets is an external table with a few hundred rows...
The problem is that the query runs like forever...
Can you please help?
Many thanks in advance
Can you check the state of the YARN application (on YARN RM UI) corresponding to the query, perhaps YARN doesn't have capacity to run the application?