Member since
08-30-2018
2
Posts
0
Kudos Received
0
Solutions
01-11-2021
09:35 PM
Later versions of hive have a "sys" DB that under the hood connects back to the hive metastore database (eg Postgres or whatever). and you can query that. Impala seems not to be able to see this sys db though. There is also a "information_schema" DB with a smaller and cleaner subset but it points back to sys and also not visible from impala if you do a "show databases;" You can use "show" statements in impala-shell but I'm not sure there is a DB to through SQL at via ODBC/JDBC. Still looking for a way to do this in impala
... View more
08-30-2018
03:07 PM
This has me a little confused as well. There are three counters for this: - (1) Ambari Dashboard has "CPU Usage" which always seems to look low. (makes sense for our env) - (2) YARN has "CPU Utilization" for "% of total cores assigned to containers" which never goes under 40% (makes sense for our env) - (3) YARN has "Cluster CPU" for "% of CPU utilization across node manager hosts" which is always between 50%-95% (This makes no sense in our env) So, when our jobs run, 1 doesn't climb much, 2 doesn't move much, 3 quickly hits 95%. We want to run our jobs faster but between 1 & 3 I can't tell if we are high utilization of CPU or not. Host page CPU seems to trend with 1 (quiet) But am I hitting a ceiling? 3 can't go any higher
... View more