Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Ambari view vs beeline

Ambari view vs beeline

New Contributor

Problem:

The scenario is to allow multiple users (created using Ambari console) to get access to HiveServer2 installed in the cluster. Hence we created users/groups using Admin login and also created home directory /user/<username> in HDFS.

When the user login to console and have got access to Ambari Hive View2.0 instance and try to execute the queries, after few queries executed, then there is no response even try to stop the execution.

But the same set of users can be logged into beeline and execute the queries, the response is always there.

Since we need to provide the GUI for users to execute queries, we thought of using Ambari views. (unlike Hive CLI or beeline).

Is there any difference interms of accessing Hiveserver2 using Ambari views or beeline? Any help to tune the usage of Ambari view is appreciated. Thanks in advance!!!

4 REPLIES 4

Re: Ambari view vs beeline

Super Mentor

@Saravanan Ramaraj

Hive View runs inside ambari server so based on the concurrent number of users you might want to increase the Ambari client api threadpool size as well.

If inside our ambari server we have some views (like Hive/File View ..etc) which is accessed by many concurrent users Or if there are many users access the ambari UI concurrently or makes Ambari Rest API calls. Then in such cases we should also increase the"client.threadpool.size.max"property value (default values is 25) inside the"/etc/ambari-server/conf/ambari.properties".

"client.threadpool.size.max":The size of the Jetty connection pool used for handling incoming REST API requests. This should be large enough to handle requests from both web browsers and embedded Views.
# grep 'client.threadpool.size.max' /etc/ambari-server/conf/ambari.properties
100<br>

.

Please check the following link for Ambari Server tuning to see if that helps: https://community.hortonworks.com/articles/131670/ambari-server-performance-tuning-troubleshooting-c...

.

Still it will be good to collect few thread dumps when the Hive View Users query gets stuck somewhere. The thread dumps will help in understanding if the queries are getting stuck? Following link provides option to collect thread dump using jcmd along with CPU details: https://community.hortonworks.com/articles/72319/how-to-collect-threaddump-using-jcmd-and-analyse-i....

If while executing some queries from Hive View you are encountering any error/exception then looking at the following logs will be also useful: (in case you can share these logs then will be better)

# ls -l /var/log/ambari-server/ambari-server.log
# ls -l /var/log/ambari-server/hive20-view/hive20-view.log     (for hive View 2.0)
# ls -l /var/log/ambari-server/hive-next-view/hive-view.log    (for hive View 1.5)

.

Re: Ambari view vs beeline

New Contributor

Thank you Jay, I will update the proceedings..meanwhile, is it possible to audit the HIVE queries executed by the users through Ambari view? OR do we need to install Ranger for audit logs?

Re: Ambari view vs beeline

New Contributor

When I run Hive Queries with couple of users using AUTO_HIVE20_VIEW instance in the ambari, the simple query also took long time to respond.I have tried tuning the following parameters in ambari-.properties file.

client.threadpool.size.max = 100

views.ambari.request.read.timeout.millis=12000

views.request.read.timeout.millis=120000

views.ambari.hive.<HIVE_VIEW_INSTANCE_NAME>.result.fetch.timeout=120000

However it does not helps. Morever the memory utilization is also less for ambari-server instance. I have observed the following line in hive20-view.log

07 Nov 2017 07:16:04,703 ERROR [HiveViewActorSystem-akka.actor.default-dispatcher-1829] [HIVE 2.0.0 AUTO_HIVE20_INSTANCE] OperationController:174 - Cannot update Dag Information for job. Job with id: 271 for instance: AUTO_HIVE20_INSTANCE has either not started or has expired. 07 Nov 2017 07:16:07,716 INFO [ambari-client-thread-16477] [HIVE 2.0.0 AUTO_HIVE20_INSTANCE] Aggregator:328 - Saving DAG information via actor system for job id: 271 07 Nov 2017 07:16:07,716 ERROR [HiveViewActorSystem-akka.actor.default-dispatcher-1829] [HIVE 2.0.0 AUTO_HIVE20_INSTANCE] OperationController:174 - Cannot update Dag Information for job. Job with id: 271 for instance: AUTO_HIVE20_INSTANCE has either not started or has expired.

Kindly help me to resolve this issue. Thanks!!

Re: Ambari view vs beeline

New Contributor

@Geoffrey Shelton OkotKindly help ..I'm also facing the same Hive query perfomance issues with Ambari U

09 May 2018 09:45:45,952 ERROR [HiveViewActorSystem-akka.actor.default-dispatcher-97] [HIVE 2.0.0 AUTO_HIVE_INSTANCE] OperationController:174 - Cannot update Dag Information for job. Job with id: 2060 for instance: AUTO_HIVE_INSTANCE has either not started or has expired.