I have almost 3 Edge Nodes in my Cluster and was wondering if it is advisable to start Hiveserver2 and Metastore on all 3 Edge Nodes? I will be starting the Hive Client on all 3 Servers but was wondering if it made sense to also start Hiverserver2 and Metastore on all of them. Or is it enough to start Hiveserver2 and Metastore on just 1 Edge Node. Please do let me know
@Leenurs Quadras You can install multiple Hive Servers in case of multiple workloads or applications in which case each HiveServer2 instance can have its own settings for Hive and Tez. You can refer to this document
P.S- If the answers help please accept and upvote the answers.
Other than what @Ishan mentioned, multiple HiveServer2 and Metastore are recommended generally in a High Availability setup. This way when one of the services goes down the other service is available for the clients to connect to.
Typically edge nodes are client facing nodes, so the client configs and tools are installed on them rather than services.
Hi Leenurs Quadras,
Edge node is mainly used for users to connect and use client installed on it to connect to different hadoop component. Example : Beeline or hive CLI.
HiveServer2 and Metastore is part of Hive service which will cater queries submitted from Hive client either via Hive CLI , Beeline or JDBC/ODBC clients.
You can enable High availibility for Hive server by creating one more instance of HiveServer2 and Metastore.
It is better to have this component get installed on separate servers than edge node. But if you are creating learning cluster of only 3 server , get hive service installed on one of edge node.