Here's a weird scenario I'm trying to understand:
We have an on-premise cluster running HDP 2.6.X
We haven't changed any of the settings or hardware in a long time.
Suddenly, when we try to open the "hive" cli from a data node or an edge node, it fails regularly.
No error. It just hangs.
This happens when there is NOTHING else going on on the cluster.
No queries. No full queues. No running applications.
HDFS is only about 70% full.
When I look at the application manager, it creates an application that is "accepted".
If I drill into it, I see this:
"Application is Activated, waiting for resources to be assigned for AM. Last Node which was processed for the application : server.name:45454 ( Partition : , Total resource : <memory:193024, vCores:16>, Available resource : <memory:193024, vCores:16> ). Details : AM Partition = <DEFAULT_PARTITION> ; Partition Resource = <memory:2702336, vCores:224> ; Queue's Absolute capacity = 100.0 % ; Queue's Absolute used capacity = 4.187192 % ; Queue's Absolute max capacity = 100.0 % ; "
Suddenly this week, it just starting being really inconsistent.
We think there must be some kind of weird networking issue going on behind the scenes - we're at the mercy of IT to know what might have changed there.
But I would really appreciate some help troubleshooting.
The above was originally posted in the Community Help track. On Fri May 10 20:03 UTC 2019, the HCC moderation staff moved it to the Cloud & Operations track because while the question is concerning an "on premises" cluster, the problem is an operational issue with the OP's Hive server. The Community Help track is appropriate for questions about using the HCC Community site itself.
Have you installed the correct hive client versions on the edge node and data node? If so check whether your edge node and data node have static IP or dynamic IP address?
Try comparing the /etc/hosts entries and if not a production environment shutdown the cluster and reboot !