Created on
11-01-2019
11:49 AM
- last edited on
11-01-2019
07:48 PM
by
ask_bill_brooks
In working with a particular HDP 3.1 cluster, with Spark 2.3 installed, I am finding that the Spark client libraries (ex: spark-cli command, as well as libraries under jars) are not available on every node. They are only installed on the nodes the customer refers to as "client nodes" (I believe this is analogous to "edge nodes"). They also have data nodes in the cluster, which are able to run Spark executors (and, in fact, YARN does distribute tasks to executors on them), but those nodes do not have Spark client libraries installed.
Is this a normal setup? Can I not assume that the Spark client is installed on every node, even if it is generally available on the cluster? Thanks for any insight.
Created 11-01-2019 12:19 PM
@JeffEvans I think below thread answers the same question about spark client libs on worker nodes.
We dont need spark clients installed on all the worker nodes, should be installed only on edge nodes.
Created 11-01-2019 12:19 PM
@JeffEvans I think below thread answers the same question about spark client libs on worker nodes.
We dont need spark clients installed on all the worker nodes, should be installed only on edge nodes.