- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Should HDP data nodes have Spark client libraries installed?
Created on
‎11-01-2019
11:49 AM
- last edited on
‎11-01-2019
07:48 PM
by
ask_bill_brooks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In working with a particular HDP 3.1 cluster, with Spark 2.3 installed, I am finding that the Spark client libraries (ex: spark-cli command, as well as libraries under jars) are not available on every node. They are only installed on the nodes the customer refers to as "client nodes" (I believe this is analogous to "edge nodes"). They also have data nodes in the cluster, which are able to run Spark executors (and, in fact, YARN does distribute tasks to executors on them), but those nodes do not have Spark client libraries installed.
Is this a normal setup? Can I not assume that the Spark client is installed on every node, even if it is generally available on the cluster? Thanks for any insight.
Created ‎11-01-2019 12:19 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@JeffEvans I think below thread answers the same question about spark client libs on worker nodes.
We dont need spark clients installed on all the worker nodes, should be installed only on edge nodes.
Created ‎11-01-2019 12:19 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@JeffEvans I think below thread answers the same question about spark client libs on worker nodes.
We dont need spark clients installed on all the worker nodes, should be installed only on edge nodes.
