Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Dedicated edge nodes

avatar
Contributor

Hi

We are thinking on having dedicated edge per project for our data lake. Each project will have a vm on which we install the required clients.

Anyone is doing this ? Any problems or issues that we should be aware of with this configuration ?

1 ACCEPTED SOLUTION

avatar
Super Guru

@Adel Ouazani

As you already know edge nodes are for running your client processes. They are not running your cluster processes and usually not storing data, unless you are using edge node data ingestion and staging your data in edge node. So edge node configuration can be customized quite a bit based on your needs.

I have not seen customers having separate edge nodes for each project but I don't see anything particularly wrong except that it increases the number of ways your cluster can be accessed which means increasing chances of security holes.

One main consideration, however will be to make sure you have good network and bandwidth support between your cluster and all of the edge nodes.

Other than that, provided reasonable resources (like CPU, disk specially if you are staging data for ingest and memory), this should be fine.

I would also recommend reading the accepted answer on this thread for more details to help you make decision.

https://community.hortonworks.com/questions/34872/staging-on-edge-nodes.html

View solution in original post

1 REPLY 1

avatar
Super Guru

@Adel Ouazani

As you already know edge nodes are for running your client processes. They are not running your cluster processes and usually not storing data, unless you are using edge node data ingestion and staging your data in edge node. So edge node configuration can be customized quite a bit based on your needs.

I have not seen customers having separate edge nodes for each project but I don't see anything particularly wrong except that it increases the number of ways your cluster can be accessed which means increasing chances of security holes.

One main consideration, however will be to make sure you have good network and bandwidth support between your cluster and all of the edge nodes.

Other than that, provided reasonable resources (like CPU, disk specially if you are staging data for ingest and memory), this should be fine.

I would also recommend reading the accepted answer on this thread for more details to help you make decision.

https://community.hortonworks.com/questions/34872/staging-on-edge-nodes.html