Created 11-28-2019 12:47 AM
I agree with you and definately application/third party tools/components must be installed outside cluster or on individual new node to avoid major performance impacts.
Regarding on how to manage the components if the hadoop version changes is pretty kind of devops question i feel.
You always need to make some inventory of applications running along with your ecosystem components and their dependencies.
Nearby you can use Nexus as centralized repository to fetch new versions which needs to be deployed on your application side[ie. Oracle Data Integrator and Jupyter Hub] with help of jenkins/some deployment tool.
As per my experience i see resource related problems in case you think of installing application on edge nodes. So i will suggest that is not a good idea.
Do revert if you have further points to highlight.