Cloudera Community

Support Questions

Find answers, ask questions, and share your expertise

Advanced Search

Solved

laplacesdemon

New Contributor

Hello All,

We have applications in our data organization like Oracle Data Integrator and Jupyter Hub.

Those set of applications require up to date jars, core-site.xml, hdfs-site xml etc to integrate properly with the cluster.

Our Application Architects suggests we should install those applications' clients and/or agents on edge nodes of our cluster.

I don't want to do this because:

*I might need to deal with conflicting environment requirements.

*I will have limited control over server resource consumption

*Application level operations such as upgrades might have adverse affects on the cluster

I am considering to offer mounting jars/conf directories of edge nodes on read only mode to application servers.

Any suggestions and evaluations about both architectures or alternatives are more than welcome.

Best regards

2,244 Views

1 ACCEPTED SOLUTION

sagarshimpi

Expert Contributor

Hi @laplacesdemon

I agree with you and definately application/third party tools/components must be installed outside cluster or on individual new node to avoid major performance impacts.

Regarding on how to manage the components if the hadoop version changes is pretty kind of devops question i feel.

You always need to make some inventory of applications running along with your ecosystem components and their dependencies.

Nearby you can use Nexus as centralized repository to fetch new versions which needs to be deployed on your application side[ie. Oracle Data Integrator and Jupyter Hub] with help of jenkins/some deployment tool.

As per my experience i see resource related problems in case you think of installing application on edge nodes. So i will suggest that is not a good idea.

Do revert if you have further points to highlight.

View solution in original post

2,368 Views

3 REPLIES 3