Support Questions

Find answers, ask questions, and share your expertise

How to stop auto distribution of parcel when adding host to cloudera cluster.

avatar
New Contributor

We are trying to optimize cluster provisioning by preloading parcels into our VM images. The goal is to have parcels already available on each host before they are added to the cluster, so we can avoid re-distribution during host onboarding.

Here is what we tested and observed:

  1. We baked parcels into the VM image under /opt/cloudera/parcels/.

  2. When we add a host created from this image into the cluster, Cloudera Manager still triggers parcel distribution.

  3. During this process, CM deletes the preloaded parcels on the new host and redistributes them again.

  4. We have tested the following without success:

    • Setting Auto Distribution = false

    • Verifying parcel existence manually

    • Creating/updating the .distributed_parcels file with the correct versions

  5. Our conclusion so far is that Cloudera Manager does not trust preloaded parcels, but instead enforces consistency against its internal parcel state database, which causes redistribution and deletion even if parcels are already present.

Our question:

  • Is there any supported way to have parcels preloaded on all hosts (via a baked image or other method) so that Cloudera Manager recognizes them as already distributed?

  • If not, is the recommended approach instead?

5 REPLIES 5

avatar
Master Collaborator

Hello @ishashrestha 

Thank you for reaching out to the Cloudera community

Are you adding the hosts through the Cloudera Wizard? Can you try adding the host manually to Cloudera Manager which means just install the packages and configure the config.ini

Usually, the parcels will be downloaded based on the configuration in /var/lib/cloudera-scm-agent/active_parcels.json

avatar
New Contributor

@upadhyayk04 I am adding a host through the API (using Ansible), but even when I add it manually, it starts distributing.
My current /var/lib/cloudera-scm-agent/active_parcels.json contains the required parcel:
{"CDH": "7.1.9-1.cdh7.1.9.p0.44702451"}


avatar
Master Collaborator

Thanks, @ishashrestha for the update. What happens if you remove the parcel link from the file I mean, keep is empty ideally, it would pick it up automatically and try the distribute the parcels. The way to skip it would be add the host to Cloudera Manager and not the cluster as adding it to cluster will try to enable the activated parcels 

 

avatar
New Contributor

Thank you for the response @upadhyayk04 . To clarify, I have a baked image that already includes Cloudera Manager, the agent, and the parcel. The hosts are already added in Cloudera Manager, and after that, they are added to my cluster. However, my requirement is for them to be added as parcel-ready. The problem is that it defeats the purpose when Cloudera Manager starts redistributing the parcel again. I was wondering if there’s a solution to this.

avatar
Rising Star

@ishashrestha Cloudera Manager tracks parcel state centrally in its DB (AVAILABLE_REMOTELY, DOWNLOADED, DISTRIBUTED, ACTIVATED, etc.).So even if the parcel bits are already present on the host, Agent will redistribute it.Also managing parcels is agents responsibility.It performs a lot of background tasks in parcel lifecycle.

In Distribution phase it compares the .sha file (from the repo) with the .parcel file in the cache.This ensures no corruption or mismatch and then extracts parcel (.parcel is tarball).

In activation phase creates/update symlinks in parcel directory, /etc/alternatives and in /var/lib/alternatives.Also creates service users which is needed service installation.

If any of these don’t line up, CM will re-trigger distribution/activation even if the bits are there.