Created on 08-24-2025 11:15 PM - edited 08-25-2025 12:41 AM
We are trying to optimize cluster provisioning by preloading parcels into our VM images. The goal is to have parcels already available on each host before they are added to the cluster, so we can avoid re-distribution during host onboarding.
Here is what we tested and observed:
We baked parcels into the VM image under /opt/cloudera/parcels/.
When we add a host created from this image into the cluster, Cloudera Manager still triggers parcel distribution.
During this process, CM deletes the preloaded parcels on the new host and redistributes them again.
We have tested the following without success:
Setting Auto Distribution = false
Verifying parcel existence manually
Creating/updating the .distributed_parcels file with the correct versions
Our conclusion so far is that Cloudera Manager does not trust preloaded parcels, but instead enforces consistency against its internal parcel state database, which causes redistribution and deletion even if parcels are already present.
Our question:
Is there any supported way to have parcels preloaded on all hosts (via a baked image or other method) so that Cloudera Manager recognizes them as already distributed?
If not, is the recommended approach instead?
Created 08-25-2025 12:29 AM
Hello @ishashrestha
Thank you for reaching out to the Cloudera community
Are you adding the hosts through the Cloudera Wizard? Can you try adding the host manually to Cloudera Manager which means just install the packages and configure the config.ini
Usually, the parcels will be downloaded based on the configuration in /var/lib/cloudera-scm-agent/active_parcels.json
Created 08-25-2025 11:51 PM
@upadhyayk04 I am adding a host through the API (using Ansible), but even when I add it manually, it starts distributing.
My current /var/lib/cloudera-scm-agent/active_parcels.json contains the required parcel:
{"CDH": "7.1.9-1.cdh7.1.9.p0.44702451"}
Created 08-26-2025 12:18 AM
Thanks, @ishashrestha for the update. What happens if you remove the parcel link from the file I mean, keep is empty ideally, it would pick it up automatically and try the distribute the parcels. The way to skip it would be add the host to Cloudera Manager and not the cluster as adding it to cluster will try to enable the activated parcels
Created 08-26-2025 01:14 AM
Thank you for the response @upadhyayk04 . To clarify, I have a baked image that already includes Cloudera Manager, the agent, and the parcel. The hosts are already added in Cloudera Manager, and after that, they are added to my cluster. However, my requirement is for them to be added as parcel-ready. The problem is that it defeats the purpose when Cloudera Manager starts redistributing the parcel again. I was wondering if there’s a solution to this.
Created 08-29-2025 10:01 AM
@ishashrestha Cloudera Manager tracks parcel state centrally in its DB (AVAILABLE_REMOTELY, DOWNLOADED, DISTRIBUTED, ACTIVATED, etc.).So even if the parcel bits are already present on the host, Agent will redistribute it.Also managing parcels is agents responsibility.It performs a lot of background tasks in parcel lifecycle.
In Distribution phase it compares the .sha file (from the repo) with the .parcel file in the cache.This ensures no corruption or mismatch and then extracts parcel (.parcel is tarball).
In activation phase creates/update symlinks in parcel directory, /etc/alternatives and in /var/lib/alternatives.Also creates service users which is needed service installation.
If any of these don’t line up, CM will re-trigger distribution/activation even if the bits are there.
Created 09-07-2025 10:48 PM
@ishashrestha Did the response assist in resolving your query? If it did, please mark the relevant reply as the solution, as it will help others locate the answer more easily in the future.
Regards,
Vidya Sargur,Created 09-24-2025 08:11 PM
Hello
Copying a parcel to a node and adding the node to the cluster does not mean Cloudera Manager will recognize that the parcel is already present and skip distribution. Cloudera Manager manages parcel distribution through its centralized management system. When a node is added to the cluster, Cloudera Manager will handle the parcel distribution process according to its internal procedures, ensuring that the correct and complete parcel metadata and components are properly configured and unpacked on the node.
This process ensures consistency and reliability in the cluster configuration. Therefore, even if a parcel is manually copied to a node, Cloudera Manager is likely to still perform its distribution process for verification and consistency purposes.
Copying the parcels to the nodes before adding them to the cluster is not the proper or suggested way to do it, therefore not supported.
The recommended method would be to distribute the parcels by using the Wizard as usual as Cloudera supports the parcel distribution through the UI only.
I hope this helps.