I'm using CDH 4.4.0-1.cdh4.4.0.p0.39 on CentOS 6 and am having trouble with changing the parcel path from /opt/cloudera to /data/cloudera (I know I can symlink it, but the field is there in CM, so why not try to do it the right way right?).
I went ahead and changed the parcel path via the CM UI to /data/cloudera/parcel-repo. Then I added a new host to the cluster, but to my surprise, I see the /opt/cloudera tree has been created and populated with parcels on the new host and nothing is in /data.
Am I missing an option to completely move the parcel path?
Created 09-17-2013 06:29 AM
One point of clarification from your first post, since you're setting parcel directories for two distinct uses:
1. Parcels originate in a Remote Repository (eg. archive.cloudera.com, or your own mirror web server)
2. Cloudera Manager server retrieves parcels and checksums from the Remote Repository and hosts them in
/opt/cloudera/parcel-repo. This location configurable via CM UI > Hosts > Parcels > Edit Settings > Local Parcel Repository Path
3. Cluster Nodes retrieve parcels from the CM Server Local Repository. Each node uses
/opt/cloudera/parcel-cache
/opt/cloudera/parcels
These locations are configurable via /etc/cloudera-scm-agent/config.ini on each respective node.
Visualized this looks like:
Created 09-16-2013 09:45 PM
Setting parcel_dir=/data/cloudera/parcels in /etc/cloudera-scm-agent/config.ini fixed the prob (service cloudera-scm-agent required). Currently it will not follow a symlink from /opt/cloudera -< /data/cloudera properly.
Before updating config.ini I couldn't deploy any additional parcels besides CDH and these were popping up in /var/log/cloudera-scm-agent/clouder-scm-agent.log
[16/Sep/2013 18:43:21 +0000] 47072 Thread-13 parcel_cache INFO Unpacking /opt/cloudera/parcel-cache/HADOOP_LZO-0.4.15-1.gplextras.p0.24-el6.parcel into /opt/cloudera/parcels
[16/Sep/2013 18:43:21 +0000] 47072 MainThread parcel INFO Loading parcel manifest for: CDH-4.4.0-1.cdh4.4.0.p0.39
[16/Sep/2013 18:43:21 +0000] 47072 MainThread parcel INFO Loading parcel manifest for: HADOOP_LZO-0.4.15-1.gplextras.p0.24
[16/Sep/2013 18:43:21 +0000] 47072 MainThread parcel INFO Loading parcel manifest for: CDH
[16/Sep/2013 18:43:21 +0000] 47072 MainThread parcel ERROR Exception while reading parcel: CDH
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/src/cmf/parcel.py", line 103, in refresh
pid = ParcelId.dir(child)
File "/usr/lib64/cmf/agent/src/cmf/parcel_id.py", line 74, in dir
raise Exception("Invalid parcel directory: %s" % (dir))
Exception: Invalid parcel directory: CDH
[16/Sep/2013 18:43:21 +0000] 47072 MainThread parcel ERROR Exception while refreshing parcel repository.
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/src/cmf/parcel.py", line 680, in process_default
self.repo.refresh()
File "/usr/lib64/cmf/agent/src/cmf/parcel.py", line 128, in refresh
pid = ParcelId.dir(child)
File "/usr/lib64/cmf/agent/src/cmf/parcel_id.py", line 74, in dir
raise Exception("Invalid parcel directory: %s" % (dir))
Exception: Invalid parcel directory: CDH
[16/Sep/2013 21:36:52 +0000] 47072 MainThread agent INFO Stopping agent...
Created 09-17-2013 06:12 AM
Thanks for your report. This is a confirmed issue, and a very near-term release will address pardel_dir= not properly enumerating parcels when a symlink is present.
You likely already discovered that you can in the short term change parcel_dir= to the end target and restart the agent, though we'll get this resolved ASAP.
Best,
--
Created 07-23-2019 06:13 AM
@smark is following symlinks in parcels dir config already available?
I'm using CDH 5.14 and it seems not.
In the CM in host config I have set Parcel Directory = /opt/cloudera/parcels
in the system: /opt/cloudera -> /data/cloudera
In the spark config files, spark-env.sh and spark-defaults everywhere /opt/cloudera is replaced to /data/cloudera.
Is it somehow possible to workarround that?
Thanks!
Created 07-23-2019 09:36 AM
This thread covers a different issue that is quite old.
Let's continue the conversation in the other thread you opened if that is ok:
Please explain what problem you are facing if there is one.
Created 09-17-2013 06:29 AM
One point of clarification from your first post, since you're setting parcel directories for two distinct uses:
1. Parcels originate in a Remote Repository (eg. archive.cloudera.com, or your own mirror web server)
2. Cloudera Manager server retrieves parcels and checksums from the Remote Repository and hosts them in
/opt/cloudera/parcel-repo. This location configurable via CM UI > Hosts > Parcels > Edit Settings > Local Parcel Repository Path
3. Cluster Nodes retrieve parcels from the CM Server Local Repository. Each node uses
/opt/cloudera/parcel-cache
/opt/cloudera/parcels
These locations are configurable via /etc/cloudera-scm-agent/config.ini on each respective node.
Visualized this looks like:
Created 10-23-2017 11:00 AM
Hi Smark,
Is it ok to host the local repository in one of the cluster nodes?
Thanks,
Priyanka