Experiencing another weird error.
I've created an EC2 instance for Cloudera Manager, and went to it's webconsole. I was presented with add hosts screen, which recommended i added the manager host, which I did. However, upon distribution of parcels, i am getting a repeated error untarring CDH 5.1.0. When I say repeated, it is essentially stuck in a loop trying to distribute, failing, then restarting distributing again.
thanks in advance if you've already seen this!
So is this a single host cluster? To troubleshoot parcel distribution, check the directories under /opt/cloudera. There could be a dangling symlink in one of those directories.
# ls -l /opt/cloudera/parcel*
Also check the agent logs where you'll see the real cause of failure. This is under /var/log/cloudera-scm-agent/*.log. If you can, please paste the log snippet here (or use a service like pastebin or github gists)
thank you for the hints.
the parcel directory looks ok to me, although two of the directories are owned by root rather than cloudera-scm. however, making that in a symbolic link made the problem go away. thanks!
root@ip-10-133-2-15:/opt# ls -l /opt/cloudera/parcel*
drwx------ 3 root root 4096 Aug 14 00:10 tmpYThWAd
-rw-r----- 1 cloudera-scm cloudera-scm 1726036186 Aug 13 21:47 CDH-5.1.0-1.cdh5.1.0.p0.53-precise.parcel
-rw-r----- 1 cloudera-scm cloudera-scm 41 Aug 13 21:47 CDH-5.1.0-1.cdh5.1.0.p0.53-precise.parcel.sha
The permissions look fine. The temp file though looks like it's a leftover of a previous aborted download. It's possible your network is slow and the downloads are timing out. You should delete it.
What do the agent logs say? That's where you'll see the real cause of problems.
sadly, the logs for that particular day have been blown away. the dangling symlink was the issue, as it was not pointing to the bigger storage device, and root store was probably getting full in the middle of the operation.
thanks for your help!