Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

cloudera-scm-agent deleting Cloudera installed files during EC2 instance cloning via AMI creation

Highlighted

cloudera-scm-agent deleting Cloudera installed files during EC2 instance cloning via AMI creation

New Contributor

Not too long ago we installed HDFS, Map Reduce and Hbase clients on a Centos EC2 instance in AWS using Cloudera Manager 4.6.2. Recently we had the need to clone that box for some installation testing. We tried imaging the instance using AMI creation wizard in the AWS console and then launching an EC2 instance from that AMI.

 

After the new instance rebooted we found out that the Cloudera files and folders including those in /usr/bin and /opt/cloudera/parcels were removed. The original instance had these folders for example under /opt/cloudera/parcels for example.

 

lrwxrwxrwx 1 root root 25 Jan 30 17:53 CDH -> CDH-4.2.1-1.cdh4.2.1.p0.5
drwxr-xr-x 9 1106 592 4096 Dec 14 2012 CDH-4.1.2-1.cdh4.1.2.p0.30
drwxr-xr-x 9 root root 4096 Apr 22 2013 CDH-4.2.1-1.cdh4.2.1.p0.5

In the new instance the folder looks like this

lrwxrwxrwx 1 root root 25 Jan 30 17:53 CDH -> CDH-4.2.1-1.cdh4.2.1.p0.5


After some investigation we found out that the cloudera-scm-agent init.d script may be responsible for doing this change on the new instance at reboot time. When we disable that script these changes do not happen. Subsequently when we enabled and looked at the audit trails for changes in the /opt/cloudera/parcels folder, we see things like the following

--------------------------------------------

----
type=PATH msg=audit(02/04/14 07:32:26.512:4583) : item=1 name=/opt/cloudera/parcels/CDH-4.1.2-1.cdh4.1.2.p0.30/share/hue/build/env/lib/python2.6/site-packages/Twisted-8.2.0-py2.6-linux-x86_64.egg/EGG-INFO/scripts/mktap inode=296125 dev=ca:41 mode=file,755 ouid=unknown(1106) ogid=unknown(592) rdev=00:00
type=PATH msg=audit(02/04/14 07:32:26.512:4583) : item=0 name=/opt/cloudera/parcels/CDH-4.1.2-1.cdh4.1.2.p0.30/share/hue/build/env/lib/python2.6/site-packages/Twisted-8.2.0-py2.6-linux-x86_64.egg/EGG-INFO/scripts/ inode=296121 dev=ca:41 mode=dir,755 ouid=unknown(1106) ogid=unknown(592) rdev=00:00
type=CWD msg=audit(02/04/14 07:32:26.512:4583) : cwd=/
type=SYSCALL msg=audit(02/04/14 07:32:26.512:4583) : arch=x86_64 syscall=unlink success=yes exit=0 a0=267a700 a1=1 a2=7f8b99646a88 a3=67652e34365f3638 items=2 ppid=1 pid=1608 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset comm=python exe=/usr/lib64/cmf/agent/build/env/bin/python key=my-cloudera-changes

----
type=PATH msg=audit(02/04/14 07:32:26.512:4584) : item=1 name=/opt/cloudera/parcels/CDH-4.1.2-1.cdh4.1.2.p0.30/share/hue/build/env/lib/python2.6/site-packages/Twisted-8.2.0-py2.6-linux-x86_64.egg/EGG-INFO/scripts/tapconvert inode=296130 dev=ca:41 mode=file,755 ouid=unknown(1106) ogid=unknown(592) rdev=00:00
type=PATH msg=audit(02/04/14 07:32:26.512:4584) : item=0 name=/opt/cloudera/parcels/CDH-4.1.2-1.cdh4.1.2.p0.30/share/hue/build/env/lib/python2.6/site-packages/Twisted-8.2.0-py2.6-linux-x86_64.egg/EGG-INFO/scripts/ inode=296121 dev=ca:41 mode=dir,755 ouid=unknown(1106) ogid=unknown(592) rdev=00:00
type=CWD msg=audit(02/04/14 07:32:26.512:4584) : cwd=/
type=SYSCALL msg=audit(02/04/14 07:32:26.512:4584) : arch=x86_64 syscall=unlink success=yes exit=0 a0=2ea80b0 a1=1 a2=7f8b99646a88 a3=67652e34365f3638 items=2 ppid=1 pid=1608 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset comm=python exe=/usr/lib64/cmf/agent/build/env/bin/python key=my-cloudera-changes

---------------------------------------------

 

I was wondering if somebody can provide me some insight about this behavior during EC2 cloning process and if there is any way to clone the instance without having this problem?

 

We are trying to test the update/installation of some piece of code that we create without messing up the live box.

 

2 REPLIES 2
Highlighted

Re: cloudera-scm-agent deleting Cloudera installed files during EC2 instance cloning via AMI creati

Master Collaborator

Can you clarify what version of Cloudera Manager this is, and whether or not the AMI imaging process would have erased the /opt/cloudera/parcel-repo directory on your server?  The agents will try to keep their nodes in sync with the server and if you wipe that directory, it might trigger the agents to mirror that change.  You can disable/re-enable that parcel to reinstall CDH in all the places you mentioned.  But I'm curious how re-imaging a node does not disrupt it.  As you indicated it's a live node and you'd prefer not to disrupt anything.

Highlighted

Re: cloudera-scm-agent deleting Cloudera installed files during EC2 instance cloning via AMI creati

New Contributor

We have a Hadoop Cluster that was setup using Cloudera Manager 4.6.2 and this box (that we are trying to clone) is a client to that cluster and so various client libraries on it were installed using the same Cloudera manager. 

 

In the original instance that we are trying to clone I see only these folders & files

 

/opt/cloudera/parcel-cache/CDH-4.1.2-1.cdh4.1.2.p0.30-el6.parcel

/opt/cloudera/parcel-cache/CDH-4.2.1-1.cdh4.2.1.p0.5-el6.parcel

/opt/cloudera/parcels/CDH -> CDH-4.2.1-1.cdh4.2.1.p0.5

/opt/cloudera/parcels/CDH-4.1.2-1.cdh4.1.2.p0.30/

/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/

 

I do not see any 

 

/opt/cloudera/parcel-repo 

 

 

folder. Should there be one?

 

When we were trying to debug this problem we actually attached the snapshot created by the AMI creation process as an extra volume to a temporary EC2 instance to see if any of the above content was removed by the AMI creation and they were not. I have been working with AWS folks on this and so far we have narrowed it down to something around the scm script. 

 

Regarding the last part of your comment, I am not sure I am understanding it correctly. But we are imaging with a no-reboot option. So the instance keeps running. We have tried it with reboot option too and that had the same results.

 

If I remember correctly we did not have this problem with older version of Cloudera Manager where we did clone client nodes for doing this kind of testing. 

 

If I could distrupt the client node for sometime, is there any temporary config change i could do to get around this problem?

 

Thanks for replying

 

Don't have an account?
Coming from Hortonworks? Activate your account here