Member since
07-29-2013
12
Posts
4
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
7077 | 12-17-2013 10:14 AM | |
3372 | 10-18-2013 08:49 AM |
12-17-2013
10:14 AM
1 Kudo
Ok, so the way I understand it is that when you do an upgrade it puts the previous fs state into a 'previous' subdirectory of each dfs.name.dir. This way the upgrade can be rolled back if something goes wrong (by basically doing a rename(prevDir, curDir)). When the upgrade is proven successful you can finalize the upgrade--which removes the prevDir--or rollback, which does the rename I mentioned. For this reason, starting a new upgrade checks to make sure the prevDir does not exist, for you would otherwise lose the ability to rollback a previously unfinished upgrade. Based on all of that and your exception, it seems as though an upgrade was or is already in progress and was not finalized. Have you run this upgrade command a few times? If so, do you have the output from the first run ever? All state is kept within these directories, so you should be able to take a manual backup like I mentioned and then try some things out. For instance you might want to try finalizing or rollback the previous upgrade and go from there. It would be nice if someone else chimed in because while I have somewhat of an understanding of how this works it is all based on my own reading of docs and the code, and not any real-world experience (like I said, I've never had to do an fs upgrade because we always migrate to a new cluster). Hope this helps.
... View more
12-17-2013
09:02 AM
I did not have to do this since I did a direct copy of data to a new cluster. A couple notes though: 1) You should only need the latest fsimage file. If an older one is causing issues, you could delete it as long as you still have another. Don't delete the edits files though. 2) You should make a copy of your fsimage and edits files, in fact the entire dfs.name.dir directory. This is recommended in general for falling back, but would allow you to do some playing safely. I still may not be able to help much, but what was the full stacktrace, and some of the surrounding log lines for the failure?
... View more
12-11-2013
12:20 PM
The migration went pretty smooth for us. However, keep in mind that the CDH4 clients can't talk to CDH3 servers, and vice versa, except through hftp. So you're going to need to coordinate the upgrade of library code and upgrade of the cluster. For us this involved quite a bit of work, since we can't afford to take much downtime for any part of our product. I hope for your sake this is not the case for you 🙂 We have a ton of custom infrastructure around build, deploy, etc so it's hard to give generic advice around that. If you have any specific questions I can try to answer as they come up though.
... View more
12-11-2013
11:55 AM
1 Kudo
The repos are at http://archive.cloudera.com/cdh4/redhat/6/x86_64/cdh/. You can see the various CDH4 minor versions listed there, and clicking in you will find the repodata, etc. You shouldn't have a problem reposync'ing that to a local yum server. I've done this migration myself, but in a much different way. We didn't reposync, and we basically spun up a brand new cluster with the new software then synchronized the data over using a modified distcp. Have you seen this documentation? http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/4.2.0/CDH4-Installation-Guide/cdh4ig_topic_6.html Unless I'm mistaken, this should step you through an upgrade without CDM (and without parcels).
... View more
10-18-2013
08:49 AM
1 Kudo
No, this isn't how it works. If you use the same configuration: /mnt, /mnt2, /mnt3, /mnt4, /mnt5 A host who just has 3 drives (/mnt, /mnt2, /mnt3) will fail to start, depending on the value of dfs.datanode.failed.volumes.tolerated (default 0). You're going to need to set up each server properly with the right value for dfs.data.dir. For this (and other) reason(s), a homogenous cluster setup is usually preferred.
... View more