Created on 07-21-2014 04:29 PM - edited 09-16-2022 02:02 AM
Hi,
I tried a search on this, but the topic is a bit squishy:
I'm trying to find a good, non-destructive way to install Cloudera Manager and integrate an *existing* and working cluster. Said cluster was manually installed (for the learning experience), as per the CDH5 docs and the Hadoop Operations book.
Some first minor attempts, using one of the datanodes/nodemanagers as a guinea pig, scared me: CM tried to force install all sorts of packages, never checking whether and what was already there. Never making an attempt of reading existing config files and importing them first.
I have local repositories (for yum) all set up and working. I know how to point CM at them to use them.
However, I would like to be able to point CM at my exising nodes (namenodes/resourcemanagers, datanodes/nodemanager, etc) and integrate their existing config, adjusting what's needed, adding packages only when I expressly tell CM to.
Is there a safe way?
Thanks
Mike
Created 07-29-2014 07:56 AM
I believe the community tool munged the hyperlinks when Todd posted the URLs, but the URLs are still correct:
http://cloudera.github.io/cm_api/apidocs/v6/
and
Created 07-22-2014 07:28 PM
Mike,
Unfortunately there is no magic "ingest CDH only cluster" pre-made toolset that is part of the product. The discrete XML configuration files you have set up for the raw CDH cluster, generally would need to be manually introduced through the CM configuration UI. There was early work attempting to do this, but it was found to not be reliable across the wide variation
We do have an API for cloudera manager that allows you to inspect and set cluster configuration values programatically (documented here http://cloudera.github.io/cm_api/apidocs/v6/ and here for examples and historical version reference http://cloudera.github.io/cm_api/).
When you consider what is generally "default" for the configuration, most of that is handled by CM, it would be specific non-default settings that you are using in your CDH cluster that would need to be brought over. The API can be used to "dump" the current configuration of the CM database (where config ends up in a CM managed cluster) from a new install to look at what is set by default and compare from a test cluster.
THanks
Todd
Created 07-23-2014 02:45 AM
Thanks, Todd (oh and, hi, to a former fellow Navidecer! Been a while... 😉
I'm going to give the API a look, though first I'll try the config dump. I wonder if a staged migration would be possible. First, temporarily lower replication to a minimum, maybe two. Then start phasing out datanodes and letting them be marked as offline. Finally put those under CM control and migrate data from the other remaining nodes.
Need to have a think about this...
Created 07-29-2014 01:15 AM
@Tgrayson wrote:Mike,
[...]
We do have an API for cloudera manager that allows you to inspect and set cluster configuration values programatically (documented here http://cloudera.github.io/cm_api/apidocs/v6/ and here for examples and historical version reference http://cloudera.github.io/cm_api/).
[...]
Did those links go dead? Getting 404s now?
Created 07-29-2014 07:56 AM
I believe the community tool munged the hyperlinks when Todd posted the URLs, but the URLs are still correct:
http://cloudera.github.io/cm_api/apidocs/v6/
and
Created 07-29-2014 03:25 PM