Member since
07-30-2013
509
Posts
113
Kudos Received
123
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2989 | 07-09-2018 11:54 AM | |
2486 | 05-03-2017 11:03 AM | |
6087 | 03-28-2017 02:27 PM | |
2326 | 03-27-2017 03:17 PM | |
2038 | 03-13-2017 04:30 PM |
07-09-2018
11:54 AM
1 Kudo
Hi, When creating your cluster, Cloudera Manager should automatically detect the directories on each host, then use Role Configuration Groups to set distinct configurations for the 10-disk nodes and the 20-disk nodes, and divide roles appropriately between those groups. dfs.data.dir isn't global, but is a role config, so it is usually set in the Role Config Group for a role. You can read more about configuration management here: https://www.cloudera.com/documentation/enterprise/latest/topics/cm_intro_primer.html#concept_fgj_tny_jk When you add new datanodes, I suggest creating a host template and applying that to your new nodes, allowing them to easily join the correct DataNode group as well as any other roles you may be running on that node (like a YARN NodeManager). You can read about host templates here: https://www.cloudera.com/documentation/enterprise/latest/topics/cm_mc_host_templates.html Thanks, Darren
... View more
05-03-2017
11:03 AM
1 Kudo
Hi, api.get_all_hosts() returns just basic information about a host by default, a SUMMARY view. You probably want the FULL view. See docs here: https://cloudera.github.io/cm_api/epydoc/5.11.0/cm_api.api_client.ApiResource-class.html#get_all_hosts If you do a simple HTTP GET on {CM_HOST:PORT}/api/v{version}/hosts you can easily see the kind of stuff returned by api.get_all_hosts(), and if you look at {CM_HOST:PORT}/api/v{version}/hosts?view=FULL you'll see you can get more details, like the role refs. For your use-case of replacing a failed node, there's significant trickiness in getting the steps just right. You may want to look into Cloudera Director, which can repair worker or gateway nodes, among many other features. Here's the doc page for repairing a node: https://www.cloudera.com/documentation/director/latest/topics/director_ui_cluster_shrink.html Thanks, Darren
... View more
03-28-2017
02:27 PM
Thanks for this report! This does indeed appear to be a bug (Paolo dug into it internally, credit to him) and we'll get a fix out in a future release. The abruptly stop step should skip when there's no started roles, rather than error. Thanks, Darren
... View more
03-27-2017
03:17 PM
1 Kudo
Hi Shant, That's not possible today. Why do you want that? Usually admins don't want to deal with so many certs. You can use additionalConfigs to emit parameters to more places, but be careful to read the caveat about passwords. Thanks, Darren
... View more
03-27-2017
09:59 AM
1 Kudo
1. You can't prevent the abrupt stop, but you shouldn't need to. Is it actually causing a problem? It may just be skipped. Can you show any error message, or post a screen shot? 2. No, that's not possible.
... View more
03-24-2017
06:21 PM
1 Kudo
Hi, Custom stop runners at the role level are planned for a future release. Stay tuned! Until then, the only ways to stop roles are: 1) Standard stop, included by default. CM will basically send a sigterm to your process, and if it doesn't die after 30s, it will send a group sigkill. You can stop individual roles this way (select what you want on the instances page, chose Actions for Selected -> Stop), but there's no reasonable way to run a custom stop script. 2) Service-level graceful stop. CM will run a custom script on a master role in your service, which must instruct the workers to exit normally (exit code 0), and once those have exited, CM will consider the service-level stop command successful. This is only helpful if your master role can orchestrate the stop, and it'll always stop all roles. Thanks, Darren
... View more
03-13-2017
04:30 PM
1 Kudo
Hi Shant, The documentation there is incomplete. Here's the information you're looking for: certificateLocationConfigName Optional. Config name to emit when ssl_server_certificate_location is used in a config file. If null, ssl_server_certificate_location will not be emitted into config files, and can only be used in substitutions like ${ssl_server_certificate_location}. certificateLocationDefault Optional. Default value for ssl_server_certificate_location. caCertificateLocationConfigName Optional. Config name to emit when ssl_server_ca_certificate_location is used in a config file. If null, ssl_server_ca_certificate_location will not be emitted into config files, and can only be used in substitutions like ${ssl_server_ca_certificate_location}. caCertificateLocationDefault Optional. Default value for ssl_server_ca_certificate_location. (sorry couldn't get the formatting nicer, forums doesn't seem to like width in the HTML) I'll get this added to the wiki in a future update.
... View more
03-02-2017
01:25 PM
Hi, It's best for systems (especially distributed systems) to not require careful ordering in startup. Instead, each process should wait for a bit for any dependency process (like the master) to come up. If possible, I also suggest that this wait period should be configurable, and at least 2 minutes in duration by default. There's no way for CSDs to control the ordering of start commands since we prefer robustness to ordering. Thanks, Darren
... View more
02-14-2017
04:16 PM
Depending on how you are running the job, you may be able to override the topology script parameter and/or replace the toplogy.py script with one that is python 3 compatible. If you're submitting jobs from the command line, you'd usually copy /etc/hadoop/conf to some custom directory /path/to/customized/conf, make changes there, then set HADOOP_CONF_DIR=/path/to/customized/conf and run your job. Assuming you can change that topology script, here's the relevant portion of the diff that you can apply: @@ -1,8 +1,8 @@
#!/usr/bin/env python
#
-# Copyright (c) 2010-2012 Cloudera, Inc. All rights reserved.
+# Copyright (c) 2016 Cloudera, Inc. All rights reserved.
#
-
+
'''
This script is provided by CMF for hadoop to determine network/rack topology.
It is automatically generated and could be replaced at any time. Any changes
@@ -12,8 +12,13 @@ made to it will be lost when this happens.
import os
import sys
import xml.dom.minidom
-from string import join
-
+
+try:
+ xrange
+except NameError:
+ # support for python3, which basically renamed xrange to range
+ xrange = range
+
def main():
MAP_FILE = '{{CMF_CONF_DIR}}/topology.map'
DEFAULT_RACK = '/default'
@@ -40,14 +45,14 @@ def main():
map[node.getAttribute("name")] = node.getAttribute("rack")
except:
default_rack = "".join([ DEFAULT_RACK for _ in xrange(max_elements)])
- print default_rack
+ print(default_rack)
return -1
-
+
default_rack = "".join([ DEFAULT_RACK for _ in xrange(max_elements)])
if len(sys.argv)==1:
- print default_rack
+ print(default_rack)
else:
- print join([map.get(i, default_rack) for i in sys.argv[1:]], " ")
+ print(" ".join([map.get(i, default_rack) for i in sys.argv[1:]]))
return 0
if __name__ == "__main__":
... View more