Member since
02-18-2014
94
Posts
23
Kudos Received
23
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1459 | 08-29-2019 07:56 AM | |
1858 | 07-09-2019 08:22 AM | |
945 | 07-01-2019 02:21 PM | |
1676 | 03-19-2019 07:42 AM | |
2026 | 09-04-2018 05:29 AM |
09-07-2019
01:23 PM
Still very strange. That line and column number places the error at the very end of the JSON, which again looks fine. Is there maybe some odd whitespace involved, like Windows line endings? Another tactic here is to try the SDKs that we provide, either Java or Python: https://github.com/cloudera/director-sdk . You could use one of them to send the same data in to Director to see what happens. If it accepts it, which it should, then you could compare the JSON emitted by the SDK to what you're trying to send directly. Or maybe there is some HTTP header that needs to be sent as well, such as "Content-Type: application/json". (I don't curl Director myself much at all, so I don't remember off-hand details like required headers.)
... View more
08-29-2019
07:56 AM
Hi Da, You're correct, you can edit the Altus Director server's logback.xml file to have it not emit log messages for that Java class. It would look like this. <logger name="com.cloudera.api.ext.ClouderaManagerClientProxy" level="INFO" /> More details on editing logback.xml are here: https://www.cloudera.com/documentation/director/latest/topics/director_troubleshoot.html#director-intro__d72814e37 While those debug lines are noisy, they can be helpful when problems arise. So, in the future, if you are troubleshooting problems with Altus Director and Cloudera Manager, flipping that logger back on might reveal some useful information.
... View more
08-29-2019
07:51 AM
Hi CK, Here are some tips to help answer your question. First, use cloud storage services, like S3 on AWS or ADLS on Azure, for keeping data like Navigator lineage information. Those services provide availability and reliability automatically. Hadoop and other services can be configured to use cloud storage in various ways instead of local block (hard drive) storage. Sometimes it's not efficient or high-performing enough to exclusively use cloud storage. For example, a typical data analysis job may have several stages where data is read, processed, and then written back, and those round trips to storage services can be slower and cost more money than local drive access. So, think about adjusting how data is managed, so that intermediate data resides in local block storage, but final results are sent to cloud storage for safekeeping. Once all of the important data is safe in cloud storage, it becomes less important to keep cluster instances running. You can even destroy entire clusters, including master / manager nodes, knowing that the data is safe in cloud storage. At this point, you will want to use automation tools, like Altus Director or Cloudbreak, so that you can easily spin up new clusters that are configured to pull their initial data from cloud storage. Then, you only run clusters when you need them. If that isn't feasible, you can still do something like what you suggest, with clusters that have some permanent nodes and some transient ones. If so, ensure that those transient nodes do not keep important state that isn't safe elsewhere. For example, YARN node managers are stateless, so scaling nodes only housing those ("worker" nodes) is an easy goal to achieve. By constrast, HDFS datanodes store file data, so those aren't as easy to scale down - you can, though, as long as they are decommissioned properly using Cloudera Manager or Ambari, which the cloud automation tools handle for you.
... View more
08-28-2019
02:49 PM
Altus, both Director and the service-based offerings, passes the enterprise license through to Cloudera Manager as is. So, it's sufficient that the license enables auto-TLS in Cloudera Manager. In other words, Altus doesn't add any additional constraints.
... View more
08-28-2019
02:46 PM
Hi AdilsonAt, It should be possible to install Nifi on a CDH cluster using Altus Director. (I've never tried it.) This article describes how to include Nifi in a CDH cluster using a parcel. https://community.cloudera.com/t5/Community-Articles/Install-Nifi-on-one-node-using-Cloudera-Flow-Management/ta-p/244281 Altus Director lets you specify additional parcels for a new cluster, so you could take the information from the article and include it in an Altus Director configuration file so that Nifi is downloaded and installed. Here's the general Altus Director documentation for that: https://www.cloudera.com/documentation/director/latest/topics/director_third-party_products.html Our reference configuration files for Altus Director document exactly how to add the necessary information, so you can use those as a basis for trying it out. Here's the AWS reference file: https://github.com/cloudera/director-scripts/blob/master/configs/aws.reference.conf If you try it, do let us know if it works out! Thanks, Bill
... View more
08-28-2019
02:39 PM
Hi YuriiL, The JSON you posted looks OK to me. You say you're doing a PUT, but for that API path, there shouldn't be any JSON in the response body. The HTTP status code is the only meaningful response. Do you know what Director is sending back for the PUT response? 202 would indicate a successful update. Is there any stack trace information around the single log line you posted that you can share? Java tends to be kind of bad at reporting on null pointer exceptions, so I'm hoping there are more hints in the log. Thanks, Bill
... View more
07-15-2019
06:57 AM
The link to the key has been fixed, so please give it another try. The archive.cloudera.com host is served by a CDN of some sort, so it'll take time for the change to propagate - you may continue to hit 404 errors until the update gets everywhere and caches clear out. Do let us know if you hit any other URL problems, in case there are other links that are still busted.
... View more
07-15-2019
05:50 AM
Thanks for raising this issue, I'll check on it. Sometimes the configuration of archive.cloudera.com gets changed, and it's possible that something went awry with a recent update.
... View more
07-10-2019
06:10 AM
Thanks for the update!
... View more
07-09-2019
08:22 AM
1 Kudo
I might need to see the whole file, but in case that's not feasible for you: Try working backwards from the deployment and cluster templates. For example, you might have this in the "cloudera-manager" / deployment template section (I say "might" because you've made modifications, which is fine). cloudera-manager {
instance: ${instances.edge} {
... This means that the instance template used for CM inherits from the "instances.edge" object with some overrides in the block that I've elided here with "...". So, you'd go back to the "instances" section, "edge" subsection. Anything there is available here. Does the "instances.edge" subsection include values for "useCustomManagedImage" and "customImagePlan"? That section in our older sample Azure files looked like this. instances { .... edge {
image: ${?common-instanceTemplate.base.image}
type: ${?common-instanceTemplate.base.type}
computeResourceGroup: ${?common-instanceTemplate.edge.computeResourceGroup}
networkSecurityGroupResourceGroup: ${?common-instanceTemplate.base.networkSecurityGroupResourceGroup}
networkSecurityGroup: ${?common-instanceTemplate.base.networkSecurityGroup}
virtualNetworkResourceGroup: ${?common-instanceTemplate.base.virtualNetworkResourceGroup}
virtualNetwork: ${?common-instanceTemplate.base.virtualNetwork}
subnetName: ${?common-instanceTemplate.base.subnetName}
instanceNamePrefix: ${?common-instanceTemplate.edge.instanceNamePrefix}
hostFqdnSuffix: ${?common-instanceTemplate.base.hostFqdnSuffix}
availabilitySet: ${?common-instanceTemplate.edge.availabilitySet}
publicIP: ${?common-instanceTemplate.edge.publicIP}
storageAccountType: ${?common-instanceTemplate.edge.storageAccountType}
dataDiskCount: ${?common-instanceTemplate.edge.dataDiskCount}
dataDiskSize: ${?common-instanceTemplate.edge.dataDiskSize}
managedDisks: ${?common-instanceTemplate.edge.managedDisks}
tags: ${?common-instanceTemplate.base.tags}
bootstrapScripts: [ ${?bootstrap-script.os-generic} ]
} You can see that this lacks "useCustomManagedImage" and "customImagePlan". You'd need to add those two properties here for the deployment template to get their values. You can either add them literally, or refer to the values in "common-instanceTemplate.base" or "common-instanceTemplate.edge" or anywhere else they are actually defined. Or you can just put them right into the deployment template, in the "instance" subsection under the "cloudera-manager" section, and not worry about how the inheritance should be working. You can hopefully see how the multiple layers of indirection doesn't help with a clear configuration.
... View more
07-09-2019
07:00 AM
Having seen a customer HOCON configuration file with this problem, I have a good guess as to the problem for you. Until recently, our example Azure configuration files used two layers of indirection to specify configuration properties for instance templates. There was a first set in a section called "common-instanceTemplate" where the actual values were defined, and then a second set "instances" where properties were defined based on the values in the first set. The deployment and cluster templates toward the bottom of the file used the properties in the "instances" section. What might be happening for you (and happened for the other customer) is that the "useCustomManagedImage" and "customImagePlan" properties aren't defined in the "instances" section. So, they don't carry through from their initial definition in the "common-instanceTemplate" section to the deployment and cluster templates. If this is the case, then adding lines like these to each of the subsections under "instances" will fix the problem. useCustomManagedImage: ${?common-instanceTemplate.base.useCustomManagedImage}
customImagePlan: ${?common-instanceTemplate.base.customImagePlan} Even though these are just mistakes in the HOCON, really the problem is that our example Azure configuration files were needlessly complex. They've been updated recently to eliminate this redirection, so I also suggest taking a look and seeing if that pattern suits your uses better. https://github.com/cloudera/director-scripts/blob/master/configs/azure.simple.conf Please let me know if this solves your problem.
... View more
07-09-2019
06:30 AM
Hi dturner, This should be supported. I'm actually currently looking into this problem on a support escalation. customImagePlan definitely does not need to be supplied when using a custom image. I think it used to be required in the past, but it no longer is, and that validation error message is just out of date. The error message seems to be triggered because Altus Director has not noticed that the useCustomManagedImage field is set to Yes. When it sees that correctly, then Altus Director skips the portion of validation that triggers the error message you are getting. My only guess at the moment is that the HOCON parsing of the actual instance templates - not the base one, but those that inherit from it - is somehow not picking up the useCustomManagedImage field. So, maybe try repeating the useCustomManagedImage field across image templates to see if it helps. That's not a satisfactory final solution, but might avoid whatever the real problem is. In the meantime, I'm continuing to investigate, so stay tuned 🙂
... View more
07-01-2019
02:34 PM
Hi Ra, I should note that it's "Director", not "Dictator", although I do like your name because it sounds really assertive. 😉 Altus Director generates a log with much more detailed information about what is happening, and that's the first spot you'd want to look. Since you're using "bootstrap-remote", you want to look at the server log, which is usually at /var/log/cloudera-director-server/application.log. It seems that the single GCE instance for Cloudera Manager came up successfully, but something went wrong with the instances for the cluster. The server log should reveal more. If you have a support contract with us, it's a good idea to open a case and include the entirety of the server log with it. Our support team can take a look and possibly help get you moving.
... View more
07-01-2019
02:29 PM
1 Kudo
Hi GaryS, Thanks for following up that you were able to resolve your issue! For others, to clarify: If you change the username and password for Cloudera Manager (for example, from the default admin/admin), then you do need to update Altus Director with the new credentials. That way, Altus Director can continue to work with Cloudera Manager to do things like add new hosts to a cluster. There is an option in the dropdown for a deployment in Altus Director to update the credentials. In case there is still a problem in communications, a workaround is to set Cloudera Manager (and Altus Director) back to admin/admin, then do what you need to do, and then switch Cloudera Manager back. There is in fact one scenario in Altus Director where this is necessary, which we're working on fixing. To add to what Asif said about the ways to add a new cluster node through Director: Besides the UI, you can use the Altus Director server API as well. The UI is just a special client for the API, anyway. Visit http://yourdirectorhost.example:7189/api-console/ for an interactive (Swagger) console to experiment. You can also try using the Java or Python SDKs, available on GitHub. https://github.com/cloudera/director-sdk
... View more
07-01-2019
02:21 PM
1 Kudo
Hi CK71, This is kind of a wide-open question, so I'll give you a wide-open answer with some ideas for implementing "transient" clusters. To start out with, you may want to think about having a cluster with a core of non-transient nodes that house management information, so-called "master" nodes that host the HDFS namenode, YARN resource manager, and so on. You also would want to keep around enough stateful nodes, like HDFS datanodes and Kudu tablet servers, so that fundamental data stays available (e.g., to stay above your chosen HDFS replication factor which defaults to 3). Then, you have the ability to scale out with stateless "compute" nodes, like YARN node managers and Spark workers, when the cluster workload increases, and then tear them down when the load is lighter. Next, a good goal is to store important data on cloud storage services, like S3 for AWS and ADLS for Azure. Hadoop and other services have the ability to reference data in those services directly - for example, Hive and HDFS can be backed by S3 - or you can establish ways to copy data into a cluster from the services to work on it, and then copy final result data back out to the services. (You'd want to avoid saving intermediate data, like temporary HDFS files or Hive tables that only matter in the middle of a workflow, because that data can be regenerated.) Once you can persist data to cloud storage services, then you have a basis for making new clusters from nothing, and then pulling in data for them to work on. Saving off metadata, such as Hive table mappings (metastore) and Sentry / Ranger authorization rules, to cloud storage is also a good idea. You can use cloud database services, like RDS on AWS, for that, or else general block storage services like S3 or ADLS. Metadata usually needs to apply to all of your clusters, transient or not, because they define common business rules. The idea behind SDX is to make saving common metadata an easy and usual thing, so that it's easier to hook up new clusters to it. Automating the creation of clusters is really important, especially for transient clusters that you'd bring up and tear down all the time. That's the purpose for tools like Altus Director and Cloudbreak. We also have customers who use general tools like Ansible, Chef, Puppet, or the like, since they are more familiar with them, or have standardized on them. If you have automated cluster creation, and important data and metadata persisted in cloud storage services, then you've got the ingredients for successfully working with transient clusters in the cloud. I know this isn't a precise answer for how to do transient clusters, but hopefully I've given you some avenues to explore.
... View more
03-29-2019
06:38 AM
Hi Tomas79, Try starting with this file: https://github.com/cloudera/director-scripts/blob/master/configs/gcp.simple.conf It's not a "full-featured" CDH, but a simple one. However, you could then carry over deployment ("cloudera-manager") and cluster configurations from aws.reference.conf to construct a more comprehensive configuration for GCP. At those levels, the configuration properties are the same no matter what provider you are on. P.S. We're working on updating the configuration files to correct minor errors and ensure they run with minimal replacements. Some changes you should make for better odds of success: - put Solr server roles on workers, not masters - put the Flume agent on a gateway - for HA, specify "oozie_load_balancer" and "oozie_load_balancer_http_port" as separate configuration properties for the OOZIE service - include SPARK_ON_YARN and HIVE GATEWAY roles where Hive and Spark roles run so that Hive-on-Spark works
... View more
03-21-2019
02:30 PM
Hi -Rana-, I don't think your attachments made it through. That's just as well: there are limits to how much we can help you to that extent over a community forum anyway. We can offer general guidance and feedback on highly specific issues, but general troubleshooting is best performed by our support organization, so they can work with you over time and as new issues crop up. Otherwise, in general, I would check that the Director instance is able to consistently reach the CM instance over the network, specifically over SSH. Check over any network / firewall rules that may interrupt the connectivity, especially if the instances are in separate networks or subnetworks.
... View more
03-19-2019
08:30 AM
Hi WZ, I have recently been reviewing the Director reference configuration files, and I had the same problem with "Default Master Group" for the Kudu configuration properties. I also got around it by copying the values to that master group. This does look like a problem with how Director is manipulating role configuration groups in Cloudera Manager. I did not have the same issue with the namenode or node manager configuration properties, but the reference configuration does not set any of those. In your cluster, did you have the same "Default Master Group" problem with those roles as with the Kudu master? Bill
... View more
03-19-2019
08:18 AM
Hi airhead, While a bootstrap script can issue a reboot, I don't think Director will reliably handle all of the ways that could play out. For example, if the reboot happens so quickly that it cuts off script execution, then Director will think the script failed, and retry it, and possibly run out of retries before the instance comes back. And if it does come back, it'll run the script again, and reboot again. If the bootstrap script is just adding disk space, you do have other options. An EC2 instance template can specify extra EBS volumes, for example, and you can update mount points (without rebooting) to hook them in where necessary in the directory tree. Also, Director automatically attempts to resize the root partition to take up all the available disk space, so a separate script isn't necessary for that work. Bill
... View more
03-19-2019
07:42 AM
1 Kudo
You appear to be using the username "scm" as an administrative user on the MySQL instance. The "scm" user must have permission to create and delete databases on the server. Normally, one would use the default MySQL "root" user, which already should have all of the necessary permissions. - The string starting with "scm_" and ending with a random string is the generated name of the Cloudera Manager server database. - The string "uxnlmrno" is the username for the user that shall be used to access the new database. Apparently, in your configuration file, you do not specify a usernamePrefix for the database. It is optional. Director is running a script on the CM instance to perform the database work. (The script uses CM code, so it needs to run where CM is installed). It is trying to reach the database server at the full hostname ending in ".internal", and it seems that the connectivity there is working. You can double-check by running a MySQL client from the CM instance itself. You say you can create and drop databases from the MySQL console. Does that use the "scm" user or, perhaps, the "root" user?
... View more
02-20-2019
10:48 AM
Hi john1, To my surprise, I think Altus Director can already work with partition placement groups. It is already possible to name a placement group for EC2 instances in an Altus Director instance template. From my interpretation of the AWS docs, if that placement group has a "partition" strategy, instances can be placed into it even without specifying a partition number (and today, Altus Director doesn't support picking a partition number). I don't know how EC2 decides which partition in the placement group each instance should then be placed into. Maybe it's random or round robin, which might be good enough for spreading out HDFS datanodes if there are enough partitions. I encourage you to give it a shot and see what happens. Explicit support isn't on the roadmap right now for Altus Director, but at first glance it doesn't seem too difficult to add it. The AWS plugin is open source, so you could also try adding it in yourself. Essentially, it would involve adding a new configuration property for EC2InstanceTemplate for the partition number and then including its value in RunInstanceRequest objects. Good luck! If you do some experiments, I'd be interested to know how they turn out. Bill
... View more
02-19-2019
08:32 AM
Hi Rana, The errors for "Opening `direct-tcpip` channel failed: Connection refused" happen under normal circumstances and can be ignored. I thought we'd suppressed them by now in Director logging, but maybe not. 🙂 Still, I see there's a failure to connect to Cloudera Manager at the end of the log snippet. That means that Director was trying to check on the status of a running command and couldn't reach Cloudera Manager. The IP address for that instance of Cloudera Manager is present in prior logging lines, so apparently the instance is at least reachable. Director did have to establish an SSH tunnel to talk to it: Successfully established tunnel to server 10.142.0.59 at 7180 There's not enough context in the logs here for me to go much further. The pipeline thread for the Cloudera Manager failure, "p-66c81a04d10f-BootstrapClouderaManagerAgent", doesn't appear elsewhere in the sample, but other threads are working with the instance successfully. Also, the beginning of line 68 is cut off, so I'm concerned that multiple Director instances are running at the same time. General ideas to troubleshoot: - Run Director inside GCP so that it can make direct connections to all of the new instances. Once that's working, try running Director outside of it, which is where I think it's running right now. - At first, just do one thing at a time in Director. It makes troubleshooting the logs easier. (We're actually working on a way to split logging out by cluster to help here.) - If that instance is still up, see if Cloudera Manager is in fact running. Maybe it died for some reason? Logs in /var/log/cloudera-scm-server are usually informative. Bill
... View more
02-04-2019
06:22 AM
1 Kudo
Hello Liran, My guess is that Spring Boot, which Altus Director uses extensively, is stripping out the ampersand when reading the last component of the property key, "R&D". According to Spring documentation, it should be possible to surround the key value with square brackets to preserve all of the characters. https://docs.spring.io/spring-boot/docs/current/reference/html/boot-features-external-config.html So, hopefully one of these alternatives works: [lp.security.ldapConfig.activeDirectory.roleMapping.R&D]: ADMIN
# or
lp.security.ldapConfig.activeDirectory.roleMapping.[R&D]: ADMIN
... View more
11-13-2018
05:33 AM
Hi ratek20, The blog post uses the "bootstrap" command, not the "bootstrap-remote" command: cloudera-director bootstrap spot-director.conf The "bootstrap" command does all the work in the client, while "bootstrap-remote" makes the client call on the server to do the work. If you want to use "bootstrap-remote" instead, then start the Director server so it is listening on port 7189. Then, when running the client, pass the username and password for admin access to the server, e.g.: cloudera-director bootstrap-remote spot-director.conf --lp.remote.username=admin --lp.remote.password=admin
... View more
11-08-2018
05:14 AM
1 Kudo
Hello yarivgraf, Good news: A community member implemented subnetwork support in the Google plugin, and the work was merged several weeks ago. https://github.com/cloudera/director-google-plugin/pull/150 The next 6.x release of Cloudera Altus Director will come packaged with a new plugin release that includes this change. In the meantime, you can build the plugin and install it into your existing Altus Director 2.8 or 6.0 installations and it should work. * For Director 2.8, build and install the plugin from the v1.x branch. * For Director 6.0, build and install the plugin from the v2.0.x branch. The plugin's README describes the build process and links to some docs on installing the plugin in Altus Director. For the latter, you basically place the plugin JAR into Altus Director's plugins directory, replacing the prior Google plugin JAR.
... View more
09-06-2018
04:50 AM
Hi Tomas79, Cloudera Director 2.8 doesn't support working with Cloudera Manager 6.0 or CDH 6.0. It supports only up through CM 5.15.x and CDH 5.15.x at this time. Usually, Cloudera Director can't support versions of CM or CDH that lie in the future at the time of Director's release. However, we're working on CM / CDH 6 support, so look for news in the near future. 🙂
... View more
09-04-2018
05:29 AM
Hi dturner, The bootstrap-remote command will remain in Cloudera Director for the foreseeable future. What we're planning to remove is the bootstrap command, without the "-remote" bit at the end. The bootstrap-remote command communicates with a Director server, while the bootstrap command does all of the work locally within the client. We call using the non-remote bootstrap command using the client in "standalone" mode, and it's those non-remote commands that are going away. We realized that having such similar client commands was pretty confusing. For example, some folks would use the bootstrap command (non-remote) to spin up a cluster, but then be surprised that their Director server elsewhere wouldn't know about it. Overall, it's just better to use the server and remote client commands anyway. For one thing, the server API and Java / Python SDKs are available for servers, but don't work for clusters created from the "standalone" client bootstrap command. I hope that helps clear up the confusion.
... View more
08-13-2018
10:30 AM
1 Kudo
Hi iasindev, For case #1, when using bootstrap-remote, most of the work is done on the associated Director server, so check the logs there for problems. It could be the same problem that surfaced in #2. In #2, the error is occurring when Director is failing to access the CDH parcel repository over https. Are you hosting your own repository, or have you customized the Director truststore to no longer include common trusted certificates? Older Director releases are available under https://archive.cloudera.com/director/ . Just drill down for your operating system to find the version you need.
... View more
08-06-2018
06:16 AM
Hi Tomas79, Thanks for bringing up this issue! I've filed an internal trouble ticket for this failure to handle Unicode, so that we can address it in a future release.
... View more
07-09-2018
07:06 AM
Hello ssankarau, I did a little searching internally and it appears that this problem has been seen before when using Cloudera Manager with MariaDB 10.2.8 and higher. There is a workaround that involves editing a file CM uses to construct the schema in its database, named 05003_cmf_schema.mysql.ddl: ....
alter table CONFIGS
drop column REVISION_ID;
ALTER TABLE ROLE_CONFIG_GROUPS DROP INDEX IDX_UNIQUE_ROLE_CONFIG_GROUP;
ALTER TABLE ROLE_CONFIG_GROUPS DROP INDEX IDX_ROLE_CONFIG_GROUP_CONFIG_REVISION;
alter table ROLE_CONFIG_GROUPS
drop column REVISION_ID;
.... Adding the two "DROP INDEX" lines as shown above should allow the file to be executed successfully. Supposedly the statements should not be necessary according to the semantics of DROP COLUMN; perhaps there is some issue with MariaDB. Unfortunately, Director does not have a script hook available to apply this workaround automatically as part of cluster bootstrap. You could try creating an image with CM already installed and with the workaround in place, so that Director will simply start CM instead of trying to install it fresh. I'm going to assign this question over to the Cloudera Manager team to make sure they see that another instance of this problem has occurred. Thank you!
... View more