About Dominika

Dominika · ‎07-11-2018

Error: Cloudbreak cluster fails with "Infrastructure creation failed. Reason: Failed to retrieve the server's certificate Solution: The most common reasons for this error are related to using your own custom image. If you are using your own custom image: - If using the CLI, you must send the image id explicitly in the CLI in the cluster template request - There was a breaking change between Cloudbreak 2.4 and 2.7 versions. If you burned your image for Cloudbreak 2.4, and you would like to use Cloudbreak 2.7, you must burn a new image.

Dominika · ‎07-11-2018

Error: Cloudbreak cluster fails with "Infrastructure creation failed. Reason: Failed to retrieve the server's certificate Solution: The most common reason for this error is that your security group configuration is too restrictive. Notably, CB needs access to port 9443 from outside the local network. Refer to https://docs.hortonworks.com/HDPDocuments/Cloudbreak/Cloudbreak-2.7.0/content/security/index.html#default-cluster-security-groups for information on ports that must be open on cluster security groups.

Dominika · ‎07-11-2018

Error: Cloudbreak cluster fails with "Infrastructure creation failed. Reason: Failed to retrieve the server's certificate Solution: The most common reason for this error is that your security group configuration is too restrictive. Notably, CB needs access to port 9443 from outside the local network. Refer to https://docs.hortonworks.com/HDPDocuments/Cloudbreak/Cloudbreak-2.7.0/content/security/index.html#default-cluster-security-groups for information on ports that must be open on cluster security groups.

Dominika · ‎07-06-2018

Hi @sunile.manjee, I believe that this is a bug related to recent image catalog update. I filed a jira item.

Dominika · ‎07-05-2018

Hi @Richard Hagarty, You do not need to have a Hortonworks support subscription in order to install HDP. The parameters that you mentioned are optional and you can proceed without entering them. Yes, SmartSense is installed on your cluster by default but in order to use its features you must have a support supscription. If you do obtain the subscription and provide its ID to SmartSense, you will be able to contact Hortonworks support regarding any issues with your cluster, and you will be able to access a set of features such as activity analysis, cluster recommendations, and activity explorer.

Dominika · ‎06-29-2018

In this tutorial, I go through the steps of creating a data lake instance and creating a cluster attached to the data lake. I cover you how to use Cloudbreak web UI and CLI to perform these steps. Assumptions: These steps assume that you already have a running instance of Cloudbreak and that you have created a Cloudbreak credential that allows you to connect to your cloud provider. All the resources that I am creating in the tutorial are running on AWS. Whatever cloud provider you are using, I assume that you have an SSH key set up (if needed) and some network and subnet that you would like to use. Note that the data lake functionality cannot be used with OpenStack. The steps require that you create some resources (databases, LDAP, cloud storage) outside of Cloudbreak. See the "Meet the Prerequisites" section. This functionality is technical preview in Cloudbreak 2.9.x. It is not suitable for production. Update for Cloudbreak 2.9.0 This post has been updated to reflect the latest functionality available in Cloudbreak 2.9.0. What is a data lake A data lake provides a way for you to centrally apply and enforce authentication, authorization, and audit policies across multiple ephemeral workload clusters. "Attaching" your workload cluster to the data lake instance allows the attached cluster workloads to access data and run in the security context provided by the data lake. Learn more Overview of steps In general, setting up a data lake involves the following steps: Decide which HDP version and which data lake blueprint to use Meet the prerequisites: You must have an existing LDAP You must have an external database for Hive (such as an RDS on AWS) You must have an external database for Ranger (such as an RDS on AWS) If planning to use the HA blueprint, you must have an external database for Ambari (such as an RDS on AWS) You must have an existing cloud storage location (such as an S3 bucket) Register the two databases and LDAP Create a data lake Create attached workload clusters Data lake blueprints When creating a data lake, you can choose from one of the two available blueprints. The following data lake blueprints are provided by default in Cloudbreak: HDP 2.6 Data Lake: Apache Ranger, Apache Atlas, Apache Hive Metastore: Includes Apache Ranger, Apache Atlas, and Apache Hive Metastore. HDP 2.6 Data Lake: Apache Ranger, Apache Hive Metastore HA: Includes Apache Ranger and Apache Hive Metastore in HA mode. Automatic and manual recovery options are available for this type of data lake. Includes two master host groups.) HDP 3.1 Data Lake: Apache Ranger, Apache Hive Metastore HA: Includes Apache Ranger and allows all clusters attached to a data lake to connect to the same Hive Metastore. Note that Hive Metastore has been removed from the HDP 3.x data lake blueprints, but setting up an external database allows all clusters attached to a data lake to connect to the same Hive Metastore. Depending on your use case, select one of these blueprints. Meet the prerequisites Meet the following prerequisites outside of Cloudbreak: Set up two external database instances, one for the HIVE component, and one for the RANGER component. For supported databases, see Supported databases. I set up two micro RDS instances on AWS and created a database on each. I created these in the same region and VPC where I am planning to launch my data lake. It is possible to use one database instance but you must create two separate databases. If planning to use the HA blueprint, you must have an external database for Ambari (such as an RDS on AWS) Create an LDAP instance and set up your users inside the LDAP. Prepare a cloud storage location for default Hive warehouse directory and Ranger audit logs. I created an S3 bucket on AWS. On AWS, you must also have an IAM instance profile that allows access to that bucket. In the steps that follow, you will be required to provide the information related to these external resources. Register the databases and LDAP in the Cloudbreak web UI Prior to creating a data lake, you must register the following resources in the Cloudbreak web UI: Register each of your two databases created as part of the prerequisites in the Cloudbreak web UI, under External Sources > Database Configurations. For instructions, see Register an external database. When registering the database for Hive, select Type > Hive. When registering the database for Ranger, select Type > Ranger. The endpoint needs to be in the following format: INSTANCE-URL:PORT/DB-NAME If using the HA blueprint, also register external database for Ambari. When registering the database for Hive, select Type > Ambari. Register your LDAP (created as part of the prerequisites) in the Cloudbreak web UI, under External Sources > Authentication Configurations. For instructions, see Register an authentication source. As an outcome of this step, you should have two or three external databases and one authentication source registered in the Cloudbreak web UI: Create a data lake Create a data lake by using the create cluster wizard. Among other information, make sure to provide the information listed in the steps below. 1.In Cloudbreak web UI, navigate to Clusters, click on Create Cluster. 2.On the General Configuration page: Under Cluster Name, provide a name for your data lake. Under Cluster Type, choose one of the two available "Data Lake" blueprints: either "Data Lake: Apache Ranger, Apache Atlas, Apache Hive Metastore" or "Data Lake: Apache Ranger, Apache Hive Metastore HA". 3.(Only if using the HA blueprint) On the Hardware and Storage page you can select the following for each host group: Under Instance Count, you can optionally specify how many nodes should be included in each host group. By default, Cloudbreak creates the minimum viable number of nodes. We recommend placing an odd node number of nodes in each host group. A total of 3 or 5 instances is recommended. You can optionally select to Enable Auto Recovery. Enabling this option will allow Cloudbreak to automatically recover any failed nodes. Without checking this option, you will have to manually trigger recovery of the failed nodes. 4.(Only if using the HA blueprint) On the Network and Availability page, enter: Custom Domain: Enter some domain name that Cloudbreak can use locally. For example “mydatalake.local”. This domain name is for local use and does not require DNS services. Custom Hostname: Enter some name convention to use for the host names. For example “prod”. For example, if using “mydatalake.local” as a custom domain and “prod” as a host name, the actual host names will be prod0.cluster-name.mydatalake.local, prod1.cluster-name.mydatalake.local, and so on. 5.On the Cloud Storage page: Under Cloud Storage, configure access to cloud storage via the method available for your cloud provider. Under Storage Locations, provide an existing location within your cloud storage account that can be used for Ranger audit logs, and - if using HDP 2.6 - Hive warehouse directory. If using the Atlas blueprint, this will also be used for HBase Root Directory. Note: The storage location must exist prior to data lake provisioning. If the storage location does not exist then Ranger is installed properly, but it may not work. 6.On the External Sources page, select the previously registered Ranger database, Hive database and LDAP. If using the HA blueprint, also select the previously registered Ambari database: 7.On the Gateway Configuration page, the gateway is enabled by default with Ambari exposed through the gateway. You should also enable Ranger by selecting the Ranger service and clicking Expose. 8.On the Network Security Groups page, you do not need to change anything. If you would like to restrict the open ports, refer to Default cluster security groups. 9.On the Security page: Under Password, provide a strong password for your cluster. For example “SomeRandomChars123!” is a strong password. A strong password is required for the default Ranger admin, which - among other cluster components like Ambari - will use this password. Select an SSH key. 10.Click Create Cluster to initiate data lake creation. As an outcome of this step, you should have a running data lake. Once the data lake is running, you can create workload clusters attached to it. Create attached HDP clusters Once your data lake is running, you can start creating clusters attached to the data lake. Follow these general steps to create a cluster attached to a data lake. In general, once you've selected the data lake that the cluster should be using, the cluster wizard should provide you with the cluster settings that should be used for the attached cluster. 1.In the Cloudbreak web UI, click on the cluster tile representing your data lake. 2.From the ACTIONS menu, select CREATE ATTACHED CLUSTER. 3.In general, the cluster wizard should provide you with the cluster settings that should be used for the attached cluster. Still, make sure to do the following: Under Region and Availability Zone, select the same location where your data lake is running. Select one of the three default blueprints. On the Cloud Storage page, enter the same cloud storage location that your data lake is using. On the External Sources page, the LDAP, and Ranger and Hive databases that you attached to the data lake should be attached to your cluster. On the Network page, select the same VPC and subnet where the data lake is running. 4.Click on CREATE CLUSTER to initiate cluster creation. As an outcome of this step, you should have a running cluster attached to the data lake. Access your attached clusters and run your workloads as normal. Perform the Same Steps with the CLI Cloudbreak CLI makes it possible to perform the exact same tasks as those that can be performed from the web UI. Download and configure the CLI The CLI is available for download from the web UI under Download CLI. Download it and then access it from that location by using ./cb (or add it to your PATH). In order to start using it, you must configure it with the specific Cloudbreak instance. For example: cb configure --server https://my-cb-intstanceIP.com --username myuser@example.com --password MySecurePassword1234! Register the databases and LDAP from the CLI You can use the following CLI commands to register the databases and LDAP: cb database create cb ldap create There are multiple ways to obtain the syntax but the easier way is to enter the parameters in the Cloudbreak web UI under External Sources > Database Configurations and Authentication Configurations and then obtain the correct commands from the SHOW CLI COMMAND option. Create a data lake with the CLI Similar to the previous step, the easiest way to obtain the correct JSON for the data lake is by using the UI. So I recommend that you start by doing the following: 1.Provide the parameters for your data lake in the web UI > create cluster - as described in the instructions above. 2.On the last page of the create wizard, click on SHOW CLI COMMAND to generate a proper JSON template. Once you’ve done that once, you can reuse your JSON file. 3.Paste the JSON content into a text editor and save it as a JSON file. 4.Review the JSON. The generated JSON will never store the cluster user password so you need to provide a strong password. 5.Use the following command to create a data lake: cb cluster create --cli-input-json /path/to/json/mydatalake.json Create attached HDP clusters with the CLI Once you have a running data lake, similar to the previous steps, the easiest way to obtain the correct JSON for attached clusters is by using the UI: 1.Provide the parameters for your data lake in the web UI > create cluster - as described in the instructions above. 2.On the last page of the create wizard, click on SHOW CLI COMMAND to generate a proper JSON. 3.Paste the JSON content into a text editor and save it as a JSON file. 4.Review the JSON. The generated JSON will never store the cluster user password so you need to provide a strong password. 5.Use the following command to create a data lake: cb cluster create --cli-input-json /path/to/json/mycluster.json Once you’ve done that once, you can reuse your JSON file. The following parameters are used to reference the data lake to which the cluster is attached: "sharedService": {"sharedCluster":"test-datalake"},

Dominika · ‎06-05-2018

Cloudbreak 2.4.2 has been released. In addition to a few bug fixes, the release comes with new base and prewarmed images including Ambari 2.6.2 and HDP 2.6.5 (Cloudbreak 2.4.0 and 2.4.1 came with earlier Ambari and HDP versions). For documentation, see https://docs.hortonworks.com/HDPDocuments/Cloudbreak/Cloudbreak-2.4.2/content/index.html For release notes, see https://docs.hortonworks.com/HDPDocuments/Cloudbreak/Cloudbreak-2.4.2/content/releasenotes/index.html

Dominika · ‎04-19-2018

Hmm, I just saw your screenshot and see that you do not have an option to confirm a security exception... Can you try a different browser? Any ideas @rdoktorics @rkovacs @khorvath? I remember others reported this issue before, but I do not remember the cause or solution.

Dominika · ‎04-19-2018

Hi @Marshal Tito, The first time you access Cloudbreak UI, Cloudbreak will automatically generate a self-signed certificate, due to which your browser will warn you about an untrusted connection and will ask you to confirm a security exception. You need to click on ADVANCED and confirm the security exception. After that, you will be able to access the Cloudbreak web UI.

Dominika · ‎04-17-2018

Cloudbreak 2.4.0 and newer includes a neat feature which allows you to obtain CLI commands from the web UI. This feature can save you a lot of time if you are just getting started with the CLI. Prerequisites You must have an instance of Cloudbreak running. You downloaded and configured Cloudbreak CLI. if you still need to do this, refer to Installing the CLI. Obtain Cluster JSON from the UI By far the most useful feature is being able to obtain cluster JSON file from the web UI. There are two easy ways to do this. Option 1 (Does not require creating a cluster) Navigate to the create cluster wizard and specify all required parameters. Once done, on the last page of the wizard there is an option to SHOW CLI COMMAND, which allows you to generate a JSON skeleton of your cluster: You can click COPY the JSON, paste it to a text edit, and save it as template.json. Next, you can copy the command listed in the web UI below the CLI skeleton to create a cluster from the template.json file: cb cluster create --cli-input-json template.json --name cli-cluster Option 2 (Requires that you create a cluster) Another way to obtain cluster JSON skeleton is by navigating to the details of a cluster that has already been created. Under ACTIONS, there is an option to SHOW CLI COMMAND, which allows you to generate a JSON skeleton of your cluster: Obtain CLI Command Syntax from the UI In addition to allowing you to obtain CLI JSON content from the web UI, Cloudbreak allows you to obtain some commands from the UI by using the SHOW CLI COMMAND option. For example: When creating a credential, you can generate `cb credential create` When creating a blueprint, you can generate `cb blueprint create` When adding a recipe, you can generate `cb recipe create`

Online	Offline
Last Visited	‎04-02-2024 01:49 PM

Member Since	‎01-07-2019 08:44 AM
Last Visited	‎04-02-2024 01:49 PM
Posts	217
Kudos received	134

Cloudera Community

Re: Documentation for CDP Public Cloud - PDF

Re: Cloudbreak 2.8TP is giving You are not authori...

Re: The image id 'ami-b72305a1' does not exist

Re: Cloudbreak Azure Marketplace Deployment Failin...

Re: Cloudbreak fails to create 2.6 cluster

Cloudbreak cluster fails with "Infrastructure crea...

Cloudbreak cluster fails with "Infrastructure crea...

Cloudbreak cluster fails with "Infrastructure crea...

Re: Cloudbreak fails to create 2.6 cluster

Re: Do I need to purchase support in order to crea...

How to create a data lake with Cloudbreak 2.9.0

Cloudbreak 2.4.2 is available

Re: Cannot access AmbariUI after Cloudbreak instal...

Re: Cannot access AmbariUI after Cloudbreak instal...

How to Quickly Obtain CLI Information From Cloudbr...