Created on 04-03-2018 08:40 PM
In this tutorial, I create a NiFi cluster from the default blueprint provided with Cloudbreak 2.5.0 TP, but this post has been updated to reflect what is available in Cloudbreak 2.9.0.
This post was originally written for Cloudbreak 2.5.0 TP, which introduced support for creating HDF Flow Management (NiFi) clusters. It has been updated to reflect the latest features available in Cloudbreak 2.9.0 general availability release.
Cloudbreak 2.5.0 TP introduced support for creating HDF Flow Management (NiFi) clusters.The subsequent release, Cloudbreak 2.6.0 TP, introduced support for creating HDF Messaging Management (Kafka) clusters. Cloudbreak 2.7.0 GA introduced these as general availability features. In Cloudbreak 2.9.0 GA, two HDF 3.3 blueprints are included by default, one for Flow Management (NiFi) and one for Messaging Management (Kafka).
Cloudbreak simplifies the provisioning, management, and monitoring of on-demand HDP and HDF clusters in virtual and cloud environments. It leverages cloud infrastructure to create host instances, and uses Apache Ambari via Ambari blueprints to provision and manage HDP and HDF clusters.
Cloudbreak allows you to create HDP and HDF clusters using the Cloudbreak web UI, Cloudbreak CLI, and Cloudbreak REST API. Clusters can be launched on public cloud infrastructure platforms Microsoft Azure, Amazon Web Services (AWS), and Google Cloud Platform, and on the private cloud infrastructure platform OpenStack.
Support for creating NiFi clusters was introduced in Cloudbreak 2.5.0 Technical Preview and was GA'd in Cloudbreak 2.7.0.
In order to use Cloudbreak, you must have access to a cloud provider account (AWS, Azure, Google Cloud, OpenStack) on which resources can be provisioned.
If you part of Hortonworks, you have access to the hosted Cloudbreak instance. If you do not have access to this instance, you must launch Cloudbreak on your chosen cloud platform.
The instructions for launching Cloudbreak are available here:
Once your Cloudbreak instance is running, you must create a Cloudbreak credential before you can start creating HDP or HDF clusters. Why do you need this? By creating a Cloudbreak credential, you provide Cloudbreak with the means to authenticate with your cloud provider account and provision resources (virtual networks, VMs, and so on) on that account. Creating a Cloudbreak credential is always required, even if Cloudbreak instance is running on your cloud provider account.
The instructions for creating a Cloudbreak credential are available here:
> Tip: When using a corporate cloud provider account, you are unlikely to be able to perform all the required prerequisite steps by yourself and you may therefore need to contact your IT so that they can perform some of the steps for you. For related tips, refer to this HCC post.
Creating clusters is possible from Cloudbreak web UI and Cloudbreak CLI. It’s best to get started with the UI before attempting to use the CLI.
1.Log in to the Cloudbreak UI.
2.Click Create Cluster and the Create Cluster wizard is displayed. By default, Basic view is displayed. You can click Advanced to see more options, but in this post we are just addressing basic parameters.
3.On the General Configuration page, specify the following general parameters for your cluster:
4.When done, click Next.
5.On the Hardware and Storage page, Cloudbreak pre-populates recommended instance types/count and storage type/size. You may adjust these depending on how many nodes and storage you want and of what type. By default, a 2-node cluster will be created with one node in each host group (Services host group and NiFi host group).
6.Before proceeding to the next page, you must select the host group on which Ambari Server will be install. Under “Services” host group, check “Ambari Server” so that Ambari Server is installed on that host group:
7.When done, click Next.
8.On the Network and Availability page, you can proceed with the default settings or adjust the settings in the following way:
9. On the Gateway Configuration page, do not change anything. Just click Next.
10. On the Network Security Groups page,
Your configuration should look like this:
> Tip: If you are planning to use NiFi processors that require additional ports, add additional rules to open these ports.
> Tip: By default, Cloudbreak creates a new security group for each host group. By default, ports 9443, 22, and 443 are open to all (0.0.0.0/0) on the Services host group (because this is where Ambari Server is installed); and port 22 is open to all (0.0.0.0/0) on the NiFi host group. These settings are not suitable for production. If you are planning to leave your cluster running for longer than a few hours, review the guidelines documented here and limit the access by (1) deleting the default rules and (2) adding new rules by setting the CIDR to “My IP” and “Custom” (use “Custom” for specifying the Cloudbreak instance IP).
9.On the Security page, provide the following information:
> Warning: Make sure not to disable Kerberos. If you don’t have one, select to create a test KDC. If you use the default Flow Management blueprint without enabling Kerberos, the NiFi UI will be inaccessible unless you configure an SSL certificate OR you register and use an existing LDAP.
10.At this point, you have provided all parameters required to create your cluster. Click CREATE CLUSTER to start cluster creation process.
11.You will be redirected to the cluster dashboard and the cluster status presented on the corresponding tile will be “Create in progress” (blue color). When the cluster is ready, its status will change to “Running”:
Once the status of your cluster changes to “Running”, click on the cluster tile to view cluster details where you can find information related to your cluster and access cluster-related options.
Note the following options:
1.Click on the link under Ambari URL to access Ambari web UI in the browser:
2.Log in to the Ambari web UI by using the cluster user and password created when creating a cluster. Since Ambari web UI is set up with a self-signed SSL certificate, the first time you access it your browser will warn you about an untrusted connection and will ask you to confirm a security exception. Once you have logged in, you can access NiFi service from the Ambari dashboard:
Nifi UI link is available from Quick Links:
3.To access cluster nodes via SSH, use:
For example, on Max OS X:
ssh -i "mytest-kp.pem" email@example.com
4.Cloudbreak web UI provides the options to Stop/Start, and Sync the cluster with the cloud provider. Once you don’t need the cluster, you can terminate it by using the Terminate option available in the cluster details.
> Resizing and autoscaling: In general, downscaling NiFi clusters is not supported - as it can result in data loss when a node is removed that has not yet processed all the data on that node. Upscaling is supported, but there is a known issue which requires you to manually update the newly added hosts (see Known Issues).
5.The Show CLI command option allows you to generate a JSON template for the existing cluster; the template can later be used to automate cluster creation with Cloudbreak CLI.
6.You can download Cloudbreak CLI by selecting Download CLI from the navigation pane. The CLI is available for Mac OS X/Windows/Linux.
7.You only need to configure the CLI once so that it can be used with your Cloudbreak instance:
./cb configure --server<cloudbreak IP> --username <cloudbreak-user> --password <cloudbreak-password>
8.Once it has been configured, you can view available commands by using:
Cloudbreak includes additional advanced options, some of which are cloud platform-specific. To review them, refer to the following docs: