Community Articles

Find and share helpful community-sourced technical articles.
avatar
Contributor

Provisioning clusters with MiNiFi C++ Bootstrap.

This article references code from https://github.com/apache/nifi-minifi-cpp/pull/218

MiNiFi C++ has a new bootstrap process that enables you to select dependencies with a menu guided approach. In this article, we’ll discuss using the bootstrap script to bypass the menu and install all dependencies that your system is capable of installing.

The major advantage of this is that you will be able to run MiNiFI C++, which doesn’t incur a startup cost and can be run within a very small memory footprint, on any device you can connect to using SCP to distribute the source and SSH to run the boot strap command.

Overview of the boot strap process.

The bootstrap script is a simple set of bash scripts that run the dependency installation, CMAKE setup, and build. You will then use boot strap to run a make install. The bootstrap.sh script normally uses prompts to ensure agreement at every step, but for the purposes of this article, we’ll be supplying the –n argument, which dictates that we want no prompts for the user. This will run the bootstrap in headless mode. We’ll be supplying the –p argument, which will run a parallel make build and make package.

Alternatives

Readers may wonder why we can’t run make package on a node of similar architecture using a portable binary. We can do this with GCC by supplying a generic march; however, the purpose of this article is to distribute the source and run the build on devices that may have potentially different architectures without the need for cross compilation. By using the bootstrap script, we are able to provision a cluster of any size with different architectural units without the need to centralize cross compilation. By no means will bootstrapping be a more efficient approach. Instead, it will support various architectures across a myriad of different OS distributions. Note that since the boot script detects OS version and type, we can use the bootstrap script to self-limit installation of features. With cross compilation, this becomes a greater task as you must build for your architecture.

Running

Running the bootstrap is very simple. To test, I built multiple VMs on AWS with Ubuntu 16, RHEL 7, and Centos 6 installed across ten instances. I then used pdsh with a formulated genders file to distribute commands to each host. I used this approach to minimize typing. You can use SSH to run commands and easily distribute them manually.

Once I distributed the tar.gz file to each node, I set up a genders file that associated a name with each IP. I called these hosts minifitest00 through minifitest09, assigning a group name of minifitestnodes. I first extracted the the tarred gzip.

$ pdsh –g minitiftestnodes ‘tar –xzf ~/minify.tar.gz’

This extracted to a directory named nifi-minifi-cpp in the connecting user’s home directory. I then ran a command to bootstrap

$  pdsh –g minitiftestnodes ‘cd ~/nifi-minifi-cpp ; ./bootstrap.sh –p –n –e’

This will download dependencies for each distribution, accepting all updates, run a make package and a make build. With MiNiFi C++ built and provisioned we can run our agents. In future posts we'll discuss using C2 to control these agents.

Next Steps for bootstrapping

The next steps of the bootstrapping process will be limiting the build nodes. Since many nodes will undoubtedly have the same build profile – architecture, OS version, and OS type, we should be able to use boot strap to automatically identify this and build only once and distribute the executables within our cluster.

Conclusion.

This article discusses bootstrapping nodes with little to no knowledge of the underlying distribution. Since the bootstrap script is bash based, no underlying dependency is needed. The script will bootstrap the node with all necessary components needed to build and run the agent.

803 Views