About mpandit

mpandit · ‎01-22-2019

Ananya, script was updated long back to take care of this. You should able to use existing vpc and subnet. Only issue you might face if internet gateway is already attached to vpc as script prefers to add new internet gateway.

mpandit · ‎12-14-2018

Overview There are many useful articles as well official Cloudbreak documentation covers everything in great depth. This short article walks you through how to deploy the Cloudbreak instance within the existing VPC and Subnet using the AWS Quickstart deployment. Cloudbreak deployment options The cloudbreak deployment options are explained in detail here. If you notice the AWS specific Networking options, the Quickstart by default creates the new VPC and for the custom VPC the production install is recommended. In case you are doing poc and quickly want to try the Quickstart option but wanted to use the existing VPC, you can do that by enhancing the CloudFormation template which is described in next session. CloudFormation template changes When you launch the CloudFormation template for AWS Quickstart by default it selects the existing CloudFormation template https://s3.amazonaws.com/cbd-quickstart/cbd-quickstart-2.7.0.template. Instead of using the default template use the following template. https://github.com/mpandithw/cloudbreak/blob/master/CloudFormation_aws_QuickStart-Template Mainly following changes have been made to the original template. Added the following two parameters VpcId and SubnetId. "VpcId": { "Type": "AWS::EC2::VPC::Id", "Description": "VpcId of your existing Virtual Private Cloud (VPC)" },"SubnetId": { "Type": "AWS::EC2::Subnet::Id", "Description": "SubnetId of your existing Virtual Private Cloud (VPC)" } I am not walking through all the detail steps which are already covered in cloudbreak documentation. Only modification to original process is to select your own CloudFormation template which is described above. You will get the options with drop down list of existing VPC and subnet. Complete the rest of process as explained in the cloudbreak documentation. Benefits You can use the AWS Quickstart deployment of cloudbreak within your existing VPC/Subnet. Document References aws CloudFormation user guide

mpandit · ‎04-09-2018

This is the continuation of article Part-1 provisioning HDP/HDF cluster on google cloud. Now that we have Google credentials created, we can provision the HDP/HDF cluster. Lets first start with HDP cluster. Login to Cloudbreak UI and click on create cluster which will open the create cluster wizard with both basic and advanced options. On the general configuration page Select the previously created Google credentials, Enter name of the cluster , Select region as shown below, Select either HDP or HDF version. For cluster type select the appropriate cluster blueprint based on your requirements. The available blueprint option in cloudbreak 2.5 tech preview are shown below. Next is configuring the Hardware and Storage piece. Select the Google VM instance type from the dropdown. Enter number of instances for each group. You must select one node for ambari server for one of the host group for which the Group Size should be set to "1". Next is setup the Network group. You can select the existing network or you have option to create new network. On the Security config page provide the cluster admin username and password. Select the new ssh key public key option or the existing ssh public key option. You will use the matching private key to access your nodes via SSH. Finally you will hit create cluster which will redirect you to cloudbreak dashboard. The following left image shows the cluster creation in progress and right image shows the successfully creation of HDP cluster on Google cloud. Once successful deploying the HDP cluster you can login to HDP nodes using your ssh private key with choice of your tool. Following image shows the node login using google cloud browser option. Similarly you can provision the HDF (NiFi: Flow management ) cluster using cloudbreak which is included as part of 2.5 tech preview. Following are some key screenshots for the reference. The Network, Storage and security configuration is similar as we have seen in HDP section earlier. With limitation with my google cloud account subscription I ran into the exception while creating HDF cluster which was rightly shown on cloudbreak. I had to select different region to resolve it. The nifi cluster got created successfully as shown below. Conclusion: Cloudbreak can provide you the easy button to provision and monitor the connected data platform (HDP and HDF) in the cloud vendor of your choice to build the modern data applications.

mpandit · ‎04-09-2018

Cloudbreak Overview Overview Cloudbreak enables enterprises to provision Hortonworks platforms in Public (AWS + GCP + Azure) and Private (OpenStack) cloud environments. It simplifies the provisioning, management, and monitoring of on-demand HDP and HDF clusters in virtual and cloud environments. Following are primary use cases for Cloudbreak: Dynamically configure and manage clusters on public or private clouds. Seamlessly manage elasticity requirements as cluster workloads change Supports configuration defining network boundaries and configuring security groups. This article focuses on deploying HDP and HDF cluster on Google Cloud. Cloudbreak Benefits You can spin up connected data platform (HDP and HDF clusters) on choice of your cloud vendor using open source Cloudbreak 2.0 which address the following scenarios. Defining the comprehensive Data Strategy irrespective of deployment architecture (cloud or on premise). Addressing the Hybrid (on-premise & cloud) requirements. Supporting the key Multi-cloud approach requirements. Consistent and familiar security and governance across on-premise and cloud environments. Cloudbreak 2 Enhancements Recently Hortonworks announced the general Availability of the Cloudbreak 2.4 release. Following are some of the major enhancements in the Cloudbreak 2.4: New UX / UI: a greatly simplified and streamlined user experience. New CLI: a new CLI that eases automation, an important capability for cloud DevOps. Custom Images: advanced support for “bring your own image”, a critical feature to meet enterprise infrastructure requirements. Kerberos: ability to enable Kerberos security on your clusters, must for any enterprise deployment. You can check the following HCC article for detail overview of Cloudbreak 2.4 https://community.hortonworks.com/articles/174532/overview-of-cloudbreak-240.html Also check the following article for the Cloudbreak 2.5 tech preview details. https://community.hortonworks.com/content/kbentry/182293/whats-new-in-cloudbreak-250-tp.html Prerequisites for Google Cloud Platform. Article assumes that you have already installed and launch the Cloudbreak instance either on your own custom VM image or on Google Cloud Platform. You can follow the Cloudbreak documentation which describes both the options. https://docs.hortonworks.com/HDPDocuments/Cloudbreak/Cloudbreak-2.5.0/content/index.html https://docs.hortonworks.com/HDPDocuments/Cloudbreak/Cloudbreak-2.5.0/content/gcp-launch/index.html In order to launch the Cloudbreak and provision the clusters make sure you have the Google cloud account. You can create one at https://console.cloud.google.com Create new project in GCP (e.g. GCPIntegration project as shown below). In order to launch the clusters on GCP you must have service account that Cloudbreak can use. Assign the admin roles for the Compute Engine and Storage. You can check the required service account admin roles at Admin Roles Make sure you create the P12 key and store it safely. This article assumes that you have successfully meet the prereqs and able to launch the cloudbreak UI as shown left below by visiting https://<IP_Addr or HostName> and Upon successful login you are redirected to the dashboard which looks like the image on right. Create Cloudbreak Credential for GCP. First step before provisioning cluster is to create the Cloudbreak credential for GCP. Cloudbreak uses this GCP credentials to create the required resources on GCP. Following are steps to create GCP credential: In Cloudbreak UI select credentials from Navigation pane and click create credentials. Under cloud provider select Google Cloud Platform. As shown below provide the Google project id, Service Account email id from google project and upload the P12 key that you created the above section. Once you provide all the right details , cloudbreak will create the GCP credential and that should be displayed in the Credential pane. Next article Part 2 covers in detail how to provision the HDP and HDF cluster using the GCP credential.

mpandit · ‎04-04-2018

Please confirm if you tried deleting the flow repository at. $nifi_home/flowfile_repository. Also take backup of flow.xml.gz file. delete it and try again. The file is in your conf dir.

mpandit · ‎04-04-2018

Please tell us what's the cloudbreak image location you use to import the image and whats the value you set up for CB_LATEST_IMAGE variable.

mpandit · ‎04-04-2018

Can you please confirm your classpath setting and make sure its pointing to correct version of NiFI.

mpandit · ‎12-21-2017

NiFi does not replicate data. If you lose a node, then flow can be directed to a available node , flowfile queued for the failed node will either wait until the node comes up or the flowfile is manually is sent to another working node. There is feature proposal for this https://cwiki.apache.org/confluence/display/NIFI/Data+Replication

mpandit · ‎12-21-2017

What NiFI version are u using? You might be running into https://issues.apache.org/jira/browse/NIFI-516, which is already fixed. If you want to merge group of 1400 flowfiles into single file every time, then you should set the Minimum Number of entries as 1400. You can significantly lower your maximum number of Bins based on system resources. https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.MergeContent/index.html

mpandit · ‎12-21-2017

Verify your HDP/Ambari version to see if you are running into this issue. https://issues.apache.org/jira/browse/AMBARI-22005

Online	Offline
Last Visited	‎08-30-2019 03:01 PM

Member Since	‎04-27-2016 03:08 PM
Last Visited	‎08-30-2019 03:01 PM
Posts	218
Kudos received	131

Cloudera Community

Re: OutofmemoryError while running the TPCDS query...

Re: Jobs are getting failed to renew token TIMELIN...

Re: Sandbox First Time Login : Cannot Type in Pass...

Re: what should be the value for dfs.datanode.data...

Re: Atlas UI not available : Service Unavailable

Re: Cloudbreak : Use existing vpc/subnet for AWS Q...

Cloudbreak : Use existing vpc/subnet for AWS Quick...

Cloudbreak 2.0 Benefits and HDP/HDF Cluster provis...

Cloudbreak 2.0 Benefits and HDP/HDF Cluster provis...

Re: NiFi-1.5 not starting and showing error org.ap...

Re: Error Openstack / Cloudbreak "no bootable devi...

Re: NiFi-1.5 not starting and showing error org.ap...

Re: Data replication in NiFi Cluster

Re: Nifi: how to combile 14 000 flowfile in one fi...

Re: Cluster upscale from Cloudbreak failing with '...