About Purnima

Purnima · ‎10-23-2018

If MySQL Driver does not exist by default, or not installed when MySQL is installed, then the provisioned cluster has the following errors. In this case, create a Recipe in Cloudbreak to install MySQL jdbc driver at pre-ambari-start time: During Create Cluster, use Advance option. Pick the recipe pk-install-mysql-driver-pre from the drop down and click Attach. This will ensure that the jdbc driver is installed before Ambari is started. This should then result in a clean install.

Purnima · ‎10-23-2018

If you see errors related to YARN Registry DNS Bind Port. Where the YARN Registry DNS is stopped, most probably due to a port conflict. Then go into the Advanced Configurations. Look for the parameter RegistryDNS Bind Port and change is to a port that does not have a conflict. I changed it from 53 to 553. Save configuration changes. Restart all components that are impacted. This should take care of the issue and YARN Registry DNS will start successfully.

Purnima · ‎08-16-2018

Launch Cloudbreak on AWS Before launching Cloudbreak on AWS, review and meet the prerequisites. Meet the Prerequisites Before launching Cloudbreak on AWS, you must meet the following prerequisites. AWS Account AWS Region SSH Key Pair Key Based Authentication AWS Account In order to launch Cloudbreak on AWS, you must log in to your AWS account. If you don't have an account, you can create one at https://aws.amazon.com/. AWS Region Decide in which AWS region you would like to launch Cloudbreak. The following AWS regions are supported: SSH Key Pair Import an existing Key Pair or generate a new Key Pair in the AWS region which you are planning to use for launching Cloudbreak and clusters. You need this SSH Key Pair to SSH to the Cloudbreak instance and Start Cloudbreak. To do this use the following steps. Navigate to the Amazon EC2 console at https://console.aws.amazon.com/ec2/. Check the region listed in the top right corner to make sure that you are in the correct region. In the left pane, find NETWORK AND SECURITY and click Key Pairs. Do one of the following: To generate a new Key Pair: Click Create Key Pair to create a new key pair Your private Key file will be automatically downloaded onto your computer. Make sure to save it in a secure location. You will need it to SSH to the cluster nodes. You may want to change access settings for the file using chmod 400 my-key-pair.pem. To import an existing Public Key: Click Import Key Pair to upload an existing Public Key Select the Public Key and click Import. Make sure that you have access to its corresponding Private Key. Key Based Authentication If you are using key-based authentication for Cloudbreak on AWS, you must be able to provide your AWS Access Key and Secret Key pair. Cloudbreak will use these keys to launch the resources. You must provide the Access and Secret Keys later in the Cloudbreak web UI later when creating a credential. If you choose this option, all you need to do at this point is check your AWS account and make sure that you can access this Key Pair. You can generate new access and secret keys from the IAM Console. To do this go to the IAM Service: In the left pane click on Users > Select a user Click on the Security credentials tab Creat Access Key or use an existing Access Key. There is a limit of two. To create a new user go to IAM Service: 1.In the left pane click on Users > Select Add User 2.Enter a user name, select Programming access and then click on Next: Permissions 3.For Set Permissions keep the defaul "Add usert to group" 4.In Add User to group select all the groups then click Next:Review 5.Review your choices and click on Create User 6.If your user has been created successfully you will see a similar image as below 7.Once you are done with these steps, you are now ready to launch Cloudbreak. Cloudbreak from quickstart template Follow documentation provided here to install cloudbreak. Next go to:Cloudbreak on AWS. Go straight to "Log into the Cloudbreak application"

Purnima · ‎07-23-2018

Nifi Flow for writing to S3, WASB and Google Storage. Run the flow, watch as the twitter messages are captured and then aggregated before putting them in storage. Azure Storage: Now you can go to your Azure Portal and look in the container and you should see aggregated messages organized by year/month/day. Google Storage: Open Google Cloud Platform and go to your Storage service. Google Storage will now contain. the aggregated messages organized by year/month/day. AWS Storage: S3 bucket in your AWS account will now have aggregated Twitter messages organized by year/month/day. Now let’s see what’s happening here. I will only focus on the three main important processors as the others make up the simple flow. The entire flow template is available as an xml file and you can download: nificloudstorage.xml PutAzureStorage Processor: Azure: Create a Storage Account Get the Storage Account name and Key as shown in this screenshot. This is needed in the PutAzureObject Processor. PutS3Object Processor From AWS dashboard, go to Users, pick your user, click on Security Credentials. If you have not saved the Secret Access key then use Create Access key button to generate it again. There is a limit of only 2 keys. PutGCSObject Processor Setting up GCS credentials is slightly different. A Controller Service is made use of. Click on the arrow in GCPCredentialsControllerService. That will take you to the next screenshot. Controller Services Click on the gears icon to take you to the properties. Use the JSON file created from your GCS credentials. You can follow this article Creating GCS credentials to find out to get this JSON Click on the lightning icon to enable this Controller Service

Purnima · ‎07-21-2018

This could be the potential solution for this "hive script does not exist" question as well: https://community.hortonworks.com/questions/25588/hive-script-error-script-does-not-exists-error.html

Purnima · ‎07-21-2018

Environment: HDP on AWS EC2 instances. Problem: When trying to execute a hive ddl script, we get the error file does not exist. Upon checking the file exists and also has the read write privileges. Root Cause: The directory where the script exists does not have read write permission to the user. Solution: Open the directory permission to include the necessary read, write permission to the user. Following is a screenshot that shows the problem and solution used to fix it.

Purnima · ‎07-20-2018

Use S3 as storage for Zeppelin Notebooks. Step 1. Use external storage to point to S3 bucket in Cloudbreak advance options. This will use S3access profile and AWS credentials. CB takes care of that set up. In addition to that Step 2 Change these 3 inzeppelin-env.sh export ZEPPELIN_NOTEBOOK_S3_BUCKET=yourBucketName export ZEPPELIN_NOTEBOOK_S3_ENDPOINT="http://s3.amazonaws.com/yourBucketName" export ZEPPELIN_NOTEBOOK_S3_USER=admin Step 3 Change this 1 inzeppelin-site.xml Point zeppelin.notebook.storage to org.apache.zeppelin.notebook.repo.S3NotebookRepo Detailed Steps below 1.Complete AWS pre-requisites 2.Create AWS credentials in Cloudbreak 3.Launch HDP cluster on AWS using Cloudbreak. 4.Enable Zeppelin Storage on S3. 3. Launch HDP Cluster on AWS using Cloudbreak. Not all the screenshots are included, only capturing screenshots that focus on some key Advance features that enable the required Zeppelin Storage. a.Use the Advance tab on Cloudbreak. b.Cloudbreak uses AWS credentials that will provide the necessary AWS Key and Secrect Access Key for S3 Storage setup. a.Provide an instance profile created in AWS that has access to your S3 b.Provide your bucket name for base storage 4.Enable Zeppelin Storage on S3. Once Ambari starts and all services are started we need to make some configuration changes to enable Zeppelin. In the zeppelin-config change the following properties: zeppelin_notebook.s3.bucket zeppelin_notebook.s3.user zeppelin_notebook.storage Or you could change them in zeppelin-env: export ZEPPELIN_NOTEBOOK_S3_BUCKET=bucketName export ZEPPELIN_NOTEBOOK_S3_ENDPOINT="http://s3.amazonaws.com/bucketName" export ZEPPELIN_NOTEBOOK_S3_USER=admin Here is an example path: bucket/user/notebook/2A94M5J1Z/note.json Now when you create and save Notebooks in Zeppelin, it will save in S3. You will be able to see the notebooks in your AWS portal, in your S3 bucket. Zeppelin notebooks use 9 character hash as the name of the folder and note.json file in that folder.

Purnima · ‎07-06-2018

Launching Cloudbreak on GCP Before launching Cloudbreak on GCP, you must meet the following prerequisites. Meet the Prerequisites Before launching Cloudbreak on GCP, you must meet the following prerequisites. GCP Account Service Account SSH Key Pair Region and zone GCP account In order to launch Cloudbreak on GCP, you must log in to your GCP account. If you don't have an account, you can create one at https://console.cloud.google.com. Once you log in to your GCP account, you must either create a project or use an existing project. To create a new project, provide name, choose organization or leave it as No Organization. Now Select this newly created Project. On the main dashboard page, you will find the Project ID. You will need this to define your credential in Cloudbbreak in a later step GCP - APIs Dashboard Go to the Service Accounts screen by (1) clicking the menu in the top left, (2) hovering over APIs and Services, and (3) Clicking on Dashboard. GCP - APIs & Services Dashboard Verify that the Google Compute Engine API is listed and enabled. If it is not click on the Enable APIs button to search for and enable it. Service account Go to the Service Accounts screen by (1) clicking the menu in the top left, (2) hovering over IAM & Admin, and (3) Clicking on Service Accounts. Create Service Account - Step 1 Click "Create Service Account" Create Service Account - Step 2 Give the service account a name Check the "Furnish a new key" box. This will download a key to your computer when you finish creating the account. If you are using Cloudbreak 2.7 or later, select JSON format key. Click the "Select a Role" dropdown Select the required Compute Engine roles (Compute Image User, Compute Instance Admin(v1) Compute Network Admin, Compute Security Admin, Compute Storage Admin). Select the Storage Admin role under Storage. Click outside of the roles selection dropdown to reveal the "create" button. All six of the roles shown are required for the service account Access to Google Storage If you also want to be able to use the GCP storage you need to add one more ROLE associated with the service account. The role is "Service account User" and you can find it under Service Accounts. You should now have the following Roles. SSH key pair Generate a new SSH key pair or use an existing SSH key pair. You will be required to provide it when launching the VM. On Linux or macOS workstations, you can generate a key with the ssh-keygen tool. Open a terminal on your workstation and use the ssh-keygen command to generate a new key. This command generates a private SSH key file and a matching public SSH key with the following structure: where: [KEY_VALUE] is the key value that you generated. [USERNAME] is the user that this key applies to. ssh-rsa [KEY_VALUE] [USERNAME] Editing public SSH key metadata Add or remove project-wide public SSH keys from the GCP Console: In the Google Cloud Platform Console, go to the metadata page for your project. GO TO THE METADATA PAGE - Click on 1. Burger Menu 2. Compute Engine 3. Metadata 4. SSH Keys tab click Edit, Add Item and then Save. Modify the project-wide public SSH keys: To add a public SSH key, click Add item at the bottom of the page. This will produce a text box. Copy the contents of your public SSH key file and paste them in to the text box. Repeat this process for each public SSH key that you want to add. To remove a public SSH key, click the removal button next to it: Region and zone Decide in which region and zone you would like to launch Cloudbreak. You can launch Cloudbreak and provision your clusters in all regions supported by GCP. Clusters created via Cloudbreak can be in the same or different region as Cloudbreak; when you launch a cluster, you select the region in which to launch it.

Purnima · ‎07-02-2018

Launching Cloudbreak cluster on Azure Before launching Cloudbreak cluster on Azure, review and meet the prerequisites. Meet the Prerequisites Before launching Cloudbreak on Azure, you must meet the following prerequisites. Azure Account Azure Region SSH Key Pair Authentication Subscription Id Tenant ID Application ID Azure Account In order to launch Cloudbreak on the Azure, log in to your existing Microsoft Azure account. If you don't have an account, you can set it up at https://azure.microsoft.com. Azure Region Decide in which Azure region you would like to launch Cloudbreak. You can launch Cloudbreak and provision your clusters in all regions supported by Microsoft Azure. Create an SSH key pair When launching Cloudbreak, you will be required to provide your public SSH key. If needed, you can generate a new SSH key pair: On MacOS X and Linux using ssh-keygen -t rsa -b 4096 -C "your_email@example.com" On Windows using PuTTygen On Azure shell. Launch azure shell via a browser https://shell.azure.com/ and Create an SSH key pair Use the ssh-keygen command to generate SSH public and private key files that are by default created in the ~/.ssh directory. Authentication For Subscription ID Search for Subscription Service Overview Subscription ID For Tenant ID Look for Azure Active Directory Properties Directory ID is Tenant ID For Application ID Look for Azure Active Directory App registration Identify the application for cloudbreak Pick the Application Id associated If an Application is not already there, then you can create one. The key thing is that this Application will need Contributor Role associated and normally only your Enterprise Account owner will be able to grant that access. Here is a screenshot for creating the Application. For App Key Look for Azure Active Directory and click App registration Identify the application for cloudbreak Click on Settings Keys Generate new Key or access an existing Key. This key is visible the first time its created only, so please makes ure to save it in a safe place.

Purnima · ‎07-02-2018

Cloudbreak on AWS Before launching Cloudbreak on AWS, review and meet the prerequisites. Meet the Prerequisites Before launching Cloudbreak on AWS, you must meet the following prerequisites. AWS Account AWS Region SSH Key Pair Key Based Authentication AWS Account In order to launch Cloudbreak on AWS, you must log in to your AWS account. If you don't have an account, you can create one at https://aws.amazon.com/. AWS Region Decide in which AWS region you would like to launch Cloudbreak. The following AWS regions are supported: SSH Key Pair Import an existing Key Pair or generate a new Key Pair in the AWS region which you are planning to use for launching Cloudbreak and clusters. You need this SSH Key Pair to SSH to the Cloudbreak instance and Start Cloudbreak. To do this use the following steps. Navigate to the Amazon EC2 console at https://console.aws.amazon.com/ec2/. Check the region listed in the top right corner to make sure that you are in the correct region. In the left pane, find NETWORK AND SECURITY and click Key Pairs. Do one of the following: To generate a new Key Pair: Click Create Key Pair to create a new key pair Your private Key file will be automatically downloaded onto your computer. Make sure to save it in a secure location. You will need it to SSH to the cluster nodes. You may want to change access settings for the file using chmod 400 my-key-pair.pem. To import an existing Public Key: Click Import Key Pair to upload an existing Public Key Select the Public Key and click Import. Make sure that you have access to its corresponding Private Key. Key Based Authentication If you are using key-based authentication for Cloudbreak on AWS, you must be able to provide your AWS Access Key and Secret Key pair. Cloudbreak will use these keys to launch the resources. You must provide the Access and Secret Keys later in the Cloudbreak web UI later when creating a credential. If you choose this option, all you need to do at this point is check your AWS account and make sure that you can access this Key Pair. You can generate new access and secret keys from the IAM Console. To do this go to the IAM Service: In the left pane click on Users > Select a user Click on the Security credentials tab Creat Access Key or use an existing Access Key. There is a limit of two. To create a new user go to IAM Service: 1.In the left pane click on Users > Select Add User 2.Enter a user name, select Programming access and then click on Next: Permissions 3.For Set Permissions keep the defaul "Add usert to group" 4.In Add User to group select all the groups then click Next:Review 5.Review your choices and click on Create User 6.If your user has been created successfully you will see a similar image as below 7.Once you are done with these steps, you are now ready to create cloud credentials in your Cloudbreak application. Log into the Cloudbreak application 1. Confirm the security exception Access Cloudbreak via a browser: https://cloudbreak-crashcourse.com/sl/ You will likely see a browser warning when you first open the Ambari UI. That is because we are using self-signed certificates. Click on Advance and then proceed to cloudbreak server. 2. login page Cloudbreak login page. Credentials will be provided for these services by the instructor. Once you log in as a first time user, your screen will look like this. Click on Go To Credentials . 3. Create Cloudbreak Credentials The first thing you need are credentials to a cloud provider of your choice. Click on Credentials in the right menu Then click on Create Credentials Pick the cloud provider of your choice Pick a type of credential, Role based or Key based. For the purpose of this tutorial pick Role based. Provide a name and description for the credential. Get the Access Key and Secret Key from previous section. If everything has gone well, then the credentials are created.

Online	Offline
Last Visited	‎06-13-2019 03:26 PM

Member Since	‎04-10-2019 08:43 AM
Last Visited	‎06-13-2019 03:26 PM
Posts	16
Kudos received	30

Cloudera Community

MySQL JDBC Driver as a recipe

YARN REGISTRY DNS - Port Conflict Issue

Cloudbreak - Quickstart on AWS

Nifi Flow for writing to S3, WASB and Google Stora...

Re: Hive file:/path/to/hiveddl.sql does not exist.

Hive file:/path/to/hiveddl.sql does not exist.

Use S3 as storage for Zeppelin Notebooks.

Cloudbreak - Google Cloud prerequisites

Cloudbreak - Azure prerequisites

Cloudbreak - AWS prerequisites