Community Articles

Find and share helpful community-sourced technical articles.
avatar
Cloudera Employee

Custom parcels and CSDs allow partners and clients to bring their libraries and packages to CDP.  Traditionally, customers would use recipes to install parcels and CSDs or manually copy them to appropriate nodes. But, with CDP Public Cloud, you need a modular approach to do this. To address that need, Cloudera has developed the capability of specifying the parcel and CSD information as part of the cluster template (blueprint). With this approach, the parcel and CSD will get downloaded, distributed and activated as part of the cluster build process.

In the example below, FLINK 1.12 should be installed by default, but a different version of the FLINK parcel and CSD are passed for the Cloudera Manager to download, distribute and activate. FLINK service is added to the cluster during the cluster import.

High level steps are as follows:

  • Modify Cluster Template to define roletype, servicetype, hostgroup configs. 
  • Get the Image Catalog ID
  • Create a Custom Request Template : add Parcels/CSDs ( cli parameter) 
  • Create/Update Manifest json  
  • CLI : Provision DH cluster using CLI

Modify Cluster Template

CDP UI > Management Console > Shared Resources > Create Cluster Template.

e.g: 

 

 

 

{
  "cdhVersion": "7.2.12",
  "displayName": "dataengineering",
  "blueprintUpgradeOption": "MAINTENANCE_UPGRADE_GA",
  "services": [
    {
      "refName": "$partner_solution",
      "serviceType": "$custom_solution_name",
      "serviceConfigs": [
        {
          "name": "partner_solution_license_specification",
          "value": "XXXXXXXXX"
        },
        {
          "name": "$partner_solution_dashboard_realm",
          "value": "cloudera-cdp-test-1"
        }
      ],
      "roleConfigGroups": [
        {
          "refName": "$partner_solution-AGENT-BASE",
          "roleType": "$partner_solution_AGENT",
          "base": true
        },
        {
          "refName": "$partner_solution-SUPERVISOR-BASE",
          "roleType": "$partner_solution_SUPERVISOR",
          "base": true
        },
        {
          "refName": "$partner_solution-COLLECTOR-BASE",
          "roleType": "$partner_solution_COLLECTOR",
          "base": true
        }
      ]
    },
 {
      "refName": "worker", --- Modify compute, master, worker, gatewat hostgroups as needed --
      "cardinality": 3,
      "roleConfigGroupsRefNames": [
        "hdfs-DATANODE-BASE",
        "yarn-NODEMANAGER-WORKER",
        "$partner_solution_AGENT-BASE",
        "$partner_solution_COLLECTOR-BASE"
      ]
    },

 

 

 

Note the cluster template name.(e.g: 7.2.12 Data Engineering: Customer template)

Get the Image Catalog ID

  1. Get the AMI-ID of the DataLake. Log on the CDP > DataLake > Image Details (for a new/recently created environments). Create a temporary Datahub to get the AMI-ID, if the DL was created a while ago.nij23_1-1608071291234.png
  2. Open the Image Catalog json
  3. Search for AMI-ID in the Image Catalog jsonnij23_2-1608071291255.png
  4. Note the UUID. Make sure the image is in the right region and the cloud provider (AWS/Azure).

Create a Custom Request Template

Add environment name: This is where the cluster will be deployed. 
BlueprintName - from the first step(7.2.12 Data Engineering: Customer template)

 

 

 

"name": “DEFINITION_NAME”,
"environmentName": “CDP_ENV_NAME”,
"cluster": {
"databases": [],
"blueprintName": “7.2.12 Data Engineering: Customer template"
                                 

 

 

 

 

Add CM base url, products: name, version and Parcels.

CM:

 

 

 

"cm": {
     "enableAutoTls": true,
       "repository": {
       "baseUrl": "http://cloudera-build-us-west-1.vpc.cloudera.com/cdh/7.x/parcels/",
       "version": “7.2.2”
       },

 

 

 

 

Add Products: CDH, FLINK, Custom Parcels and CSDs

 

 

 

        "products": [
                {
          "name": "CDH",
           "version": "7.2.2-1.cdh7.2.2.p1.6575992",
           "parcel": "https://archive.cloudera.com/p/cdp-public/7.2.2.1/parcels/"
          },
          {
           "name": "FLINK",
           "version": "1.10.0-csa1.2.1.0-cdh7.2.1.0-240-4844562",
           "parcel": "http://XXXXXXXXXXXXX/parcels/",
           "csd": [ "http://XXXXXXXXXX/parcels/FLINK-1.10.0-csa1.2.1.0-cdh7.2.1.0-240-4844562.jar”

             ]}}

 

 

 

 

 

Add image ID at the end.

 

 

 

         "image": {
         "catalog": "cdp-default",
         "id": "eb1d458a-bf82-434b-bc2a-ac06b8cd6c0a"

 

 

 

 

A Sample request-template:

 

 

 

{
  "instanceGroups": [
    {
      "nodeCount": 1,
      "name": "manager",
      "type": "GATEWAY",
      "recoveryMode": "MANUAL",
      "template": {
        "aws": {
          "encryption": {
            "type": "NONE",
            "key": null
          }
        },
        "instanceType": "m5.2xlarge",
        "rootVolume": {
          "size": 50
        },
        "attachedVolumes": [
          {
            "size": 100,
            "count": 1,
            "type": "standard"
          }
        ],
        "cloudPlatform": "AWS"
      },
      "recipeNames": []
    },
    {
      "nodeCount": 2,
      "name": "master",
      "type": "CORE",
      "recoveryMode": "MANUAL",
      "template": {
        "aws": {
          "encryption": {
            "type": "NONE",
            "key": null
          }
        },
        "instanceType": "m5.2xlarge",
        "rootVolume": {
          "size": 50
        },
        "attachedVolumes": [
          {
            "size": 100,
            "count": 1,
            "type": "standard"
          }
        ],
        "cloudPlatform": "AWS"
      },
      "recipeNames": []
    },
    {
      "nodeCount": 3,
      "name": "worker",
      "type": "CORE",
      "recoveryMode": "MANUAL",
      "template": {
        "aws": {
          "encryption": {
            "type": "NONE",
            "key": null
          }
        },
        "instanceType": "m5.2xlarge",
        "rootVolume": {
          "size": 50
        },
        "attachedVolumes": [
          {
            "size": 100,
            "count": 1,
            "type": "standard"
          }
        ],
        "cloudPlatform": "AWS"
      },
      "recipeNames": []
    }
  ],
  "name": "pse-flink-cust",
  "environmentName": "pse-722-cdp-env",
  "cluster": {
    "databases": [],
    "exposedServices": [
      "ALL"
    ],
    "blueprintName": "7.2.2 - Streaming Analytics Light Duty with Apache Flink",
    "validateBlueprint": false,
    "cm": {
        "enableAutoTls": true,
        "repository": {
          "baseUrl": "http://cloudera-build-us-west-1.vpc.cloudera.com/s3/build/4763198/cdh/7.x/parcels/",
          "version": "7.2.2"
        },
        "products": [
          {
            "name": "CDH",
            "version": "7.2.2-1.cdh7.2.2.p1.6575992",
            "parcel": "https://archive.cloudera.com/p/cdp-public/7.2.2.1/parcels/"
          },
          {
            "name": "FLINK",
            "version": "1.10.0-csa1.2.1.0-cdh7.2.1.0-240-4844562",
            "parcel": "http://54.153.84.214/parcels/",
            "csd": [
              "http://54.153.84.214/parcels/FLINK-1.10.0-csa1.2.1.0-cdh7.2.1.0-240-4844562.jar"
            ]
          }
        ]
    },
    "Xcm": {
        "enableAutoTls": true,
        "repository": {
          "baseUrl": "https://archive.cloudera.com/p/cm-public/7.2.2-6458542/redhat7/yum/",
          "version": "7.2.2"
        },
        "products": [
          {
            "name": "CDH",
              "version": "7.2.2-1.cdh7.2.2.p1.6575992",
              "parcel": "https://archive.cloudera.com/p/cdp-public/7.2.2.1/parcels/"
          }
        ]
    }
  },
  "image": {
      "catalog": "cdp-default",
      "id": "eb1d458a-bf82-434b-bc2a-ac06b8cd6c0a"
  },
  "inputs": {}
}

 

 

 

 

CDP CLI Provision to Cluster 

 

 

 

$ cdp datahub --profile pse-demo  create-aws-cluster --request-template file:///Users/njlamsal/flink-pse-demo.json --cluster-name Custom-flink-1215

 

 

 

profile: Configure profile using cdpcli.

request-template: Template created above. Note to use file://

cluster-name: Name of the cluster 

 

NOTE: Manifest File And Naming Of The Parcel

Parcel filenames must follow a specific format: [name]-[version]-[distro suffix].parcel. The list of valid distro suffixes is available here. Otherwise, improper naming will lead to parcel not being recognized during the cluster installation.

It is also necessary to have a manifest file in the parcel directory. The manifest creator will create a manifest.json file for a directory of parcels, so that directory can be served as a parcel repository.
Manifest Creator : https://github.com/cloudera/cm_ext/blob/master/make_manifest/README.md
Cloudera Schema/Parcel Validator : https://github.com/cloudera/cm_ext/blob/master/validator/README.md
Cloudera Manager Extensions Wiki : https://github.com/cloudera/cm_ext/wiki

 

 

ERROR(s):

- An error occurred: {"message":"Image or image catalog settings are incorrect: CloudbreakImageCatalogException: Inconsistent request, base images are disabled but custom repo information is submitted!"} (Status Code: 400; Error Code: INVALID_ARGUMENT; Service: datahub; Operation: createAWSCluster; Request ID: a7feb7a9-5547-43b2-8cfd-9543067677cc;)

Check DL and DH image IDs. If they are different, take the DH AMI-ID and find the catalog ID (UUID).

 

- Failed to deploy parcels(see the section above on manifest and the naming of the parcel)

nij23_0-1641915795752.png

 

 

 

5,386 Views
Comments
avatar
Explorer

Hi. Thanks for this article. Which version of CDP Platform is this feature available in?

avatar
Cloudera Employee

@FRG96  CDP runtime 7.2.2

avatar
Explorer

@nij23, Thanks. I created a cluster definition on CDP 7.2.6 (latest available version) following the above steps and was able to create a Data Hub with my custom CSD and parcels distributed, activated. I used the cdp cli to successfully create the datahub cluster. 

I wanted to save this definition to use from the management console too, so I used "cdp datahub create-cluster-definition" command. But when I viewed the cluster definition JSON on Management console (Shared resources section), the "enableAutoTls" property was set to false even though I had set it to true explicitly in my JSON file.
When I tried creating a cluster with this definition, I wasn't able to access the CM-UI or other service UIs after the cluster was created and in running state. Just wanted to point this out, that I am able to create cluster via cdpcli but via management console UI it was failing. Any suggestions on this?

FRG96_0-1610635732865.png


Additionally, Can you point me to any documentation for the model/schema of the CDP Cluster Definition?

avatar
Cloudera Employee

@FRG96 You will need to adjust user permission to access the UI. It is blocked by default. 

Review account-level roles and resource roles here: https://docs.cloudera.com/management-console/cloud/user-management/topics/mc-understanding-roles-res...

 

After adding the user to an appropriate admin role, sync it to freeIPA. More on that here: https://docs.cloudera.com/management-console/cloud/user-management/topics/mc-sync.html

 

avatar
Explorer

@nij23  I have the correct account and resource role in my environment. I restarted the cluster created from the custom cluster definition and when I opened the CM-UI I got this message:

Screenshot 2021-01-15 105437.png

Above Text for reference:

HTTP ERROR 500 javax.servlet.ServletException: javax.servlet.ServletException: java.lang.NullPointerException: Custom client SSL context, if set, must not be null.
URI: /test5/cdp-proxy/cmf/home/
STATUS: 500
MESSAGE: javax.servlet.ServletException: javax.servlet.ServletException: java.lang.NullPointerException: Custom client SSL context, if set, must not be null.
SERVLET: cdp-proxy-knox-gateway-servlet
CAUSED BY: javax.servlet.ServletException: javax.servlet.ServletException: java.lang.NullPointerException: Custom client SSL context, if set, must not be null.
CAUSED BY: javax.servlet.ServletException: java.lang.NullPointerException: Custom client SSL context, if set, must not be null.
CAUSED BY: java.lang.NullPointerException: Custom client SSL context, if set, must not be null.

 

 

When I try to open the CM-UI the second time and in any subsequent tries, I get the 403 Forbidden message. I see the 500 Servlet Exception only the first time when I try to access CM-UI.

 

Could it be because the "enableAutoTls" property was automatically set to false when I registered the custom cluster definition even though I had explicitly set it to true in my cluster definition JSON?