Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Sreamsets setup on CDH 5.9.1

Solved Go to solution

Sreamsets setup on CDH 5.9.1

New Contributor

Hi,

 

We have installed and configure CDH 5.9.1 using couldera director. We are trying to deply streamsets data collector 2.2  as cluster service using percel. I could download, distrubute and activate parcel from Manager. However I tried to add service but I could not find any option to choose  streamsets data collector.  I tried streamsets version 2.1 as well but no luck.

 

Your help will be appreciated. 

 

Thanks

SP

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Sreamsets setup on CDH 5.9.1

Contributor

I'm not sure if there's official support with Director, but it can be done using the conf file with the help of a bootstrap script:

 

Specify a bootstrap script (only for the CM instance) to download and place the csd jar in the appropriate csd directory.

 

instances {

  cminstance {
    type: m4.xlarge
    image: ami-ac5f2fcc

    tags {
      owner: ${?USER}
    }

    bootstrapScript: """#!/bin/sh
yum -y install wget
wget https://archives.streamsets.com/datacollector/2.3.0.0/csd/STREAMSETS-2.3.0.0.jar
mkdir -p /opt/cloudera/csd
mv STREAMSETS-2.3.0.0.jar /opt/cloudera/csd/
"""
  }

  ...
}

You also have to specify the Product, Service, Role name along with the Parcel Repository URL in the conf file. The following worked for me (I went through a manual install to get these values):

 

cluster {

  # add the streamset data collector product
  products {
    CDH: 5
    STREAMSETS_DATACOLLECTOR: 2.3
  }


  # add the streamset parcel repository
  parcelRepositories: ["http://archive.cloudera.com/cdh5/parcels/5.9/",
                       "https://archives.streamsets.com/datacollector/latest/parcel/"]

  # add the service
  services: [HDFS, YARN, STREAMSETS,...]

  ...

  workers {
  	...

    # add the data collector role to the streamset service
    roles {
      HDFS: [DATANODE]
      YARN: [NODEMANAGER]
      STREAMSETS: [DATACOLLECTOR]
      ...
    }
  }
}

 

5 REPLIES 5

Re: Sreamsets setup on CDH 5.9.1

Contributor

I'm not sure if there's official support with Director, but it can be done using the conf file with the help of a bootstrap script:

 

Specify a bootstrap script (only for the CM instance) to download and place the csd jar in the appropriate csd directory.

 

instances {

  cminstance {
    type: m4.xlarge
    image: ami-ac5f2fcc

    tags {
      owner: ${?USER}
    }

    bootstrapScript: """#!/bin/sh
yum -y install wget
wget https://archives.streamsets.com/datacollector/2.3.0.0/csd/STREAMSETS-2.3.0.0.jar
mkdir -p /opt/cloudera/csd
mv STREAMSETS-2.3.0.0.jar /opt/cloudera/csd/
"""
  }

  ...
}

You also have to specify the Product, Service, Role name along with the Parcel Repository URL in the conf file. The following worked for me (I went through a manual install to get these values):

 

cluster {

  # add the streamset data collector product
  products {
    CDH: 5
    STREAMSETS_DATACOLLECTOR: 2.3
  }


  # add the streamset parcel repository
  parcelRepositories: ["http://archive.cloudera.com/cdh5/parcels/5.9/",
                       "https://archives.streamsets.com/datacollector/latest/parcel/"]

  # add the service
  services: [HDFS, YARN, STREAMSETS,...]

  ...

  workers {
  	...

    # add the data collector role to the streamset service
    roles {
      HDFS: [DATANODE]
      YARN: [NODEMANAGER]
      STREAMSETS: [DATACOLLECTOR]
      ...
    }
  }
}

 

Re: Sreamsets setup on CDH 5.9.1

New Contributor
Many Thanks AARMAN !! It worked.

Re: Sreamsets setup on CDH 5.9.1

Explorer

Hello @aarman,
could you give me please further assistance regarding the config-files you mentioned?
I have a similar issue like @SAKTIPADA.
I installed and distributed the StreamSets Data Collector version 3.5.0 Parcel with Cloudera Manager on our CDH 5.14.1 with success but I am not able to add the services with CM because there is no StreamSets option.

 

Thanks. 

Highlighted

Re: Sreamsets setup on CDH 5.9.1

Explorer

My fault. I forgot to add the Custom Service Descriptor (CSD) to /opt/cloudera/csd/.
After restarting scm with: service cloudera-scm-server restart I was able to find StreamSets on the AddService-Page.

Re: Sreamsets setup on CDH 5.9.1

Contributor

@Baris Glad you got it working. With Director version 2.4+ you actually don't need to use a bootstrap script as shown in my previous example. You can just specify the CSD URL on the conf file and Director will automatically download and place it in '/opt/cloudera/csd/'.

 

See the documentation here: https://www.cloudera.com/documentation/director/latest/topics/director_non-cdh_products_custom_descr...