Created 02-07-2017 01:18 AM
Hi,
We have installed and configure CDH 5.9.1 using couldera director. We are trying to deply streamsets data collector 2.2 as cluster service using percel. I could download, distrubute and activate parcel from Manager. However I tried to add service but I could not find any option to choose streamsets data collector. I tried streamsets version 2.1 as well but no luck.
Your help will be appreciated.
Thanks
SP
Created 02-09-2017 03:16 PM
I'm not sure if there's official support with Director, but it can be done using the conf file with the help of a bootstrap script:
Specify a bootstrap script (only for the CM instance) to download and place the csd jar in the appropriate csd directory.
instances {
  cminstance {
    type: m4.xlarge
    image: ami-ac5f2fcc
    tags {
      owner: ${?USER}
    }
    bootstrapScript: """#!/bin/sh
yum -y install wget
wget https://archives.streamsets.com/datacollector/2.3.0.0/csd/STREAMSETS-2.3.0.0.jar
mkdir -p /opt/cloudera/csd
mv STREAMSETS-2.3.0.0.jar /opt/cloudera/csd/
"""
  }
  ...
}You also have to specify the Product, Service, Role name along with the Parcel Repository URL in the conf file. The following worked for me (I went through a manual install to get these values):
cluster {
  # add the streamset data collector product
  products {
    CDH: 5
    STREAMSETS_DATACOLLECTOR: 2.3
  }
  # add the streamset parcel repository
  parcelRepositories: ["http://archive.cloudera.com/cdh5/parcels/5.9/",
                       "https://archives.streamsets.com/datacollector/latest/parcel/"]
  # add the service
  services: [HDFS, YARN, STREAMSETS,...]
  ...
  workers {
  	...
    # add the data collector role to the streamset service
    roles {
      HDFS: [DATANODE]
      YARN: [NODEMANAGER]
      STREAMSETS: [DATACOLLECTOR]
      ...
    }
  }
}
Created 02-09-2017 03:16 PM
I'm not sure if there's official support with Director, but it can be done using the conf file with the help of a bootstrap script:
Specify a bootstrap script (only for the CM instance) to download and place the csd jar in the appropriate csd directory.
instances {
  cminstance {
    type: m4.xlarge
    image: ami-ac5f2fcc
    tags {
      owner: ${?USER}
    }
    bootstrapScript: """#!/bin/sh
yum -y install wget
wget https://archives.streamsets.com/datacollector/2.3.0.0/csd/STREAMSETS-2.3.0.0.jar
mkdir -p /opt/cloudera/csd
mv STREAMSETS-2.3.0.0.jar /opt/cloudera/csd/
"""
  }
  ...
}You also have to specify the Product, Service, Role name along with the Parcel Repository URL in the conf file. The following worked for me (I went through a manual install to get these values):
cluster {
  # add the streamset data collector product
  products {
    CDH: 5
    STREAMSETS_DATACOLLECTOR: 2.3
  }
  # add the streamset parcel repository
  parcelRepositories: ["http://archive.cloudera.com/cdh5/parcels/5.9/",
                       "https://archives.streamsets.com/datacollector/latest/parcel/"]
  # add the service
  services: [HDFS, YARN, STREAMSETS,...]
  ...
  workers {
  	...
    # add the data collector role to the streamset service
    roles {
      HDFS: [DATANODE]
      YARN: [NODEMANAGER]
      STREAMSETS: [DATACOLLECTOR]
      ...
    }
  }
}
Created 03-02-2017 07:48 PM
Created 10-08-2018 04:43 AM
Hello @aarman,
could you give me please further assistance regarding the config-files you mentioned?
I have a similar issue like @SAKTIPADA.
I installed and distributed the StreamSets Data Collector version 3.5.0 Parcel with Cloudera Manager on our CDH 5.14.1 with success but I am not able to add the services with CM because there is no StreamSets option.
Thanks.
Created on 10-08-2018 06:15 AM - edited 10-08-2018 06:17 AM
My fault. I forgot to add the Custom Service Descriptor (CSD) to /opt/cloudera/csd/.
After restarting scm with: service cloudera-scm-server restart I was able to find StreamSets on the AddService-Page.
Created 10-08-2018 03:50 PM
@Baris Glad you got it working. With Director version 2.4+ you actually don't need to use a bootstrap script as shown in my previous example. You can just specify the CSD URL on the conf file and Director will automatically download and place it in '/opt/cloudera/csd/'.
See the documentation here: https://www.cloudera.com/documentation/director/latest/topics/director_non-cdh_products_custom_descr...
 
					
				
				
			
		
