Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: The Cloudera Community will undergo maintenance on Saturday, August 17 at 12:00am PDT. See more info here.

Kafka connect-distributed.sh auto start (re-post)

Highlighted

Kafka connect-distributed.sh auto start (re-post)

Contributor

Hi,

 

I posed this question back in February of this year before I knew Cloudera bought HortonWorks.

Original post:

-----------------------------------------------------------------------------------------------------------------------------------------------------

I have installed and setup Kafka (KAFKA-3.1.1-1.3.1.1.p0.2) in Cloudera Manager (Cloudera Enterprise 5.14.3) successfully.  I have also configured and setup a Splunk connector to allow Splunk to consume Cloudera Audit data.

However, I have to manually launch the connect-distributed.sh script and register the Splunk Sink connector if something fails.  If the server is restarted I would have log into the server and manually run the 2 commands (curl) to get the distributed service (or maybe I should call it a role) running and to register it with the Splunk service.

Is there a way to run scripts automatically when Cloudera Manager is used to restart Kafka?

If not, I'm thinking I will create a Python based framework that runs in cron and checks the health of the connect-distributed.sh service and re-run it if it is down.

Thanks!

-----------------------------------------------------------------------------------------------------------------------------------------------------

Now the question is, given that HortonWorks has their DataFlow products, in which Cloudera now owns, and the ability to manage schemas in Kafka, does the Schema Manager manage the connect_distributed service or is it even needed with CDF now?

 

Thanks!