Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

How can I set Druid to ingest streaming data from Kerberized Kafka?

New Contributor

I have Kerberized Kafka broker.

I am trying to setup Druid to ingest streaming data from that Kafka:

  • by submitting supervisor spec to Druid overlord
    I do see that there is field for Kafka server URL(bootstrap.servers) but I don't see any field for Kerberos information.

How can I set up Druid so that it can ingest streaming data from Kerberized Kafka?

2 REPLIES 2

Expert Contributor

you can set the kerberos credentials as part of the consumer properties, look at this thread and let me know if it works for you.

https://groups.google.com/forum/#!topic/druid-user/W2SiPnNsy0U

New Contributor

@David Han,

I have tried an alternative approach and was successful in implementing Kafka Ingestion service in the production environment for a Kerberized Kafka cluster.

Step #1: Configure/setup a "PLAINTEXT" listener for the Kafka broker. You can follow this

Step #2: Add the following "ioConfig" specs to the Druid supervisor-spec (Druid Indexing task)

"ioConfig":{
      "topic":"<Kafka_topic_name>",
      "consumerProperties":{
         "bootstrap.servers":"<myHost:6668(comma_separated_list_for_more_than_one_host)>",
	 "group.id": "<Kafka_topic_consumer_group>",
	 "security.protocol": "PLAINTEXT"
      }

Step #3: Submit the supervisor spec as following

curl -X POST -H 'Content-Type: application/json' -d @supervisor-spec.json http://<OVERLOAD_IP>:<OVERLOAD_PORT>/druid/indexer/v1/supervisor

This indexing tasks reads events using Kafka's own partition and offset mechanism providing guaranteed exactly-once ingestion.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.