Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Kafka Connect/HDF Schema Registry with Confluent Avro Converter Plugin

Kafka Connect/HDF Schema Registry with Confluent Avro Converter Plugin

New Contributor

Folks:

We are in the process of migrating away from Confluent Kafka to HDF Kafka. One of the key features we require to land the Kafka data into HDFS. All data in Kafka is in the Avro format, and we are planning to use the HDF Schema Registry to manage the Avro schemas.

While working with getting the Connect environment up, we have noticed a few things.

  1. As far as we can tell, no additional connectors are included in the HDF distribution (like, say, one that could write to HDP). Odd, but that's ok, we'll go with the open source ones
  2. The Apache JSON connector included works as expected in standalone and distributed mode
  3. Both the HDFS and JDBC open source connectors from Confluent work as expected with JSON documents in standalone and distributed mode
  4. When we add Confluent's open source Avro converter, we get this error message:

    [2019-01-15 00:34:33,705] INFO Kafka version : 2.0.0.3.3.0.0-165 (org.apache.kafka.common.utils.AppInfoParser:109)[2019-01-15 00:34:33,705] INFO Kafka commitId : bd037de41b621a69 (org.apache.kafka.common.utils.AppInfoParser:110)[2019-01-15 00:34:33,848] INFO Kafka cluster ID: IH7zs9ZPQoaoCeNcBQZQxw (org.apache.kafka.connect.util.ConnectUtils:59)
    [2019-01-15 00:34:33,862] INFO Logging initialized @7446ms to org.eclipse.jetty.util.log.Slf4jLog (org.eclipse.jetty.util.log:193)
    [2019-01-15 00:34:33,898] INFO Added connector for http://:8083 (org.apache.kafka.connect.runtime.rest.RestServer:119)
    [2019-01-15 00:34:33,917] INFO Advertised URI: http://IP-ADDRESS:8083/ (org.apache.kafka.connect.runtime.rest.RestServer:267)
    [2019-01-15 00:34:33,926] INFO Kafka version : 2.0.0.3.3.0.0-165 (org.apache.kafka.common.utils.AppInfoParser:109)
    [2019-01-15 00:34:33,926] INFO Kafka commitId : bd037de41b621a69 (org.apache.kafka.common.utils.AppInfoParser:110)
    [2019-01-15 00:34:33,936] ERROR Stopping due to error (org.apache.kafka.connect.cli.ConnectDistributed:117)io.confluent.common.config.ConfigException: Missing required configuration "schema.registry.url" which has no default value.
    at io.confluent.common.config.ConfigDef.parse(ConfigDef.java:251)
    at io.confluent.common.config.AbstractConfig.<init>(AbstractConfig.java:78)
    at io.confluent.kafka.serializers.AbstractKafkaAvroSerDeConfig.<init>(AbstractKafkaAvroSerDeConfig.java:105)
    at io.confluent.connect.avro.AvroConverterConfig.<init>(AvroConverterConfig.java:27)
    at io.confluent.connect.avro.AvroConverter.configure(AvroConverter.java:60)
    at org.apache.kafka.connect.runtime.isolation.Plugins.newConverter(Plugins.java:266)
    at org.apache.kafka.connect.runtime.Worker.<init>(Worker.java:115)
    at org.apache.kafka.connect.cli.ConnectDistributed.main(ConnectDistributed.java:88)

The schema.registry.url is in the connect-distributed.properties file for the connector and works without incident so long as the internal.key.converter/internal.value.converter values are set to JsonConverter rather than AvroConverter. As soon as we comment out the Avro converter lines, everything goes back to the expected behaviour.

I suspect that the problem is somewhere in the Confluent configuration, as the fifth INFO statement indicates adding a connector for a NULL server (in italics above), but I'm posting this in case other people have seen similar issues while we're working through the opaque Confluent configuration documentation.

Don't have an account?
Coming from Hortonworks? Activate your account here