Folks: We are in the process of migrating away from Confluent Kafka to HDF Kafka. One of the key features we require to land the Kafka data into HDFS. All data in Kafka is in the Avro format, and we are planning to use the HDF Schema Registry to manage the Avro schemas. While working with getting the Connect environment up, we have noticed a few things. As far as we can tell, no additional connectors are included in the HDF distribution (like, say, one that could write to HDP). Odd, but that's ok, we'll go with the open source ones The Apache JSON connector included works as expected in standalone and distributed mode Both the HDFS and JDBC open source connectors from Confluent work as expected with JSON documents in standalone and distributed mode When we add Confluent's open source Avro converter, we get this error message: [2019-01-15 00:34:33,705] INFO Kafka version : 18.104.22.168.3.0.0-165 (org.apache.kafka.common.utils.AppInfoParser:109)[2019-01-15 00:34:33,705] INFO Kafka commitId : bd037de41b621a69 (org.apache.kafka.common.utils.AppInfoParser:110)[2019-01-15 00:34:33,848] INFO Kafka cluster ID: IH7zs9ZPQoaoCeNcBQZQxw (org.apache.kafka.connect.util.ConnectUtils:59) [2019-01-15 00:34:33,862] INFO Logging initialized @7446ms to org.eclipse.jetty.util.log.Slf4jLog (org.eclipse.jetty.util.log:193) [2019-01-15 00:34:33,898] INFO Added connector for http://:8083 (org.apache.kafka.connect.runtime.rest.RestServer:119) [2019-01-15 00:34:33,917] INFO Advertised URI: http://IP-ADDRESS:8083/ (org.apache.kafka.connect.runtime.rest.RestServer:267) [2019-01-15 00:34:33,926] INFO Kafka version : 22.214.171.124.3.0.0-165 (org.apache.kafka.common.utils.AppInfoParser:109) [2019-01-15 00:34:33,926] INFO Kafka commitId : bd037de41b621a69 (org.apache.kafka.common.utils.AppInfoParser:110) [2019-01-15 00:34:33,936] ERROR Stopping due to error (org.apache.kafka.connect.cli.ConnectDistributed:117)io.confluent.common.config.ConfigException: Missing required configuration "schema.registry.url" which has no default value. at io.confluent.common.config.ConfigDef.parse(ConfigDef.java:251) at io.confluent.common.config.AbstractConfig.<init>(AbstractConfig.java:78) at io.confluent.kafka.serializers.AbstractKafkaAvroSerDeConfig.<init>(AbstractKafkaAvroSerDeConfig.java:105) at io.confluent.connect.avro.AvroConverterConfig.<init>(AvroConverterConfig.java:27) at io.confluent.connect.avro.AvroConverter.configure(AvroConverter.java:60) at org.apache.kafka.connect.runtime.isolation.Plugins.newConverter(Plugins.java:266) at org.apache.kafka.connect.runtime.Worker.<init>(Worker.java:115) at org.apache.kafka.connect.cli.ConnectDistributed.main(ConnectDistributed.java:88) The schema.registry.url is in the connect-distributed.properties file for the connector and works without incident so long as the internal.key.converter/internal.value.converter values are set to JsonConverter rather than AvroConverter. As soon as we comment out the Avro converter lines, everything goes back to the expected behaviour. I suspect that the problem is somewhere in the Confluent configuration, as the fifth INFO statement indicates adding a connector for a NULL server (in italics above), but I'm posting this in case other people have seen similar issues while we're working through the opaque Confluent configuration documentation.
... View more
@quilkpoac If your SAN supports the AWS authentication mechanisms then yes, you can use it. I'll call out the Western Digital store as one I know works: they've been very busy in the open source side of things. For other stores, tuning the authentication options is the usual troublespot Start by pointing the clients at your local store by setting fs.s3a.endpoint to the hostname of the service. Probably also set fs.s3a.path.style.access to true, unless your system creates a DNS entry for every bucket. After that, it's down to playing with authentication. The propery fs.s3a.signing-algorithm is passed straight down to the AWS SDK here; a quick glance at its implementation implies it can be one of: NoOpSignerType, AWS4UnsignedPayloadSignerType, AWS3SignerType, AWS4SignerType and QueryStringSignerType. The v4 signing API is new and unlikely to work; the S3A default is the v3 one
... View more