Support Questions

Find answers, ask questions, and share your expertise

Kafka HDFS connection

Explorer

Dear All,

I want to transfer data from Kafka to HDFS.

I searched online and found that this can be done through camus and gobblin.

Would like to know if there is a default HDFS connector that comes with HDP 2.6 which is ready to use with minimal coding/configuration. And a useful link on how to use it.

Thanks in advance.

Best Regards,

Gagan

2 REPLIES 2

This can be done easily with Apache NiFi using a couple of out-of-the-box processors...

- ConsumeKafka

- MergeContent (to merge together small messages from Kafka to a larger file for HDFS)

- PutHDFS

Super Collaborator

If you don't have NiFi, Camus is deprecated in favor of Gobblin, yes, but Confluent has packaged Kafka Connect specifically for transferring data between various source and sinks, such as HDFS.

https://www.confluent.io/product/connectors/

https://docs.confluent.io/current/connect/connect-hdfs/docs/hdfs_connector.html