I want to transfer data from Kafka to HDFS.
I searched online and found that this can be done through camus and gobblin.
Would like to know if there is a default HDFS connector that comes with HDP 2.6 which is ready to use with minimal coding/configuration. And a useful link on how to use it.
Thanks in advance.
This can be done easily with Apache NiFi using a couple of out-of-the-box processors...
- MergeContent (to merge together small messages from Kafka to a larger file for HDFS)
If you don't have NiFi, Camus is deprecated in favor of Gobblin, yes, but Confluent has packaged Kafka Connect specifically for transferring data between various source and sinks, such as HDFS.