Created 10-06-2017 07:20 PM
I have a scenario where my kafka cluster will be getting events from remote kafka cluster(source) and I want events to be pushed to hadoop hdfs (sink). I have searched through google and found out that we need to use kafka-connect-hdfs connector which will send messages or events from kafka to hdfs directly. Now iam not too sure how to get this connector and work with HDP or how do you think we can get around this situation effectively
Created 08-20-2018 10:02 PM
Ranjit Nagi - Do you have any working implementation for the above scenario? If so, how is the implementation look like?
Created 08-20-2018 11:26 PM
From a non-Hadoop machine, install Java+Maven+Git
git clone https://github.com/confluentinc/kafka-connect-hdfs cd kafka-connect-hdfs git fetch --all --tags --prune git checkout tags/v4.1.2 # This is a Confluent Release number, which corresponds to a Kafka release number mvn clean install -DskipTests
This should generate some files under the target folder in that directory.
So, using the 4.1.2 example, I would ZIP up the "target/kafka-connect-hdfs-4.1.2-package/share/java/" folder that was built, then copy this file and extract it into all HDP servers that I want to run Kafka Connect on. For example, /opt/kafka-connect-hdfs/share/java
From there, you would find your "connect-distributed.properties" file and add a line for
plugin.path=/opt/kafka-connect-hdfs/share/java
Now, run something like this (I don't know the full location of the property files)
connect-distributed /usr/hdp/current/kafka/.../connect-distributed.properties
Once that starts, you can attempt to hit http://connect-server:8083/connector-plugins , and you should see an item for "io.confluent.connect.hdfs.HdfsSinkConnector"
If successful, continue to read the HDFS Connector documentation, then POST the JSON configuration body to the Connect Server endpoint. (or use Landoop's Connect UI tool)
Created 08-21-2018 12:28 AM
Will this work for Apache Kafka in an Hortonworks cluster
Created 08-21-2018 07:46 PM
Yes, "Confluent" is not some custom version of Apache Kafka
In fact, this process is very repeatable for all other Kafka Connect plugins.
Created 11-13-2019 03:21 AM
@JordanMoore I am getting this error
java.lang.ClassNotFoundException: io.confluent.connect.storage.StorageSinkConnectorConfig
when trying to add connector using REST api. I am following this documentation.