- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
How to read data from HDFS and place into Kafka (don’t want to use Scala/Spark)? Any utilities or methods?
- Labels:
-
HDFS
Created 04-16-2021 04:14 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We are looking for some kind of utility or tool to read the data from HDFS and place it in the kafka topic. Appreciate your inputs.
From the community section, we came across this "You could use Apache NiFi with a ListHDFS + FetchHDFS processor followed by PublishKafka"...Can you provide more insight how this can be acheived
Thank you
Srinu
Created 04-26-2021 09:58 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello @sriven
Found this - https://community.cloudera.com/t5/Support-Questions/How-to-insert-parquet-file-to-Kafka-and-pass-the...
Please let me know if it helps.
Thanks & Regards,
Nandini
Created 04-28-2021 11:46 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created 04-28-2021 02:34 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please try Kafka connect then, that seems to be the best option suited.
Created on 04-30-2021 03:36 PM - edited 04-30-2021 03:39 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How to read parquet files using Kconnect.?
In simple,We just want to read the parquet files on HDFS using kconnect and without spark jobs?
Please let us know if there is a solution or not?
Created 05-03-2021 07:44 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As you know,
We have limitation with source kafka connector that it works for HDFS objects/files created only by the HDFS 2 Sink Connector for Confluent Platform
and how we can pull the files if created by other spark,mapreduce or any other jobs on HDFS?
The use case of HDFS source connector is only to mirror the same data on kafka.
Created 05-06-2021 10:34 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please try Nifi - Kakfa

- « Previous
-
- 1
- 2
- Next »