Posts: 54
Registered: ‎06-24-2018

Data ingestion using kafka from crawlers

[ Edited ]



I am trying to work with kafka for data ingestion but being new to this, i kind of pretty much confused.


I have multiple  crawlers, who extract data for me from web platforms, now the issue is i want to ingest that extracted data to hadoop using kafka without any middle scripts/service file . Main commplication is that, platforms are disparate in nature and one web platform is providing real-time data other batch based. Can integrate my crawlers some how with kafka producers ? and they keep running all by themselves. is it possible ? I think it is but i am not getting in right direction. any help would be appreciated.




New solutions