Code Repositories
Find and share code repositories
Labels (1)
Super Guru
Repo Description

A web crawler bot written in Spark with Kafka and Tika to replace Nutch. It renders Javascript and processes files with Tika.

https://github.com/uscdataScience/sparkler/wiki/sparkler-0.1

Repo Info
Github Repo URL https://github.com/USCDataScience/sparkler
Github account name USCDataScience
Repo name sparkler
1,414 Views
0 Kudos