I'm working on this repo
There is something that I don't understand. How does spark know that it should repeat executing a function (with new data) in its intervals?
to make it clear: the main job creates an Spark Streaming Context (ssc) instance which has an interval in its configs. after ssc.start(), sparks starts to do the same process iteratively on new data.
how does spark know which functions to repeat? I need to write a new function which should execute in each interval. how can I do that?
Do I need to put ssc in its inputs? or it's irrelevant?