Support Questions
Find answers, ask questions, and share your expertise

Is there a way to get time-based ticks/triggers in a Spark Streaming job?

 
1 ACCEPTED SOLUTION

Accepted Solutions

That is one of the things that is more natural in storm. I think your only chance is to set a pretty low base frequency and then either check for the time/trigger event yourself ( in code that gets executed Iike a mappartitions. ) or to use a trigger input ( for example a kafka topic wirh control commands) and join with your main data stream.

the first approach would be in pseudo code

Inputstream.mappartitions{

String command=<load trigger from database hbase whatever...>

Transform your data flow based on command

}

View solution in original post

2 REPLIES 2

That is one of the things that is more natural in storm. I think your only chance is to set a pretty low base frequency and then either check for the time/trigger event yourself ( in code that gets executed Iike a mappartitions. ) or to use a trigger input ( for example a kafka topic wirh control commands) and join with your main data stream.

the first approach would be in pseudo code

Inputstream.mappartitions{

String command=<load trigger from database hbase whatever...>

Transform your data flow based on command

}

View solution in original post

I ended up creating an additional source upstream that generates "tick" events at my specified interval, then joined the two RDDs. Every interval, the RDD element from the "tick" stream has a non-zero value.