Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Is there a way to get time-based ticks/triggers in a Spark Streaming job?

avatar
 
1 ACCEPTED SOLUTION

avatar
Master Guru

That is one of the things that is more natural in storm. I think your only chance is to set a pretty low base frequency and then either check for the time/trigger event yourself ( in code that gets executed Iike a mappartitions. ) or to use a trigger input ( for example a kafka topic wirh control commands) and join with your main data stream.

the first approach would be in pseudo code

Inputstream.mappartitions{

String command=<load trigger from database hbase whatever...>

Transform your data flow based on command

}

View solution in original post

2 REPLIES 2

avatar
Master Guru

That is one of the things that is more natural in storm. I think your only chance is to set a pretty low base frequency and then either check for the time/trigger event yourself ( in code that gets executed Iike a mappartitions. ) or to use a trigger input ( for example a kafka topic wirh control commands) and join with your main data stream.

the first approach would be in pseudo code

Inputstream.mappartitions{

String command=<load trigger from database hbase whatever...>

Transform your data flow based on command

}

avatar

I ended up creating an additional source upstream that generates "tick" events at my specified interval, then joined the two RDDs. Every interval, the RDD element from the "tick" stream has a non-zero value.