- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Is there a way to get time-based ticks/triggers in a Spark Streaming job?
- Labels:
-
Apache Spark
Created ‎04-20-2016 05:17 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎04-20-2016 10:01 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That is one of the things that is more natural in storm. I think your only chance is to set a pretty low base frequency and then either check for the time/trigger event yourself ( in code that gets executed Iike a mappartitions. ) or to use a trigger input ( for example a kafka topic wirh control commands) and join with your main data stream.
the first approach would be in pseudo code
Inputstream.mappartitions{
String command=<load trigger from database hbase whatever...>
Transform your data flow based on command
}
Created ‎04-20-2016 10:01 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That is one of the things that is more natural in storm. I think your only chance is to set a pretty low base frequency and then either check for the time/trigger event yourself ( in code that gets executed Iike a mappartitions. ) or to use a trigger input ( for example a kafka topic wirh control commands) and join with your main data stream.
the first approach would be in pseudo code
Inputstream.mappartitions{
String command=<load trigger from database hbase whatever...>
Transform your data flow based on command
}
Created ‎08-05-2016 07:30 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I ended up creating an additional source upstream that generates "tick" events at my specified interval, then joined the two RDDs. Every interval, the RDD element from the "tick" stream has a non-zero value.
