Support Questions

Find answers, ask questions, and share your expertise

How to join two streams coming at different times in Streaming Analytics Manager (Storm)

New Contributor

I have two streams that are being published to two Kafka topics from NiFi and am trying to 'time synchronize' them so i can do some simple analytics on the streams.

The streams are time series data but the messages in these streams are generated at different times for example:

Stream 1 will look like this:

Time Value_1

0:00:01 10

0:00:05 20

0:00:10 30 .....

Stream 2 will look like this:

Time Value_2

0:00:01 100

0:00:02 200

0:00:03 300

0:00:04 400

0:00:05 500

0:00:06 600

0:00:07 700

0:00:08 800

0:00:09 900

0:00:10 1000

and so on

When I join these two streams I want something like this:

Time Value_1 Value_2

0:00:02 10 200

0:00:03 10 300

0:00:04 10 400

0:00:05 20 500

0:00:06 20 600

0:00:07 20 700

0:00:08 20 800

0:00:09 20 900

0:00:10 30 1000

I tried an inner join in SAM with window_interval 2 and sliding_interval 0 which gets close but what I get is this instead

Time Value_1 Value_2

0:00:01 10 200

0:00:05 20 400

0:00:10 30 1000

As you can see I am missing data in the middle that is needed for my analysis. This doesn't change if I change the order of the streams being joined.

1 REPLY 1

New Contributor

bumping for visibility.

Can anyone help with this?