Support Questions
Find answers, ask questions, and share your expertise

Do we have Analytical function like RANK/LAG in Spark Structured streaming

Do we have Analytical function like RANK/LAG in Spark Structured streaming

Contributor

I would like to use an Analytical function like RANK/LAG in Spark Structured streaming.

  1. Do we have Analytical Window function in Spark Structured Streaming by any chance?I am aware org.apache.spark.sql.expressions.Window is available, but this probably works with Data Frames only, not with Spark Structured streaming. Also, org.apache.spark.sql.window method is purely used for sliding data in window in Structured streaming.

  2. Also, Any suggestions on how to pick Max Date record in a group, in Spark structured streaming if we don’t have rank option.

    val groupedOutput = joinedDf.groupBy(    window(col("timestampIpfr"), "2 minutes"),    col("userid"), col("timestampIpfr")  ).agg(max("timestampMME").alias("maxTimestampMME"),    max("locationMME").alias("locationMME") 
    //Max on locationMME doesn't work here, looking for help on the same to find locationMME of maxTimestampMME record )
    

Expecting Grouped output value with max timestampMME record and corresponding locationMME