Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Do we have Analytical function like RANK/LAG in Spark Structured streaming

Highlighted

Do we have Analytical function like RANK/LAG in Spark Structured streaming

Contributor

I would like to use an Analytical function like RANK/LAG in Spark Structured streaming.

  1. Do we have Analytical Window function in Spark Structured Streaming by any chance?I am aware org.apache.spark.sql.expressions.Window is available, but this probably works with Data Frames only, not with Spark Structured streaming. Also, org.apache.spark.sql.window method is purely used for sliding data in window in Structured streaming.

  2. Also, Any suggestions on how to pick Max Date record in a group, in Spark structured streaming if we don’t have rank option.

    val groupedOutput = joinedDf.groupBy(    window(col("timestampIpfr"), "2 minutes"),    col("userid"), col("timestampIpfr")  ).agg(max("timestampMME").alias("maxTimestampMME"),    max("locationMME").alias("locationMME") 
    //Max on locationMME doesn't work here, looking for help on the same to find locationMME of maxTimestampMME record )
    

Expecting Grouped output value with max timestampMME record and corresponding locationMME