spark pandas split-apply-combine ?

Hi, is there anything like pandas groupby (split-apply-combine)


I would like to spilt a big DataFrame into many small DataFrames according some columns.

On each group, I would like to apply a predefine function which will return another data object.

Finally, return a map of <key, result data object. Or if each group returns a DataFrame, I would like to combine them into a big DataFrame.

In short words, something like:

DataFrame.groupby(columns...).foreach(rows => ...)


DataFrame.groupby(columns...).foreach(rows => ...).collect(...)




Any idea?