Spark : Equivalent of RDD action foreach


We are new to spark looking for an equivalent action of an RDD similar to foreach which applies to each element


Expert Contributor

Theres not really an rdd equivalent like "foreachRDD" in core spark but there is a foreach partition that allows you to apply function foreach partition in the RDD.

In spark streaming there is a foreachRDD which allows you to apply a function to each "RDD" in the DStream.

//Core Spark
foreachPartition(f: (Iterator[T]) ⇒ Unit): Unit
foreachRDD(foreachFunc: (RDD[T]) ⇒ Unit): Unit

New Contributor

Interesting, assumed that something like the following:

sc.parallelize([1, 2, 3, 4]).foreach(lambda x: accum.add(x))

As per the Spark manual would execute in parallel in Spark, applying a function to each element. But perhaps that is not quite what the above line is doing.