02-09-2017 12:31 PM
We have a spark streaming application running on a cluster managed by the yarn ..i see the job is very slow and see uneven distribution
as shown in the picture where i have highlighted one node taking most of rdd's
From research found that we can repartition or colasace ...but little lost on which one to use ...
Would grealty appreciate if any of the experts in the community will help out as we are new to Spark
02-09-2017 12:56 PM