Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Spark streaming job doesn't release resources while idling

Spark streaming job doesn't release resources while idling

Explorer

Hi All

 

Is there a way that a spark streaming job can release resources while idling and waiting for actual data to come in?

I expect, thats when it actually can request more resources from YARN and continue to run. But it doesnt release resources from YARN.

Our spark jobs are made to pick up data as soon a file is dropped every half an hour.

 

While idling for data, driver doesnt release resources and streaming resource pool is always full, resulting in all other jobs waiting or doesnt utilize the settings for this pool. (like immediately get resources for this pool)

 

Thanks

Abhishek

 

Abhishek
1 REPLY 1
Highlighted

Re: Spark streaming job doesn't release resources while idling

Master Collaborator

This is just what dynamic allocation is for, and you can enable it for a streaming job to add/remove executors in response to demand.


It's not as great an idea for streaming because your job will be mostly idle, and then need to quickly process a new batch of data, and it will take some time to reacquire new executors to do the work. Still, it is entirely possible.

Don't have an account?
Coming from Hortonworks? Activate your account here