Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Yarn default queue full of unknown jobs

avatar
New Contributor

I admin a small Hortonworks Hadoop cluster which runs consistently two custom spark applications, each in separate queue (dev and prod). Default queue is used for services like thrift server / zeppelin etc. Last two weeks some process submits several applications each hour to the root.default queue on behalf of user hadoop, getsss and some others. There are more 12 thousand apps submitted to this date. Node Managers die after several minutes if I restart YARN or the whole cluster.

I know nothing about these applications. They are all either in ACCEPTED or FAILED state. What bothers me is:

1. Who (which service or user, from which machine) keeps submitting these apps? The cluster is hosted in cloud in internal network and accessible only via several edge gateway forwarded ports (ssh and web-ui of hue, ambari, yarn-ui and zeppelin)

2. How do I stop this from happening? I see following solutions:

- block default queue from any submissions after cluster startup and clear the default queue

I didn't find a decent way to clean queue from 12000 apps at a time, and it takes ages to kill them one-by-one.

I still have to reopen the queue to restart e.g. zeppelin, and blocking it each time seems a bad idea

- delete default queue and reconfigure all services to use other queue as default

Also seems like a painful and ugly solution

- find out what submits apps and kill it with fire

I still have to clear the queue, so the last question is

3. How do I clear the queue from this mess?

Thanks for help in advance!

1 ACCEPTED SOLUTION

avatar
New Contributor

hello,

last month i has same trouble in my cluster.

the temporary solution has been To block the ressource manager port 8088.

however, this is not definitive solution.

regards

View solution in original post

6 REPLIES 6

avatar
Expert Contributor

Hi @Oleg Parkhomenko,

You should be able to kill al the queue job with this script:

for app in `yarn application -list | awk '$6 == "ACCEPTED" { print $1 }'`; do yarn application -kill "$app";  done

Just put in a scri[t .sh and run it wit ha user that are allow to kill app

Best regards,
Michel

avatar
New Contributor

@msumbul

Hi! Thank you for the script, it solves the part about messy default queue, but even if I clear it, new applications get submitted instead of old killed ones. Dirty solution would be to put your script on cron, but I want to stop receiving them for good. I found out, that they are all submitted on behalf of user dr.who - Yarn ui and web hdfs user. What may be the cause of so much apps submitted from this particular user? Can I block this user from submitting apps or my Yarn ui will stop working? If I can, how do I block a user from submitting to the queue?

avatar
Expert Contributor

Hi @Oleg Parkhomenko,

The following link describe how you can secure yarn queue to be sure that only specific user can submit job to specific queue, it done with Ranger:
https://community.hortonworks.com/articles/10797/apache-ranger-and-yarn-setup-security.html

Normaly if you are in a kerberos environment, you should not have job running as dr who

Miche

avatar
New Contributor

hello,

last month i has same trouble in my cluster.

the temporary solution has been To block the ressource manager port 8088.

however, this is not definitive solution.

regards

avatar
New Contributor

@sidoine kakeuh fosso Thank you, I actually did it earlier today and it stopped those "spam" apps. Still don't know the source, the only guess I have about it is that somebody discovered our public IP address and practically DDOSed yarn for some reason.

Did you investigate your yarn/hive/webhcat logs for any alien IPs or queries? Have you managed to find something?

I tried but gave up, no trace of who that might be.

Anyways, thanks for your answer, this is the closest to definitive solution.

Best regards

avatar
New Contributor

Hi @Oleg Parkhomenko,

Hope you are doing great. I just would like to know, Is your resolved and how? whereas i too had the same issue and fed up completely with it.

Thanks in advance,

Sanjay