Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Why are there dr.who "MYYARN" applications running and all failing in what seems to be a loop?

Re: Why are there dr.who "MYYARN" applications running and all failing in what seems to be a loop?

New Contributor

No. I'm the only user connected. And while my cluster is not kerberized, my Ambari connection is made through HTTPS.

Re: Why are there dr.who "MYYARN" applications running and all failing in what seems to be a loop?

New Contributor

Our jobs are indeed stuck in "ACCEPTED" status, and then eventually fail due to a time-out. I can't get any further useful log information. Having checked the RM UI logs for "FAILED" jobs, I noticed it started on April 30 for 3 hours straight then stoped. It started again on May 1 up until today.

Re: Why are there dr.who "MYYARN" applications running and all failing in what seems to be a loop?

Super Collaborator

This is typically the case when the resources are exceeded. This could be the memory of the node, but also the queue itself. Can you check if the jobs getting stuck are all submitted in the same queue?

https://community.hortonworks.com/questions/96750/yarn-application-stuck-in-accepted-state.html

Re: Why are there dr.who "MYYARN" applications running and all failing in what seems to be a loop?

New Contributor

I'm having the exact same issue. All of a sudden yesterday - on a cluster that has been up and running for weeks - started spawning six of these at a time for no apparent reason. I kill them and they come back. I've poured over every single log, checked every nook and cranny and cannot figure it out. I have no idea where they are coming from. It is most definitely not a resource issue - these jobs shouldn't even be running - and it's not cron. They are sucking up major CPU when it runs.

If anyone has any thoughts I'd be grateful to hear them!

The other odd thing is that in the past I would see one of these jobs - but only one - never like this.

,

I'm having the exact same problem. All of a sudden on a cluster that has not changed is spawning off these jobs that are in ACCEPTED status as user Dr.Who and called MYYARN. I've poured over every single log, bounced my cluster several times, there are no cron jobs and it is most definitely not a resource issue. Looking at old logs it looks like it happened periodically - but only once of twice and then it stops. Yesterday it started running wild and as quick as I kill them off it starts another 6 of the exact same job. If anyone has any insight at all I'd be grateful.

And I'm not even using HDP - this is standard Apache Hadoop/Yarn/Spark 2.7.5

Re: Why are there dr.who "MYYARN" applications running and all failing in what seems to be a loop?

New Contributor

I am wondering if this a security loophole ,since my cluster is not yet kerberized !

Re: Why are there dr.who "MYYARN" applications running and all failing in what seems to be a loop?

New Contributor

I have the same question (for the same reason, ie. not being kerberized yet).

Re: Why are there dr.who "MYYARN" applications running and all failing in what seems to be a loop?

New Contributor

I'm having the exact same issue. All of a sudden yesterday - on a cluster that has been up and running for weeks - started spawning six of these at a time for no apparent reason. I kill them and they come back. I've poured over every single log, checked every nook and cranny and cannot figure it out. I have no idea where they are coming from. It is most definitely not a resource issue - these jobs shouldn't even be running - and it's not cron. They are sucking up major CPU when it runs.

If anyone has any thoughts I'd be grateful to hear them!

The other odd thing is that in the past I would see one of these jobs - but only one - never like this.

Re: Why are there dr.who "MYYARN" applications running and all failing in what seems to be a loop?

New Contributor

Temporary workaround could be set hadoop.http.staticuser.user=testuser

assign testuser to queue testqueue with 1% resources ?

Re: Why are there dr.who "MYYARN" applications running and all failing in what seems to be a loop?

New Contributor

(this might be the real answer)

It looks like some kind of an attack. I have seen it on 2 clusters, 1 running HDP and 1 running Hadoop 2.7.4..

Re: Why are there dr.who "MYYARN" applications running and all failing in what seems to be a loop?

New Contributor

Using iptables firewall, I blocked port 8088 and the situiation improved. Too soon to tell if this is a real fix.