Support Questions

Find answers, ask questions, and share your expertise

YARN Applications display wrong formatted duration

avatar
Expert Contributor

Hello,

 

I am having a problem that I can't find any logical solution. Every job that requires YARN it will show up in "YARN Applications" UI on Cloudera Manager. Even though I can see all the running jobs on YARN Applications UI, ResourceManager UI, or Spark UI I have to widen my time selector to a year or two to see the finished jobs.

 

I think this has something to do with displayed time. All the running jobs have the static `17540.7d` as their duration:

 

Screenshot 2018-01-09 18.14.26.png

 

At the same time these applications on `ResourceManager` are showing up with the right date/time:

 

Screenshot 2018-01-15 19.58.56.png

As you can see this makes it really hard to monitor and track anything in YARN Applications view in Cloudera Manager.

 

Cloudera Manager express: 5.13.1

CDH: 5.13.1

Ubuntu Server 16.04

And I checked all the machines date/time to see if they are not sync. But unfortunately I can't find any issue in my cluster.

 

 

NOTE: there is only one similar issue here, but I guess he can't see any jobs even by

widening time window. (I can see jobs with wider time window 1-2yrs)

http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Completed-YARN-applications-not-visib...

 

Best,

Maziyar

 
1 ACCEPTED SOLUTION

avatar
Expert Contributor

Hi Maziyar and Li,

 

You are definitely on the right track here.  I see that the top-level root queue has "aclAdministerApps=maziyar,admin", which limits YARN API access to some of the time metrics for the application.  If you're not one of these users when you make the API call you won't get correct value returned for:

 

  1. startedTime
  2. finishedTime
  3. elapsedTime
  4. logAggregationStatus
  5. amHostHttpAddress
  6. usedResources
  7. allocatedMB
  8. allocatedVCores
  9. runningContainers

Although you will get some basic application info, so you'll see the application, but metrics will be wrong, just like you report.

 

So, the question is which user is the CM service using to interact with the YARN API.  I did some testing and reproduced your issue by limiting the aclAdministerApps property to "yarn".  I then found that when I add "dr.who" to aclAdministerApps at the root level, it starts working properly.

 

So, try modifying your root level ACLs to be "aclAdministerApps=maziyar,admin,dr.who", refresh the Dynamic Resource Pool (DRP) configuration, and see if it resolves the issue for you.

 

Nick

View solution in original post

17 REPLIES 17

avatar
Expert Contributor
Hi Nick,

Thanks for the advice. I have added "dr.who" to the list and now everything is back to normal! Many thanks mate 🙂

avatar
Expert Contributor

Maziyar,

 

I was discussing this issue internally and adding "dr.who" to the adminACL has the side effect of allowing all users to have access, so we don't want that.  I know we're on the right track here, we just need to get the correct user or group added to the adminACL for CM.  I'm researching and will update as soon as I have the answer!

 

Nick

avatar
Expert Contributor
Fantastic! Many thanks Nick and looking forward to the right solution 🙂

avatar
Expert Contributor

Hi Maziyar,

 

I've found that CM uses the "hue" user to interact with the YARN API, so try changing the root level ACL to be "aclAdministerApps=maziyar,admin,hue", refresh the Dynamic Resource Pool (DRP) configuration, and test if it still resolves the issue for you.  This will be much more restricted than using "dr.who" but allow the CM Web UI to function properly.

 

Nick

avatar
Expert Contributor
Hi Nick,

Unfortunately, removing dr.who and adding hue resulted in the same problem as I had initially. I do agree to add hue would be much safer and restricted than dr.who, but it didn't work.

I am looking forward to something similar to hue to solve this issue 🙂

Many thanks,
Maziyar

avatar
Expert Contributor

Hi Maziyar,

 

I'm digging in to this again.  I clearly see messages in the RM log showing "dr.who" is the user accessing the YARN API.  I'm researching further so I hopefully can provide the correct answer!

 

Nick

avatar
Expert Contributor

Hi Maziyar,

 

The information I had about user "hue" being used by CM to access YARN API is correct for kerberized clusters, but in your case we know that the cluster is not kerberized and we see "dr.who" is used by CM.  Consequently, I think that adding "dr.who" to the aclAdminsterApps property is the only solution for now.

 

I am creating an internal improvement request for Cloudera Manager (CM) to also use the use "hue" if ACLs are turned on in a non-kerberized cluster.  That way the behavior will be consistent and will provide some level of restriction on who can administer queues in a non-kerberized environment.

 

EDIT: Internal Improvement JIRA created for CM - look for this change in a future release of CDH (no guarantees, but I hope we implement this change)

 

Nick

avatar
Expert Contributor
Hi Nick,

That would be great! Thank you so much for your time and helping me in this matter, I really appreciate it 🙂