There's no control in place to ensure access to the job tracker portal (YARN logs). The URL is open to all who have knowledge of it. I am looking for a way to secure these logs URLS (YARN application history, job history, Spark history). what is best way to go about locking down these URLs to sppecific groups or to force some kind of authentication (provde login credentials) and not have them open to all who is aware of them....
Any thoughts or suggestions of best way to do this?
Usually, UI security is done via Kerberos for YARN and Spark. If you have enabled Kerberos authentication in your cluster and you have enabled. If you are using Cloudera Manager, the following can be set to enable SPNEGO authentication for the YARN UI and HDFS UIs:
Enable Kerberos Authentication for HTTP Web-Consoles
To provide authorization you can enable ACLs I think and then specify admins via yarn.admin.acl.
This documentation may help:
For Spark see:
(see the spark section)