Created on 12-20-2019 09:26 AM
Introduction to Application Timeline Server
All the metrics of applications, either current or historic, can be retrieved from Yarn through Application Timeline Server. This will include pieces of information like the number of map tasks, reduce tasks, counters, etc. Application developers can publish the specific information to the Timeline server via TimelineClient in the Application Master and/or the application’s containers. The information is then queryable via REST APIs. The above said are the artifacts of the present application.
Also, generic information of completed applications like queue-name, user information a list of application-attempts that ran for an application, information about each application-attempt can be stored in the Application Timeline Server.
Let us delve into a live example of an Application Server. We are using HDP 3.1 cluster in this example.
Note: The following is for Hadoop Version:
The following is a demonstration of Time Service and Timeline Service reader from Ambari UI:
It should be noted that the Timeline Server and Timeline Reader exists as a different process in the cluster, and it may or may not be collocated. The following is the output of ps -ef | grep ‘timeline server’ from the machine in which Timeline Server is installed:
Timeline Server
In the example demonstrated in this document, we are restricting ourselves to an embedded HBASE storage. (Note: Other storage options like standalone HBase or external HBase storage options are beyond the scope of this document).
This means that Yarn creates an Embedded HBase by default:
Process capture of Embedded HBase
Flow Chart of Metrics Collection to Storage(Hbase)
(Image Source: Publishing_application_specific_data)
Demystifying the Flow Chart of Metrics Flow
The general concept is that the application submission client submits an application to the Resource Manager via Yarn Client Object to request needful resources for its usage. The Resource Manager(RM) will launch the Application Master on an allocated container. From this point onwards Application Master (AM) becomes the actual owner of the job. The AM communicates with YARN cluster and handles application execution. During the application launch time, the main tasks of the AM include communicating with the RM to negotiate and allocate resources for future containers, and after container allocation, communicating YARN Node Managers (NMs) to launch application containers on them.
Created on 02-27-2020 10:35 AM
This is very good information. Where can I look for more documentation like this regarding specifically the modes other then "embedded"?
Thank you in advance,
David