Created on 07-02-2016 01:19 PM - edited 08-17-2019 11:48 AM
- Here is a small demo how NiFi can help you monitor and alert on YARN Application failure.
- Here you can view the screen recording that Demonstrates how it works!
- Make sure you have your HDP cluster/Sandbox up and running.
- NiFi_0.6.1//HDF_1.2 is available up and running.
1. Assuming you have NiFi UI Available, lets drop GetHTTP processor to pull data from YARN REST API:
Configure Processor with URL given as below, which pulls all Applications in Killed and Failed state. node1 is my Resource Manager:
http://node1:8088/ws/v1/cluster/apps?states=KILLED,FAILED
Lets schedule the processor to run only every 10sec so that you don’t query too often.
2. As the Rest call outputs the application details in Json format, lets use a SplitJson processor to separate individualapplication details.
Provide “JsonPath Expression” value as “$.apps.app” in the configuration.
3. Connect GetHTTP to SplitJson for success relation and auto terminate rest.
4. Lets add EvaluateJsonPath processor to extract required fields and add them to flow-file attribute: Configure it as below:
5. Connect SplitJson to EvaluateJsonPath for success relation.
6. Create and start two controller services: DistributedMapCacheClientService, DistributedMapCacheServer so that we keep track of all the applications and don’t sent out duplicate alerts for same application.
7. Add a PutDistributedMapCache processor to update the cache with latest apps that fails/killed. Configure it as below adding Distributed cache service.
8. Lets auto terminate Failure relationship and connect success relationship to PutEmail processorwhich will sent out email for any new failed/killed application.
9. Make sure you have formatted the email body and subject to have all information about the failed job:
10. Auto terminate success and failure relationship for PutEmail processor. Once you start the Flow, you will get alerts for each Killed/Failed Yarn application. My Alert would look like below:
Note: Now you can configure your GetHTTP Processor to query YARN to find long running applications
Thanks,
Jobin George
Created on 07-04-2016 04:07 AM
@Jobin George This is real cool. I have tried to build a similar functionality tool using Flink as my Stream Processor. I call it Applerts :).
Created on 07-05-2016 04:54 PM
Applerts are way cooler!!! 🙂
Created on 07-25-2016 06:49 PM
can you share the flow XML?
Created on 08-03-2016 03:38 AM
Sorry for the late reply, didn't get the update as i was not tagged.
Attaching it here: yarn-application-monitor.xml.
Thanks,
Jobin George
Created on 07-26-2018 02:26 PM
My Yarn UI is kerberos enabled. getHTTP complaining about 401 authentication error. Is there any work around for this?