Community Articles

Find and share helpful community-sourced technical articles.
Labels (2)
avatar

Introduction

- Here is a small demo how NiFi can help you monitor and alert on YARN Application failure.

- Here you can view the screen recording that Demonstrates how it works!

Prerequisite

- Make sure you have your HDP cluster/Sandbox up and running.

- NiFi_0.6.1//HDF_1.2 is available up and running.

Steps:

1. Assuming you have NiFi UI Available, lets drop GetHTTP processor to pull data from YARN REST API:

Configure Processor with URL given as below, which pulls all Applications in Killed and Failed state. node1 is my Resource Manager:

http://node1:8088/ws/v1/cluster/apps?states=KILLED,FAILED

5424-1.jpg

Lets schedule the processor to run only every 10sec so that you don’t query too often.

2. As the Rest call outputs the application details in Json format, lets use a SplitJson processor to separate individualapplication details.

Provide “JsonPath Expression” value as  “$.apps.app”  in the configuration.

5425-2.jpg

3. Connect GetHTTP to SplitJson for success relation and auto terminate rest.

4. Lets add EvaluateJsonPath processor to extract required fields and add them to flow-file attribute: Configure it as below:

5426-3.jpg

5. Connect SplitJson to EvaluateJsonPath for success relation.

6. Create and start two controller services: DistributedMapCacheClientService, DistributedMapCacheServer so that we keep track of all the applications and don’t sent out duplicate alerts for same application.

5427-4.jpg

7. Add a PutDistributedMapCache processor to update the cache with latest apps that fails/killed. Configure it as below adding Distributed cache service.

5428-5.jpg

8. Lets auto terminate Failure relationship and connect success relationship to PutEmail processorwhich will sent out email for any new failed/killed application.

9. Make sure you have formatted the email body and subject to have all information about the failed job:

5429-6.jpg

10. Auto terminate success and failure relationship for PutEmail processor. Once you start the Flow, you will get alerts for each Killed/Failed Yarn application. My Alert would look like below:

5430-7.jpg

Note: Now you can configure your GetHTTP Processor to query YARN to find long running applications

Thanks,

Jobin George

8,770 Views
Comments
avatar
Rising Star

@Jobin George This is real cool. I have tried to build a similar functionality tool using Flink as my Stream Processor. I call it Applerts :).

avatar
@Hemant Kumar Dindi

Applerts are way cooler!!! 🙂

avatar
Master Guru

can you share the flow XML?

avatar

@Timothy Spann,

Sorry for the late reply, didn't get the update as i was not tagged.

Attaching it here: yarn-application-monitor.xml.

Thanks,

Jobin George

avatar
Contributor

My Yarn UI is kerberos enabled. getHTTP complaining about 401 authentication error. Is there any work around for this?