Community Articles

Find and share helpful community-sourced technical articles.
Labels (2)
avatar
Master Guru

Use Case: Store log data in Hadoop Data Lake and Send Curated, Reduced Set to Sumologic via REST API

9625-sumo6.png

Integration

The integration point for sending log data to Sumologic is their HTTP Source. To send data you must setup an HTTP Source in Sumologic from your web console as shown below. Take the HTTP string they give to you and put into an InvokeHTTP processor with POST, it will look something like this: https://endpoint1.collection.us2.sumologic.com/receiver/v1/http/ZaLongCodeLong

9627-sumo2.png

9628-sumo3.png

9626-sumo4.png

I noticed IP Address in the data, so I decided to parse it out:

${regex.6:substringAfterLast('source ip: '):replaceAll('\)','')}

Then send it to MaxMind for processing. The MaxMind GeoIP free database is easy to download and use with NiFi. Just add the GeoIP processor and connect the field and the file location.

Finally displaying and charting data is up next, easy as pie in Zeppelin. Just query my Phoenix data.

9629-log1.png

The flow is a bit long as I am using RegEx to convert the logs from NiFi Log4J format to individual attributes then make them into a JSON file and convert to SQL upsert for Phoenix insert. I log all failures to a file.

9630-log2.png

9631-log3.png

9632-log9.png

Transmitting Log Data

It's pretty easy to integrate with Sumologic. They have a nice HTTP endpoint to send this data. They will accept JSON and many other text formats. They have a native agent, which can be interfaced with as well via several logging mechanisms. I asked them about it and I may work on that in the future.

Apache Phoenix Table for Log Data

0: jdbc:phoenix:tspannserver> !describe nifilogs
+------------+--------------+-------------+--------------------+------------+------------+--------------+----------------+-----------------+-----------------+-----------+---------+
| TABLE_CAT  | TABLE_SCHEM  | TABLE_NAME  |    COLUMN_NAME     | DATA_TYPE  | TYPE_NAME  | COLUMN_SIZE  | BUFFER_LENGTH  | DECIMAL_DIGITS  | NUM_PREC_RADIX  | NULLABLE  | REMARKS |
+------------+--------------+-------------+--------------------+------------+------------+--------------+----------------+-----------------+-----------------+-----------+---------+
|            |              | NIFILOGS    | SDATE              | 12         | VARCHAR    | null         | null           | null            | null            | 1         |         |
|            |              | NIFILOGS    | FRAGID             | 12         | VARCHAR    | null         | null           | null            | null            | 0         |         |
|            |              | NIFILOGS    | MSG                | 12         | VARCHAR    | null         | null           | null            | null            | 1         |         |
|            |              | NIFILOGS    | MODULE             | 12         | VARCHAR    | null         | null           | null            | null            | 1         |         |
|            |              | NIFILOGS    | STIME              | 12         | VARCHAR    | null         | null           | null            | null            | 1         |         |
|            |              | NIFILOGS    | STYPE              | 12         | VARCHAR    | null         | null           | null            | null            | 1         |         |
|            |              | NIFILOGS    | SCLASS             | 12         | VARCHAR    | null         | null           | null            | null            | 1         |         |
|            |              | NIFILOGS    | GEOCITY            | 12         | VARCHAR    | 255          | null           | null            | null            | 1         |         |
|            |              | NIFILOGS    | GEOLATITUDE        | 12         | VARCHAR    | 255          | null           | null            | null            | 1         |         |
|            |              | NIFILOGS    | GEOLONGITUDE       | 12         | VARCHAR    | 255          | null           | null            | null            | 1         |         |
|            |              | NIFILOGS    | GEOCOUNTRY         | 12         | VARCHAR    | 255          | null           | null            | null            | 1         |         |
|            |              | NIFILOGS    | GEOPOSTALCODE      | 12         | VARCHAR    | 255          | null           | null            | null            | 1         |         |
|            |              | NIFILOGS    | GEOCOUNTRYISOCODE  | 12         | VARCHAR    | 255          | null           | null            | null            | 1         |         |
|            |              | NIFILOGS    | IPADDRESS          | 12         | VARCHAR    | 255          | null           | null            | null            | 1         |         |
+------------+--------------+-------------+--------------------+------------+------------+--------------+----------------+-----------------+-----------------+-----------+---------+

NiFi Apache Phoenix (HBase) SQL Upsert (ReplaceText)

upsert into nifilogs (sdate,fragid,msg,module,stime,stype,sclass, geocity, geolatitude, geolongitude, geocountry, geopostalcode, geocountryisocode, ipaddress, geostate) 
values ('${'date'}','${'fragment.identifier'}', '${'msg'}','${'module'}','${'time'}','${'type'}','${'class'}','${'ipaddress.geo.city'}',
'${'ipaddress.geo.latitude'}','${'ipaddress.geo.longitude'}','${'ipaddress.geo.country'}','${'ipaddress.geo.postalcode'}','${'ipaddress.geo.country.isocode'}','${'ipaddress'}','${'ipaddress.geo.subdivision.isocode.0'}')

Note the use of stime, stype, sclass, sdate; I am trying to avoid using built-in SQL keywords. I added some fields for the geo encoding that will come from MaxMind database. I parse out IP Address from the main log record.

References:

4,186 Views