Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Nifi ListenSyslog howto ingest into HAWQ

Solved Go to solution

Nifi ListenSyslog howto ingest into HAWQ

New Contributor

Hi,

1. step we ingest SYSLOG messages with the ListenSyslog and PutHDFS in our DataLake.

2. step ingest with PXF external table technology into HAWQ.

CREATE EXTERNAL TABLE ext_syslog_hist
(message TEXT)
LOCATION ('pxf://xxx.xxx.com:8020/apps/nifi/syslog/history/*?PROFILE=HdfsTextSimple')
FORMAT 'TEXT' (delimiter=E'\t');

Every minute there are thousands of small (100 bytes) files created in the HDFS.
So I'm looking for a way to ingest the incoming ListenSyslog directly into a HAWQ table.

Regards
Timo

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Nifi ListenSyslog howto ingest into HAWQ

You could use PutSQL to issue INSERT statements directly against HAWQ. Assign the data item values to attributes, as described in the PutSQL documentation. Then put the INSERT statement in the content of the Flow File, with '?' placeholders for the individual values.

9 REPLIES 9

Re: Nifi ListenSyslog howto ingest into HAWQ

You could use PutSQL to issue INSERT statements directly against HAWQ. Assign the data item values to attributes, as described in the PutSQL documentation. Then put the INSERT statement in the content of the Flow File, with '?' placeholders for the individual values.

Re: Nifi ListenSyslog howto ingest into HAWQ

New Contributor

Thanks,
this was really easy.
I just generating "insert into table" statements including the Flow File with the ReplaceText processor and connect it with the PutSQL processor.

regards
Timo

Re: Nifi ListenSyslog howto ingest into HAWQ

New Contributor

Is this equivalent to: INSERT INTO <table> (col1, col2, col3) VALUES (?, ?, ?);

Re: Nifi ListenSyslog howto ingest into HAWQ

You could also use MergeContent before PutHDFS so that you can merge together all of the small files based on size or time before writing to HDFS.

Re: Nifi ListenSyslog howto ingest into HAWQ

New Contributor

Hi Bryan,

how can i merge the FlowData Files to a multiline File?
Using the MergeCentent processor i got a concatenated line.
input:

1 FlowData: <30>Aug 11 06:27:26 xxx.xxx.com systemd[28892]: Stopping Timers.

2 FlowData:<30>Aug 11 06:27:15 xxx.xxx.com systemd[24517]: Stopping Paths

etc...

MergeContent output:

<30>Aug 11 06:27:26 xxx.xxx.com systemd[28892]: Stopping Timers.<30>Aug 11 06:27:15 xxx.xxx.com systemd[24517]:Stopping Paths

I need the following multiline structure:

<30>Aug 11 06:27:26 xxx.xxx.com systemd[28892]: Stopping Timers.
<30>Aug 11 06:27:15 xxx.xxx.com systemd[24517]:Stopping Paths

any solution?

regards
Timo

Re: Nifi ListenSyslog howto ingest into HAWQ

Just set the "Delimiter Strategy" in MergeContent to "Text" and then set the "Demarcator" property to shift+enter to make a new line. It will insert the demarcator between each flow file that it merges.

Re: Nifi ListenSyslog howto ingest into HAWQ

New Contributor

You definitely do not want to execute an INSERT for every row of data. That will be really, really slow.

I suggest you use the gpfdist protocol to load the log files directly into HAWQ. You start a gpfdist process where the files are located, create an external table that uses the gpfdist location, and then execute a single INSERT statement to load all of the data from the files into your HAWQ table.

Re: Nifi ListenSyslog howto ingest into HAWQ

New Contributor

The PutSQL option is by far the worst option. This is the wrong answer. Please, don't do that. Hellmer clearly doesn't use HAWQ.

I wrote a blog post over a year ago on this subject. The singleton INSERT solution that is common for OLTP databases is the absolute worst solution for HAWQ. In other words, "INSERT INTO table_name (columns...) VALUES (...);"

Here is my blog post covering loading data into HAWQ: http://www.pivotalguru.com/?p=1425

Highlighted

Re: Nifi ListenSyslog howto ingest into HAWQ

New Contributor

Great, works !!!
Thanks!

Don't have an account?
Coming from Hortonworks? Activate your account here