Reply
New Contributor
Posts: 4
Registered: ‎09-03-2014

Flume scaling best practices

Hi all!

I’m using this flume-snmp-source plugin [1] to query a managed host with the following logic:

 

agent.sources.source1.type = org.apache.flume.source.SNMPQuerySource
agent.sources.source1.host = 23.23.52.11
agent.sources.source1.port = 161
agent.sources.source1.delay = 30
agent.sources.source1.oid1 = 1.3.6.1.4.1.2000.1.2.5.1.3
agent.sources.source1.oid2 = 1.3.6.1.4.1.2000.1.2.5.1.7
agent.sources.source1.oid3 = 1.3.6.1.4.1.2000.1.2.5.1.9
agent.sources.source1.oid4 = 1.3.6.1.4.1.2000.1.2.5.1.10
agent.sources.source1.oid5 = 1.3.6.1.4.1.2000.1.2.5.1.12
agent.sources.source1.oid6 = 1.3.6.1.4.1.2000.1.2.5.1.13
….
agent.sources.source1.oidN = N.N.N.N …

 

The plugin is a PollableSource source and is quering every “source1.delay” seconds to the managed host “source1.host”. The message passed to the Flume channel is created with this format:

 

“current date, ip managed device, oid1 answer, oid2 anwswer, …., oidN answer”

 

The query is made using SNMP GETBULK for performance reasons.

 

The plugin works fine (the development is in alpha stage), however I’m going ahead with more robust tests, at this point I have the following question related with Flume scaling:

 

I have to query +1K managed devices with the same snmp query, so I have to created a “source” entry for each host to query in the flume.conf file. This is the correct way to do that, to maintain a huge “flume.conf” file with thousand of entries? Otherwise is there a better strategy for this big scale problems?

 

Thanks a lot!

 

[1] https://github.com/javiroman/flume-snmp-source

New Contributor
Posts: 4
Registered: ‎09-03-2014

Re: Flume scaling best practices

Anyone managing Flume scale problems? or any useful references (blogs, guides)  with clues considering Flume in production environments?

New Contributor
Posts: 4
Registered: ‎09-03-2014

Re: Flume scaling best practices

My first approach would be to create a multi-layer topology where each agent takes control of a subset of that thousands of sources, don't it?

Thus, I'd need a conf mgmt server (i.e. Puppet) to automatize the configuration on these hundreds of agents. 


Highlighted
New Contributor
Posts: 4
Registered: ‎09-03-2014

Re: Flume scaling best practices

I have opened a discussion in flume user list[1]

 

[1] - https://www.mail-archive.com/user@flume.apache.org/msg04069.html

 

Regards.

Announcements
New solutions