I have deployed the 10 node automated amazon AWS, using the ansible playbook.
Everything looks to be working OK, but a new data source I've added isn't showing up in elasticsearch. I'm a bit stumped at how to start debugging this.
The incoming flow looks like:
AuditD -> SYSLOG -> NiFi -> Kafka
It's then picked up in Kafka by the
Appreciate any tips to help figure out where my data is disappearing to!
@Laurens Vets - unfortunately auditd* didn't work, elasticsearch is not listing any new indexes.
I can confirm that adding the new data type creates a new auditd topology, and it looks to be running without errors, acknowledging messages. The enrichment topology is also running without issues (which I believe is responsbile for indexing?)
@Oliver Fletcher Were you able to solve this issue? I'm currently facing a very similar issue.
DataSource 1 parsed/indexed just fine with the same workflow CEF(logger) -> SYSLOG -> NiFi -> Kafka.
When I added the second source DataSource 2 - CEF(logger) -> SYSLOG -> NiFi -> Kafka. Seems to parse just fine but I'm not able to see any indexes on ES. I checked the ES logs, storm logs, Strom UI and ES UI - don't see any errors either.
The weird part is, the HDFS index writer has records avaible hdfs dfs -ls /apps/metron/indexing/indexed/ for datasource 2. I did double check the ES writer and it's enabled. Except when I look at the shards, there are no indexes fro Datasource2.
Yes - going back to basics fixed the issue:
1. I updated ambari parsers configuration to include only the auditd parser
2. Ensured timestampField: timestamp was included in parserConfig
3. Restarted parser's service in ambari's metron service
I then started to see data being added to the enrichments kafka topic.