Created on 02-18-201705:20 PM - edited 08-17-201902:25 PM
Imagine we have a use case of exchanging data with an external party via AWS S3 buckets: we want to push these messages into our internal Kafka cluster, after enriching each message with additional metadata, in an event-driven fashion. In AWS, this is supported by associating notifications with an S3 bucket.
These destinations can make use of a few different destinations, namely, SQS, SNS, and Lambda. We'll focus on the SQS approach, and will make use of NiFi's GetSQS processor.
To configure this in AWS, navigate to the S3 bucket and then to the Properties tab, and scroll down to Advanced settings > Events. You'll need to create an SQS queue for this purpose.
With this configured, a new SQS message will appear any time an object is created within our S3 bucket.
We need to configure some IAM policies in order for our NiFi data flow to be authorized to read from the S3 bucket and to read from the SQS queue. We will authenticate from NiFi using the Access Key and Secret Key associated with a particular IAM user.
First, the IAM policy for reading from the S3 bucket called nifi:
Second, the IAM policy for reading from--as well as deleting from since we'll configure GetSQS to auto-delete received messages--the SQS queue. We'll need the ARN and URL associated with our SQS queue, these can be retrieved from the SQS Management Console and navigating to the SQS queue name we created above.
Note: we could harden this by restricting the permitted SQS actions further.