- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
Created on 05-25-2022 01:41 PM
The ListS3 and FetchS3 processors in Apache NiFi are commonly used to retrieve objects from Amazon S3 buckets, but they can be easily configured to retrieve objects from IBM Cloud buckets.
Assume, I have an IBM Cloud bucket that contains three CSV files:
First, get the following from your IBM Cloud bucket configuration :
- Bucket Name
- Private Endpoint
Then, from the Service Credentials of your Cloud Object Storage, get:
- Access Key ID
- Secret Access Key
Note: If you don't have Service Credentials for the storage instance, create a new one with HMAC set to "true" (https://cloud.ibm.com/docs/cloud-object-storage?topic=cloud-object-storage-uhc-hmac-credentials-main)
Create or confirm that your IBM Cloud user has the necessary Bucket Access Policy to view and download objects (https://cloud.ibm.com/docs/cloud-object-storage?topic=cloud-object-storage-iam-bucket-permissions) :
With this setup information confirmed, add to and connect ListS3 and FetchS3 processors on your NiFi canvas, similar to the following:
In the List S3 configuration, enter the Bucket, Access Key ID, Secret Access Key and Endpoint URL:
Note: The Region property is ignored when the Endpoint Override URL property is used.
Run the ListS3 processor and you will see a FlowFile generated for each of the bucket objects:
Looking at the queue details:
Now configure the FetchS3 similarly with the Bucket Name, Access Key ID, Secret Access Key and Endpoint Override URL:
Run the FetchS3 processor and the three CSV files are retrieved from the IBM Cloud Bucket: