I'm just starting to learn NiFi. I need to read parquet data from s3 bucket, I don’t understand how to set up lists3 and fetchs3object processors for reading data. full path looks like this: s3://inbox/prod/export/date=2022-01-07/user=100/cro.parquet I'll write data to sql database - I don't have problems with it, but not sure)) I tried to configure the lists3 processor myself and I think is not very good bucket inbox aws_access_key_id aws_secret_access_key region US EAST endpoint override URL http://s3.wi-fi.ru:8080
In order to read parquet data from s3 bucket flow would look like : ListS3 -> FetchS3 -> ConvertRecord with parquet reader .
So you are facing issue with ListS3 ? Can you please provide more details , I mean if you have configured with all the required details and upon start processor is throwing any error ? Processor show invalid and need some additional information? snapshot , error stack trace would help .
Hey! I built the process of getting data from s3, but now I have another problem, when converting, part of the data is lost and not written to the SQL database, and I can’t understand where my error, there are my settings. If you help me it will be very great. I use split because without it there is a lot of data and Nifi drops UpdateAttribute I use because some of the values are taken from the path where the files lie on s3