Created on 02-28-2018 02:00 PM - edited 09-16-2022 05:55 AM
Dear all,
We will soon migrate to Big data platform along with Hortonworks. Yey !
Currently, we have large projects based on ETL/ELT processes using ODI and my company wants to migrate them, or at least the new projects to be done via Nifi. So, here is my question:
Knowing that Nifi is more like a data flow process, specifically for loading massive amount of data into Data lake for example, what are the features that Nifi make me able to replace an ETL/ELT process, precisely, concentrated on data transforming/checking part ? I know about SplitText, ExtractText and so on but I didn't see the specific filters...
For example, how can I check each record of a CSV/XLSX file if, i.e., has the correct length, to check if that record is really a number not a varchar, how to check if column X has a value then column Z should has also a value ? For the lines that their records are good, we insert in a repository table and then use it to make a join with a materialize view in order to provide a CSV file with specific content. Here is my second question:
How can we make such a join in Nifi and how to interact with the tables of a schema ? If you do not have time to answer, it will be great also a link which show how to solve this.
I think those questions are in mind of the all developers which want to migrate from ETL/ELT to Nifi.
Thank you for your time and help !
Stefan
Created 02-28-2018 02:14 PM
Created 02-28-2018 02:03 PM
For somethings SQOOP is a good choice. For somethings Spark works as well.
Created 02-28-2018 02:14 PM