Support Questions
Find answers, ask questions, and share your expertise

mark synced data in Apache Hadoop

mark synced data in Apache Hadoop

New Contributor

I use Apache Hadoop as a Data lake. It stores raw data from transactional Databases.

I use Apache Nifi to fetch data from Apache Hadoop and transform the data and load it in some data marts.

I need something like cursor to mark synced data to hold the state of sync data from raw data to every mart, so that when I start the flow in Apache Nifi it resumes the data transformation from fetched data from raw data.

 

 

What is the best solution?

Tagging on original data or something else?

Is there any special Nifi processor to do that?