Support Questions
Find answers, ask questions, and share your expertise

I need ingest over 100 teradata table to hive tables, like to have incremental and streaming features.

Solved Go to solution

I need ingest over 100 teradata table to hive tables, like to have incremental and streaming features.

Contributor

Q1, is setup triggers on each table the only way nifi capture data change?

Q2, what if I have over 100 tables need to ingest to hadoop, each table need 3 triggers on insert, delete and update. so I need to setup all?

Q3, how exactly nifi reads log files from RDBMS?

Q4. do you think Sqoop is better tool in my case?

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: I need ingest over 100 teradata table to hive tables, like to have incremental and streaming features.

Q1. Change data capture in Ni-Fi is the easiest way to capture incremental records there are work around as well depending upon the use case.

Q2. I believe yes. But if your target is hive then its better not go with all three. Capture just the incremental records into HDFS and do the comparison within HDFS and update the target.

Q4. It depends. If you are looking for real-time processing then dont think of choosing sqoop. Sqoop is specifically desinged for large data processing. So if real- time processing is needed go with kafka/Nifi to ingest data into hadoop. Kafka/NiFi can handle incremental volume in a decent way.

View solution in original post

1 REPLY 1
Highlighted

Re: I need ingest over 100 teradata table to hive tables, like to have incremental and streaming features.

Q1. Change data capture in Ni-Fi is the easiest way to capture incremental records there are work around as well depending upon the use case.

Q2. I believe yes. But if your target is hive then its better not go with all three. Capture just the incremental records into HDFS and do the comparison within HDFS and update the target.

Q4. It depends. If you are looking for real-time processing then dont think of choosing sqoop. Sqoop is specifically desinged for large data processing. So if real- time processing is needed go with kafka/Nifi to ingest data into hadoop. Kafka/NiFi can handle incremental volume in a decent way.

View solution in original post