Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to ingest CSV file using NIFI whose schema changes in-between( new rows added)

Highlighted

How to ingest CSV file using NIFI whose schema changes in-between( new rows added)

Explorer

As a part of my requirement, i would like to ingest a CSV file to HDFS. The schema of the CSV file keeps changing( mainly new columns added now and then). Is it possible to accomplish this using NIFI?

3 REPLIES 3
Highlighted

Re: How to ingest CSV file using NIFI whose schema changes in-between( new rows added)

Super Guru
@Sudheer K

Does your csv file having header in it? if yes then you can use Csv Reader with the below configs

80414-csvreader.png

By using these configs your reader will change dynamically based on your header.

Highlighted

Re: How to ingest CSV file using NIFI whose schema changes in-between( new rows added)

Super Guru
@Sudheer K

If you are converting the csv data to ORC format then ConvertAvroToOrc processor adds hive.ddl attribute to the flowfile and make use of that attribute you can recreate the external table everytime while ingesting the data.
(or)
ExtractAvroMetadata processor extracts the avro schema from the avro datafile and adds to the flowfile attributes so using the avro schema you can drop and recreate the table everytime.

(or)

While configuring Record SetWriter controller service keep the below property value as

Schema Write Strategy

Set 'avro.schema' Attribute

Then each flowfile will have avro.schema attribute, make use of this attribute prepare your create table statement that includes all the columns.

Create external tables so that when you drop the table there will be no data loss in this case and assuming all the new fields are appended to the existing data model, if there are some fields got removed or changed the position in the file then based on new schema you are going to creating the table which will results data issues .!!

Highlighted

Re: How to ingest CSV file using NIFI whose schema changes in-between( new rows added)

Explorer

Thank You. Is there a way through NIFI that existing table be updated as the schema changes?

Thanks In Adavance

Don't have an account?
Coming from Hortonworks? Activate your account here