Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Gobblin Job stores the data in datasets of HDFS instead of Quarantine even though Producer and Gobblin schema's doesn't match exactly

Gobblin Job stores the data in datasets of HDFS instead of Quarantine even though Producer and Gobblin schema's doesn't match exactly

New Contributor

Gobblin schema defines four fields.

Producer which will produce data to Kafka Topic has 5 fields.

Once Gobblin job runs, Gobblin ignoring the last field and storing data of first 4 fields into HDFS.

I am expecting Gobblin Job has to put data into Quarantine instead of in HDFS.

Because Gobblin job puts data in datasets in HDFS only if producer and Gobblin schema matches.

Is Gobblin ignores the fields once the schema matches defined in Gobblin and stores in datasets of HDFS?

Or the data should be pushed to Quarantine because schema which doesn't match exactly?