Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Error while ingesting Plain CSV to SAM via NIFI

Solved Go to solution

Error while ingesting Plain CSV to SAM via NIFI

Explorer

I'm trying to upgrade a existing visualization(Kafka>Flink>Druid>Superset) solution to work with HWX SAM & Registry.

Currently the NIFI Works as a HTTP proxy to collect events and push to kafka, I'm trying to convert the events(CSV) to avro in this stage and push to kafka so that SAM can consume.

Output of the SplitContent is something similar to "abc,def,ghi,jkl,,"

I'm getting this error in storm UI

com.hortonworks.registries.schemaregistry.serde.SerDesException: Unknown protocol id [49] received while deserializing the payload at com.hortonworks.registries.schemaregistry.serdes.avro.AvroSnapsho

Is there something I should pay closer attention to when processing CSV? Troubleshooting recommendations ?

40414-screen-shot-2017-09-19-at-114110-am.png

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Error while ingesting Plain CSV to SAM via NIFI

The reader on the SAM side is trying to read the encoded schema reference, but it is likely not there. The AvroRecordSetWriter being used by PublishKafkaRecord_0_10 must be configured with a "Schema Write Strategy" of "Hortonworks Content Encoded Schema Reference".

View solution in original post

4 REPLIES 4
Highlighted

Re: Error while ingesting Plain CSV to SAM via NIFI

Explorer

@Roshan Dissanayake

Can you please show the configuration of publishkafka reader and writer CS?

This looks to be an issue while setting the attributes of the flowfile when it is being sent to retrieve the Schema from registry.

Highlighted

Re: Error while ingesting Plain CSV to SAM via NIFI

The reader on the SAM side is trying to read the encoded schema reference, but it is likely not there. The AvroRecordSetWriter being used by PublishKafkaRecord_0_10 must be configured with a "Schema Write Strategy" of "Hortonworks Content Encoded Schema Reference".

View solution in original post

Highlighted

Re: Error while ingesting Plain CSV to SAM via NIFI

Explorer

@mkalyanpur

CSVReader 1.2.0.3.0.1.1-5 & AvroRecordSetWriter 1.2.0.3.0.1.1-5 are as follows.

And my avro schema in the registry is similar to this with bunch of more string fields.

{
  "type": "record",
  "name": "tracking_sdk_event",
  "fields": [
    {
      "name": "timeStamp",
      "type": "long",
      "default": null
    },
    {
      "name": "isoTime",
      "type": "string",
      "default": null
    }
  ] 
}

@Bryan Bende

After changing the "Schema Write Strategy" to "Hortonworks Content Encoded Schema Reference" I'm getting an error with the timeStamp field. I have attached an image of it.


avrorecordsetwriter-1203011-5.pngscreen-shot-2017-09-20-at-112540-am.pngcsvreader-1203011-5.png
Highlighted

Re: Error while ingesting Plain CSV to SAM via NIFI

If you want to have a default value of "null" then the type of your field needs to be a union of null and the real type.

For example, for timestamp you would need: "type": ["long", "null"]

Don't have an account?
Coming from Hortonworks? Activate your account here