Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Error while ingesting Plain CSV to SAM via NIFI

avatar
Explorer

I'm trying to upgrade a existing visualization(Kafka>Flink>Druid>Superset) solution to work with HWX SAM & Registry.

Currently the NIFI Works as a HTTP proxy to collect events and push to kafka, I'm trying to convert the events(CSV) to avro in this stage and push to kafka so that SAM can consume.

Output of the SplitContent is something similar to "abc,def,ghi,jkl,,"

I'm getting this error in storm UI

com.hortonworks.registries.schemaregistry.serde.SerDesException: Unknown protocol id [49] received while deserializing the payload at com.hortonworks.registries.schemaregistry.serdes.avro.AvroSnapsho

Is there something I should pay closer attention to when processing CSV? Troubleshooting recommendations ?

40414-screen-shot-2017-09-19-at-114110-am.png

1 ACCEPTED SOLUTION

avatar
Master Guru
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
4 REPLIES 4

avatar
Contributor

@Roshan Dissanayake

Can you please show the configuration of publishkafka reader and writer CS?

This looks to be an issue while setting the attributes of the flowfile when it is being sent to retrieve the Schema from registry.

avatar
Master Guru
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Explorer

@mkalyanpur

CSVReader 1.2.0.3.0.1.1-5 & AvroRecordSetWriter 1.2.0.3.0.1.1-5 are as follows.

And my avro schema in the registry is similar to this with bunch of more string fields.

{
  "type": "record",
  "name": "tracking_sdk_event",
  "fields": [
    {
      "name": "timeStamp",
      "type": "long",
      "default": null
    },
    {
      "name": "isoTime",
      "type": "string",
      "default": null
    }
  ] 
}

@Bryan Bende

After changing the "Schema Write Strategy" to "Hortonworks Content Encoded Schema Reference" I'm getting an error with the timeStamp field. I have attached an image of it.


avrorecordsetwriter-1203011-5.pngscreen-shot-2017-09-20-at-112540-am.pngcsvreader-1203011-5.png

avatar
Master Guru

If you want to have a default value of "null" then the type of your field needs to be a union of null and the real type.

For example, for timestamp you would need: "type": ["long", "null"]