Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Representing NULL in avro

Super Collaborator

Hi All, I am trying to store CSV data to AVRO. I have issue with columns that have no value(NULL). Instead of getting nulls in the avro, it gets replaced by an empty string. What do I need to do to make sure that empty columns in CSV are represented as nulls in my avro file. My sample CSV looks like -

,12,"street 1"
and my avro schema for the same is
{
    "type": "record",
    "name": "record",
    "doc": "Schema generated by Kite",
    "fields": [
        {
            "name": "name",
            "type": [
                "null",
                "string"
            ],
            "default": null
        },
        {
            "name": "age",
            "type": [
                "null",
                "long"
            ],
            "default": null
        },
        {
            "name": "address",
            "type": [
                "null",
                "string"
            ],
            "default": null
        }
    ]
}
To verify the conversion, if I transform my AVRO to a JSON file, I get
{
  "name" : "",
  "age" : 12,
  "address" : "street 1"
}
But I expect it to be
{
  "name" : null,
  "age" : 12,
  "address" : "street 1"
}
1 REPLY 1

Explorer

in the schema definition for name and address

 

set "default" : "null"

 

instead of "default": null

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.