Use case Description:
We are receiving the xml from the source and the expected xmls per day is around 1lakh. we thought is merge all the xmls of a day and convert to avro.
sample data:
extended schema xml
xsd:
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="employee" type="fullpersoninfo"/>
<xs:complexType name="personinfo">
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="fullpersoninfo">
<xs:complexContent>
<xs:extension base="personinfo">
<xs:sequence>
<xs:element name="address" type="xs:string"/>
<xs:element name="city" type="xs:string"/>
<xs:element name="country" type="xs:string"/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
</xs:schema>
Issue:
i have used databricks jar to convert xml to avro it worked for simple xml,
but it didnt work for schema extended xml.
Is there any workaround to convert this type of xml to avro