Member since
08-30-2018
4
Posts
0
Kudos Received
0
Solutions
09-06-2018
12:27 PM
Hi, As per the schema registry documentation at https://github.com/hortonworks/registry/blob/v0.5.1/docs/schema-registry.rst , ------------------------------------------------------------------------------------------------------------com.hortonworks.registries.schemaregistry.serdes.avro.kafka.KafkaAvroDeserializer This deserializer tries to find schema.id in the message payload. If it finds schema.id, makes a call to schema registry to fetch the avro schema. If it doesn't find schema.id it falls back to using topic name to fetch a schema. -------------------------------------------------------------------------------------------------------------- I created a KafkaProducer that created a new schema in schema registry SR-A and wrote to a kafka topic KT. I wrote a KafkaConsumer that read from KT while using SR-A. This worked perfectly fine. Then I manually exported the schema from SR-A to a new schema registry SR-B with the same name. The schema ID in SR-A was 27 and in SR-B was 1. The same KafkaConsumer tried to read from KT while using SR-B and it failed. From the description of KafkaAvroDeserializer it seems it should have used the fall back mechanism to read the schema from SR-B as there was a schema with the same name. To dig further, I checked the source code of KafkaAvroDeserializer and found that it calls this function to fetch the schema from schema registry https://github.com/hortonworks/registry/blob/v0.5.1/schema-registry/client/src/main/java/com/hortonworks/registries/schemaregistry/serde/AbstractSnapshotDeserializer.java#L153 Strangely this function only fetches the schema by using schema.id and doesn't have the fall back mechanism implemented to fetch schema using topic name. To summarise, The documentation says that if schema.id is not found the fallback mechanism is to check for a schema with same name as the topic. The actual behaviour in my test and the code I checked indicate that there is no such fallback mechanism. Am I missing something here ? Regards, Sanjay
... View more
Labels:
09-06-2018
04:10 AM
Thanks for the quick response @Timothy Spann I see that the one I requested had been approved. Just now I edited to make an attribution clear, and it again went to moderation 😞 Can you please, for the last time, approve it ? I won't make any further changes. I am really stuck because of this and would help if the question appears in the list so that it can be answered.
... View more
09-06-2018
02:13 AM
@Timothy Spann / @Dave Russell There was some issue with the ask question page that gave me a blank page after I submitted my question. Hence I asked the same question again second time, and then again via different browsers thinking it to be browser issue. Then I came to my profile page where I see that I have ended up asking same question multiple times and all are in moderation since 14+ hours. I do not see any way to contact an administrator. Can you please delete all my questions posted yesterday except the first one posted yesterday ( Schema retrieval by name not working ) for which I request you to moderate and accept it ? Thanks, Sanjay
... View more
08-30-2018
11:39 AM
I have my Kafka producer running on cluster A and Kafka consumer running on cluster B. Both of these have their own kafka clusters and we use MirrorMaker to replicate the messages from A to B. Both these clusters should be able to operate independently hence I need schema registry on each of these. Problem is that how do I get the schema that is created on A to be sync'd to B ? If I add the schema to cluster B using REST API call, then the schema ID will be different on B than it was in A . From whatever little documentation I could find around serialisation/deserialisation, I understand that the deserialiser on my consumer cluster will try to find the schema matching the ID embedded in the message in kafka topic. This ID will match the ID of the schema in cluster A and will not be equal to the ID of the same schema that I manually added to cluster B. How do I solve this scenario ? Is there a way to indicate the deserialiser to find the schema by name and not by ID ? Also is there a way to mirror or replicate schemas on both clusters ?
... View more
Labels: