<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Spark not showing Kafka Data Properly in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Spark-not-showing-Kafka-Data-Properly/m-p/335840#M232045</link>
    <description>&lt;P&gt;Looking at the serialized data, that seems like the Java binary serialization protocol.&amp;nbsp;It seems to me that the producer is simply writing the HashMap java object directly to Kafka, rather than using a proper serializer (Avro, JSON, String, etc.)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;You should look into modifying your producer so that you can properly deserialize the data that you're reading from Kafka.&lt;/P&gt;</description>
    <pubDate>Tue, 08 Feb 2022 22:44:48 GMT</pubDate>
    <dc:creator>araujo</dc:creator>
    <dc:date>2022-02-08T22:44:48Z</dc:date>
    <item>
      <title>Spark not showing Kafka Data Properly</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-not-showing-Kafka-Data-Properly/m-p/334992#M231915</link>
      <description>&lt;P&gt;I'm trying to use kafka data using pyspark but I having difficult because it's in Hashmap type&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The Question is, how can I convert this to a useful df to be treated in pyspark?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This is the output and my actual code:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="This is the output" style="width: 999px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/33317i28404ACB80145BBA/image-size/large?v=v2&amp;amp;px=999" role="button" title="2022-01-31 10_37_07-2022-01-28 16_30_18-Greenshot.png ‎- Fotos.png" alt="This is the output" /&gt;&lt;span class="lia-inline-image-caption" onclick="event.preventDefault();"&gt;This is the output&lt;/span&gt;&lt;/span&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="This is my actual code" style="width: 991px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/33318iA92AF4CC7AD77F8D/image-size/large?v=v2&amp;amp;px=999" role="button" title="2022-01-31 10_36_05-● new.py - Visual Studio Code.png" alt="This is my actual code" /&gt;&lt;span class="lia-inline-image-caption" onclick="event.preventDefault();"&gt;This is my actual code&lt;/span&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any suggestion and steps?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 31 Jan 2022 13:43:06 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-not-showing-Kafka-Data-Properly/m-p/334992#M231915</guid>
      <dc:creator>victorescosta</dc:creator>
      <dc:date>2022-01-31T13:43:06Z</dc:date>
    </item>
    <item>
      <title>Re: Spark not showing Kafka Data Properly</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-not-showing-Kafka-Data-Properly/m-p/335707#M232012</link>
      <description>&lt;P&gt;You need to find out what's the serializer that's being used to write data to Kafka and use an associated deserializer to read those messages.&lt;/P&gt;</description>
      <pubDate>Mon, 07 Feb 2022 05:11:05 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-not-showing-Kafka-Data-Properly/m-p/335707#M232012</guid>
      <dc:creator>araujo</dc:creator>
      <dc:date>2022-02-07T05:11:05Z</dc:date>
    </item>
    <item>
      <title>Re: Spark not showing Kafka Data Properly</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-not-showing-Kafka-Data-Properly/m-p/335787#M232031</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/95425"&gt;@victorescosta&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;You need to check the producer code at which format kafka message is produced and what kind of Serializer class you have used. Same format/serialiser you need to use while deserialising the data. For example while writing data if you have used Avro then while deserialising you need to Avro.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/11191"&gt;@araujo&lt;/a&gt;&amp;nbsp;You are right. Customer needs to check their producer code and serializer class.&lt;/P&gt;</description>
      <pubDate>Tue, 08 Feb 2022 12:07:46 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-not-showing-Kafka-Data-Properly/m-p/335787#M232031</guid>
      <dc:creator>RangaReddy</dc:creator>
      <dc:date>2022-02-08T12:07:46Z</dc:date>
    </item>
    <item>
      <title>Re: Spark not showing Kafka Data Properly</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-not-showing-Kafka-Data-Properly/m-p/335840#M232045</link>
      <description>&lt;P&gt;Looking at the serialized data, that seems like the Java binary serialization protocol.&amp;nbsp;It seems to me that the producer is simply writing the HashMap java object directly to Kafka, rather than using a proper serializer (Avro, JSON, String, etc.)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;You should look into modifying your producer so that you can properly deserialize the data that you're reading from Kafka.&lt;/P&gt;</description>
      <pubDate>Tue, 08 Feb 2022 22:44:48 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-not-showing-Kafka-Data-Properly/m-p/335840#M232045</guid>
      <dc:creator>araujo</dc:creator>
      <dc:date>2022-02-08T22:44:48Z</dc:date>
    </item>
  </channel>
</rss>

