<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Avro type information shown with indexed fields when trying out Cloudera Search using QuickStart VM. in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Avro-type-information-shown-with-indexed-fields-when-trying/m-p/5791#M1078</link>
    <description>&lt;P&gt;I'm trying out the Cloudera QuickStart VM and found it pretty straightforward to try out basic WordCount M/R example, Hive queries on sample CSVs. I was most eager to try out Cloudera Search.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I followed the steps from the blog &lt;A target="_self" href="https://community.cloudera.com/t5/forums/postpage/choose-node/true/interaction-style/forum/override-styles/"&gt;here&lt;/A&gt;. The ~/datasets/batch-tweets.sh script seemed to run fine - the MapReduceIndexer took 3 to 4 minutes and jobs seemed to succeed. I could see what looks like a Lucene index in HDFS under&amp;nbsp;/solr/batch_tweets/core_node1/data/index. So far so good. I fired up the Hue Solr Search tool and tried customizing how search results are formatted. This works partially but each field in a set of results is preceded by what looks like Avro type information e.g. if the template looks like:&amp;nbsp;{{text}} {{user_name}} the results preview shows the following:&lt;/P&gt;&lt;P&gt;&lt;EM&gt;org.apache.avro.util.Utf8:&lt;/EM&gt;tweet text 10782 &lt;EM&gt;org.apache.avro.util.Utf8:&lt;/EM&gt;fake user10782&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I also tried using avro-tools to read the sample data that the batch-tweets script pulls in for indexing:&lt;/P&gt;&lt;P&gt;java -jar ~/avro-tools-1.7.3.jar tojson &amp;nbsp;/usr/share/doc/search-1.0.0/examples/test-documents/sample-statuses-20120906-141433-medium.avro | less&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The avro files seemed to read just fine.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is it possible that there's been some change to the QuickStart VM since the blog was posted last summer? Any suggestions welcome.&lt;/P&gt;</description>
    <pubDate>Tue, 21 Apr 2026 14:02:11 GMT</pubDate>
    <dc:creator>ngk</dc:creator>
    <dc:date>2026-04-21T14:02:11Z</dc:date>
    <item>
      <title>Avro type information shown with indexed fields when trying out Cloudera Search using QuickStart VM.</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Avro-type-information-shown-with-indexed-fields-when-trying/m-p/5791#M1078</link>
      <description>&lt;P&gt;I'm trying out the Cloudera QuickStart VM and found it pretty straightforward to try out basic WordCount M/R example, Hive queries on sample CSVs. I was most eager to try out Cloudera Search.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I followed the steps from the blog &lt;A target="_self" href="https://community.cloudera.com/t5/forums/postpage/choose-node/true/interaction-style/forum/override-styles/"&gt;here&lt;/A&gt;. The ~/datasets/batch-tweets.sh script seemed to run fine - the MapReduceIndexer took 3 to 4 minutes and jobs seemed to succeed. I could see what looks like a Lucene index in HDFS under&amp;nbsp;/solr/batch_tweets/core_node1/data/index. So far so good. I fired up the Hue Solr Search tool and tried customizing how search results are formatted. This works partially but each field in a set of results is preceded by what looks like Avro type information e.g. if the template looks like:&amp;nbsp;{{text}} {{user_name}} the results preview shows the following:&lt;/P&gt;&lt;P&gt;&lt;EM&gt;org.apache.avro.util.Utf8:&lt;/EM&gt;tweet text 10782 &lt;EM&gt;org.apache.avro.util.Utf8:&lt;/EM&gt;fake user10782&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I also tried using avro-tools to read the sample data that the batch-tweets script pulls in for indexing:&lt;/P&gt;&lt;P&gt;java -jar ~/avro-tools-1.7.3.jar tojson &amp;nbsp;/usr/share/doc/search-1.0.0/examples/test-documents/sample-statuses-20120906-141433-medium.avro | less&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The avro files seemed to read just fine.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is it possible that there's been some change to the QuickStart VM since the blog was posted last summer? Any suggestions welcome.&lt;/P&gt;</description>
      <pubDate>Tue, 21 Apr 2026 14:02:11 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Avro-type-information-shown-with-indexed-fields-when-trying/m-p/5791#M1078</guid>
      <dc:creator>ngk</dc:creator>
      <dc:date>2026-04-21T14:02:11Z</dc:date>
    </item>
    <item>
      <title>Re: Avro type information shown with indexed fields when trying out Cloudera Search using QuickStart</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Avro-type-information-shown-with-indexed-fields-when-trying/m-p/5833#M1079</link>
      <description>&lt;P&gt;A small additional piece of information is that by exploring the contents of the SOLR index via the Solr Admin web UI I can see that certain fields do indeed seem to be &lt;STRONG&gt;indexed&lt;/STRONG&gt; with "org.apache.avro.util.Utf8:" prefix on the original strings. The fields in question are:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;user_screen_name&lt;/LI&gt;&lt;LI&gt;user_location&lt;/LI&gt;&lt;LI&gt;text&lt;/LI&gt;&lt;LI&gt;user_name&lt;/LI&gt;&lt;LI&gt;source&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;From the batch_tweets.sh script I can see how it invokes the MapReduceIndexerTool pointing at the batch_tweets_indir location in HDFS (which contains the input data in avro format). From what I can understand I believe the morphline may be key to processeding the input data in HDFS and passing on to the indexer. Doe anybody know if that's a good place to dig further or should I look into the source code for&amp;nbsp;&lt;SPAN&gt;MapReduceIndexerTool?&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 07 Feb 2014 14:07:59 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Avro-type-information-shown-with-indexed-fields-when-trying/m-p/5833#M1079</guid>
      <dc:creator>ngk</dc:creator>
      <dc:date>2014-02-07T14:07:59Z</dc:date>
    </item>
    <item>
      <title>Re: Avro type information shown with indexed fields when trying out Cloudera Search using QuickStart</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Avro-type-information-shown-with-indexed-fields-when-trying/m-p/5839#M1080</link>
      <description>I think this has been fixed in more recent versions of Cloudera Search.&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Fri, 07 Feb 2014 15:11:19 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Avro-type-information-shown-with-indexed-fields-when-trying/m-p/5839#M1080</guid>
      <dc:creator>whosch</dc:creator>
      <dc:date>2014-02-07T15:11:19Z</dc:date>
    </item>
    <item>
      <title>Re: Avro type information shown with indexed fields when trying out Cloudera Search using QuickStart</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Avro-type-information-shown-with-indexed-fields-when-trying/m-p/5843#M1081</link>
      <description>&lt;P&gt;That's good to know. I believe I have the most recent quickstart VM (4.4.0-1). Are updated versions of the VMs made available regularly? Or do you know if this something that can be "patched" within the VM? (I'd like to demo something based on the search functionality with a view to requesting that our enterprise (cloudera&amp;nbsp;but not sure what CDH version yet)&amp;nbsp;cluster have Search enabled..)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 07 Feb 2014 15:30:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Avro-type-information-shown-with-indexed-fields-when-trying/m-p/5843#M1081</guid>
      <dc:creator>ngk</dc:creator>
      <dc:date>2014-02-07T15:30:50Z</dc:date>
    </item>
    <item>
      <title>Re: Avro type information shown with indexed fields when trying out Cloudera Search using QuickStart</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Avro-type-information-shown-with-indexed-fields-when-trying/m-p/5845#M1082</link>
      <description>Make sure to run search-1.1.0.&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Fri, 07 Feb 2014 16:11:19 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Avro-type-information-shown-with-indexed-fields-when-trying/m-p/5845#M1082</guid>
      <dc:creator>whosch</dc:creator>
      <dc:date>2014-02-07T16:11:19Z</dc:date>
    </item>
  </channel>
</rss>

