<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Understanding the mahout SSVD output! in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Understanding-the-mahout-SSVD-output/m-p/31949#M7361</link>
    <description>The output is as you say -- these are the products of the SVD. You can&lt;BR /&gt;do what you want with them, and it depends on what you're trying to&lt;BR /&gt;achieve. You can look at the matrix V S to study term similarities, or&lt;BR /&gt;U S to discover document similarities for example.&lt;BR /&gt;&lt;BR /&gt;</description>
    <pubDate>Thu, 17 Sep 2015 08:30:03 GMT</pubDate>
    <dc:creator>srowen</dc:creator>
    <dc:date>2015-09-17T08:30:03Z</dc:date>
    <item>
      <title>Understanding the mahout SSVD output!</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Understanding-the-mahout-SSVD-output/m-p/31945#M7360</link>
      <description>&lt;P&gt;Dear Colleagues,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In order to run a SSVD in mahout the documents were represented in a tfidf matrix using seq2sparse&lt;/P&gt;&lt;P&gt;(the row-index are the doc-ids and the column-index are the dict-id (word-id)).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The input for SSVD is these tfidf-matrix.&lt;/P&gt;&lt;P&gt;The output of the SSVD job are the matrices U,S,V (transpose).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;How i can interprete this output regarding the original tfidf-matrix? Should i multiplice the original one with U, S or V?&lt;/P&gt;&lt;P&gt;What is the conclusion?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks in advance and best regards,&lt;/P&gt;&lt;P&gt;&amp;nbsp;butkiz&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 15:15:58 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Understanding-the-mahout-SSVD-output/m-p/31945#M7360</guid>
      <dc:creator>butkiz</dc:creator>
      <dc:date>2022-09-16T15:15:58Z</dc:date>
    </item>
    <item>
      <title>Re: Understanding the mahout SSVD output!</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Understanding-the-mahout-SSVD-output/m-p/31949#M7361</link>
      <description>The output is as you say -- these are the products of the SVD. You can&lt;BR /&gt;do what you want with them, and it depends on what you're trying to&lt;BR /&gt;achieve. You can look at the matrix V S to study term similarities, or&lt;BR /&gt;U S to discover document similarities for example.&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 17 Sep 2015 08:30:03 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Understanding-the-mahout-SSVD-output/m-p/31949#M7361</guid>
      <dc:creator>srowen</dc:creator>
      <dc:date>2015-09-17T08:30:03Z</dc:date>
    </item>
    <item>
      <title>Re: Understanding the mahout SSVD output!</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Understanding-the-mahout-SSVD-output/m-p/31950#M7362</link>
      <description>Thanks! I try to figure out which terms are related to one topic. Should i multiplice at first the V and S matrices and then compute the distance of the "new" vectors? Whats your understanding?&lt;BR /&gt;&lt;BR /&gt;Thanks and regards,&lt;BR /&gt;butkiz</description>
      <pubDate>Thu, 17 Sep 2015 08:38:49 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Understanding-the-mahout-SSVD-output/m-p/31950#M7362</guid>
      <dc:creator>butkiz</dc:creator>
      <dc:date>2015-09-17T08:38:49Z</dc:date>
    </item>
    <item>
      <title>Re: Understanding the mahout SSVD output!</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Understanding-the-mahout-SSVD-output/m-p/31951#M7363</link>
      <description>I suppose you can cluster term vectors in V S for this purpose, to&lt;BR /&gt;discover related terms and thus topics.&lt;BR /&gt;This is the type of problem where you might more usually use LDA.&lt;BR /&gt;&lt;BR /&gt;I know you're using Mahout, but if you ever consider using Spark,&lt;BR /&gt;there's a chapter on exactly this in our book:&lt;BR /&gt;&lt;A href="http://shop.oreilly.com/product/0636920035091.do" target="_blank"&gt;http://shop.oreilly.com/product/0636920035091.do&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 17 Sep 2015 09:09:03 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Understanding-the-mahout-SSVD-output/m-p/31951#M7363</guid>
      <dc:creator>srowen</dc:creator>
      <dc:date>2015-09-17T09:09:03Z</dc:date>
    </item>
  </channel>
</rss>

