<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question What is recommended NLP solution on top of HDP stack for Text Analytics in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/What-is-recommended-NLP-solution-on-top-of-HDP-stack-for/m-p/96447#M59767</link>
    <description>&lt;P&gt;What is recommended NLP solution on top of HDP stack for Text Analytics. I know they can use Tika/Stanbol etc for this but are these recommended tech? Anything better than this especially using spark etc?&lt;/P&gt;&lt;P&gt;Use case on-hand is to scan comments ( free text ) and generate insights in the form of recommendations . &lt;/P&gt;</description>
    <pubDate>Fri, 16 Sep 2022 09:47:44 GMT</pubDate>
    <dc:creator>nasghar</dc:creator>
    <dc:date>2022-09-16T09:47:44Z</dc:date>
    <item>
      <title>What is recommended NLP solution on top of HDP stack for Text Analytics</title>
      <link>https://community.cloudera.com/t5/Support-Questions/What-is-recommended-NLP-solution-on-top-of-HDP-stack-for/m-p/96447#M59767</link>
      <description>&lt;P&gt;What is recommended NLP solution on top of HDP stack for Text Analytics. I know they can use Tika/Stanbol etc for this but are these recommended tech? Anything better than this especially using spark etc?&lt;/P&gt;&lt;P&gt;Use case on-hand is to scan comments ( free text ) and generate insights in the form of recommendations . &lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 09:47:44 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/What-is-recommended-NLP-solution-on-top-of-HDP-stack-for/m-p/96447#M59767</guid>
      <dc:creator>nasghar</dc:creator>
      <dc:date>2022-09-16T09:47:44Z</dc:date>
    </item>
    <item>
      <title>Re: What is recommended NLP solution on top of HDP stack for Text Analytics</title>
      <link>https://community.cloudera.com/t5/Support-Questions/What-is-recommended-NLP-solution-on-top-of-HDP-stack-for/m-p/96448#M59768</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/233/omendelevitch.html" nodeid="233"&gt;@Ofer Mendelevith&lt;/A&gt; Please see this.&lt;/P&gt;</description>
      <pubDate>Wed, 04 Nov 2015 08:39:36 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/What-is-recommended-NLP-solution-on-top-of-HDP-stack-for/m-p/96448#M59768</guid>
      <dc:creator>nsabharwal</dc:creator>
      <dc:date>2015-11-04T08:39:36Z</dc:date>
    </item>
    <item>
      <title>Re: What is recommended NLP solution on top of HDP stack for Text Analytics</title>
      <link>https://community.cloudera.com/t5/Support-Questions/What-is-recommended-NLP-solution-on-top-of-HDP-stack-for/m-p/96449#M59769</link>
      <description>&lt;P&gt;There are a range of common NLP systems that work well on the platform. &lt;A href="https://opennlp.apache.org/"&gt;OpenNLP&lt;/A&gt; is a java native library which integrates well with, for example map reduce, and of course NLTK being a python system works well with pyspark. There are also native spark elements which are connected to NLP tasks: Latent Dirichlet Allocation for topic detection is one example. Of course the NLTK components also work well with Hive to do things like Tokenisation, and Part of Speech tagging.&lt;/P&gt;&lt;P&gt;Stanford &lt;A href="http://nlp.stanford.edu/software/corenlp.shtml"&gt;CoreNLP&lt;/A&gt; also provides a good toolkit of NLP functions. There is also a &lt;A href="https://github.com/databricks/spark-corenlp"&gt;spark-package&lt;/A&gt; to integrate this with SparkML pipelines. &lt;/P&gt;&lt;P&gt;Solr provides a number of useful tools that apply in the NLP space as well, such as stemming, synonym handling etc as part of its indexing and querying, so provides some building blocks for simple NLP analysis.&lt;/P&gt;&lt;P&gt;There are also a number of commercial and partner solutions which handle NLP tasks. &lt;/P&gt;&lt;P&gt;We are also looking to build tools for Entity Resolution on Spark, which will add to this. &lt;/P&gt;</description>
      <pubDate>Wed, 04 Nov 2015 19:15:11 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/What-is-recommended-NLP-solution-on-top-of-HDP-stack-for/m-p/96449#M59769</guid>
      <dc:creator>sball</dc:creator>
      <dc:date>2015-11-04T19:15:11Z</dc:date>
    </item>
    <item>
      <title>Re: What is recommended NLP solution on top of HDP stack for Text Analytics</title>
      <link>https://community.cloudera.com/t5/Support-Questions/What-is-recommended-NLP-solution-on-top-of-HDP-stack-for/m-p/96450#M59770</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/104/sball.html" nodeid="104"&gt;@Simon Elliston Ball&lt;/A&gt; is right, there's a huge variety of options for NLP as there are many niches for natural language processing.  Keep in mind that NLP libraries rarely directly solve business solutions directly.  Rather, they give you the tools to build a solution.  Often this is segmenting free text into chunks suitable for analysis (e.g. sentence disambiguation), annotating free text (e.g. part of speech tagging), converting free text to a more structured form (e.g. vectorization).  All of these are tools that are useful in processing text, but are insufficient by themselves.  These tools help you convert free, unstructured text into a form suitable as input into a normal machine learning or analysis pipeline (i.e. classification, etc.).  I suppose the one exception to this that I can think of is sentiment analysis..that is a properly valuable analytic in and of itself.&lt;/P&gt;&lt;P&gt;Also, keep in mind the license for some of these libraries are not as permissive as Apache (e.g. CoreNLP is GPL with the option to purchase a license for commercial use).&lt;/P&gt;</description>
      <pubDate>Fri, 06 Nov 2015 07:27:45 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/What-is-recommended-NLP-solution-on-top-of-HDP-stack-for/m-p/96450#M59770</guid>
      <dc:creator>cstella</dc:creator>
      <dc:date>2015-11-06T07:27:45Z</dc:date>
    </item>
  </channel>
</rss>

