Community Articles

Find and share helpful community-sourced technical articles.
avatar
Master Guru

Using Stanford CoreNLP in Your Big Data Pipelines

CoreNLP Overview

The latest version of Stanford CoreNLP includes a server that you can run and access via REST API. CoreNLP adds a lot of features, but the one most interesting to me is Sentiment Analysis.

Installation and Setup (http://stanfordnlp.github.io/CoreNLP/corenlp-server.html)

Download a recent full deployment (http://nlp.stanford.edu/software/stanford-corenlp-full-2016-10-31.zip) This is big, it has models and all the JARS and server code.

Run the Server

Giving the JVM Four Gigs of RAM to run makes it run nice. Port 9000 works for me.

java  -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 15000

Running the Server

stanford-corenlp-full-2016-10-31 git:(master) ✗ java  -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 15000
[main] INFO CoreNLP - --- StanfordCoreNLPServer#main() called ---
[main] INFO CoreNLP - setting default constituency parser
[main] INFO CoreNLP - warning: cannot find edu/stanford/nlp/models/srparser/englishSR.ser.gz
[main] INFO CoreNLP - using: edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz instead
[main] INFO CoreNLP - to use shift reduce parser download English models jar from:
[main] INFO CoreNLP - http://stanfordnlp.github.io/CoreNLP/download.html
[main] INFO CoreNLP -     Threads: 8
[main] INFO CoreNLP - Starting server...
[main] INFO CoreNLP - StanfordCoreNLPServer listening at /0:0:0:0:0:0:0:0:9000
[pool-1-thread-6] INFO CoreNLP - [/0:0:0:0:0:0:0:1:59705] API call w/annotators tokenize,ssplit,parse,pos,sentiment
The quick brown fox jumped over the lazy dog.
[pool-1-thread-6] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[pool-1-thread-6] INFO edu.stanford.nlp.pipeline.TokenizerAnnotator - No tokenizer type provided. Defaulting to PTBTokenizer.
[pool-1-thread-6] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[pool-1-thread-6] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse
[pool-1-thread-6] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ... done [0.5 sec].
[pool-1-thread-6] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator pos
[pool-1-thread-6] INFO edu.stanford.nlp.tagger.maxent.MaxentTagger - Loading POS tagger from edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger ... done [0.6 sec].
[pool-1-thread-6] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator sentiment
[pool-1-thread-8] INFO CoreNLP - [/0:0:0:0:0:0:0:1:59706] API call w/annotators tokenize,ssplit,pos,parse,sentiment
The quick brown fox jumped over the lazy dog.
[pool-1-thread-8] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[pool-1-thread-8] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[pool-1-thread-8] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator pos
[pool-1-thread-8] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse
[pool-1-thread-8] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator sentiment
[pool-1-thread-2] INFO CoreNLP - [/0:0:0:0:0:0:0:1:59709] API call w/annotators tokenize,ssplit,pos,parse,sentiment
This is the worst way to test sentiment ever.

Testing Your Installation

You can call the Stanford Server via wget and curl. I like these properties: tokenize, ssplit, parse, sentiment.

curl --data 'This is greatest test ever.' 'http://localhost:9000/?properties={%22annotators%22%3A%22sentiment%22%2C%22outputFormat%22%3A%22json%22}' -o -

I am running an instance of the server locally, you can run this on an edge node in your cluster.

wget --post-data 'This is the worst way to test sentiment ever.' 'localhost:9000/?properties={"annotators":"sentiment","outputFormat":"json"}' -O -

--2017-02-02 12:13:51--  http://localhost:9000/?properties=%7B%22annotators%22:%22sentiment%22,%22outputFormat%22:%22json%22%...
Resolving localhost... ::1, 127.0.0.1
Connecting to localhost|::1|:9000... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4407 (4.3K) [application/json]
Saving to: 'STDOUT'
-                                                    0%[                                                                                                               ]       0  --.-KB/s               {"sentences":[{"index":0,"parse":"(ROOT\n  (S\n    (NP (DT This))\n    (VP (VBZ is)\n      (NP\n        (NP (DT the) (JJS worst) (NN way))\n        (PP (TO to)\n          (NP (NN test) (NN sentiment))))\n      (ADVP (RB ever)))\n    (. .)))","basicDependencies":[{"dep":"ROOT","governor":0,"governorGloss":"ROOT","dependent":5,"dependentGloss":"way"},{"dep":"nsubj","governor":5,"governorGloss":"way","dependent":1,"dependentGloss":"This"},{"dep":"cop","governor":5,"governorGloss":"way","dependent":2,"dependentGloss":"is"},{"dep":"det","governor":5,"governorGloss":"way","dependent":3,"dependentGloss":"the"},{"dep":"amod","governor":5,"governorGloss":"way","dependent":4,"dependentGloss":"worst"},{"dep":"case","governor":8,"governorGloss":"sentiment","dependent":6,"dependentGloss":"to"},{"dep":"compound","governor":8,"governorGloss":"sentiment","dependent":7,"dependentGloss":"test"},{"dep":"nmod","governor":5,"governorGloss":"way","dependent":8,"dependentGloss":"sentiment"},{"dep":"advmod","governor":5,"governorGloss":"way","dependent":9,"dependentGloss":"ever"},{"dep":"punct","governor":5,"governorGloss":"way","dependent":10,"dependentGloss":"."}],"enhancedDependencies":[{"dep":"ROOT","governor":0,"governorGloss":"ROOT","dependent":5,"dependentGloss":"way"},{"dep":"nsubj","governor":5,"governorGloss":"way","dependent":1,"dependentGloss":"This"},{"dep":"cop","governor":5,"governorGloss":"way","dependent":2,"dependentGloss":"is"},{"dep":"det","governor":5,"governorGloss":"way","dependent":3,"dependentGloss":"the"},{"dep":"amod","governor":5,"governorGloss":"way","dependent":4,"dependentGloss":"worst"},{"dep":"case","governor":8,"governorGloss":"sentiment","dependent":6,"dependentGloss":"to"},{"dep":"compound","governor":8,"governorGloss":"sentiment","dependent":7,"dependentGloss":"test"},{"dep":"nmod:to","governor":5,"governorGloss":"way","dependent":8,"dependentGloss":"sentiment"},{"dep":"advmod","governor":5,"governorGloss":"way","dependent":9,"dependentGloss":"ever"},{"dep":"punct","governor":5,"governorGloss":"way","dependent":10,"dependentGloss":"."}],"enhancedPlusPlusDependencies":[{"dep":"ROOT","governor":0,"governorGloss":"ROOT","dependent":5,"dependentGloss":"way"},{"dep":"nsubj","governor":5,"governorGloss":"way","dependent":1,"dependentGloss":"This"},{"dep":"cop","governor":5,"governorGloss":"way","dependent":2,"dependentGloss":"is"},{"dep":"det","governor":5,"governorGloss":"way","dependent":3,"dependentGloss":"the"},{"dep":"amod","governor":5,"governorGloss":"way","dependent":4,"dependentGloss":"worst"},{"dep":"case","governor":8,"governorGloss":"sentiment","dependent":6,"dependentGloss":"to"},{"dep":"compound","governor":8,"governorGloss":"sentiment","dependent":7,"dependentGloss":"test"},{"dep":"nmod:to","governor":5,"governorGloss":"way","dependent":8,"dependentGloss":"sentiment"},{"dep":"advmod","governor":5,"governorGloss":"way","dependent":9,"dependentGloss":"ever"},{"dep":"punct","governor":5,"governorGloss":"way","dependent":10,"dependentGloss":"."}],"sentimentValue":"0","sentiment":"Verynegative","tokens":[{"index":1,"word":"This","originalText":"This","characterOffsetBegin":0,"characterOffsetEnd":4,"pos":"DT","before":"","after":" "},{"index":2,"word":"is","originalText":"is","characterOffsetBegin":5,"characterOffsetEnd":7,"pos":"VBZ","before":" ","after":" "},{"index":3,"word":"the","originalText":"the","characterOffsetBegin":8,"characterOffsetEnd":11,"pos":"DT","before":" ","after":" "},{"index":4,"word":"worst","originalText":"worst","characterOffsetBegin":12,"characterOffsetEnd":17,"pos":"JJS","before":" ","after":" "},{"index":5,"word":"way","originalText":"way","characterOffsetBegin":18,"characterOffsetEnd":21,"pos":"NN","before":" ","after":" "},{"index":6,"word":"to","originalText":"to","characterOffsetBegin":22,"characterOffsetEnd":24,"pos":"TO","before":" ","after":" "},{"index":7,"word":"test","originalText":"test","characterOffsetBegin":25,"characterOffsetEnd":29,"pos":"NN","before":" ","after":" "},{"index":8,"word":"sentiment","originalText":"sentiment","characterOffsetBegin":30,"characterOffsetEnd":39,"pos":"NN","before":" ","after":" "},{"index":9,"word":"ever","originalText":"ever","characterOffsetBegin":40,"characterOffsetEnd":44,"-                                                  100%[==============================================================================================================>]   4.30K  --.-KB/s    in 0s

The tool gives you a ton of data on how it ran it's NLP analysis as well as giving you back your sentiment results. You can configure different properties for different language processing. This is well documented by Stanford.

Stanford CoreNLP Server UI

You not only get a REST API, you also get a nice front-end

12053-stanfordserversentiment.png

12054-stanfordcorenlpserverresults.png

Accessing From Apache NiFi

12051-stanfordserverflowoverview.png

Step 1: Get some Data (GetTwitter works nice)

Step 2: Build a File with just 1 field to send (I extract the Twitter message and then convert that to a FlowFile with no JSON)

Step 3: InvokeHTTP to call Sentiment Server

http://localhost:9000/?properties=%7B%22annotators%22%3A%22tokenize%2Cssplit%2Cparse%2Csentiment%22%...

12056-stanfordserverinvokehttp2.png

12057-stanfordserverinvokehttp.png

Make sure you set Content-Type to application/json, set the Message Body to True, Always output Response to true,

Follow Redirects to true, and HTTP Method to POST.

Step 4: Use the JSON NLP Results

The server will also allow you to receive text and XML. JSON is easy to work with.

12055-stanfordserverresults.png

{
  "sentences" : [ {
    "index" : 0,
    "parse" : "(ROOT\n  (FRAG\n    (NP (NNP RT) (NNP @MikeTamir))\n    (: :)\n    (S\n      (NP (NNP Google))\n      (VP (VBG betting)\n        (ADJP (JJ big)\n          (PP (IN on)\n            (S\n              (VP (VBG #DeepLearning)\n                (NP\n                  (NP (JJ #AI) (VBG #MachineLearning) (NN #DataScience))\n                  (: :)\n                  (NP (NNP Sundar) (NNP Pichai) (NNPS https://t.co/r5X4AnhXUo) (NNP https://t.co/c…)))))))))))",
    "basicDependencies" : [ {
      "dep" : "ROOT",
      "governor" : 0,
      "governorGloss" : "ROOT",
      "dependent" : 2,
      "dependentGloss" : "@MikeTamir"
    }, {
      "dep" : "compound",
      "governor" : 2,
      "governorGloss" : "@MikeTamir",
      "dependent" : 1,
      "dependentGloss" : "RT"
    }, {
      "dep" : "punct",
      "governor" : 2,
      "governorGloss" : "@MikeTamir",
      "dependent" : 3,
      "dependentGloss" : ":"
    }, {
      "dep" : "nsubj",
      "governor" : 5,
      "governorGloss" : "betting",
      "dependent" : 4,
      "dependentGloss" : "Google"
    }, {
      "dep" : "parataxis",
      "governor" : 2,
      "governorGloss" : "@MikeTamir",
      "dependent" : 5,
      "dependentGloss" : "betting"
    }, {
      "dep" : "xcomp",
      "governor" : 5,
      "governorGloss" : "betting",
      "dependent" : 6,
      "dependentGloss" : "big"
    }, {
      "dep" : "mark",
      "governor" : 8,
      "governorGloss" : "#DeepLearning",
      "dependent" : 7,
      "dependentGloss" : "on"
    }, {
      "dep" : "advcl",
      "governor" : 6,
      "governorGloss" : "big",
      "dependent" : 8,
      "dependentGloss" : "#DeepLearning"
    }, {
      "dep" : "amod",
      "governor" : 11,
      "governorGloss" : "#DataScience",
      "dependent" : 9,
      "dependentGloss" : "#AI"
    }, {
      "dep" : "amod",
      "governor" : 11,
      "governorGloss" : "#DataScience",
      "dependent" : 10,
      "dependentGloss" : "#MachineLearning"
    }, {
      "dep" : "dobj",
      "governor" : 8,
      "governorGloss" : "#DeepLearning",
      "dependent" : 11,
      "dependentGloss" : "#DataScience"
    }, {
      "dep" : "punct",
      "governor" : 11,
      "governorGloss" : "#DataScience",
      "dependent" : 12,
      "dependentGloss" : ":"
    }, {
      "dep" : "compound",
      "governor" : 16,
      "governorGloss" : "https://t.co/c…",
      "dependent" : 13,
      "dependentGloss" : "Sundar"
    }, {
      "dep" : "compound",
      "governor" : 16,
      "governorGloss" : "https://t.co/c…",
      "dependent" : 14,
      "dependentGloss" : "Pichai"
    }, {
      "dep" : "compound",
      "governor" : 16,
      "governorGloss" : "https://t.co/c…",
      "dependent" : 15,
      "dependentGloss" : "https://t.co/r5X4AnhXUo"
    }, {
      "dep" : "dep",
      "governor" : 11,
      "governorGloss" : "#DataScience",
      "dependent" : 16,
      "dependentGloss" : "https://t.co/c…"
    } ],
    "enhancedDependencies" : [ {
      "dep" : "ROOT",
      "governor" : 0,
      "governorGloss" : "ROOT",
      "dependent" : 2,
      "dependentGloss" : "@MikeTamir"
    }, {
      "dep" : "compound",
      "governor" : 2,
      "governorGloss" : "@MikeTamir",
      "dependent" : 1,
      "dependentGloss" : "RT"
    }, {
      "dep" : "punct",
      "governor" : 2,
      "governorGloss" : "@MikeTamir",
      "dependent" : 3,
      "dependentGloss" : ":"
    }, {
      "dep" : "nsubj",
      "governor" : 5,
      "governorGloss" : "betting",
      "dependent" : 4,
      "dependentGloss" : "Google"
    }, {
      "dep" : "parataxis",
      "governor" : 2,
      "governorGloss" : "@MikeTamir",
      "dependent" : 5,
      "dependentGloss" : "betting"
    }, {
      "dep" : "xcomp",
      "governor" : 5,
      "governorGloss" : "betting",
      "dependent" : 6,
      "dependentGloss" : "big"
    }, {
      "dep" : "mark",
      "governor" : 8,
      "governorGloss" : "#DeepLearning",
      "dependent" : 7,
      "dependentGloss" : "on"
    }, {
      "dep" : "advcl:on",
      "governor" : 6,
      "governorGloss" : "big",
      "dependent" : 8,
      "dependentGloss" : "#DeepLearning"
    }, {
      "dep" : "amod",
      "governor" : 11,
      "governorGloss" : "#DataScience",
      "dependent" : 9,
      "dependentGloss" : "#AI"
    }, {
      "dep" : "amod",
      "governor" : 11,
      "governorGloss" : "#DataScience",
      "dependent" : 10,
      "dependentGloss" : "#MachineLearning"
    }, {
      "dep" : "dobj",
      "governor" : 8,
      "governorGloss" : "#DeepLearning",
      "dependent" : 11,
      "dependentGloss" : "#DataScience"
    }, {
      "dep" : "punct",
      "governor" : 11,
      "governorGloss" : "#DataScience",
      "dependent" : 12,
      "dependentGloss" : ":"
    }, {
      "dep" : "compound",
      "governor" : 16,
      "governorGloss" : "https://t.co/c…",
      "dependent" : 13,
      "dependentGloss" : "Sundar"
    }, {
      "dep" : "compound",
      "governor" : 16,
      "governorGloss" : "https://t.co/c…",
      "dependent" : 14,
      "dependentGloss" : "Pichai"
    }, {
      "dep" : "compound",
      "governor" : 16,
      "governorGloss" : "https://t.co/c…",
      "dependent" : 15,
      "dependentGloss" : "https://t.co/r5X4AnhXUo"
    }, {
      "dep" : "dep",
      "governor" : 11,
      "governorGloss" : "#DataScience",
      "dependent" : 16,
      "dependentGloss" : "https://t.co/c…"
    } ],
    "enhancedPlusPlusDependencies" : [ {
      "dep" : "ROOT",
      "governor" : 0,
      "governorGloss" : "ROOT",
      "dependent" : 2,
      "dependentGloss" : "@MikeTamir"
    }, {
      "dep" : "compound",
      "governor" : 2,
      "governorGloss" : "@MikeTamir",
      "dependent" : 1,
      "dependentGloss" : "RT"
    }, {
      "dep" : "punct",
      "governor" : 2,
      "governorGloss" : "@MikeTamir",
      "dependent" : 3,
      "dependentGloss" : ":"
    }, {
      "dep" : "nsubj",
      "governor" : 5,
      "governorGloss" : "betting",
      "dependent" : 4,
      "dependentGloss" : "Google"
    }, {
      "dep" : "parataxis",
      "governor" : 2,
      "governorGloss" : "@MikeTamir",
      "dependent" : 5,
      "dependentGloss" : "betting"
    }, {
      "dep" : "xcomp",
      "governor" : 5,
      "governorGloss" : "betting",
      "dependent" : 6,
      "dependentGloss" : "big"
    }, {
      "dep" : "mark",
      "governor" : 8,
      "governorGloss" : "#DeepLearning",
      "dependent" : 7,
      "dependentGloss" : "on"
    }, {
      "dep" : "advcl:on",
      "governor" : 6,
      "governorGloss" : "big",
      "dependent" : 8,
      "dependentGloss" : "#DeepLearning"
    }, {
      "dep" : "amod",
      "governor" : 11,
      "governorGloss" : "#DataScience",
      "dependent" : 9,
      "dependentGloss" : "#AI"
    }, {
      "dep" : "amod",
      "governor" : 11,
      "governorGloss" : "#DataScience",
      "dependent" : 10,
      "dependentGloss" : "#MachineLearning"
    }, {
      "dep" : "dobj",
      "governor" : 8,
      "governorGloss" : "#DeepLearning",
      "dependent" : 11,
      "dependentGloss" : "#DataScience"
    }, {
      "dep" : "punct",
      "governor" : 11,
      "governorGloss" : "#DataScience",
      "dependent" : 12,
      "dependentGloss" : ":"
    }, {
      "dep" : "compound",
      "governor" : 16,
      "governorGloss" : "https://t.co/c…",
      "dependent" : 13,
      "dependentGloss" : "Sundar"
    }, {
      "dep" : "compound",
      "governor" : 16,
      "governorGloss" : "https://t.co/c…",
      "dependent" : 14,
      "dependentGloss" : "Pichai"
    }, {
      "dep" : "compound",
      "governor" : 16,
      "governorGloss" : "https://t.co/c…",
      "dependent" : 15,
      "dependentGloss" : "https://t.co/r5X4AnhXUo"
    }, {
      "dep" : "dep",
      "governor" : 11,
      "governorGloss" : "#DataScience",
      "dependent" : 16,
      "dependentGloss" : "https://t.co/c…"
    } ],
    "sentimentValue" : "1",
    "sentiment" : "Negative",
    "tokens" : [ {
      "index" : 1,
      "word" : "RT",
      "originalText" : "RT",
      "characterOffsetBegin" : 0,
      "characterOffsetEnd" : 2,
      "pos" : "NN",
      "before" : "",
      "after" : " "
    }, {
      "index" : 2,
      "word" : "@MikeTamir",
      "originalText" : "@MikeTamir",
      "characterOffsetBegin" : 3,
      "characterOffsetEnd" : 13,
      "pos" : "NN",
      "before" : " ",
      "after" : ""
    }, {
      "index" : 3,
      "word" : ":",
      "originalText" : ":",
      "characterOffsetBegin" : 13,
      "characterOffsetEnd" : 14,
      "pos" : ":",
      "before" : "",
      "after" : " "
    }, {
      "index" : 4,
      "word" : "Google",
      "originalText" : "Google",
      "characterOffsetBegin" : 15,
      "characterOffsetEnd" : 21,
      "pos" : "NNP",
      "before" : " ",
      "after" : " "
    }, {
      "index" : 5,
      "word" : "betting",
      "originalText" : "betting",
      "characterOffsetBegin" : 22,
      "characterOffsetEnd" : 29,
      "pos" : "VBG",
      "before" : " ",
      "after" : " "
    }, {
      "index" : 6,
      "word" : "big",
      "originalText" : "big",
      "characterOffsetBegin" : 30,
      "characterOffsetEnd" : 33,
      "pos" : "JJ",
      "before" : " ",
      "after" : " "
    }, {
      "index" : 7,
      "word" : "on",
      "originalText" : "on",
      "characterOffsetBegin" : 34,
      "characterOffsetEnd" : 36,
      "pos" : "IN",
      "before" : " ",
      "after" : " "
    }, {
      "index" : 8,
      "word" : "#DeepLearning",
      "originalText" : "#DeepLearning",
      "characterOffsetBegin" : 37,
      "characterOffsetEnd" : 50,
      "pos" : "NN",
      "before" : " ",
      "after" : " "
    }, {
      "index" : 9,
      "word" : "#AI",
      "originalText" : "#AI",
      "characterOffsetBegin" : 51,
      "characterOffsetEnd" : 54,
      "pos" : "NN",
      "before" : " ",
      "after" : " "
    }, {
      "index" : 10,
      "word" : "#MachineLearning",
      "originalText" : "#MachineLearning",
      "characterOffsetBegin" : 55,
      "characterOffsetEnd" : 71,
      "pos" : "NN",
      "before" : " ",
      "after" : " "
    }, {
      "index" : 11,
      "word" : "#DataScience",
      "originalText" : "#DataScience",
      "characterOffsetBegin" : 72,
      "characterOffsetEnd" : 84,
      "pos" : "NN",
      "before" : " ",
      "after" : " "
    }, {
      "index" : 12,
      "word" : ":",
      "originalText" : ":",
      "characterOffsetBegin" : 85,
      "characterOffsetEnd" : 86,
      "pos" : ":",
      "before" : " ",
      "after" : " "
    }, {
      "index" : 13,
      "word" : "Sundar",
      "originalText" : "Sundar",
      "characterOffsetBegin" : 87,
      "characterOffsetEnd" : 93,
      "pos" : "NNP",
      "before" : " ",
      "after" : " "
    }, {
      "index" : 14,
      "word" : "Pichai",
      "originalText" : "Pichai",
      "characterOffsetBegin" : 94,
      "characterOffsetEnd" : 100,
      "pos" : "NNP",
      "before" : " ",
      "after" : " "
    }, {
      "index" : 15,
      "word" : "https://t.co/r5X4AnhXUo",
      "originalText" : "https://t.co/r5X4AnhXUo",
      "characterOffsetBegin" : 101,
      "characterOffsetEnd" : 124,
      "pos" : "NN",
      "before" : " ",
      "after" : " "
    }, {
      "index" : 16,
      "word" : "https://t.co/c…",
      "originalText" : "https://t.co/c…",
      "characterOffsetBegin" : 125,
      "characterOffsetEnd" : 140,
      "pos" : "NN",
      "before" : " ",
      "after" : ""
    } ]
  } ]
}






Reference:

Another simple option for Sentiment Analysis and NLP integration is to use Apache NiFi's ExecuteScript to call various Python libraries. This is well documented here: https://community.hortonworks.com/articles/76935/using-sentiment-analysis-and-nlp-tools-with-hdp-25....

http://stanfordnlp.github.io/CoreNLP/

https://github.com/stanfordnlp/CoreNLP/

http://stanfordnlp.github.io/CoreNLP/download.html

6,509 Views
Comments

This was awesome Tim