Reply
Explorer
Posts: 12
Registered: ‎03-18-2016

Spark NLP provision

Hello- Does spark provides all NLP features like parts-of-speech tagging, tokenization, entity co-referencing - just like OpenNLP? If so kindly send us link or any workarounds. Thanks
Cloudera Employee
Posts: 366
Registered: ‎07-29-2013

Re: Spark NLP provision

Spark has no particular support for NLP, no. You can use third party
libraries for this.
Explorer
Posts: 12
Registered: ‎03-18-2016

Re: Spark NLP provision

So does this imply "word2vec" of spark MLLIB is not related to NLP? Somewhere mention about stanford NLP with spark?

Cloudera Employee
Posts: 366
Registered: ‎07-29-2013

Re: Spark NLP provision

word2vec is just a means of translating bags of items to a vector
space representation. I myself don't call that NLP per se but it is
used to make feature vectors from text. NLP to me is more like
stemming and sentiment analysis. For this you'd be calling to
third-party libraries, like the Stanford NLP library, or building your
own NLP processes on top of generic implementations of, say, LDA in
Spark.
Explorer
Posts: 12
Registered: ‎03-18-2016

Re: Spark NLP provision

Thanks - do we have any java example for "word2vec" - to give text cosinesimilarity.

As per javadoc what does <S> indicate?

unable to use
fit(JavaRDD<S> dataset)

<S extends Iterable<String>>
Word2VecModel fit(JavaRDD<S> dataset)
Computes the vector representation of each word in vocabulary (Java version).