Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Classifying music lyrics

Solved Go to solution
Highlighted

Classifying music lyrics

New Contributor

Hi, let me briefly describe my problem:

Initial task was to try some NLP practices in close real-world problems. We decided to start with simple classification problem - predict a genre for a music lyric. We had a strong requirement of using Java, Spark 2.0.0 with ML library (not MLLib). ML library has a limited number of algorithms, so we started with binary classification for 2 genres and simple pipeline with Word2Vec and Logistic Regression. It showed acceptable results. Than we decided to add one genre. So we had to some other algorithm, because Logistic Regression works only for binary problems. So we've tried 3 approaches:

1. Bag of words + Naive Bayes - around 82% precision 2. Word2Vec + Logistic Regression + One vs Rest - around 65% precision 3. Word2Vec + MixMaxScaler + Naive Bayes - aroung 58% precision First approach showed a good results, but the concern is that Bag of Words is a bit old and it doesn't handle well similar words. So we're still in search with better solution with Word2Vec. We had a thought to try Desicion Tree or Random Forecast, but we're not sure how performant it will be with large vectors (100, 200, 300) and large datasets. I read that it's good to use *Tree approaches when you have small number of feature.

Maybe you could recommend some other approach based on your experience? Any help is very appreciated. I have to reming that we're strongly tied to Spark 2.0.0 + ML library due to DevOps infrastructure.

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Classifying music lyrics

Super Guru

Have you tried DeepLearning4J with Spark? or H2O with Spark.

Both of those are very sophisticated NLP.

https://dzone.com/articles/in-progress-natural-language-processing

View solution in original post

3 REPLIES 3

Re: Classifying music lyrics

Super Guru

Have you tried DeepLearning4J with Spark? or H2O with Spark.

Both of those are very sophisticated NLP.

https://dzone.com/articles/in-progress-natural-language-processing

View solution in original post

Highlighted

Re: Classifying music lyrics

Super Guru
Don't have an account?
Coming from Hortonworks? Activate your account here