Hi, let me briefly describe my problem:
Initial task was to try some NLP practices in close real-world problems. We decided to start with simple classification problem - predict a genre for a music lyric. We had a strong requirement of using Java, Spark 2.0.0 with ML library (not MLLib). ML library has a limited number of algorithms, so we started with binary classification for 2 genres and simple pipeline with Word2Vec and Logistic Regression. It showed acceptable results. Than we decided to add one genre. So we had to some other algorithm, because Logistic Regression works only for binary problems. So we've tried 3 approaches:
Maybe you could recommend some other approach based on your experience? Any help is very appreciated. I have to reming that we're strongly tied to Spark 2.0.0 + ML library due to DevOps infrastructure.