We achieved best accuracy for German Sentiment Analysis as of 2018

Inspired by Jeremy Howard we adopted his recent discovery to handle morphological rich languages. Our test showed that we can easily achieve better accuracy than state of the art models on GermEval 2017 data set.

Unfortunately this does not mean that we have the best sentiment prediction model as universal sentiment analysis does not exist. We can achieve high accuracy on specific domains like tweets but such model won’t handle other domains unless it is trained on that domain. To show you how such universal sentiment analysis fail we used Google Cloud Platform on GermEval 2017 and achieved very low accuracy.

Solution GermEval2017 task 1a    GermEval2017 task 1b
Google Cloud Platform Generic*    0.654 0.671
Google Cloud Platform AutoML** ? ?
Sayyed et al. (2017) 0.733 0.750
Naderalvojoud et al. (2017) 0.749 0.736
Ours 0.765 0.781

*) Google cloud platform is completly lost on the data provided by GermEval 2017,

**) We are in process of trainig a custom model using Google AutoML on GE17 to see how this approach compare.

Our model is opensourced and we are working on a way to expose it for other researchers to evaluate it. If you need a model to classify document trained on your data feel free to contact us.