One Million Posts Corpus

The “One Million Posts” corpus is an annotated data set consisting of user comments posted to an Austrian newspaper website (in German language). The dataset comprises approx. one million posts approx. 11K of which are manually annotated with the following categories: sentiment (negative/neutral/positive), off-Topic (yes/no), inappropriate (yes/no), discriminating (yes/no), feedback to the article author (yes/no), user personal stories (yes/no), arguments used (yes/no).

Publications

Authors

Licence

Sponsor

Key facts