ICWSM Datasets- ICWSM 2011 Spinn3r Dataset
That dataset, provided by Spinn3r.com, is a continuation of the 2009 Spinn3r Dataset. The dataset consists of over 386 million blog posts, news articles, classifieds, forum posts and social media content between January 13th and February 14th.
- ICWSM 2009 Spinn3r Blog Dataset
The dataset, provided by Spinn3r.com, is a set of 44 million blog posts made between August 1st and October 1st, 2008.
- JDPA Sentiment Corpus
The JDPA Corpus consists of user-generated content (blog posts) containing opinions about automobiles and digital cameras. They have been manually annotated for named, nominal, and pronominal mentions of entities. Entities are marked with the aggregate sentiment expressed toward them in the document.
Note that these datasets are free but researchers will need to contact the ICWSM and sign a usage agreement to be granted access.