David shimony dating
Later, in 2004, the group collected a Blog Authorship Corpus (BAC; (Schler et al.
2006)), containing about 700,000 posts to (in total about 140 million words) by almost 20,000 bloggers. Slightly more information seems to be coming from content (75.1% accuracy) than from style (72.0% accuracy). We see the women focusing on personal matters, leading to important content words like love and boyfriend, and important style words like I and other personal pronouns.
We then experimented with several author profiling techniques, namely Support Vector Regression (as provided by LIBSVM; (Chang and Lin 2011)), Linguistic Profiling (LP; (van Halteren 2004)), and Ti MBL (Daelemans et al.
The resource would become even more useful if we could deduce complete and correct metadata from the various available information sources, such as the provided metadata, user relations, profile photos, and the text of the tweets.