At DALI, we achieve accurate and quality data annotation with crowdsourcing via Games With a Purpose. We intend to lead our players into contributing to science in a fun and intuitive way!
Phrase Detectives is an online crowdsourcing game with a validation stage, primarily designed to collect data about English (and subsequently Italian) anaphoric co-reference. Anaphoric coreference is a type of linguistic reference where one expression depends on another referential element. An example would be the relation between the entity 'Jon' and the pronoun 'his' in the text 'Jon rode his bike to school.'
The game uses two styles of text annotation for players to complete a linguistic task. Initially text is presented in Annotation Mode (called Name the Culprit in the game). This is a traditional annotation method in which the player makes an interpretation (annotation decision) about a highlighted markable (section of text). If different players enter different interpretations for a markable then each interpretation is presented to more players in Validation Mode (called Detectives Conference in the game). The players in Validation Mode have to agree or disagree with the interpretation. Players may also make comments about the task and/or skip the task if they do not want to provide an interpretation.
Players could label markables as DN (discourse-new, where the markable refers to a newly introduced entity), DO (discourse-old, where the markable refers to an already mentioned entity in the text, NR (non-referring, where the markable does not refer to anything or PR (where the markable represents a property of a previously mentioned entity).
The first publicly available dataset (Phrase Detectives Corpus 1.0) was used to determine what the collective quality of the players were, as well as the quality of individual decisions. Full details of the game and corpora, including processing pipeline, descriptive statistics and gold standard creation, have been published (see publications section).
TileAttack is a two player game whose aim is to identify the parts of a text that refer to objects - e.g., people, places, locations. (These parts of a sentence are generally known as noun phrases; we call these parts of text mentions). In the game you are randomly matched with a second player and score points for agreeing on a mention, but even more points for finding that mention first. There are many different types of mentions, both simple ones (e.g., He, Bob, the CEO, London) and more complex (the classical music fan, the capital of the UK). Your aim in the game is to identify as many of them as you can. Keep in mind that markables can be nested: e.g., the capital of the UK is a markable, but the UK is a markable as well!