A semantic role in language is the relationship that a
syntactic constituent has with a predicate. Typical semantic arguments
include Agent, Patient, Instrument, etc. and also adjunctive arguments
indicating Locative, Temporal, Manner, Cause, etc. aspects. Recognizing
and labeling semantic arguments is a key task for answering "Who",
"When", "What", "Where", "Why", etc. questions in Information
Extraction, Question Answering, Summarization, and, in general, in all
NLP tasks in which some kind of semantic interpretation is needed.
The following sentence, taken from the PropBank corpus, exemplifies the annotation of semantic roles:
[A0 He ] [AM-MOD would ] [AM-NEG n't ] [V accept ] [A1 anything of value ] from [A2 those he was writing about ] .Here, the roles for the predicate accept (that is, the roleset of the predicate) are defined in the PropBank Frames scheme as:
V: verb
A0: acceptor
A1: thing accepted
A2: accepted-from
A3: attribute
AM-MOD: modal
AM-NEG: negation
The Shared Tasks of CoNLL-2004 and CoNLL-2005 concerned the recognition of semantic roles for the English language, based on PropBank predicate-argument structures. Given a sentence, the task consists of analyzing the propositions expressed by some target verbs of the sentence. In particular, for each target verb all the constituents in the sentence which fill a semantic role of the verb have to be recognized. We will refer to this problem as Semantic Role Labeling (SRL).
As in all previous CoNLL shared tasks, the general goal is to come forward with machine learning strategies which address the proposed NLP problem, SRL in the present case. In CoNLL-2004, the goal was to develop SRL systems based on partial parsing information (see the main conclusions and system descriptions and evaluations here). In CoNLL-2005, the main focus of interest was to increase the amount of syntactic and semantic input information, aiming to boost the performance of machine learning systems on the SRL task. Following earlier editions of the shared task, the input information contained several levels of annotation apart from the role labeling information: words, POS tags, chunks, clauses, named entities, and parse trees. All participants were suggested to propose novel learning architectures for better exploiting the data structures, relations and constraints of the problem.
Compared to the shared task of CoNLL-2004, the novelties introduced in the 2005 edition were: