CoNLL-2004 Shared Task:
Semantic Role Labeling: Data, Systems, and Results
The CoNLL-2004 Shared Task on Semantic Role Labeling took place
during the period December 2003 to March 2004. Participants systems and
results were presented at the CoNLL-2004 conference in Boston during
May 2004.
Ten systems participated in the closed challenge. No system was
presented at the
open challenge. A description of the task, with a discussion of systems
and results, can be found below in the introduction paper [CM04], together with the papers, outputs and
presentation slides of all systems.
The following tables contain the overall evaluation figures of all
the
systems on the development and test sets. The significance
intervals for the F rates have been obtained with bootstrap
resampling [Nor89]. F rates
outside of these intervals are assumed to be significantly different
from the related F rate (p<0.05).
+-----------+-----------+---------------+
Development | Precision | Recall | F_1 |
+----------------+---------------------------------------+
| HPWMJ04 | 74.18% | 69.43% | 71.72 ±0.8 |
| PRYZT04 | 71.96% | 64.93% | 68.26 ±0.9 |
| CMC04 | 73.40% | 63.70% | 68.21 ±1.0 |
| LHPR04 | 69.78% | 62.57% | 65.97 ±0.9 |
| PHR04 | 67.27% | 64.36% | 65.78 ±0.8 |
| Hi04 | 65.59% | 60.16% | 62.76 ±1.0 |
| VCDHT04 | 69.06% | 57.84% | 62.95 ±0.9 |
| Ko04 | 44.93% | 63.12% | 52.50 ±0.9 |
| BEPP04 | 64.90% | 41.61% | 50.71 ±0.9 |
| WDM04 | 53.37% | 32.43% | 40.35 ±0.9 |
+----------------+-----------+-----------+---------------+
| baseline | 50.63% | 30.30% | 37.91 ±1.0 |
+----------------+-----------+-----------+---------------+
+-----------+-----------+---------------+
Test | Precision | Recall | F_1 |
+----------------+---------------------------------------+
| HPWMJ04 | 72.43% | 66.77% | 69.49 ±1.0 |
| PRYZT04 | 70.07% | 63.07% | 66.39 ±1.0 |
| CMC04 | 71.81% | 61.11% | 66.03 ±1.1 |
| LHPR04 | 68.42% | 61.47% | 64.76 ±1.1 |
| PHR04 | 65.63% | 62.43% | 63.99 ±1.0 |
| Hi04 | 64.17% | 57.52% | 60.66 ±1.1 |
| VCDHT04 | 67.12% | 54.46% | 60.13 ±1.0 |
| Ko04 | 56.86% | 49.95% | 53.18 ±0.9 |
| BEPP04 | 65.73% | 42.60% | 51.70 ±1.0 |
| WDM04 | 58.08% | 34.75% | 43.48 ±1.1 |
+----------------+-----------+-----------+---------------+
| baseline | 54.60% | 31.39% | 39.87 ±1.0 |
+----------------+-----------+-----------+---------------+
Post-conference
contributions: If you have a system developed in the setting of
the CoNLL-2004 Shared
Task, and a description of it in an official document (technical report
or published paper), please contact us by sending an email
to srlconll <at>
lsi.upc.edu.
We'll be glad to add an entry for your system in the results table.
Task Description Paper
- [CM04]
Xavier Carreras and
Lluís
Màrquez, Introduction to the
CoNLL-2004 Shared Task:
Semantic Role Labeling.
[ps]
[ps.gz]
[pdf]
[slides (pdf)] , including a detailed comparative evaluation of systems
on different aspects of the task.
System Description Papers, Outputs and Slides
- [BEPP04]
Ulrike Baldewein, Katrin Erk, Sebastian Padó and
Detlef Prescher, Semantic Role
Labeling With Chunk Sequences.
[pdf] [slides]
[output.dev.gz] [output.test.gz] [results.dev] [results.test]
- [CMC04]
Xavier Carreras, Lluís Màrquez and Grzegorz Chrupała, Hierarchical Recognition of Propositional
Arguments with Perceptrons.
[pdf] [slides]
[output.dev.gz] [output.test.gz] [results.dev] [results.test]
- [HPWMJ04]
Kadri Hacioglu, Sameer Pradhan, Wayne Ward, James H. Martin and Daniel
Jurafsky, Semantic Role Labeling by
Tagging Syntactic Chunks.
[pdf] [slides]
[output.dev.gz] [output.test.gz] [results.dev] [results.test]
- [Hi04]
Derrick Higgins, A
transformation-based approach to argument labeling.
[pdf] [slides]
[output.dev.gz] [output.test.gz] [results.dev] [results.test]
- [Ko04]
Beata Kouchnir, A Memory-Based
Approach for Semantic Role Labeling.
[pdf] [slides]
[output.dev.gz] [output.test.gz] [results.dev] [results.test]
- [LHPR04]
Joon-Ho Lim, Young-Sook Hwang, So-Young Park and Hae-Chang Rim, Semantic Role Labeling using Maximum
Entropy Model.
[pdf] [slides]
[output.dev.gz] [output.test.gz]
[results.dev] [results.test]
- [PHR04]
Kyung-Mi Park, Young-Sook Hwang and Hae-Chang Rim, Two-Phase Semantic Role Labeling based on
Support Vector Machines.
[pdf] [slides]
[output.dev.gz] [output.test.gz] [results.dev] [results.test]
- [PRYZT04]
Vasin Punyakanok, Dan Roth, Wen-Tau Yih, Dav Zimak and Yuancheng Tu, Semantic Role Labeling Via Generalized
Inference Over Classifiers.
[pdf] [slides]
[output.dev.gz] [output.test.gz] [results.dev] [results.test]
- [VCDHT04]
Antal van den Bosch, Sander Canisius, Walter Daelemans, Iris
Hendrickx and Erik Tjong Kim Sang, Memory-based semantic role labeling:
Optimizing features, algorithm, and output.
[pdf] [slides]
[output.dev.gz] [output.test.gz] [results.dev] [results.test]
- [WDM04]
Ken Williams, Christopher Dozier and Andrew McCulloh, Learning Transformation Rules for Semantic
Role Labeling.
[pdf] [slides]
[output.dev.gz] [output.test.gz] [results.dev] [results.test]
Datasets, Software and Resources
You can download the distribution
package (released in March 2004) containing data, official
resources, evaluation software, and a baseline system. See the README
file for complete information of contents.
Last
Update: January 28, 2005. Xavier
Carreras, Lluís
Màrquez.