CoNLL-2004 Shared Task: 

Semantic Role Labeling: Data, Systems, and Results


Introduction    F.A.Q.    References    CoNLL Conferences


CoNLL-2005 :       Description&Goal       Examples       Data&Software      Systems&Results 


CoNLL-2004 :      Summary Page (data, systems & results)



The CoNLL-2004 Shared Task on Semantic Role Labeling took place during the period December 2003 to March 2004. Participants systems and results were presented at the CoNLL-2004 conference in Boston during May 2004.

Ten systems participated in the closed challenge. No system was presented at the open challenge. A description of the task, with a discussion of systems and results, can be found below in the introduction paper [CM04], together with the papers, outputs and presentation slides of all systems.

The following tables contain the overall evaluation figures of all the systems on the development and test sets.  The significance intervals for the F rates  have been obtained with bootstrap resampling [Nor89].  F rates outside of these intervals are assumed to be significantly different from the related F rate (p<0.05).


                         +-----------+-----------+---------------+
Development | Precision | Recall | F_1 |
+----------------+---------------------------------------+
| HPWMJ04 | 74.18% | 69.43% | 71.72 ±0.8 |
| PRYZT04 | 71.96% | 64.93% | 68.26 ±0.9 |
| CMC04 | 73.40% | 63.70% | 68.21 ±1.0 |
| LHPR04 | 69.78% | 62.57% | 65.97 ±0.9 |
| PHR04 | 67.27% | 64.36% | 65.78 ±0.8 |
| Hi04 | 65.59% | 60.16% | 62.76 ±1.0 |
| VCDHT04 | 69.06% | 57.84% | 62.95 ±0.9 |
| Ko04 | 44.93% | 63.12% | 52.50 ±0.9 |
| BEPP04 | 64.90% | 41.61% | 50.71 ±0.9 |
| WDM04 | 53.37% | 32.43% | 40.35 ±0.9 |
+----------------+-----------+-----------+---------------+
| baseline | 50.63% | 30.30% | 37.91 ±1.0 |
+----------------+-----------+-----------+---------------+


+-----------+-----------+---------------+
Test | Precision | Recall | F_1 |
+----------------+---------------------------------------+
| HPWMJ04 | 72.43% | 66.77% | 69.49 ±1.0 |
| PRYZT04 | 70.07% | 63.07% | 66.39 ±1.0 |
| CMC04 | 71.81% | 61.11% | 66.03 ±1.1 |
| LHPR04 | 68.42% | 61.47% | 64.76 ±1.1 |
| PHR04 | 65.63% | 62.43% | 63.99 ±1.0 |
| Hi04 | 64.17% | 57.52% | 60.66 ±1.1 |
| VCDHT04 | 67.12% | 54.46% | 60.13 ±1.0 |
| Ko04 | 56.86% | 49.95% | 53.18 ±0.9 |
| BEPP04 | 65.73% | 42.60% | 51.70 ±1.0 |
| WDM04 | 58.08% | 34.75% | 43.48 ±1.1 |
+----------------+-----------+-----------+---------------+
| baseline | 54.60% | 31.39% | 39.87 ±1.0 |
+----------------+-----------+-----------+---------------+


Post-conference contributions: If you have a system developed in the setting of the CoNLL-2004 Shared Task, and a description of it in an official document (technical report or published paper), please contact us by sending an email to srlconll <at> lsi.upc.edu. We'll be glad to add an entry for your system in the results table.



Task Description Paper



System Description Papers, Outputs and Slides


Datasets, Software and Resources

You can download the distribution package (released in March 2004) containing data, official resources, evaluation software, and a baseline system. See the README file for complete information of contents.


Last Update: January 28, 2005. Xavier Carreras, Lluís Màrquez.