|
Structure Extraction competition @ ICDAR 2009
|
|
|
Alternative Evaluation Measure proposed by XRCE
Description
The measure was initially described in the following paper:
Hervé Déjean and Jean-Luc Meunier, "XRCE Participation to the Book Structure Task", in Advances in Focused Retrieval: 7th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2008, Dagstuhl Castle, Germany, December 15-18, p.124-131, 2008.
This measure has the following characteristics:
- Results available at the book level.
- A detailed display of the errors is possible.
- The quality of the links is evaluated independently of the title and levels.
- A title quality is also evaluated per book, using a similarity distance based on the ICDAR-INEX weighted Levenshtein distance and averaging over the lowercased titles of the book.
- A "INEX-link-like" measure is also produced (rather than matching first the title then the page, it first matches the page and then the title).
This measure allows to look in details at the results independently of the titles, which sometime can be quite debatable and often hide useful things.
Script
The corresponding Python implementation was further provided by XRCE, you may dowload it here as a zip archive.
Results
Here are the corresponding results for the 2009 submissions of book structure extraction competition.
Please note that the F1 value actually corresponds to the average of F1 over all documents in the ground truth.
| XRCE Link-based Measure |
| Links | Accuracy (for valid links) |
| Prec. | Rec. | F1 | Title | Level |
MDCS | 65.9 | 70.3 | 66.4 | 86.7 | 75.2 |
XRCE-run3 | 69.7 | 65.7 | 64.6 | 74.4 | 68.8 |
XRCE-run2 | 69.2 | 64.8 | 63.8 | 74.4 | 69.1 |
XRCE-run1 | 67.1 | 63.0 | 62.0 | 74.6 | 68.9 |
Noopsis | 46.4 | 38.0 | 39.9 | 71.9 | 68.5 |
GREYC-1 | 59.7 | 34.2 | 38.0 | 42.1 | 73.2 |
GREYC | 6.7 | 0.7 | 1.2 | 13.9 | 31.4 |
(For the GREYC result, an unfortunate shift of +1 was observed on all page numbers, the corrected run is computed under the name "GREYC-1")