SIGA:aSystemtoManageInformationRetrievalEvaluations · PDF...
Transcript of SIGA:aSystemtoManageInformationRetrievalEvaluations · PDF...
PÁGICO: Evaluating Wikipedia-based information retrieval in PortugueseSIGA: a System to Manage Information Retrieval Evaluations
LUÍS COSTA, CLÁUDIA FREITAS, CRISTINA MOTA, DIANA SANTOS AND ALBERTO SIMÕES
http://www.linguateca.pt
ACKNOWLEDGMENTSLinguateca has throughout the years been jointly funded by the Portuguese Gov-ernment, the European Union (FEDER and FSE), UMIC, FCCN and FCT. Págicowas also supported by the Universities of Oslo, PUC-Rio, Coimbra and FCT grantSFRH/BPD/73011/2010.
REFERENCESDiana Santos, Nuno Cardoso, Paula Carvalho, Iustin Dornescu, Sven Hartrumpf, Johannes Leveling, and Yvonne Skalban. 2009. GikiP at GeoCLEF 2008: Joining
GIR and QA forces for querying Wikipedia. In C. Peters, T. Deselaers, N. Ferro, J. Gonzalo, G. J.F.Jones, M. Kurimo, T. Mandl, A. Pe nas, and V. Petras, ed., EvaluatingSystems for Multilingual and Multimodal Information Access 9th Workshop of the Cross-Language Evaluation Forum, CLEF 2008, Aarhus, Denmark, September 17-19, 2008, RevisedSelected Papers, Springer, pp. 894-905.
Diana Santos and Luís Miguel Cabral. 2010. GikiCLEF : Expectations and lessons learned. In Carol Peters, Giorgio Di Nunzio, Mikko Kurimo, Thomas Mandl,Djamel Mostefa, Anselmo Peñas, and Giovanna Roda, ed., Multilingual Information Access Evaluation, VOL I, Springer, pp. 212-222.
Diana Santos, Cristina Mota, Cláudia Freitas, and Luís Costa. 2012. Linguamática 4, number 1, special volume about Págico.
MOTIVATION AND TASKIs it possible to develop better systems to answer realistic user
needs, searching for answers to a particular topic in Wikipedia? IsWikipedia in Portuguese good enough to provide information onlusophone topics? Can we learn from watching people trying toanswer them? Is competition or cooperation between human andautomatic participants worth indulging in?
PT WIKIPEDIA IN 150 TOPICSInformation needs related to Portuguese-speaking countries
and their history, with enough coverage in Wikipedia, and noteasily browsable through simple categories or infoboxes, span-ning areas from History (50) through Geography (26) and Music(19) to Mathematics (1) and Geology (2). How to assess the an-swers and their justifications was often quite difficult.
PÁGICO COLLECTIONbased on the 25 April 2011 wikipedia snapshot;converted to XHTML using:mwlib for the markup conversion;MediaWiki::DumpFile to control the snapshot parsing;in-house tools to manage macro expansion;
Collection constitution:
Page type Total docsTemplate pages 32 900Disambiguation pages 5 006Redirection pages 574 077Multimedia pages 9 678Article pages 856 005
SIGATopic creationSystem run submission and testingHuman participation interfaceAssessment interfaceConflict resolutionPool browsingScoring
HUMANS VS. SYSTEMS
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Scores
Pseudo-recall
Precision
ludIT_1GLNISTT_1João Miranda_1Ângela Mota_1RAPPORTAGICO_3RAPPORTAGICO_2RAPPORTAGICO_1Bruno Nascimento_1RENOIR_1RENOIR_3RENOIR_2Average
Correct & Justified Unjustified Invalid document Other
BothSystemsHuman participants
Automatic Evaluation
0.0
0.2
0.4
0.6
0.8
1.0
Letras Artes Geografia Cultura Política Desporto Ciência Economia
Final score per subject
0100
200
300
400
500
02
46
810
HumansSystems
Letras Artes Geografia Cultura Política Desporto Ciência Economia
Precision per subject
0.0
0.2
0.4
0.6
0.8
1.0
00.008
0.016
0.024
0.032
0.04
HumansSystems
The most correct topicsID Topic Total Hum Sys H & S
H
135 Aves de Angola 54 10 44 019 Tribos indígenas que vivem na Amazônia. 115 56 35 2490 Filmes brasileiros premiados na categoria Montagem. 34 8 19 713 Dinossauros carnívoros que habitaram o Brasil. 23 6 12 5
S
19 Tribos indígenas que vivem na Amazônia. 115 56 35 2462 Praias de Portugal boas para a prática de surf 30 5 6 19
7 Guitarristas portugueses que também foram compositores. 34 17 0 1711 Filmes sobre o cangaço. 41 20 4 17
EVAL MEASURES
Precision: Pp,c =|Cp,c||Rp,c|
Pseudo-recall:αp,c =
|Cp,c||CPagico|+|Caval|
Pseudo-F-measure:φp,c = 2× Pp,c×αp,c
Pp,c+αp,c
Originality:Op,c =
∑Ti
∑Rp,c,i
j o(rp,c,i,j)
Creativity:Kp,c =
∑Ti
∑Rp,c,i
j k(rp,c,i,j)
Final score: Mp,j = |Cp,c| × Pc,j
In addition to the measures usedin GikiP and GikiCLEF, we choseto investigate originality and cre-ativity, by weighing differentlyanswers according to the numberof participants who found them.
USER BROWSER BEHAVIOUR
0 200 600
040
80
ludit
Browsing order
Tim
e sp
ent o
n to
pic
0 20 40 60 80
040
80
angelamota
Browsing order
Tim
e sp
ent o
n to
pic
0 10 30 50
24
68
miranda
Browsing order
Tim
e sp
ent o
n to
pic
0 5 10 15 20 25 30
26
10
Px120
Browsing order
Tim
e sp
ent o
n to
pic
0 20 40 60 80
020
4060
GLNISTT1
Browsing order
Tim
e sp
ent o
n to
pic
0 50 100 150 200
010
2030
GLNISTT2
Browsing order
Tim
e sp
ent o
n to
pic
0 20 40 60 80
05
1525
GLNISTT3
Browsing order
Tim
e sp
ent o
n to
pic
0 10 30 50
010
30
GLNISTT4
Browsing order
Tim
e sp
ent o
n to
pic
0 20 40 60 80
020
4060
GLNISTT5
Browsing order
Tim
e sp
ent o
n to
pic
0 5 10 15 20 25
010
2030
GLNISTT6
Browsing order
Tim
e sp
ent o
n to
pic
0 10 30 50
050
100150
GLNISTT7
Browsing order
Tim
e sp
ent o
n to
pic
0 20 40 60
020
4060
GLNISTT8
Browsing order
Tim
e sp
ent o
n to
pic
CARTOLA http://www.linguateca.pt/Cartola
Págico answers poolNumber of answers and justification documents Percentage of answers and justifications
only in the PT wikipedia