Upload
others
View
13
Download
0
Embed Size (px)
Citation preview
Overview of the Second NTCIR Workshop
National Institute of Informatics (NII), Japan
1.1 Purpose
3.
1.2 Brief History
1.3 Focus of the NTCIR Workshop
,
Traditional IR Testing
Challenging Issues
1.4 Evaluation Workshops
Benefits of Evaluation Workshops
pooling
2.1 Tasks
.
Chinese Text Retrieval Tasks (CHTR):
Japanese-English IR Tasks (JEIR):
Text Summarization Challenge (TSC):
2.2 Participants
Participants of the Second NTCIR Workshop
Yokohama National Univ. (Japan)Waseda Univ. (Japan)
Table 1. Number of Participating Groups
subtask
Table 2 Attribute of Participating Groups
Table 3. Distribution of Participating Groups
CHTR JEIR. TSC
enrl sub enrl sub enrl sub
Canada 1 0 1 0 0 0
China 2 2 0 0 0 0
Hong Kong 2 1 0 0 0 0
Japan 3 3 21 18 12 9
Korea 0 0 1 1 0 0
Taiwan 2 2 2 2 1 0
UK 1 0 2 1 0 0
USA 5 3 4 3 2 0
total 16 11 31 25 15 9
Comparison with the First NTCIR Workshop
Fig. 1 Number of Participants of Each Task
AdHoc-
JEIR/mono
CLIR-
JEIR/CLIR
CHTR
TSCTermExtractio
0
10
20
30
40
50
60
TSC 0 9
TermExtraction 9 0
CHTR 0 11
CLIR-JEIR/CLIR 10 14
AdHoc-
JEIR/mono
18 17
ntcir-ws1 ntcir-ws2
2.3 Procedures and Evaluation
1 June 2000:
10 August 2000:
30 August 2000:
8 September 2000:
18 September 2000:
20 October 2000:
27 November - 1 December 2000:
28 December 2000:
10 January 2001:
7-9 March 2001:
CHTR and JEIR:
TSC:
3. Test Collections
CIRB010;
NTCIR-1;
NTCIR-2;
NTCIR-2 Summ
<REC>
<ACCN>gakkai-0000011144</ACCN>
<TITL TYPE="kanji">d ´ d d } -
SGML ± ¬ ± Ê </TITL>
<TITE TYPE="alpha">Electronic manuscripts, electronicpublishing, and electronic library </TITE>
<AUPK TYPE="kanji"> ³õ</AUPK>
<AUPE TYPE="alpha">Negishi, Masamitsu</AUPE>
<CONF TYPE="kanji">¤� ( b)</CONF>
<CNFE TYPE="alpha">The Special Interest Group Notes of
IPSJ</CNFE><CNFD>1991. 11. 19</CNFD>
<ABST TYPE="kanji"><ABST.P>d
�S £ ó Ê ö d
» ® ã ®ü �¢
d K SGML (Standard
Generalized Markup Language) Î ® �
Ú SGML ±
S CD-ROM ¬ ± Ê ¾ m©
d } `Ô W]
� Z I
âè Z I Z§ âè
_ « _
</ABST.P></ABST>
<ABSE TYPE="alpha"><ABSE.P>Current situation onelectronic processing in preparation, editing, printing, and
distribution of documents is summarized and its future trend is
discussed, with focus on the concept: "Electronic publishing:Movements in the country concerning an international standard
for electronic publishing. Standard Generalized MarkupLanguage (SGML) is assumed to be important, and the results
from an experiment at NACSIS to publish an "SGML
Experimental Journal" and to make its full-text CD-ROM versionare reported. Various forms of "Electronic Library" are also
investigated. The author puts emphasis on standardization, astechnological problems for those social systems based on the
cultural settings of publication of the country, are the problems of
acceptance and penetration of the technology in thesociety.</ABSE.P></ABSE>
<KYWD TYPE="kanji">d // d } // d ´ //
SGML // // S </KYWD>
<KYWE TYPE="alpha">Electronic publishing // Electronic
library // Electronic manuscripts // SGML // NACSIS // Full textdatabases</KYWE>
<SOCN TYPE="kanji"> </SOCN>
<SOCE TYPE="alpha">Information Processing Society ofJapan</SOCE>
</REC>
Fig. 2 A Sample of a Document Record
3.1 Documents
3.2 Topics
<TOPIC q=0005><TITLE>
Á¥ ³
</TITLE><DESCRIPTION>
Á¥ ³
</DESCRIPTION>
<NARRATIVE>
Á¥ » ]
³ �ç ��
O ³
³ Á¥ ³ @
_Ê ± ñÄ
� ± � ê
Á¥ ³ �
�
</NARRATIVE><CONCEPT>
Á¥Ið, ¬ Í, x,
</CONCEPT><FIELD>
1.d §ä
</FIELD>
</TOPIC>
Fig. 3 A Sample Topic
3.3 Relevance Judgments (Right Answers)
Multi-grade Judgments
Level
1
Level 2
Judgments by Different Users
,
rigid relevance
Level 1 relaxed
relevance
Level 2
Additional Information
Rank-Degree Sensitive Evaluation Metric on
Multi-grade Relevance Judgments
3.4 Linguistic Analysis
3.5 Robustness of the System Evaluation
using the Test Collections
3.6 Differences between CHTR and JEIR
Table 4. "Query Types" in CHTR and JEIR
4.1 Round Tables at the Workshop Meeting
4.2 Future Directionsworkshops NTCIR WS1
topic
field(s)
used*
Ad Hoc &
CLIR (JE)
Chinese
Text
Retrieval
Japanese
& English
IR
T Very Short TI (Title) T
TD T+D
D
Short
without
Conce t
D only
C or DC or
TDC
TCVS (Very
Short
N or DN or
TN or TDN
Long
without
Conce t
N without
C
NC or
DNC or
TNC or
Long with
ConceptN+C
: mandatory for automatic query construc
w ere = , =
(QUESTION in CHTR), N=NARRATIVE,
C=CONCEPT
NTCIR WS2
LO (Long)
Short with
Concept
C without
N
SO (Short)
ACKNOWLEDGMENTS
[1] NTCIR Project: http://research.nii.ac.jp/ntcir/
[2] NTCIR Workshop 1: Proceedings of the First NTCIR
Workshop on Research in Japanese Text Retrieval
and Term Recognition, 30 Aug.�1 Sept., 1999, Tokyo,
ISBN4-924600-77-6.
http://research.nii.ac.jp/ntcir/workshop/OnlineProcee
dings/)
[3] IREX URL:http://cs.nyu.edu/cs/projects/proteus/irex/
[4] Sung, H.M. "HANTEC Collection". Presented at the
panel on IR Evaluation in the 4th IRAL, Hong Kong,
30 Sept.-3 Oct. 2000.
[5] Kando, N.: Cross-Linguistic Scholarly Information
Transfer and Database Services in Japan. Annual
Meeting of the ASIS, Washington DC. Nov. 1, 1997
[6] TREC URL: http://trec.nist.gov/
[7] Smeaton, A., Harman, D. K. "The TREC experiments
and their impact on Europe", Journal of Information
Science, Vol23, No.2, pp.169-174, 1997.
[8] Sparck Jones, K., Rijsbergen, C. J." Information
retrieval test collections", Journal of Documantation,
Vol.32, No.1, pp.59-72, 1975.
[9] Panel of IR Evaluation of the World. RIAO 2000,
Paris, France, April 2000.
[10] TDT URL: http://www.nist.gov/speech Selecting
"benchmark tests" and then "TDT".
[11] CLEF URL: http://www.iei.pi.cnr.it/DELOS/CLEF/
[12] http://www.itl.nist.gov/iaui/894.02/related_projects/
tipster_summac/
[13] Spink, A., Bateman, J. From highly relevant to not
relevant: Examining different regions of relevance.
Information Processing and Management, Vol.34,
No.5, pp.599-622, 1998
[14] Dunlop, M.D.Reflections on Mira, Journal of the
Americal Society for Information Sciences, Vol.51,
No.14, pp.1269-1274, 2000
[15] Spink, A., Greisdorf, H. Regions and levels:
Measuring and mapping users' relevance judgments.
Journal of the Americal Society for Information
Sciences, Vol.52, No.2, pp.161-173, 2001
[16] Campbell, I., Interactive evaluation of the ostensive
model using a new test collection of images with
multiple relevance assessments, Information
Retrieval, Vol.2, No.1,pp.87-114, 2000
[17] Reid, J. A task-oriented non-interactive evaluation
methodology for information retrieval systems,
Information Retrieval, Vol.2, No.1,pp 115-129, 2000
[18] Chiang, Yu-ting: A Study on Design and
Implementation for Chinese Information Retrieval
Benchmark. Master Thesis, National Taiwan
University, 1999, 184 p.
[19] Shaw, W.M., Jr, et al.: The cystic fibrosis database:
Content and research opportunities. Library and
Information Science Research, Vol.13, pp.347-366,
1991.
[20] Hersh, W., Buckley, C., Leone, T.J., Kichman, D.H.:
OHSUMED: an Interactive Retrieval Evaluation and
New Large Test Collection for Research. In
Proceedings of 17th Annual International ACM-
SIGIR Conference on Research and Development in
Information Retrieval. p.192-201, Dublin, Ireland,
1994.
[21] Losee, R.M.: Text retrieval and filtering: analytic
models of performance. Kluwer, 1998
[22] Borlund, P., Ingwersen, P.: Measures of relative
relevance and ranked half-life: Performance
indicators for interactive IR. In Proceedings of 21st
Annual International ACM-SIGIR Conference on
Research and Development in Information Retrieval.
p.324-331, Melbourne, Australia, August. 1998.
[23] Jarvelin, K., Kekalainen, J.: IR evaluation methods
for retrieving highly relevant documents. In
Proceedings of 23rd Annual International ACM-
SIGIR Conference on Research and Development in
Information Retrieval. p. 41-48, Philadelphia, PA,
USA, July 2000.
[24] Yoshioka, M., Kuriyiama, K., Kando, N.: Analysis
on the Usage of Japanese Segmented Texts in the
NTCIR Workshop 2. In NTCIR Workshop 2 :
Proceedings of the Second NTCIR Workshop on
Research in Chinese & Japanese Text Retrieval and
Text Summarization, Tokyo, June 2000- March 2001
iISBNF4-924600-96-2) (to appear)
[25] Kando, N., Kuriyama, K., Yoshioka, M. Evaluation
based on multi-grade relevance judgements. IPSJ SIG
Notes, July 2001 (to appear)
[26] Kando, N, Nozue, T., Kuriyama, K., Oyama, K.:
NTCIR-1: Its Policy and Practice, IPSJ SIG Notes,
Vol.99, No.20, pp. 33-40, 1999 [in Japanese].
[27] Kuriyama, K., Nozue, T., Kando, N., Oyama, K.:
Pooling for a Large Scale Test Collection: Analysis
of the Search Results for the Pre-test of the NTCIR-1
Workshop, IPSJ SIG Notes, Vol.99-FI-54, pp.25-32
May, 1999 [in Japanese].
[28] Kuriyama, K., Kando, K.: Construction of a Large
Scale Test Collection: Analysis of the Training
Topics of the NTCIR-1, IPSJ SIG Notes, Vol.99-FI-
55, pp.41-48, July 1999 [in Japanese].
[29] Kando, N., Eguchi, K., Kuriyama, K.: Construction
of a Large Scale Test Collection: Analysis of the Test
Topics of the NTCIR-1, In Proceedings of IPSJ
Annual Meeting [in Japanese]. pp.3-107 -- 3-108, 30
Sept -3 Oct. 1999.
[30] Kuriyama, K., Yoshioka, M., Kando, N.: Effect of
Cross-Lingual Pooling. In NTCIR Workshop 2 :
Proceedings of the Second NTCIR Workshop on
Research in Chinese & Japanese Text Retrieval and
Text Summarization, Tokyo, June 2000- March 2001
iISBNF4-924600-96-2) (to appear)
Voorhees, E.M.: Variations in Relevance Judgments
and the Measurement of Retrieval Effectiveness, In
Proceedings of 21st Annual International ACM-
SIGIR Conference on Research and Development in
Information Retrieval. pp. 315-323, Melbourne,
Australia, August. 1998