1 1 1 1 1Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez
Raúl García-Castro, Asunción Gómez-Pérez<rgarcia,[email protected]>
September 10th, 2004
Benchmarking in Knowledge Web
Jérôme Euzenat<[email protected]>
2 2 2 2 2Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez
Research Benchmarking
Industrial Benchmarking
≠
WP 1.2
(From T.A. page 26)
WP 2.1(From T.A. Page 41)
Point of view • Tool recommendation • Research progress
Criteria • Utility • Scalalability• Robustness• Interoperability
Tools • Ontology development tools• Annotation tools• Querying and reasoning services of ontology development tools• Merging and alignment tools
• Ontology development tools• Annotation tools• Querying and reasoning services of ontology development tools• Semantic Web Service technology
3 3 3 3 3Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez
Index
Benchmarking activities in Knowledge WebBenchmarking in WP 2.1Benchmarking in WP 2.2Benchmarking information repositoryBenchmarking in Knowledge Web
4 4 4 4 4Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez
Overview of the benchmarking activities:
• Progress
• What to expect from them
• What are their relationships/dependencies
• What could be shared/reused between them
Benchmarking activities in KW
5 5 5 5 5Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez
0 6 12 18 24 30 36 42 48
D2.1.1:Benchmarking
SoA
D2.2.2:Benchmarkingmethodologyfor alignment
D2.2.4:Benchmarking
alignmentresults
D2.1.4:BenchmarkingMethodology,
criteria, test suites
D2.1.6:Benchmarkingbuilding tools
Benchmarkingquerying, reasoning,
annotation
Benchmarkingweb service technology
D1.31:Best practices and guidelines
for industry
Best practices and guidelinesfor business cases
D1.2.1:Utility of ontologydevelopment tools
Utility of merging,alignment, annotation
Performance ofquerying, reasoning
Finished
Started
Not started
Progress:
WP 1.2Roberta Cuel
WP 1.3Luigi Lancieri
WP 2.1Raúl García
WP 2.2Jérôme Euzenat
Benchmarking timeline
?
6 6 6 6 6Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez
T 2.1.1 SoA on the technology of the scalability
WP
T 2.1.4 Definitionof a methodology, general criteria for
benchmarking
T 1.2.1 Utility of ontology-based tools
T 2.1.6 Benchmarking
of ontology building tools
T 2.2.2 Designof a benchmark
suite for alignment
T 2.2.4 Researchon alignment
techniques and implementations
T 1.3.1Best Practices and Guidelines
Benchmarking relationships
Benchmarking methodology alignmentBenchmark suite alignment
Benchmarking methodologyBenchmark suites
6 12 18 24
Benchmarking overviewSoA ontology tech. evaluation
Benchmarking methodology
Best Practices
7 7 7 7 7Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez
Index
Benchmarking activities in Knowledge WebBenchmarking in WP 2.1Benchmarking in WP 2.2Benchmarking information repositoryBenchmarking in Knowledge Web
8 8 8 8 8Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez
0 6 12 18 24 ... 36 ... 48
T 2.1.4 Definition of a methodology, general criteria for ontology tools benchmarking
T 2.1.1 State of the Art
Benchmarking methodology
Type of tools to be benchmarked:• Ontology building tools• Annotation tools• Querying and reasoning services of ontology development tools• Semantic Web Services technology
General evaluation criteria:• Interoperability• Scalability• Robustness
Test suites for each type of tools
Benchmarking supporting tools
• Overview of benchmarking, experimentation, and measurement
• SoA of ontology technology evaluation
T 2.1.6 Benchmarking of ontology building tools
T2.1.xBenchmarkingquerying, reasoning,annotation, web service
Specific evaluation criteria:• Interoperability• Scalability• Robustness
Test suites for ontology building tools
Benchmarking supporting tools
Benchmarking in WP 2.1
9 9 9 9 9Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez
Ontology Technology/Methods
Evaluation
Benchm
arking
Desired attributesWeaknesses
Comparative analysis...
Continuous improvementBest practices
Measurement
Experimentation
T 2.1.1: Benchmarking Ontology Technologyin D 2.1.1 Survey of Scalability Techniques for Reasoning with Ontologies
• Overview of benchmarking, experimentation, and measurement• State of the Art of Ontology-based Technology Evaluation
Recommendations
10 10 10 10 10Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez
Plan 1 Goals identification
2 Subject identification
3 Management involvement
4 Participant identification
5 Planning and resource allocation
6 Partner selection
Experiment 7 Experiment definition
8 Experiment execution
9 Experiment results analysis
Improve10 Report writing
11 Findings communication
12 Findings implementation
13 Recalibration
T 2.1.4: Benchmarking methodology, criteria, and test suites
General evaluation criteria:• Interoperability• Scalability• Robustness
Benchmark suites for:• Ontology building tools• Annotation tools• Querying and reasoning services• Semantic Web Services technology
Benchmarking supporting tools:• Workload generators• Test generators• Statistical packages•...
Methodology
11 11 11 11 11Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez
T 2.1.6: Benchmarking of ontology building tools
Benchmarking ontology
building tools
Partners/Tools:
UPM
...... ... ...
Benchmark suites:• Interoperability
(x tests)• Scalability
(y tests)• Robustness
(z tests)
Benchmarking results:• Comparative• Weaknesses• (Best) practices• Recommendations
Benchmark suites:• RDF(S) Import capability • OWL Import capability• RDF(S) Export capability• OWL Export capability
Experiments:• Import/export RDF(S) ontologies• Import/export OWL ontologies• Check for knowledge loss• ...
Experiment results:
• test 1• test 2• test 3• ...
NOOKOK
Benchmarking results:• Comparative• Weaknesses• (Best) practices
Interoperability• Do the tools import/export from/to RDF(S)/OWL?• Are the imported/exported ontologies the same?• Is there any knowledge loss during import/export?• ...
12 12 12 12 12Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez
Index
Benchmarking activities in Knowledge WebBenchmarking in WP 2.1Benchmarking in WP 2.2Benchmarking information repositoryBenchmarking in Knowledge Web
13 13 13 13 13Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez
T 2.2.2 Design of a benchmark suite for alignment
Why evaluate?• Comparing the possible solutions;• Detecting the best methods;• Finding out where we are bad.
Two goals:• For the developer: improving the solutions;• For the user: choosing the best tools;• For both: testing compliance with a norm.
Results:• Benchmarking methodology for alignment techniques;• Benchmark suite for alignment;• First evaluation campaign;• Greater benchmarking effort.
How evaluate?• Take a real life case and set the deadline• Take several cases normalizing them• Take simple cases identifying what they highlight
(benchmark suite)• Build a challenge (MUC, TREC)
14 14 14 14 14Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez
T 2.2.2 What has been done?Information Interpretation and Integration Conference (I3CON), to held at the NIST Performance Metrics for Intelligent Systems (PerMIS) Workshop: focuses on "real-life" test cases and compare algorithm global performance.
Facts:• 7 ontology pairs;• 5 participants;• Undisclosed target alignments (independently made);• Ask for the alignments in normalized format;• Evaluation on the F-measure.
Results:• Difficult to find pairs in the wild (they have
been created);• No dominating algorithm, no most difficult
case for all;• 5 participants was the targetted number, we
must have more next time!
The Ontology Alignment Contest at the 3rd Evaluation of Ontology-based Tools (EON) Workshop, to be held the International Semantic Web Conference (ISWC): aims at defining a proper set of benchmark tests for assessing feature-related behavior.
Facts:• 1 ontology and 20 variations (15 hand-crafted on
some particular aspects);• Target alignment (made on purpose) published;• Ask for a paper, with comments on the tests and on
the achieved results (as well as the results in normalized format).
Results:
We are currently benchmarking the tools!
See you at
EON Workshop, ISWC 2004,
Hiroshima, JP
November …
15 15 15 15 15Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez
T 2.2.2 What’s next?
• More consensus on what’s to be done?
• Learn more
• Take advantage of the remarks
• Make a more complete:
real-world+bench suite+challenge?
• Provide automated procedures
16 16 16 16 16Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez
Index
Benchmarking activities in Knowledge WebBenchmarking in WP 2.1Benchmarking in WP 2.2Benchmarking information repositoryBenchmarking in Knowledge Web
17 17 17 17 17Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez
Benchmarking information repository
Web pages inside the Knowledge Web portal with:• General benchmarking information
(methodology, criteria, test suites, references, ...)• Information about the different benchmarking activities in Knowledge Web• Benchmarking results and lessons learned• ...
Objectives:• Inform• Coordinate• Share/reuse• ...
Proposal for a benchmarking working group in the SDK cluster.
18 18 18 18 18Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez
Index
Benchmarking activities in Knowledge WebBenchmarking in WP 2.1Benchmarking in WP 2.2Benchmarking information repositoryBenchmarking in Knowledge Web
19 19 19 19 19Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez
In Knowledge Web:• Benchmarking is performed over products/methods (not processes)• Benchmarking is not a continuous process
Ends with findings communication, there is no findings implementation or recalibration• Benchmarking technology involes evaluating technology• Benchmarking technology is NOT just evaluating technology
We must extract practices and best practices• Benchmarking results
• Comparative• Weaknesses• (Best) practices
• Benchmarking results are needed!Both in industry and research
• ...
Recommendations (Continuous) Improvement
What is benchmarking in Knowledge Web?
20 20 20 20 20Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez
How much do we share?
Benchmarking methodology, criteria, and test suites
Benchmarking results
• Is the view about benchmarking from industry “similar” to the view from research?• Is it viable to have a common methodology? Will anyone use it? • Can the test suites be reused between industry/research?• Can be useful a common way of presenting test suites?• ...
• Can research benchmarking results be (re)used by industry, and viceversa?• Can be useful a common way of presenting results?• ...
21 21 21 21 21Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez
Provide the benchmarking methodology to industry:• First draft after Manchester Research meeting. 1st October.• Feedback from WP 1.2. End of October.• (Almost) final version by half-November.
Set up web pages with benchmarking information in the portal:• Benchmarking activities • Methodology• Criteria• Test suites
Discuss in a mailing list and agree on a definition of “best practice”.
Next meeting? To be decided (around November) (with O2I)
Next steps
22 22 22 22 22Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez
Benchmarking in Knowledge Web
Raúl García-Castro, Asunción Gómez-Pérez<rgarcia,[email protected]>
September 10th, 2004
Jérôme Euzenat<[email protected]>