69
“All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

“All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

Embed Size (px)

Citation preview

Page 1: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

“All databases are equal...

…but some are more equal than others.”

Stephen Adams,

Magister Ltd., GB

Page 2: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 2

Topics

• Where database creation goes wrong…

• Why bother to evaluate?– A word about ‘quality’

• Quality content– missing documents, document kinds and

fields

• Quality context– search engines

• Conclusion

Page 3: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 3

Topics

• Where database creation goes wrong…

• Why bother to evaluate?– A word about ‘quality’

• Quality content– missing documents, document kinds and

fields

• Quality context– search engines

• Conclusion

Page 4: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 4

The basics of information retrieval

Query

Documents Documentrepresentation

Query representation

HitsMatching

Adapted from Crestani, J.Inf.Sci. 29(2), 87-96 (2003)

Page 5: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 5

Reference interview

Query

Documents Documentrepresentation

Query representation

HitsMatching

“I’m sorry - I don’t understand the question…”

“Are you also interested in…?”

“How much do you already know about this?”

QUALITY RESULTS START WITH US.

Page 6: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 6

Strategy development

Query

Documents Documentrepresentation

Query representation

HitsMatching“Where has that manual got to…!”

“When did they start using that field?”

“Is that field available for all records?”

Page 7: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 7

Document quality - at source

Query

Documents Documentrepresentation

Query representation

HitsMatching“I leave the form-filling to the paralegals…”

“I’m sure my secretary never transposes application numbers - she can read my handwriting…”

“Our patent office uses that INID code differently…”

Page 8: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 8

Full text, abstract, indexing...

Query

Documents Documentrepresentation

Query representation

HitsMatching

“I get so much rubbish with full-text…”

“I don’t trust abstracts - especially for a freedom-to-operate search…”

“Their timeliness has improved - but indexing quality is down…”

“800,000 corrections per year”

Page 9: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 9

Hitting the keyboard

Query

Documents Documentrepresentation

Query representation

HitsMatching

“Where on earth did that false drop come from…?”

“We always use the free services - the results are OK so far”

“Why does this host always crash on a Friday?”

Page 10: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 10

Major topics for today

Query

Documents Documentrepresentation

Query representation

HitsMatching

Database content

Database context

Page 11: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 11

Content and context

• The effectiveness of “a database” as a search tool is a function of (at least) two variables:– the data content– the search engine / command language.

• The ideal answer may be a compromise:– (‘average’ database & ‘good’ command

language) or (‘good’ database & ‘poor’ search engine).

Page 12: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 12

Topics

• Where database creation goes wrong…

• Why bother to evaluate?– A word about ‘quality’

• Quality content– missing documents, document kinds and

fields

• Quality context– search engines

• Conclusion

Page 13: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 13

Why evaluate database content?

• Database evaluation is a basic part of information literacy:– “a set of abilities requiring individuals to

recognise when information is needed and have the ability to locate, evaluate and use effectively the needed information.”

– American Library Association 1989, Final Report of the ALA Presidential Committee on Information Literacy

• If we do not evaluate our sources, we cannot serve our customers fully.

Page 14: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 14

The biggest database of all?

Isn’t that enough for anyone?

So why evaluate?

Page 15: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 15

A simple evaluation parameter: language

Cyber Atlas distribution 2000

English

Japanese

German

Chinese

French

Spanish

Russian

Other

Source: CyberAtlas, www.clickz.com/stats/big_picture/demographics/article.php/5901_408521

OCLC figures for 2004 are comparable: 30-35% of the Internet is not in English.

Page 16: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 16

Implication:

• The effectiveness of ‘the Internet’ as a retrieval tool will be skewed according to the nature of our search:– “Hermann Hesse” “Das Glasperlenspiel” =

13,600• of which “& domain=de” comprises 13,400

– “Hermann Hesse” “The Glass Bead Game” = 12,500

• of which “& domain=de” comprises 128

– “Hermann Hesse” “Magister Ludi” = 5,100

Page 17: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 17

The third leg

• Good database evaluation should include not only the 2 factors identified above: – Database content i.e. how well it is put together

– Database context i.e. the command language and search engine

• but also a third factor– How well does this database fit my specific enquiry? (one-off

need or recurring usage)

– Note - if the evaluation process includes this factor, it follows that there is no such thing as the ‘ideal’ database for all enquiries

Page 18: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 18

What is quality?

• “Fitness for purpose”– content– completeness– timeliness etc.

• It is difficult to be absolute; more easy to evalutate as a relative quantity– benchmarking two sources against one

another gives a better practical feel for ‘quality’ than attempts to measure against a mythical standard

Page 19: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 19

Simple example of quality

• We wish to conduct a freedom-to-operate search in respect of Germany– one file contains DE-C2, DE-B4 documents– a second file contains DE-C2, DE-B4, DE-

C1, DE-B3, DE-T2 and DE-U documents

• Which one would you choose?– Whichever your answer, it does not imply

that the other is ‘poor quality’.

Page 20: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 20

Measuring quality

• We can measure good content– essentially quantitative, binary

• We can measure good database structure/context – essentially qualitative, relative, subjective

• e.g. are there explicit links between individual records (e.g. common indexing scheme)?

• e.g. do the command language features or field standardisation facilitate virtual links?

• e.g. what proportion of the time is the system up?

Page 21: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 21

The coelecanth

Location: GreenlandZone: polarHabitat: fresh waterSize: 30 cm.Era: 200 m. years agoExtinct for 50 m. years

Location: South AfricaZone: sub-tropicalHabitat: salt waterSize: 1.75 metresEra: 1938Alive and breeding

Page 22: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 22

Databases or datadumps?

• Science is not ephemeral - it is cumulative– Unless adequate consideration is given to the

issue of retrieval at a distance of 10, 20 or 50 years after publication, then the resulting file is not a database at all - it is a datadump

• Much emphasis has been given in recent years to timeliness i.e. adding new records– add in haste, repent at leisure?

Page 23: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 23

Robert Maxwell:

Chairman of Pergamon Press

Owner of Pergamon Orbit-InfoLine

Owner of Mirror Group Newspapers

Page 24: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 24

“All the science that’s fit to print”

• Publication or ‘laid open to public inspection’ without consideration of retrieval afterwards means that each record is left isolated from the context of the corpus of science– and will be missed in a proportion of the

searches to which it is a relevant answer– or possibly never found again

Page 25: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 25

Topics

• Where database creation goes wrong…

• Why bother to evaluate?– A word about ‘quality’

• Quality content– missing documents, document kinds and

fields

• Quality context– search engines

• Conclusion

Page 26: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 26

Missing fields

Three ‘layers of incompleteness’

Missing Kinds-of-documents

Missing documents

Page 27: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 27

Missing documents

• The classical measure of database quality:– Is

• every document of the same kind,

• published in that period

• by that publishing authority

– present in the file?

• Examples:– Latipat, USPTO.gov, Patent Abstracts of

Japan

Page 28: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 28

Missing documents

• Latipat– Newly launched esp@cenet portal,

http://lp.espacenet.com

• USPTO.gov– Full-text of granted patents

• Patent Abstracts of Japan– JAPIO file

Page 29: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 29

Latipat

0500

10001500

200025003000

35004000

45005000

Both Latipat and PlusPat (below) suffer from the same problem - missing records; lots of them!

Page 30: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 30

USPTO.gov

Partial listing of missing patents:

4097518 - 4097928 (411)

4526401 - 4527286 (886)…

= 6,092 missing between 4,000,000 and 4,999,999 (0.6%)

STILL 224 missing between 6,000,000 and 6,101,209 (0.2%)

Page 31: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 31

PAJ

PAJ fact sheet from Questel-Orbit

Page 32: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 32

What the publicity impliesA

PP

LIC

AN

TS

DATE

TECHNOLOGY

1976

Page 33: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 33

First limitation - by applicantA

PP

LIC

AN

TS

DATE

TECHNOLOGY

1976 1989

Backfile to 1989 now available - but has every host loaded it?

Prior to 1998, cases not claiming JP priority were not automatically included in PAJ

Page 34: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 34

Second limitation - by technologyA

PP

LIC

AN

TS

DATE

TECHNOLOGY

1976 1989

Prior to 1989, only 48 out of 118 IPC classes were covered completely (40%)

Complete IPC coverage from 1989 - but no plans to create back-file?

Page 35: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 35

The (messy) truthA

PP

LIC

AN

TS

DATE

TECHNOLOGY

1976 1989

Page 36: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 36

How to evaluate?

• “Missing documents” is one of the few parameters which can be measured independently of the database– Annual Reports of the office concerned– WIPO Industrial Property statistics

• Caution : – these may not refer to the appropriate

document kinds; check before use.

Page 37: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 37

Caution

• Determining database ‘completeness’ is only meaningful when measured against a quantitative parameter– e.g. publication number.

• It has little or no meaning when measured using more qualitative parameters– e.g. no. of hits found using the same strategy

across several databases• the strategy will be sub-optimal for some

databases and not for others

Page 38: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 38

Simple source-by-source comparison

BIOSIS -v- Medline

BIOSIS Evolutions vol.9 no.6 © BIOSIS

Page 39: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 39

Science Direct: Comprehensive - provided it’s from Elsevier...

Web of Science: Comprehensive - provided it’s got a high impact factor from ISI...

MDL: PCT and EP from 1976 ?

Page 40: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 40

Take-home message

• There is nothing wrong with publicity– provided it is not confused with user

documentation.

• Database producers still have a long way to go in informing users of the gaps in their databases– it should be much easier to locate this data

than it is at present.

Page 41: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 41

Topics

• Where database creation goes wrong…

• Why bother to evaluate?– A word about ‘quality’

• Quality content– missing documents, document kinds and

fields

• Quality context– search engines

• Conclusion

Page 42: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 42

Missing Kinds-of-Documents

• Second measure of database quality– Is

• every document of every appropriate kind,

• published in that period

• by that publishing authority

– present in the file?

• Examples:– Overlapping year / country coverage– EP-A1, -A2, -A3, -A8, -A9– US-B1, -B2, -E, -C1, -C2

Page 43: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 43

But they all cover Australia...

• Even given overlapping country and year coverage, different sources can cover different publication stages

• e.g. Australia– WPI : AU-A from 1963, AU-B from 1993– INPD : AU-A from 1973, AU-B from 1978– CAS : AU-B from 1927

• AU-A is included in CAPlus family, even though it will never be selected as CAS basic - see http://www.cas.org/EO/patkind.html

Page 44: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 44

European correction documents

• ST.50 implemented from 1997– how many database producers take the data?– how many tell their users whether they take

the data?

• Examples:– Questel-Orbit EPPATENT file– STN Europatfull file

Page 45: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 45

Coverage of correction documents

1/1 EPPATENT - (C) Questel.Orbit- imageCPIMPN - EP954211 A2 19991103 [EP-954211]BPN - 1999-44ET - Supporting apparatusBRR - 2000-29 (Updated 2000-29)DREX- 2001-01-18 Request for examination (Updated 2001-13)DNEX- 2001-08-06 First examination report (Updated 2001-38)DGR - 2003-07-23 Grant (Updated 2003-30)BGR - 2003-30 (Updated 2003-30)NGR - B1 (Updated 2003-30)

EPPATENT MAX format (edited) : all Bulletin announcements

Page 46: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 46

Coverage of correction documents

L1 ANSWER 1 OF 1 EUROPATFULL COPYRIGHT 2004 WILA on STN PATENT APPLICATION - PATENTANMELDUNG - DEMANDE DE BREVET AN 954211 EUROPATFULL ED 19991114 EW 199944 FS OSTIEN Supporting apparatus.PIT EPA2 EUROPAEISCHE PATENTANMELDUNG GRANTED PATENT - ERTEILTES PATENT - BREVET DELIVRE AN 954211 EUROPATFULL UP 20030729 EW 200330 FS PSTIEN Supporting apparatus.PIT EPB1 EUROPAEISCHE PATENTSCHRIFT

Europatfull default format (edited) : no record of anything after EP-B1

Page 47: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 47

INID (15) shows that this B8 is a correction to the B1 (grant).

Page 48: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 48

How effective is this?

• The experienced information specialist is tempted to infer legal status information from the presence/absence of a particular publication stage (risky!) – e.g. EP-B = assumption of entry into force

• The inexperienced information specialist is not always given the correct links to lead to the right conclusion e.g.– e.g. US parent, re-issue, re-examination cases

Page 49: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 49

Re-examination mentioned in facsimile version - but not in ASCII text:Parent case - claims 1-10Re-exam 1 - new claims 11-112Re-exam 2 - new claims 113-126

IFI record consolidates all changes into a single record - the novice has a better chance of getting a more accurate answer to a legal status search.

Page 50: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 50

US coverage

Kind Code DefinitionEarliest date of use

Dialog / 652-654 IFI Claims INPADOC (incl. Delphion)

Questel / USAPPS

Questel / USPAT

STN/ USPAT2

STN/ USPATFULL WPI

US-A old Act grant 1836 1971 1950 1968 1971 1971 1963new Act published application 2001 2001 2001 2001 2001 2001

US-B old Act re-examination Y 1981new Act grant 2001 2001 Y

US-C new Act re-examination 2001 YUS-E re-issue 1838 Y 1963 1968 Y 1970US-H defensive publication 1969 Y 1963 1977 1976US-H1 Statutory Invention Registration 1985 Y 1963 1985 1985 Y 1968US-A1 Trial Voluntary Protest Program 1975 1975US-S Design Patent 1843 Y 1976 2001 1976 YUS-P old Act Plant Patent 1931 Y 1976 1994 1976 YUS-P1 Plant Patent published application 2001 2001US-P2 Plant Patent grant 2001US-A0 NTIS invention applications 1974 1983US-A9 Correction of new Act published application 2002 2002 2002None Office of the Alien Property Custodian (APC) 1917

• Example analysis of KD coverage– e.g. IFI would appear to cover SIR’s from 1963,

some 22 years before they started (?)

– e.g. split between USPATFULL/USPAT2 difficult to discern

Page 51: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 51

Topics

• Where database creation goes wrong…

• Why bother to evaluate?– A word about ‘quality’

• Quality content– missing documents, document kinds and

fields

• Quality context– search engines

• Conclusion

Page 52: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 52

Missing fields

• Third measure of database quality• every document of every appropriate kind

• published in that period

• by that publishing authority

• present in the file to the same level of detail

• Evaluation must compare like-with-like• Variations in completeness of coverage and/or field

population will affect the apparent effectiveness

• Examples: • EP and PCT full-text files, Derwent WPI

coverage

Page 53: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 53

Non-systematic or missing fields

• New field during database life– imposes an implicit time range on your

search• e.g. IPC editions, WPI coding changes

• Systematic omission of a field– biases results against records which do not

contain that field• e.g. US-A assignees

• e.g. JP, CN inventors in WPI

Page 54: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 54

European Patents Fulltext covers all European patent applications and granted European patents published since the opening of the European Patent Office (EPO) in 1978…

But…EP-A specifications from 1986 in only one languageEP-B specifications from 1991 in three languages

Page 55: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 55

PCT full text

• Many files claim to cover ‘full text’ PCTs– Few handle the cases published in Japanese,

Chinese or Russian• but these still have an English abstract

• Abstract searching gives equal weight to all documents

• Full text searching skews results in favour of records containing full text

Page 56: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 56

Derwent WPI countries

• Most countries in WPI are coded using the Manual Code system– but not all countries had Manual Codes added

from the start of their coverage

• A strategy incorporating Manual Codes imposes an implicit time ranging on some countries, and can distort retrieval– MC retrieval of KR-B started 1990, biblio

available from 1986

Page 57: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 57

Topics

• Where database creation goes wrong…

• Why bother to evaluate?– A word about ‘quality’

• Quality content– missing documents, document kinds and

fields

• Quality context– search engines

• Conclusion

Page 58: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 58

Quality and the search platform

• A poor search platform / command language can ruin a good quality database, by effectively concealing or distorting the information which is present.

• Typical questions:– does the default print format contain the most

useful information for my search?– do I obtain the same answer irrespective of

the route to it?

Page 59: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 59

Default print formats

1/1 PLUSPAT - (C) QUESTEL-ORBIT- image CPIM (C) Questel-Orbit PN - EP0954211 A2 19991103 [EP-954211] TI - (A2) Supporting apparatusSTG - (A2) Pub. Of applic. Without search report

1/1 PLUSPAT - (C) QUESTEL-ORBIT- imageCPIM (C) Questel-OrbitPN - EP0954211 A2 19991103 [EP-954211]PN2 - EP0954211 A3 20000719 [EP-954211]PN3 - EP0954211 B1 20030723 [EP-954211]PN4 - EP0954211 B8 20040414 [EP-954211]TI - (A2) Supporting apparatusSTG - (A2) Pub. Of applic. Without search reportSTG2- (A3) Publi. Of search reportSTG3- (B1) PatentSTG4- (B8) Modified first page

PlusPat BIB format : only shows first publication stage.

PlusPat MAX format

Page 60: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 60

Variation due to search route

• US patent term extension under 35 USC §136 (Hatch/Waxman)– issued in the form of a Certificate of

Correction

• At least two equivalent routes to view: – locate the original document and check for a

‘Correction’ segment in full text view OR– go directly to list of term extensions and link

to Certificate• http://www.uspto.gov/web/offices/pac/dapp/opla/term/156.html

Page 61: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 61

Test question

• Is there an extension in force for – US 4540568 ? – US 4572909 ?– N.B. - This question avoids the use of PAIR

(inoperative on day of test) and assumes that the enquirer has already established that US 4540568 has been replaced by US Re 32969

• but why was PAIR not working?

• and why should I have to make that assumption?

Page 62: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 62

35 USC 156 listing

Page 63: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 63

Results

• US Re 32,969 (replaced US 4540568)– via 156 listing, obtains a PDF of Cert. of Correction

shows term extended for 931 days • actual extension in listing is recalculated as 897 under 35

USC 156(c)(3)

– via full text view, there is no record of the Certificate of Correction at all; nor any link from US 4540568

• US 4572909– via 156 listing, obtains a PDF showing extension for

1252 days

– via full text view, an additional ‘Correction’ segment available

Page 64: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 64

Additional document segment is present for US 4572909, but missing for others….

Page 65: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 65

Summary answer

Source: US 4540568 /US Re32969

US 4572909

35 USC 156listing

Yes Yes

Full text No Yes

PAIR ? ?

Page 66: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 66

Topics

• Where database creation goes wrong…

• Why bother to evaluate?– A word about ‘quality’

• Quality content– missing documents, document kinds and

fields

• Quality context– search engines

• Conclusion

Page 67: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 67

Conclusion

• There is no such thing as “the database for all seasons”

• Evaluation is ongoing, even for established products

• There are many ways in which databases can be ‘incomplete’

• A poor search environment can ruin a good database

• Communication between legal, information and database specialists is the key quality factor

Page 68: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 68

Coming up in part 2...

• Two case studies– PCT publication rates– Searching for gold

Page 69: “All databases are equal... …but some are more equal than others.” Stephen Adams, Magister Ltd., GB

© Magister Ltd 2004, 2005 69

Enjoy your break!