31
For peer review only Validity of ICD-9-CM Codes for Breast, Lung, and Colorectal Cancers in Three Italian Administrative Healthcare Databases: A Diagnostic Accuracy Study Protocol Journal: BMJ Open Manuscript ID bmjopen-2015-010547 Article Type: Protocol Date Submitted by the Author: 16-Nov-2015 Complete List of Authors: Abraha, Iosief; Regional Health Authority of Umbria, Health Planning Service Serraino, Diego; IRCCS Centro di Riferimento Oncologico Aviano (IT), Epidemiology and Biostatistic Unit, Giovannini, Gianni; Regional Health Authority of Umbria, Health Planning Office Stracci, Fabrizio; University of Perugia, Public Health Department Casucci, Paola; Regional Health Authority of Umbria, Health ICT Service Alessandrini, Giuliana; Regional Health Authority of Umbria, Health ICT Service Bidoli, Ettore; IRCCS Centro di Riferimento Oncologico, Epidemiology and Biostatistic Unit Chiari, Rita; Azienda Ospedaliera Perugia, Oncology De Giorgi, Marcello; Regional Health Authority of Umbria, , Health ICT Service Franchini, David; Regional Health Authority of Umbria, Health ICT Service Vitale, Maria Francesca; Registro Tumori Regione Campania , ASL NA3 Sud Fusco, Mario; ASL NA3 Sud, Registro Tumori Regione Campania Montedori, Alessandro; Regional Health Authority of Umbria, Health Planning Service <b>Primary Subject Heading</b>: Health services research Secondary Subject Heading: Oncology, Public health Keywords: administrative database, validating ICD-9 codes, breast, lung and colorectal cancers For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml BMJ Open on June 19, 2020 by guest. Protected by copyright. http://bmjopen.bmj.com/ BMJ Open: first published as 10.1136/bmjopen-2015-010547 on 25 March 2016. Downloaded from

BMJ Open · of Diseases, 9th Revision (ICD-9) or 10th Revision (ICD-10) edition. The ICD is designed to map health conditions to corresponding generic categories together with specific

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

For peer review only

Validity of ICD-9-CM Codes for Breast, Lung, and Colorectal Cancers in Three Italian Administrative Healthcare

Databases: A Diagnostic Accuracy Study Protocol

Journal: BMJ Open

Manuscript ID bmjopen-2015-010547

Article Type: Protocol

Date Submitted by the Author: 16-Nov-2015

Complete List of Authors: Abraha, Iosief; Regional Health Authority of Umbria, Health Planning Service Serraino, Diego; IRCCS Centro di Riferimento Oncologico Aviano (IT),

Epidemiology and Biostatistic Unit, Giovannini, Gianni; Regional Health Authority of Umbria, Health Planning Office Stracci, Fabrizio; University of Perugia, Public Health Department Casucci, Paola; Regional Health Authority of Umbria, Health ICT Service Alessandrini, Giuliana; Regional Health Authority of Umbria, Health ICT Service Bidoli, Ettore; IRCCS Centro di Riferimento Oncologico, Epidemiology and Biostatistic Unit Chiari, Rita; Azienda Ospedaliera Perugia, Oncology De Giorgi, Marcello; Regional Health Authority of Umbria, , Health ICT Service

Franchini, David; Regional Health Authority of Umbria, Health ICT Service Vitale, Maria Francesca; Registro Tumori Regione Campania , ASL NA3 Sud Fusco, Mario; ASL NA3 Sud, Registro Tumori Regione Campania Montedori, Alessandro; Regional Health Authority of Umbria, Health Planning Service

<b>Primary Subject Heading</b>:

Health services research

Secondary Subject Heading: Oncology, Public health

Keywords: administrative database, validating ICD-9 codes, breast, lung and colorectal cancers

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open on June 19, 2020 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2015-010547 on 25 M

arch 2016. Dow

nloaded from

For peer review only

1

Validity of ICD-9-CM Codes for Breast, Lung, and Colorectal

Cancers in Three Italian Administrative Healthcare

Databases: A Diagnostic Accuracy Study Protocol

Iosief Abraha, Diego Serraino, Gianni Giovannini, Fabrizio Stracci, Paola Casucci, Giuliana

Alessandrini, Ettore Bidoli, Rita Chiari, Marcello De Giorgi, David Franchini, Maria Francesca

Vitale, Mario Fusco, Alessandro Montedori

Author affiliations:

Health Planning Service, Regional Health Authority of Umbria, Perugia, Italy

Iosief Abraha

Gianni Giovannini

Alessandro Montedori

Regional Health Authority of Umbria, Health ICT Service, Perugia, Italy

Paola Casucci

Giuliana Alessandrini

Marcello De Giorgi

David Franchini

Centro di Riferimento Oncologico Aviano, Epidemiology and Biostatistic Unit, Aviano, Italy

Diego Serraino

Ettore Bidoli

Registro Tumori Regione Campania, ASL NA3 Sud, Brusciano (Na), Italy

Mario Fusco

Maria Francesca Vitale

Public Health Department University of Perugia, Perugia, Italy

Fabrizio Stracci

Dipartimento di Oncologia, Azienda Ospedaliera Perugia, Perugia, Italy

Rita Chiari

Corresponding author:

Iosief Abraha, [email protected]; [email protected]

Health Planning Service

Regional Health Authority of Umbria

Via Mario Angeloni 61

06124 Perugia, Italia

Tel. +39 075 504 5251

Fax. +39 075 504 5569

Page 1 of 14

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on June 19, 2020 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2015-010547 on 25 March 2016. D

ownloaded from

For peer review only

2

Abstract

Introduction Administrative healthcare databases are useful tools to study healthcare outcomes and to monitor

the health status of a population. Patients with cancer can be identified through disease-specific

codes, prescriptions and physician claims, but prior validation is required to achieve an accurate

case definition. The objective of this protocol is to assess the accuracy of International

Classification of Diseases 9th Revision – Clinical Modification (ICD-9-CM) codes for breast, lung,

and colorectal cancers in identifying patients diagnosed with the relative disease in three Italian

administrative databases.

Methods and analysis Data from the administrative databases of Umbria (910,000 residents), Napoli (1,170,000 residents),

Friuli-Venezia Giulia (1,227,000 residents) will be considered. The following ICD-9-CM codes will

function as index tests: 233.0 and 174.0 -174.9 for breast cancer, 162.0 - 162.9 for lung cancer,

153.0 - 153.9 and 154.8 for colorectal cancer. Randomly selected clinical charts will function as

reference standards. Data abstractors will be trained appropriately and agreement between pairs of

abstractors will be assessed. By and large, we will consider two types of algorithms: (a) a cancer

specific algorithm by combining ICD-9-CM diagnosis codes (e.g., malignant neoplasm 174.0 ) with

procedure codes (i.e. mastectomy: 85.41 - 85.48); and (b) generic codes including chemotherapy or

radiotherapy codes, and the position of the diagnoses (principal or secondary) as well as variation in

the time window between two diagnoses. For each algorithm, sensitivity, specificity, and predictive

values will be calculated.

Dissemination Study results will be disseminated widely through peer-reviewed publications and presentations at

national and international conferences.

Page 2 of 14

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on June 19, 2020 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2015-010547 on 25 March 2016. D

ownloaded from

For peer review only

3

Article focus

Validation of administrative database diagnosis codes is necessary for health outcome research.

The aim of this protocol is to evaluate the validity of the International Classification of Diseases-

9th Revision – Clinical Modification (ICD-9-CM) codes for breast, lung and colorectal cancers in

three administrative databases.

Strengths and limitations of this study

This study will be the first to validate ICD-9-CM codes of three cancers in three large

administrative databases in Italy.

This study will evaluate the best combination of algorithms through which to identify the disease

of interest.

Validation studies of administrative data codes is setting-specific and not generalizable to other

settings.

Page 3 of 14

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on June 19, 2020 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2015-010547 on 25 March 2016. D

ownloaded from

For peer review only

4

Introduction

As computer technology continues to advance, administrative databases are increasingly growing in

numerous healthcare settings worldwide. These databases anonymously store data about patients

regarding healthcare assistance they received, including birth, death, or disease treatment. Usually,

the diagnosis of the disease is associated with a specific code from the International Classification

of Diseases, 9th Revision (ICD-9) or 10

th Revision (ICD-10) edition. The ICD is designed to map

health conditions to corresponding generic categories together with specific variations1. The

merging of individual patient data from administrative databases with other sources (e.g.,

prescription and laboratory data) allows investigating a wide range of relevant and often unique

public health questions 2, monitoring population health status over time and performing population-

based pharmacoepidemiological research 2.

To constitute a reliable source for research studies, adequate validation of administrative healthcare

databases is mandatory. While non-clinical information in healthcare databases, such as

demographic and prescription data, are highly accurate 3 4, the validity of registered diagnoses and

procedures is variable4 5. Determining the accuracy of the latter two categories of clinical

information is vitally important to all potential users and involves confirming the consistency of

information within the databases with the corresponding clinical records of patients 3.

In Italy, all the regional health authorities maintain large healthcare information systems containing

patient data from all hospital and territorial sources. These databases have the potential to address

important issues in post-marketing surveillance6 7, epidemiology

8, quality performance and health

services research9. However, there is a concern that their considerable potential as a source of

reliable healthcare information has not been realized since they have not been widely validated. A

systematic review of ICD-9 code validation in Italian administrative databases10 reported that only a

few regional databases have been validated for a limited number of ICD-9 codes of diseases

including stroke 11 12, gastrointestinal bleeding

13, thrombocytopenia

14, epilepsy

15, infection

16,

Page 4 of 14

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on June 19, 2020 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2015-010547 on 25 March 2016. D

ownloaded from

For peer review only

5

chronic obstructive pulmonary disease 17 18, Guillain-Barre syndrome

19, and cancer

20 21. In

addition, the use of these databases was scarce, as only six administrative databases served as

sources for published research articles based on the validated ICD-9 codes. Hence, it is imperative

that regional health authorities systematically validate their databases for critical diseases to

productively use the information they contain.

Breast, colorectal and lung cancers are the most commonly diagnosed neoplasms worldwide, as

well as in Italy22. Consequently, they generate interest in the scientific community and industry as

targets for the development of new drugs, and for governments, given that they are an important

cause of public health and economic burden. For example, variation in the epidemiology of breast23,

colorectal24 25 and lung

26 cancers, treatment (pharmacological or surgical) administered to patients

suffering from these cancers and potential clinical and economic outcomes27-29, can all be evaluated

using validated administrative databases.

The objective of the present protocol is to evaluate the accuracy of the ICD-9-CM codes related

breast, lung and colorectal cancers, in correctly identifying the respective diseases using three large

Italian administrative healthcare databases.

Page 5 of 14

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on June 19, 2020 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2015-010547 on 25 March 2016. D

ownloaded from

For peer review only

6

Methods

Setting and data source

Administrative databases

The target administrative databases were from the Umbria Region (910,000 residents), Local Health

Unit 3 of Napoli (1,170,000 residents) and the Friuli-Venezia Giulia Region (1,227,000 residents).

Starting from the early 90s, all of these administrative databases collected information from all

patient medical records from public and private hospitals including demographics, hospital

admission and discharge dates, vital statistics, the admitting hospital department, the principal

diagnosis and a maximum of 5 secondary discharge diagnoses and the principal, and 5 secondary,

surgical and diagnostic procedures. In addition, these databases contain the records of all drug

prescriptions listed in the National Drug Formulary and the basic characteristics of patients’

physicians. Each resident has a unique national identification code with which it is possible to link

the various types of information, corresponding to each person, within the database. In Italy,

healthcare assistance is covered almost entirely by the Italian National Health System (NHS),

therefore most residents’ significant healthcare information can be found within the healthcare

databases.

Study design

The design to be used is that of a diagnostic accuracy study and will be applied to the three

administrative databases. Each team will validate its own database using a common methodological

approach. For each ICD-9 code (considered to be an index test), a reference standard for the

definition of a disease will be identified using available systematic reviews of studies that have

validated the specific code[6]. When a systematic review is not available, in addition to seeking the

opinions of clinicians, we will systematically search relevant primary studies, that validated medical

diagnoses, in electronic databases of the published medical literature (see companion paper). After

constructing a standardized form for each diagnostic code, a random sample of patients with

Page 6 of 14

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on June 19, 2020 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2015-010547 on 25 March 2016. D

ownloaded from

For peer review only

7

specific ICD-9 codes will be obtained from each database and the corresponding charts will be

acquired. Benchmarking between the three administrative databases (rates, inconsistencies in

coding errors and potential differences that need to be resolved before initiating the validation

process) will be performed.

Any reporting or publication of the results from the present study will follow recommended

guidelines to ensure quality of reporting 30 31.

Methodological and statistical analysis Each team will validate the listed ICD-9 codes within its own database, using the same

methodological approach.

Index test

The following ICD-9-CM codes will function as index tests: 233.0 and 174.0 to 174.9 for breast

cancer; 162.0 to 162.9 for lung cancer; and 153.0 to 153.9 and 154.8 for colorectal cancer. To

enhance the accuracy of the codes to classify a disease, algorithms will be developed (see algorithm

development).

Reference standards

Medical charts will function as reference standards (gold standard) to validate the listed codes.

For each cancer, clinical and histological parameters will be retrieved, as well as information related

to tumor size (pT), diagnostic results used to determine the stage of the tumor (TNM), and type of

surgery. In addition, the use of any therapy, including chemotherapy or radiotherapy, either as

standard or adjuvant therapy, will be recorded.

Page 7 of 14

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on June 19, 2020 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2015-010547 on 25 March 2016. D

ownloaded from

For peer review only

8

Algorithm development

Algorithms are a set of rules for identifying disease cases from administrative data32. Elements for

the development of algorithms will be retrieved from the medical literature and by consulting

experienced oncologists. By and large, we will consider two types of algorithms: (a) a cancer

specific algorithm by combining ICD-9-CM diagnosis codes (e.g., Malignant neoplasm 174.0 ) with

procedure codes (i.e. Mastectomy: 85.41 - 85.48); and (b) generic codes including chemotherapy or

radiotherapy codes, and the position of the diagnoses (principal or secondary) as well as variation in

the time window between two diagnoses. Table 1 displays a combination of ICD-9-CM diagnosis

codes and potential surgical, radiotherapy or chemotherapy procedure codes to identify breast,

colorectal, and lung cancer cases in medical charts.

Page 8 of 14

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on June 19, 2020 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2015-010547 on 25 March 2016. D

ownloaded from

For peer review only

9

Table 1. ICD-9-CM diagnosis and procedure codes for identifying breast, colorectal, and lung

cancer cases in medical charts

Site of

cancer

ICD-9-CM principal diagnosis

code

ICD-9-CM secondary diagnosis

code

ICD-9-CM procedure code

Breast Carcinoma in situ: 233.0 Any Incisional breast biopsy:

85.12

Malignant neoplasm: 174.0 -174.9 Any Excision of breast tissue:

85.20 - 85.25

Mastectomy: 85.41 - 85.48

Any Carcinoma in situ: 233.0 Any procedure

Any Malignant neoplasm: 174.0 -174.9 Any procedure

Chemotherapy: V58.1, V67.2 Malignant neoplasm: 174.0 -174.9 Any procedure

Radiotherapy: V58.0, V67.0 Malignant neoplasm: 174.0 -174.9 Any procedure

Lung Malignant neoplasm of trachea,

bronchus, and lung: 162.0e162.9 Any procedure

Chemotherapy: V58.1, V67.2 Malignant neoplasm of trachea,

bronchus, and lung: 162.0-162.9

Any procedure

Radiotherapy: V58.0, V67.0 Malignant neoplasm of trachea,

bronchus, and lung: 162.0-162.9

Any procedure

Any Malignant neoplasm of trachea,

bronchus, and lung: 162.0-162.9

Administration of any

antineoplastic agent: 99.25,

99.28

Colorectal Malignant neoplasm of colon:

153.0 - 153.9

Any Any procedure

Malignant neoplasm of rectum and

rectosigmoid junction: 154.0 -

154.1,154.8

Any Any procedure

Any Malignant neoplasm of colon:

153.0 - 153.9

Any procedure

Any Malignant neoplasm of rectum and

rectosigmoid junction:154.0 -

154.1, 154.8

Any procedure

Chemotherapy: V58.1, V67.2 Malignant neoplasm of colon:

153.0 - 153.9

Any procedure

Radiotherapy: V58.0, V67.0 Malignant neoplasm of rectum and

rectosigmoid junction:154.0 -

154.1, 154.8

Any procedure

Table adapted from Baldi 200820

Page 9 of 14

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on June 19, 2020 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2015-010547 on 25 March 2016. D

ownloaded from

For peer review only

10

Case selection and sampling method

In general, for each medical diagnosis, we will identify patients discharged across a period of time

(3 years). From these records, we will use a random number generator to randomly select a subset

of patients for our reference standard abstraction.

Case ascertainment

A standardized form, for each disease, will be produced to record demographic, clinical, and

laboratory patient data. Clinically trained chart reviewers will examine the medical charts and fill in

the specific form. Before proceeding to full medical chart abstracting, the level of agreement

between pairs of reviewers will be assessed for a limited number of charts.

Validation methods and statistical analysis

We will determine inter-rater agreement for chart abstraction by calculating the kappa statistic for

the sample of charts abstracted and reviewed by two investigators. Sensitivity and specificity will

be analyzed separately for each ICD-9 code by constructing 2 x 2 tables. Sensitivity measures the

extent to which an index test (e.g. ICD-9 code: 233) correctly identifies individuals who, according

to a reference standard, possess the characteristics of interest (i.e. carcinoma in situ) in the medical

chart. Specificity measures the degree to which an index test correctly identifies individuals who,

according to a reference standard, lack the characteristics of interest in the medical chart4. For both

sensitivity and specificity, 95% confidence intervals will be calculated.

In addition, multiple versions of the diagnostic algorithm will be developed using different logistic

regression models and all combinations of the various variables, including the position of the

diagnosis (primary or secondary), chemotherapy use and surgical procedures. Finally, the accuracy

of the algorithm stratified by the candidate ICD-9-CM code diagnosis position, the hospital and the

district, will be compared using the chi-squared test and with p < 0.05.

Page 10 of 14

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on June 19, 2020 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2015-010547 on 25 March 2016. D

ownloaded from

For peer review only

11

Discussion

In this protocol, we present the approach we will use to analyze the validity of ICD-9-CM codes for

breast, lung and colorectal cancers, in order to ascertain their incidence and prevalence in

administrative databases representing northern, central and southern Italy.

Administrative databases constitute a valid alternative to situations in which randomized trials are

not able to provide the required evidence, for practical or economic reasons. Epidemiological

studies are frequently based on administrative claims databases to identify cases of specific

diseases, such as cancer, and often contain ICD-9 codes. The latter codes have the advantage of

being widely available and require lower effort and cost than consulting medical charts4.

Accurate identification of cancer cases using the ICD-9 codes may contribute to monitoring cancer

trends and to proposing interventions to ameliorate cancer care. In 2008, an Italian study developed

and validated an algorithm using a regional administrative database to determine incident cases of

breast, lung, and colorectal cancers and found a sensitivity of, respectively, 76.7%, 80.8%, and

72.4%, for the three cancers20. The present study will add value to the knowledge of the three

cancer diseases given that it covers different areas of Italy.

Ethics and dissemination Ethical approval has been obtained from the Regional Committee of Umbria. Study results will be

disseminated widely through peer-reviewed publications and presentations at national and

international conferences.

Page 11 of 14

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on June 19, 2020 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2015-010547 on 25 March 2016. D

ownloaded from

For peer review only

12

Footnotes

Contributors AM, IA, MF, and DS conceived the study and all authors were responsible for

designing the protocol. IA and AM drafted the protocol manuscript. All authors critically revised

the successive versions of the manuscript and approved the final version.

Funding This study protocol is supported by funding from the National Centre for Disease

Prevention and Control (CCM 2014), Ministry of Health, Italy. The study funder was not involved

in the study design or the writing of the protocol.

Competing interests None.

Page 12 of 14

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on June 19, 2020 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2015-010547 on 25 March 2016. D

ownloaded from

For peer review only

13

References

1. World Health Organization. International statistical classification of diseases and health related

problems, 10th revision. Geneva: WHO 1992.

2. Jutte DP, Roos LL, Brownell MD. Administrative record linkage as a tool for public health research. Annu

Rev Public Health 2011;32:91-108.

3. Rawson NSB, Shatin D. Assessing the validity of diagnostic data in large administrative healthcare

utilization databases. . In: Hartzema A, Tilson H, Chan K, eds. Pharmacoepidemiology and

Therapeutic Risk Management: Harvey Whitney Books, 2008.

4. West SL, Ritchey ME, Poole C. Validity of Pharmacoepidemiologic Drug and Diagnosis Data.

Pharmacoepidemiology: Wiley-Blackwell, 2012:757-94.

5. Campbell SE, Campbell MK, Grimshaw JM, et al. A systematic review of discharge coding accuracy. J

Public Health Med 2001;23(3):205-11.

6. Traversa G, Bianchi C, Da Cas R, et al. Cohort study of hepatotoxicity associated with nimesulide and

other non-steroidal anti-inflammatory drugs. BMJ (Clinical research ed) 2003;327(7405):18-22.

7. Trifiro G, Patadia V, Schuemie MJ, et al. EU-ADR healthcare database network vs. spontaneous reporting

system database: preliminary comparison of signal detection. Studies in health technology and

informatics 2011;166:25-30.

8. Gini R, Francesconi P, Mazzaglia G, et al. Chronic disease prevalence from Italian administrative

databases in the VALORE project: a validation through comparison of population estimates with

general practice databases and national survey. BMC public health 2013;13:15.

9. Colais P, Pinnarelli L, Fusco D, et al. The impact of a pay-for-performance system on timing to hip fracture

surgery: experience from the Lazio Region (Italy). BMC health services research 2013;13(1):393.

10. Abraha I, Montedori A, Eusebi P, et al. The Current State of Validation of Administrative Healthcare

Databases in Italy: A Systematic Review. Pharmacoepidemiol Drug Saf 2012;21:400-00.

11. Leone MA, Capponi A, Varrasi C, et al. Accuracy of the ICD-9 codes for identifying TIA and stroke in an

Italian automated database. Neurological Sciences 2004;25(5):281-8.

12. Rinaldi R, Vignatelli L, Galeotti M, et al. Accuracy of ICD-9 codes in identifying ischemic stroke in the

General Hospital of Lugo di Romagna (Italy). Neurological Sciences 2003;24(2):65-69.

13. Cattaruzzi C, Troncon MG, Agostinis L, et al. Positive predictive value of ICD-9th codes for upper

gastrointestinal bleeding and perforation in the Sistema Informativo Sanitario Regionale database.

Journal of clinical epidemiology 1999;52(6):499-502.

14. Galdarossa M, Vianello F, Tezza F, et al. Epidemiology of primary and secondary thrombocytopenia: first

analysis of an administrative database in a major Italian institution. Blood Coagul

Fibrinolysis;23(4):271-7.

15. Franchi C, Giussani G, Messina P, et al. Validation of healthcare administrative data for the diagnosis of

epilepsy. Journal of epidemiology and community health 2013;67(12):1019-24.

16. Tinelli M, Mannino S, Lucchi S, et al. Healthcare-acquired infections in rehabilitation units of the

Lombardy Region, Italy. Infection 2002;39(4):353-8.

17. Fano V, D'Ovidio M, del Zio K, et al. [The role of the quality of hospital discharge records on the

comparative evaluation of outcomes: the example of chronic obstructive pulmonary disease

(COPD)]. Epidemiologia e prevenzione 2011;36(3-4):172-79.

18. Faustini A, Canova C, Cascini S, et al. The reliability of hospital and pharmaceutical data to assess

prevalent cases of chronic obstructive pulmonary disease. Copd;9(2):184-96.

19. Bogliun G, Beghi E. Validity of hospital discharge diagnoses for public health surveillance of the Guillain-

Barré syndrome. Neurological Sciences 2002;23(3):113-17.

20. Baldi I, Vicari P, Di Cuonzo D, et al. A high positive predictive value algorithm using hospital

administrative data identified incident cancer cases. Journal of clinical epidemiology

2008;61(4):373-9.

21. Schifano P, Papini P, Agabiti N, et al. Indicators of breast cancer severity and appropriateness of surgery

based on hospital administrative data in the Lazio Region, Italy. BMC public health 2006;6:25.

Page 13 of 14

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on June 19, 2020 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2015-010547 on 25 March 2016. D

ownloaded from

For peer review only

14

22. Ferlay J, Soerjomataram I, Dikshit R, et al. Cancer incidence and mortality worldwide: sources, methods

and major patterns in GLOBOCAN 2012. International journal of cancer Journal international du

cancer 2015;136(5):E359-86.

23. Yuan Y, Li M, Yang J, et al. Using administrative data to estimate time to breast cancer diagnosis and

percent of screen-detected breast cancers - a validation study in Alberta, Canada. European journal

of cancer care 2015;24(3):367-75.

24. Deshpande AD, Schootman M, Mayer A. Development of a claims-based algorithm to identify colorectal

cancer recurrence. Annals of epidemiology 2015;25(4):297-300.

25. Quantin C, Benzenine E, Hagi M, et al. Estimation of national colorectal-cancer incidence using claims

databases. Journal of cancer epidemiology 2012;2012:298369.

26. Abdulmalak C, Cottenet J, Beltramo G, et al. Haemoptysis in adults: a 5-year study using the French

nationwide hospital administrative database. The European respiratory journal 2015;46(2):503-11.

27. Dehal A, Abbas A, Johna S. Comorbidity and outcomes after surgery among women with breast cancer:

analysis of nationwide in-patient sample database. Breast cancer research and treatment

2013;139(2):469-76.

28. Konski A. Clinical and economic outcomes analyses of women developing breast cancer in a managed

care organization. American journal of clinical oncology 2005;28(1):51-7.

29. Mittmann N, Liu N, Porter J, et al. Utilization and costs of home care for patients with colorectal cancer:

a population-based study. CMAJ open 2014;2(1):E11-7.

30. Benchimol EI, Manuel DG, To T, et al. Development and use of reporting guidelines for assessing the

quality of validation studies of health administrative data. Journal of clinical epidemiology

2011;64(8):821-9.

31. De Coster C, Quan H, Finlayson A, et al. Identifying priorities in methodological research using ICD-9-CM

and ICD-10 administrative data: report from an international consortium. BMC health services

research 2006;6:77.

32. Lix LM, De Coster C, Currie R. Defining and validating chronic diseases: an administrative data approach:

Manitoba Centre for Health Policy Winnipeg, 2006.

Page 14 of 14

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on June 19, 2020 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2015-010547 on 25 March 2016. D

ownloaded from

For peer review only

Validity of ICD-9-CM Codes for Breast, Lung, and Colorectal Cancers in Three Italian Administrative Healthcare

Databases: A Diagnostic Accuracy Study Protocol

Journal: BMJ Open

Manuscript ID bmjopen-2015-010547.R1

Article Type: Protocol

Date Submitted by the Author: 31-Jan-2016

Complete List of Authors: Abraha, Iosief; Regional Health Authority of Umbria, Health Planning Service Serraino, Diego; IRCCS Centro di Riferimento Oncologico Aviano (IT),

Epidemiology and Biostatistic Unit, Giovannini, Gianni; Regional Health Authority of Umbria, Health Planning Office Stracci, Fabrizio; University of Perugia, Public Health Department Casucci, Paola; Regional Health Authority of Umbria, Health ICT Service Alessandrini, Giuliana; Regional Health Authority of Umbria, Health ICT Service Bidoli, Ettore; IRCCS Centro di Riferimento Oncologico, Epidemiology and Biostatistic Unit Chiari, Rita; Azienda Ospedaliera Perugia, Oncology Cirocchi, Roberto; University of Perugia, Surgery De Giorgi, Marcello; Regional Health Authority of Umbria, , Health ICT

Service Franchini, David; Regional Health Authority of Umbria, Health ICT Service Vitale, Maria Francesca; Registro Tumori Regione Campania , ASL NA3 Sud Fusco, Mario; ASL NA3 Sud, Registro Tumori Regione Campania Montedori, Alessandro; Regional Health Authority of Umbria, Health Planning Service

<b>Primary Subject Heading</b>:

Health services research

Secondary Subject Heading: Oncology, Public health

Keywords: administrative database, validating ICD-9 codes, breast, lung and colorectal cancers

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open on June 19, 2020 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2015-010547 on 25 M

arch 2016. Dow

nloaded from

For peer review only

1

Validity of ICD-9-CM Codes for Breast, Lung, and Colorectal

Cancers in Three Italian Administrative Healthcare

Databases: A Diagnostic Accuracy Study Protocol

Iosief Abraha, Diego Serraino, Gianni Giovannini, Fabrizio Stracci, Paola Casucci, Giuliana

Alessandrini, Ettore Bidoli, Rita Chiari, Roberto Cirocchi, Marcello De Giorgi, David Franchini,

Maria Francesca Vitale, Mario Fusco, Alessandro Montedori

Author affiliations:

Health Planning Service, Regional Health Authority of Umbria, Perugia, Italy

Iosief Abraha

Gianni Giovannini

Alessandro Montedori

Regional Health Authority of Umbria, Health ICT Service, Perugia, Italy

Paola Casucci

Giuliana Alessandrini

Marcello De Giorgi

David Franchini

Centro di Riferimento Oncologico Aviano, Epidemiology and Biostatistic Unit, Aviano(PN), Italy

Diego Serraino

Ettore Bidoli

Registro Tumori Regione Campania, ASL NA3 Sud, Brusciano (Na), Italy

Mario Fusco

Maria Francesca Vitale

Public Health Department University of Perugia, Perugia, Italy

Fabrizio Stracci

Dipartimento di Oncologia, Azienda Ospedaliera Perugia, Perugia, Italy

Rita Chiari

Department of Digestive Surgery and Liver Unit, University of Perugia, Perugia, Italy

Roberto Cirocchi

Corresponding author:

Iosief Abraha, [email protected]; [email protected]

Health Planning Service

Regional Health Authority of Umbria

Via Mario Angeloni 61

06124 Perugia, Italia

Tel. +39 075 504 5251

Fax. +39 075 504 5569

Page 1 of 15

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on June 19, 2020 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2015-010547 on 25 March 2016. D

ownloaded from

For peer review only

2

Abstract

Introduction Administrative healthcare databases are useful tools to study healthcare outcomes and to monitor

the health status of a population. Patients with cancer can be identified through disease-specific

codes, prescriptions and physician claims, but prior validation is required to achieve an accurate

case definition. The objective of this protocol is to assess the accuracy of International

Classification of Diseases 9th Revision – Clinical Modification (ICD-9-CM) codes for breast, lung,

and colorectal cancers in identifying patients diagnosed with the relative disease in three Italian

administrative databases.

Methods and analysis Data from the administrative databases of Umbria Region (910,000 residents), Local Health Unit 3

of Napoli (1,170,000 residents), Friuli-Venezia Giulia Region (1,227,000 residents) will be

considered. In each administrative database, patients with the first occurrence of diagnosis of breast,

lung or colorectal cancer between 2012 and 2014 will be identified using the following groups of

ICD-9-CM codes in primary position: (a) 233.0 and (b) 174.x for breast cancer; (c) 162.x for lung

cancer; (d) 153.x for colon cancer, and (e) 154.0 - 154.1 and 154.8 for rectal cancer. Only incident

cases will be considered, that is, excluding cases that have the same diagnosis in the five years

(2007-2011) before the period of interest. A random sample of cases and non-cases will be selected

from each administrative database and the corresponding medical charts will be assessed for

validation by pairs of trained, independent reviewers. Case ascertainment within the medical charts

will be based on (a) the presence of a primary nodular lesion in the breast, lung or colon-rectum,

documented with imaging or endoscopy and (b) a cytological or histological documentation of

cancer from a primary or metastatic site. Sensitivity and specificity with 95% confidence intervals

will be calculated.

Page 2 of 15

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on June 19, 2020 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2015-010547 on 25 March 2016. D

ownloaded from

For peer review only

3

Dissemination Study results will be disseminated widely through peer-reviewed publications and presentations at

national and international conferences.

Strengths and limitations of this study

The study will evaluate the validity of the International Classification of Diseases-9th Revision –

Clinical Modification (ICD-9-CM) codes for breast, lung and colorectal cancers in three large

Italian administrative databases.

The strength of this study is that it will use a medical chart review to ascertain case of cancer

diseases.

Once these administrative databases will be validate for breast, lung and colorectal cancer diseases

they can be used for Outcome Research including pharmacoepidemiology, health service research

and quality of care research.

This study will be the first to validate ICD-9-CM codes of three cancers in three large

administrative databases in Italy.

Validation studies of administrative data are related to that context and are not generalizable to

other settings.

Page 3 of 15

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on June 19, 2020 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2015-010547 on 25 March 2016. D

ownloaded from

For peer review only

4

Introduction

As computer technology continues to advance, administrative databases are increasingly growing in

numerous healthcare settings worldwide. These databases anonymously store data about patients

regarding healthcare assistance they received, including birth, death, or disease treatment. Usually,

the diagnosis of the disease is associated with a specific code from the International Classification

of Diseases, 9th Revision (ICD-9) or 10

th Revision (ICD-10) edition. The ICD is designed to map

health conditions to corresponding generic categories together with specific variations1. The

merging of individual patient data from administrative databases with other sources (e.g.,

prescription and laboratory data) allows investigating a wide range of relevant and often unique

public health questions 2, monitoring population health status over time and performing population-

based pharmacoepidemiological research 2-4

.

To constitute a reliable source for research studies, adequate validation of administrative healthcare

databases is mandatory. While non-clinical information in healthcare databases, such as

demographic and prescription data, are highly accurate 5 6

, the validity of registered diagnoses and

procedures is variable6 7

. Determining the accuracy of the latter two categories of clinical

information is vitally important to all potential users and involves confirming the consistency of

information within the databases with the corresponding clinical records of patients 5.

In Italy, all the Regional Health Authorities maintain large healthcare information systems

containing patient data from all hospital and territorial sources. These databases have the potential

to address important issues in post-marketing surveillance8 9

, epidemiology10

, quality performance

and health services research11

. However, there is a concern that their considerable potential as a

source of reliable healthcare information has not been realized since they have not been widely

validated. A systematic review of ICD-9 code validation in Italian administrative databases12

reported that only a few regional databases have been validated for a limited number of ICD-9

codes of diseases including stroke13 14

, gastrointestinal bleeding15

, thrombocytopenia16

, epilepsy17

,

Page 4 of 15

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on June 19, 2020 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2015-010547 on 25 March 2016. D

ownloaded from

For peer review only

5

infections18

, chronic obstructive pulmonary disease19 20

, Guillain-Barré syndrome21

, and cancers22 23

.

In addition, the use of these databases was scarce, as only six administrative databases served as

sources for published research articles based on the validated ICD-9 codes. Hence, it is imperative

that Regional Health Authorities systematically validate their databases for critical diseases to

productively use the information they contain.

Breast, colorectal and lung cancers are the most commonly diagnosed neoplasms worldwide, as

well as in Italy24

. Consequently, they generate interest in the scientific community and industry as

targets for the development of new drugs, and for governments, given that they are an important

cause of public health and economic burden. For example, variation in the epidemiology of breast25

,

colorectal26 27

and lung28

cancers, treatment (pharmacological or surgical) administered to patients

suffering from these cancers and potential clinical and economic outcomes29-31

, can all be evaluated

using validated administrative databases.

The objective of the present protocol is to evaluate the accuracy of the ICD-9-CM codes related to

breast, lung and colorectal cancers, in correctly identifying the respective diseases using three large

Italian administrative healthcare databases.

Page 5 of 15

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on June 19, 2020 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2015-010547 on 25 March 2016. D

ownloaded from

For peer review only

6

Methods

Setting and data source

Administrative databases

Starting from the early 90s, local and regional Italian healthcare administrative databases have

collected information from all patient medical records from public and private hospitals including

demographics, hospital admission and discharge dates, vital statistics, the admitting hospital

department, the principal diagnosis and a maximum of 5 secondary discharge diagnoses and the

principal, and 5 secondary, surgical and diagnostic procedures. In addition, these databases contain

the records of all drug prescriptions listed in the National Drug Formulary and the basic

characteristics of patients’ physicians. Each resident has a unique national identification code with

which it is possible to link the various types of information, corresponding to each person, within

the database. In Italy, healthcare assistance is covered almost entirely by the Italian National Health

System (NHS), therefore most residents’ significant healthcare information can be found within the

healthcare databases.

The target administrative databases for the present study will be from the Umbria Region (910,000

residents), Local Health Unit 3 of Napoli (1,170,000 residents) and the Friuli-Venezia Giulia

Region (1,227,000 residents). For each database the corresponding Unit (Regional Health Authority

of Umbria for Umbria Region, Registro Tumori Regione Campania for Local Health Unit 3 of

Napoli, and Centro di Riferimento Oncologico Aviano for Friuli-Venezia Giulia Region) will

conduct the same validation process.

Source population

The source population will be represented by permanent residents aged 18 or above of Umbria

Region, Local Health Unit 3 of Napoli and the Friuli-Venezia Giulia Region. Any resident that has

been discharged from hospital with a diagnosis of breast, lung or colorectal cancer will be

Page 6 of 15

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on June 19, 2020 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2015-010547 on 25 March 2016. D

ownloaded from

For peer review only

7

considered. Residents that have been hospitalised outside the regional territory of competence will

be excluded from analysis due to difficulty in obtaining the medical charts.

Case selection and sampling method

In each administrative database, patients with the first occurrence of diagnosis of breast, lung or

colorectal cancer between 2012 and 2014 will be identified using the following groups of ICD-9-

CM codes located in primary position: (a) 233.0 and (b) 174.x for breast cancer; (c) 162.x for lung

cancer; (d) 153.x for colon cancer, and (e) 154.0 - 154.1 and 154.8 for rectal cancer. Only incident

cases will be considered, that is, excluding cases with the same diagnosis (ICD-9-CM codes in any

position) in the five years (2007-2011) before the period of interest. Subsequently, for each of the

above reported groups of ICD-9-CM codes, a random sample of cases will be selected from each

administrative database. Table 1 displays the description of the ICD-9-CM codes for each of the

cancer diseases of interest.

Chart abstraction and case ascertainment

The corresponding medical charts of the randomly selected sample cases will be obtained from

hospitals for validation purposes. From each medical chart the following information will be

retrieved: initials of the patient, date of birth, sex, dates of hospital admission and discharge, any

diagnostic procedure that contributed to the diagnosis of the cancer, any pharmacological or

surgical intervention that were provided for the treatment of the cancer.

Within each unit, two reviewers will receive training on data abstraction. An initial consensus chart

review will be performed with each reviewer independently examining the same number of medical

chart (n=20). The inter-rater agreement regarding the presence or absence of breast, lung or

colorectal cancer among the pairs of reviewers within each unit will be calculated using the κ

statistics. This process will be repeated until the strength of agreement among the pairs of reviewers

will be near perfect (κ statistics between 0.81 and 1.00). Any discrepancies will be discussed and

resolved through a third party involvement (Rita Chiari).

Page 7 of 15

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on June 19, 2020 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2015-010547 on 25 March 2016. D

ownloaded from

For peer review only

8

Case ascertainment of cancer within medical chart will be based on (a) the presence of a primary

nodular lesion in the breast, lung or colon-rectum, documented with imaging or endoscopy and (b)

the cytological or histological documentation of cancer from a primary or metastatic site.

Following the consensus review, data abstraction will be completed independently. To ensure

consistency among all the reviewers, cases with uncertainty will be discussed and resolved through

a third party involvement (Rita Chiari).

Validation criteria

For non-invasive breast cancer, we will consider the ICD-9-CM code 233.0 valid, when there is

evidence of a breast nodule documented with imaging (e.g., mammography) and a histological

diagnosis of ductal or lobular breast carcinoma in situ (pTis).

For invasive breast cancer, we will consider the ICD-9-CM codes 174.x valid, when there is

evidence of a breast nodule documented with imaging (e.g., mammography) and a cytological or

histological diagnosis from a primary or metastatic site positive for ductal or lobular

adenocarcinoma.

For lung cancer, we will consider the ICD-9-CM codes 162.x valid, when there is evidence of a

pulmonary nodule documented with imaging (e.g., CT scan) and a cytological or histological

diagnosis from a primary or metastatic site positive for either small cell lung cancer (Microcitoma)

or non-small cell lung cancer (NSCLC).

For colon cancer, we will consider the ICD-9-CM codes 153.x valid, when there is evidence of a

neoplastic lesion within the colon documented with endoscopy (e.g., coloscopy) or imaging (e.g.,

barium enema), and a histological diagnosis from a primary or metastatic site positive for

adenocarcinoma, squamous cell carcinoma or neuroendocrine carcinoma.

For rectal cancer, we will consider the ICD-9-CM codes 154.0-154.1 and 154.8 valid, when there is

evidence of a neoplastic lesion, in the rectosigmoid junction or the rectum, documented with

Page 8 of 15

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on June 19, 2020 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2015-010547 on 25 March 2016. D

ownloaded from

For peer review only

9

endoscopy (e.g., coloscopy) or imaging, (e.g., barium enema) and a histological diagnosis from a

primary or metastatic site positive for adenocarcinoma or squamous cell carcinoma.

Table 1. Description of the ICD-9-CM codes related to breast, lung and colorectal cancers.

Condition ICD-9-CM diagnosis code

Breast Carcinoma in situ: 233.0

Malignant neoplasm

of female breast

174.0 nipple and areola

174.1 central portion

174.2 upper-inner quadrant

174.3 lower-inner quadrant

174.4 upper-outer quadrant

174.5 lower-outer quadrant

174.6 axillary tail

174.8 other specified sites of female breast

174.9 breast female, unspecified

Lung Malignant neoplasm of trachea, bronchus, and lung

162.0 Trachea

162.2 main bronchus

162.3 upper lobe, bronchus or lung

162.4 middle lobe, bronchus or lung

162.5 lower lobe, bronchus or lung

162.8 other parts of bronchus or lung

162.9 bronchus and lung, unspecified

Colorectal 153 Malignant neoplasm of colon

153.0 hepatic flexure

153.1 transverse colon

153.2 descending colon

153.3 sigmoid colon

153.4 cecum

153.5 appendix

153.6 ascending colon

153.7 splenic flexure

153.8 other specified sites of large intestine

153.9 colon, unspecified

154 Malignant neoplasm of rectum and rectosigmoid junction

154.0 rectosigmoid junction

154.1 rectum

Page 9 of 15

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on June 19, 2020 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2015-010547 on 25 March 2016. D

ownloaded from

For peer review only

10

154.8 other

Page 10 of 15

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on June 19, 2020 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2015-010547 on 25 March 2016. D

ownloaded from

For peer review only

11

Statistical analysis

We calculated a sample of 130 charts of cases will be necessary to obtain an expected sensitivity of

80% with a precision of 10% and a power of 80%. For specificity calculation, we will randomly

select non-cases, that is, records without the ICD-9-codes of interest from administrative database.

The corresponding medical charts will be retrieved and evaluated. We calculated a sample of 94

charts of non-cases will be retrieved to obtain an expected specificity of 90% with a precision of

10% and a power of 80%. Overall, each unit will evaluate 1,120 charts.

Sensitivity and specificity will be analysed separately for each ICD-9-CM code by constructing 2 x

2 tables. Sensitivity expresses the proportion of ‘true positives’(TP) (i.e., cancer cases classified as

positive by both the administrative database and medical record review) and all cases deemed

positive by medical chart review. Specificity expresses the proportion of ‘true negatives’ (i.e., cases

without cancer identified by both the administrative database and medical record review), and with

all cases deemed negative by medical chart review. For both sensitivity and specificity, 95%

confidence intervals will be calculated.

Reporting

Complete, transparent and accurate reporting is essential in diagnostic accuracy studies because it

allows readers to assess internal validity as well as to evaluate the generalisability and applicability

of results32

. To ensure quality of reporting any reporting or publication of the results from the

present study will follow recommended guidelines based on the criteria published by the Standards

for Reporting of Diagnostic accuracy (STARD) initiative for the accurate reporting of investigations

of diagnostic studies 32-34

.

Page 11 of 15

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on June 19, 2020 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2015-010547 on 25 March 2016. D

ownloaded from

For peer review only

12

Discussion

In this protocol, we present the approach we will use to analyse the validity of ICD-9-CM codes for

breast, lung and colorectal cancers, in administrative databases representing northern, central and

southern Italy.

Administrative databases constitute a valid alternative to situations in which randomised trials are

not able to provide the required evidence, for practical or economic reasons. In addition, despite

epidemiological studies on cancer are frequently based on cancer registries35-37

, administrative

databases can add a further value especially on pharmacoepidemiology3 12 38

and health services

research39 40

.

Accurate identification of cancer cases using the ICD-9-CM codes may contribute to monitoring

cancer trends and to proposing interventions to ameliorate cancer care. In 2008, an Italian study

developed and validated an algorithm using a regional administrative database to determine incident

cases of breast, lung, and colorectal cancers and found a sensitivity of, respectively, 76.7%, 80.8%,

and 72.4%, for the three cancers22

. The present study will add value to the knowledge of the three

cancer diseases given that it covers different areas of Italy.

Ethics and dissemination Ethical approval has been obtained from the Regional Committee of Umbria. Study results will be

disseminated widely through peer-reviewed publications and presentations at national and

international conferences.

Page 12 of 15

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on June 19, 2020 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2015-010547 on 25 March 2016. D

ownloaded from

For peer review only

13

Footnotes

Contributors AM, IA, MF, and DS conceived the study and all authors were responsible for

designing the protocol. IA and AM drafted the protocol manuscript. IA, DS, GG, FS, PC, GA, EB,

RC, RC, MD, DF, MFV, MF and AM critically revised the successive versions of the manuscript

and approved the final version.

Funding:

This study protocol is supported by funding from the National Centre for Disease Prevention and

Control (CCM 2014), Ministry of Health, Italy. The study funder was not involved in the study

design or the writing of the protocol.

Funding This systematic review is developed within the D.I.V.O. project (Realizzazione di un

Database Interregionale Validato per l’Oncologia quale strumento di valutazione di impatto e di

appropriatezza delle attività di prevenzione primaria e secondaria in ambito oncologico) supported

by funding from the National Centre for Disease Prevention and Control (CCM 2014), Ministry of

Health, Italy. The study funder was not involved in the study design or the writing of the protocol.

Competing interests None.

Page 13 of 15

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on June 19, 2020 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2015-010547 on 25 March 2016. D

ownloaded from

For peer review only

14

References

1. World Health Organization. International statistical classification of diseases and health related

problems, 10th revision. Geneva: WHO 1992.

2. Jutte DP, Roos LL, Brownell MD. Administrative record linkage as a tool for public health research. Annual

review of public health 2011;32:91-108.

3. Jick SS. Fresh evidence confirms links between newer contraceptive pills and higher risk of venous

thromboembolism. Bmj 2015;350:h2422.

4. García Rodríguez LA, Pérez-Gutthann S, Jick S. The UK General Practice Research Database.

Pharmacoepidemiology: John Wiley & Sons, Ltd, 2002:375-85.

5. Rawson NSB, Shatin D. Assessing the validity of diagnostic data in large administrative healthcare

utilization databases. . In: Hartzema A, Tilson H, Chan K, eds. Pharmacoepidemiology and

Therapeutic Risk Management: Harvey Whitney Books, 2008.

6. West SL, Ritchey ME, Poole C. Validity of Pharmacoepidemiologic Drug and Diagnosis Data.

Pharmacoepidemiology: Wiley-Blackwell, 2012:757-94.

7. Campbell SE, Campbell MK, Grimshaw JM, et al. A systematic review of discharge coding accuracy. J

Public Health Med 2001;23(3):205-11.

8. Traversa G, Bianchi C, Da Cas R, et al. Cohort study of hepatotoxicity associated with nimesulide and

other non-steroidal anti-inflammatory drugs. Bmj 2003;327(7405):18-22.

9. Trifiro G, Patadia V, Schuemie MJ, et al. EU-ADR healthcare database network vs. spontaneous reporting

system database: preliminary comparison of signal detection. Studies in health technology and

informatics 2011;166:25-30.

10. Gini R, Francesconi P, Mazzaglia G, et al. Chronic disease prevalence from Italian administrative

databases in the VALORE project: a validation through comparison of population estimates with

general practice databases and national survey. BMC public health 2013;13:15.

11. Colais P, Pinnarelli L, Fusco D, et al. The impact of a pay-for-performance system on timing to hip

fracture surgery: experience from the Lazio Region (Italy). BMC health services research

2013;13(1):393.

12. Abraha I, Montedori A, Eusebi P, et al. The Current State of Validation of Administrative Healthcare

Databases in Italy: A Systematic Review. Pharmacoepidemiol Drug Saf 2012;21:400-00.

13. Leone MA, Capponi A, Varrasi C, et al. Accuracy of the ICD-9 codes for identifying TIA and stroke in an

Italian automated database. Neurological Sciences 2004;25(5):281-8.

14. Rinaldi R, Vignatelli L, Galeotti M, et al. Accuracy of ICD-9 codes in identifying ischemic stroke in the

General Hospital of Lugo di Romagna (Italy). Neurological Sciences 2003;24(2):65-69.

15. Cattaruzzi C, Troncon MG, Agostinis L, et al. Positive predictive value of ICD-9th codes for upper

gastrointestinal bleeding and perforation in the Sistema Informativo Sanitario Regionale database.

Journal of clinical epidemiology 1999;52(6):499-502.

16. Galdarossa M, Vianello F, Tezza F, et al. Epidemiology of primary and secondary thrombocytopenia: first

analysis of an administrative database in a major Italian institution. Blood Coagul

Fibrinolysis;23(4):271-7.

17. Franchi C, Giussani G, Messina P, et al. Validation of healthcare administrative data for the diagnosis of

epilepsy. Journal of epidemiology and community health 2013;67(12):1019-24.

18. Tinelli M, Mannino S, Lucchi S, et al. Healthcare-acquired infections in rehabilitation units of the

Lombardy Region, Italy. Infection 2002;39(4):353-8.

19. Fano V, D'Ovidio M, del Zio K, et al. [The role of the quality of hospital discharge records on the

comparative evaluation of outcomes: the example of chronic obstructive pulmonary disease

(COPD)]. Epidemiologia e prevenzione 2011;36(3-4):172-79.

20. Faustini A, Canova C, Cascini S, et al. The reliability of hospital and pharmaceutical data to assess

prevalent cases of chronic obstructive pulmonary disease. Copd;9(2):184-96.

21. Bogliun G, Beghi E. Validity of hospital discharge diagnoses for public health surveillance of the Guillain-

Barré syndrome. Neurological Sciences 2002;23(3):113-17.

Page 14 of 15

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on June 19, 2020 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2015-010547 on 25 March 2016. D

ownloaded from

For peer review only

15

22. Baldi I, Vicari P, Di Cuonzo D, et al. A high positive predictive value algorithm using hospital

administrative data identified incident cancer cases. Journal of clinical epidemiology

2008;61(4):373-9.

23. Schifano P, Papini P, Agabiti N, et al. Indicators of breast cancer severity and appropriateness of surgery

based on hospital administrative data in the Lazio Region, Italy. BMC public health 2006;6:25.

24. Ferlay J, Soerjomataram I, Dikshit R, et al. Cancer incidence and mortality worldwide: sources, methods

and major patterns in GLOBOCAN 2012. International journal of cancer Journal international du

cancer 2015;136(5):E359-86.

25. Yuan Y, Li M, Yang J, et al. Using administrative data to estimate time to breast cancer diagnosis and

percent of screen-detected breast cancers - a validation study in Alberta, Canada. European journal

of cancer care 2015;24(3):367-75.

26. Deshpande AD, Schootman M, Mayer A. Development of a claims-based algorithm to identify colorectal

cancer recurrence. Annals of epidemiology 2015;25(4):297-300.

27. Quantin C, Benzenine E, Hagi M, et al. Estimation of national colorectal-cancer incidence using claims

databases. Journal of cancer epidemiology 2012;2012:298369.

28. Abdulmalak C, Cottenet J, Beltramo G, et al. Haemoptysis in adults: a 5-year study using the French

nationwide hospital administrative database. The European respiratory journal 2015;46(2):503-11.

29. Dehal A, Abbas A, Johna S. Comorbidity and outcomes after surgery among women with breast cancer:

analysis of nationwide in-patient sample database. Breast cancer research and treatment

2013;139(2):469-76.

30. Konski A. Clinical and economic outcomes analyses of women developing breast cancer in a managed

care organization. American journal of clinical oncology 2005;28(1):51-7.

31. Mittmann N, Liu N, Porter J, et al. Utilization and costs of home care for patients with colorectal cancer:

a population-based study. CMAJ open 2014;2(1):E11-7.

32. Bossuyt PM, Reitsma JB, Bruns DE, et al. Towards complete and accurate reporting of studies of

diagnostic accuracy: the STARD initiative. Bmj 2003;326(7379):41-44.

33. Benchimol EI, Manuel DG, To T, et al. Development and use of reporting guidelines for assessing the

quality of validation studies of health administrative data. Journal of clinical epidemiology

2011;64(8):821-9.

34. De Coster C, Quan H, Finlayson A, et al. Identifying priorities in methodological research using ICD-9-CM

and ICD-10 administrative data: report from an international consortium. BMC health services

research 2006;6:77.

35. Guzzinati S, Buzzoni C, De Angelis R, et al. Cancer prevalence in Italy: an analysis of geographic

variability. Cancer causes & control : CCC 2012;23(9):1497-510.

36. Nicita C, Buzzoni C, Chellini E, et al. [A comparative analysis between regional mesothelioma registries

and cancer registries: results of the ReNaM-AIRTUM project]. Epidemiol Prev 2014;38(3-4):191-9.

37. Zorzi M, Mangone L, Sassatelli R, et al. Screening for colorectal cancer in Italy: 2011-2012 survey.

Epidemiol Prev 2015;39(3 Suppl 1):115-25.

38. Nordstrom BL, Whyte JL, Stolar M, et al. Identification of metastatic cancer in claims data.

Pharmacoepidemiol Drug Saf 2012;21 Suppl 2:21-8.

39. Franchi C, Lucca U, Tettamanti M, et al. Cholinesterase inhibitor use in Alzheimer's disease: the

EPIFARM-Elderly Project. Pharmacoepidemiol Drug Saf 2011;20(5):497-505.

40. Rosato R, Sacerdote C, Pagano E, et al. Appropriateness of early breast cancer management in relation

to patient and hospital characteristics: a population based study in Northern Italy. Breast cancer

research and treatment 2009;117(2):349-56.

Page 15 of 15

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on June 19, 2020 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2015-010547 on 25 March 2016. D

ownloaded from