of 12 /12
doi:10.1136/jech.32.4.303 1978;32;303-313 J Epidemiol Community Health R Doll and R Peto and lifelong non-smokers. relationships among regular smokers carcinoma: dose and time Cigarette smoking and bronchial http://jech.bmj.com/cgi/content/abstract/32/4/303 Updated information and services can be found at: These include: References http://jech.bmj.com/cgi/content/abstract/32/4/303#otherarticles 55 online articles that cite this article can be accessed at: Rapid responses http://jech.bmj.com/cgi/eletter-submit/32/4/303 You can respond to this article at: service Email alerting sign up in the box at the top right corner of the article Receive free email alerts when new articles cite this article - Notes http://journals.bmj.com/cgi/reprintform To order reprints of this article go to: http://journals.bmj.com/subscriptions/ go to: Journal of Epidemiology and Community Health To subscribe to on 7 April 2009 jech.bmj.com Downloaded from

Cigarette smoking and bronchial carcinoma: dose and time ... · Epidemiological data on smoking and bronchial carcinoma are more extensive than for anyother causeofhumancarcinomas,

  • Upload

  • View

  • Download

Embed Size (px)

Text of Cigarette smoking and bronchial carcinoma: dose and time ... · Epidemiological data on smoking and...

Page 1: Cigarette smoking and bronchial carcinoma: dose and time ... · Epidemiological data on smoking and bronchial carcinoma are more extensive than for anyother causeofhumancarcinomas,

doi:10.1136/jech.32.4.303 1978;32;303-313 J Epidemiol Community Health

  R Doll and R Peto  

and lifelong non-smokers.relationships among regular smokerscarcinoma: dose and time Cigarette smoking and bronchial

http://jech.bmj.com/cgi/content/abstract/32/4/303Updated information and services can be found at:

These include:


http://jech.bmj.com/cgi/content/abstract/32/4/303#otherarticles55 online articles that cite this article can be accessed at:  

Rapid responses http://jech.bmj.com/cgi/eletter-submit/32/4/303

You can respond to this article at:

serviceEmail alerting

sign up in the box at the top right corner of the article Receive free email alerts when new articles cite this article -


http://journals.bmj.com/cgi/reprintformTo order reprints of this article go to:

http://journals.bmj.com/subscriptions/ go to: Journal of Epidemiology and Community HealthTo subscribe to

on 7 April 2009 jech.bmj.comDownloaded from

Page 2: Cigarette smoking and bronchial carcinoma: dose and time ... · Epidemiological data on smoking and bronchial carcinoma are more extensive than for anyother causeofhumancarcinomas,

Journal of Epidemiology and Community Health, 1978, 32, 303-313

Cigarette smoking and bronchial carcinoma: dose andtime relationships among regular smokers and lifelongnon-smokersRICHARD DOLL AND RICHARD PETOFrom the Radcliffe Infirmary, University of Oxford

SUMMARY In a 20-year prospective study on British doctors, smoking habits were ascertained byquestionnaire and lung cancer incidence was monitored. Among cigarette smokers who startedsmoking at ages 16-25 and who smoked 40 or less per day, the annual lung cancer incidence inthe age range 40-79 was:

0-273 x 1-12. (cigarettes/day+6)2. (age-22.5)4 5.The form of the dependence on dose in this relationship is subject not only to random error but alsoto serious systematic biases, which are discussed. However, there was certainly some statisticallysignificant (P<0.01) upward curvature of the dose-response relationship in the range 0-40 cigar-ettes/day, which is what might be expected if more than one of the 'stages' (in the multistagegenesis of bronchial carcinoma) was strongly affected by smoking. If a higher than linear dose-response relationship exists between dose per bronchial cell and age-specific risk per bronchial cell,this may help explain why bronchial carcinomas chiefly arise in the upper bronchi, for dilutioneffects might then protect the larger areas lower in the bronchial tree.

In essence, the 'multistage model' hypothesis aboutcarcinoma induction is that a few changes, eachheritable when somatic cells divide in the tissues,are needed to alter an ordinary epithelial cell intothe progenitor of a carcinoma. As well as these'stages', other processes may of course also berelevant-for example, partially transformed epi-thelial cells may already have some selective ad-vantage over their unaltered neighbours, thuseventually increasing the number of such cells atrisk of further change, while the host may havesome defences against partially or even fully alteredcells, thus reducing the number of such cells. Thereis not as yet sufficient knowledge of these otherprocesses to allow their proper incorporation intothe mathematical formulation of multistage models,but, as the various qualitatively different stages andprocesses involved in the production of one singlecarcinoma come to be understood separately, theneed will arise for some synthesis of them into anoverall sequence of events. Multistage models seemat present to offer the most promising frameworkfor such an eventual synthesis (Peto, 1977), even ifcurrent knowledge is too sparse for such models tobe tested critically.

Successive stages in the transformation of one

cell may be separated from each other by severalyears, and the biologic nature of the early stagesmay be completely different from that of the laterstages. If so, different stages may have differentcauses. Epidemiological data on smoking andbronchial carcinoma are more extensive than forany other cause of human carcinomas, so it may beprofitable to ask which stage or process smokingaffects most strongly. This has already been done(Doll, 1971), but the epidemiological evidence wasdifficult to fit together plausibly (Armitage, 1971).Smoking in early adult life seemed to have a sub-stantial effect on the risk of cancer in old age,suggesting that smoking affected at least one earlystage. Giving up smoking in later adult life seemed tohave a substantial effect on the risk five or 10 yearslater, suggesting that smoking also affected at leastone late stage or process. The simplest assumptionwould be that if, comparing two people with differentsmoking habits, the dose-rate of smoke to thetarget cells in one person was double that in theother person, then the rate of occurrence of boththe early stage and the late stage or process wouldbe approximately doubled, thereby multiplying thefinal age-specific incidence rate of lung cancer byabout four. (This is not a firm prediction, of course,


on 7 April 2009 jech.bmj.comDownloaded from

Page 3: Cigarette smoking and bronchial carcinoma: dose and time ... · Epidemiological data on smoking and bronchial carcinoma are more extensive than for anyother causeofhumancarcinomas,

Richard Doll and Richard Peto

for the two separate occurrence rates may not beeven approximately proportional to the dose-rateof smoke). Armitage (1971), however, pointed outthat most epidemiological data suggested that theage-specific risk was approximately proportional tothe reported daily consumption rather than to thesquare of the reported consumption, and he foundit surprising that the observed relationship seemedlinear rather than quadratic, if two different aspectsof carcinoma induction were indeed affected.The reported cigarette consumption is, however,

certainly an inaccurate measure of the current (letalone the past) extent of exposure of the bronchialepithelial stem cells to the carcinogenic agents incigarette smoke, because of differences betweensmokers in the condition of the bronchial tree(deciliation, airflow obstruction, phlegm production,etc.); type of cigarette; number and size of puffs;butt length and depth of inhalation*.The aims of this paper are (1) to present in

detail the age-specific lung cancer incidence datafrom the 20-year follow-up in the prospectivestudy of male British doctors undertaken in 1951(Doll and Hill, 1964) in as accurate a form aspossible, and (2) to point out that even if theremaining inaccuracies are fairly random, they willtend to conspire to make the exponent of dose inthe observed relationship between lung cancerincidence and daily cigarette consumption lowerthan that in the true relationship between lungcancer incidence and extent of exposure of thebronchial stem cells to the relevant components ofsmoke. Thus, for example, even if the true biologicaldose-response relationship is quadratic, the observedepidemiological dose-response relationship mightbe roughly linear.Taken together, (1) and (2) may allow circum-

vention of the difficulty discussed by Armitage.


The main features of the study we shall use havebeen described in detail elsewhere (Doll and Peto,1976). The study period runs for 20 years, from1 November 1951 to 31 October 1971, and weshall subdivide this period into 20 'study years'(study year 1, study year 2, . . . study year 20),recording only the study year in which deaths andlung cancer onsets happen, rather than the exactdates of these events. On 1 November 1951, postalinquiries (questionnaire Ql) were made into the

*The effects of these differences in smoking style are important, butdifficult to quantify, as is demonstrated by the curious observationthat among heavy smokers the inhalers seem able to get some smokesafely past the main danger area in the upper bronchi, and so actuallyhave a somewhat lower risk of bronchial carcinoma than do non-inhalers with the same cigarette consumption (Doll and Peto, 1976).

current and past smoking habits of all men on the1951 British medical register who were thought tobe resident in Britain in October of that year.34 440 men (69% of those then alive) replied,almost all very promptly. This paper chiefly concernsthose who reported in 1951 that they were lifelongnon-smokers, or who reported in 1951 that theyhad smoked cigarettes regularly since early adultlife (which we defined as ages 16-25) and who didnot report that they had ever given up smoking orsmoked anything other than cigarettes. During theseventh and fifteenth years of the study, furtherquestionnaires (Q2 and Q3) were sent to the doctors,reminding them of their previously stated smokinghabit and asking whether this had continued. If theydid not reply they were reminded at least twice.Replies were received before the ends of study years7 and 15 from 98-4% and 96-4% of the survivors.

Ideally, we would like to examine the incidenceof lung cancer among lifelong non-smokers andamong men who have been exposed to a constantdaily dose of cigarette smoke since a commonstarting age. If, therefore, in response to question-naire Q2 or Q3, any non-smoker reported current orprevious smoking, or any smoker reported givingup smoking temporarily or permanently, or reportedcurrent or previous use of cigars or pipes, or re-ported a change in consumption of more thanfive cigarettes/day, then that person has beenexcluded from our present analysis as from the endof year 7 (for Q2) or year 15 (for Q3). The fewmen who did not reply to either or both of Q2 andQ3 were presumed not to have altered their habits.We have called those smokers who satisfied ourcriteria for inclusion in part or all of the study'regular' smokers, and we have studied them andthe non-smokers up to the time when they died, orwere excluded from the study, or were diagnosed ashaving lung cancer.

In the present analysis, we have related lungcancer onsets between the start of study year 1 andthe end of study year 7 to the smoking habitsdescribed in Qi, irrespective of information aboutthe dates of changes in habits gleaned from Q2.Likewise, we have related onsets between the startof study year 8 and the end of study year 15 to Q2,irrespective of Q3, while we have related onsetsbetween the start of year 16 and the end of year 20to Q3, irrespective of any change which in factoccurred during years 16-20.ttThis may seem perverse, since we know that some of our 'smokers'had in fact given up a few years previously, but it is necessary inorder to avoid bias. If, among men who gave up smoking soonafter one questionnaire, those who then died before the next question-naire were classified as smokers while those who survived were not,the death rates of smokers would be overestimated and those ofex-smokers would be underestimated. No information was soughtafter men had died about their smoking habits between the lastquestionnaire they answered and their death.


on 7 April 2009 jech.bmj.comDownloaded from

Page 4: Cigarette smoking and bronchial carcinoma: dose and time ... · Epidemiological data on smoking and bronchial carcinoma are more extensive than for anyother causeofhumancarcinomas,

Cigarette smoking and bronchial carcinoma


Of the 34 440 respondents in 1951 to Q1, 10 072(29 2 %) were known to have died before 31 October1971, 24 265 were traced after 31 October 1971 andwere known to have been alive on 1 November1971, and 103 (0 3 %) could not be traced and havebeen arbitrarily taken to be alive on 1 November1971. Most of these 0 3% were known to be aliveat the time of the questionnaire Q3 in study year 15.Of those alive on 1 November 1971, we haveexamined almost all the death certificates of thosewho were known to have died within the next twoand a half years for mention of lung cancer, in caseany such lung cancers may have had onset during themain study period.


We shall be concerned with the date of clinicalonset of lung cancer (which we shall use to calculateincidence rates during the 20-year study period).We have therefore excluded all men with onsetbefore the initial questionnaire Ql, and we do notcount any onsets which occurred after the end ofthe main 20-year study period at 31 October 1971.The principal means of ascertainment has been bydeath certificates, 556 of which mentioned cancer ofthe trachea, bronchus, or pleura as an underlying or

associated cause of death. In addition, we learnedof 15 other cases by informal means, chiefly fromthe replies to questionnaires Ql, Q2, or Q3, or to a

questionnaire which was distributed after the endof study year 20 for other purposes. We wished toexclude any of these 571 putative lung cancers

which were not really lung cancer, or which hadonset outside the main study period. To help usdecide which to accept, we obtained informationabout the basis for the diagnosis of 567 of these571 possible lung cancers from the doctor who hadsigned the death certificate or from the consultantto whom the patient had been referred. Thirty-two

of the 567 (5 6%) were, on review, consideredunlikely to have been cancers of the lung or tracheaand have been discounted; these are listed in Dolland Peto (1976). The remainder, together with thefour for which we could not obtain informationabout the basis for the diagnosis, were accepted.Where there was doubt whether or not to acceptparticular cases, we sought the advice of Dr. J. R.Bignall (consultant physician at the BromptonHospital for Diseases of the Chest), who was notinformed of the subject's smoking history.For those 539 in whom the diagnosis was accepted,

we also sought the date when the diagnosis wasfirst made and regarded this as the date of onset ofthe disease. The original plan of the study had notrequired information about the date of diagnosis,and later inquiries failed to elicit the date in 60cases, most of which were doctors who died duringthe first few years of the study. For these, we assumedthat the diagnosis was made three months beforethe date of death, which is the median duration ofsurvival from clinical onset in the cases for whichwe had complete information. Four hundred andeighty-three of the 539 accepted cases of bronchialcarcinoma had estimated dates of onset within the20-year period from 1 November 1951 to 31 October1971.Not all patients were equally thoroughly investi-

gated. They were, therefore, classified according tothe strength of the evidence. 'Category 1' caseswere those diagnosed at necropsy or on (1) micro-scopic evidence of cancer compatible with a

bronchial origin plus (2) macroscopic evidence of theprimary site of origin at operation, bronchoscopy,or radiological examination. 'Category 2' caseswere those diagnosed without one or other of theseand without necropsy, while 'Category 3' caseswere diagnosed only on history and clinical ex-

amination. Four cases were known to us on thebasis of the death certificate alone.

Table 1 Reported cases of lung cancer by date of onset, diagnostic category, and relation to death

Estimated date of clinical onset Diagnostic category Underlying cause Contributory cause Unrelated to Totalof death of death death reported

During main study period1 Nov. 1951 to 31 Oct. 1971 1 259 13 5 277

2 172 7 8 1873 15 0 0 15

Death certificate only 4 0 0 4

Total accepted 450 20 13 483

Before I Nov. 1951 Total accepted 6 0 2 8

After 31 Oct. 1971 Total accepted 44 4 0 48

All periods Total accepted 500 24 15 539

All periods Not accepted* 29 3 NA 32

All periods Total reported 529 27 15 571

*Bronchial carcinoma mentioned on death certificate, but not accepted on investigation; excluded from analysis.


on 7 April 2009 jech.bmj.comDownloaded from

Page 5: Cigarette smoking and bronchial carcinoma: dose and time ... · Epidemiological data on smoking and bronchial carcinoma are more extensive than for anyother causeofhumancarcinomas,


The distribution of the reported cases by date ofonset, diagnostic category, and relation to death isshown in Table 1. Of the 483 accepted lung cancerswith estimated date of onset within the mainstudy period, only 215 affected the non-smokersand regular cigarette smokers who are the subjectof the present paper.


The basic data are presented in Tables 2 and 3.Because of digit preference in reporting cigarette

Richard Doll and Richard Peto

consumption, the mean consumption (Table 2) isoften not in the middle of the range, as many

doctors stated that they smoked exactly 20, 30, or

40 cigarettes, while few reported adjacent numbers.For example, among men smoking 35 or more

cigarettes/day, 34% reported smoking 35-39, 46%reported smoking exactly 40, and 20% reportedsmoking more than 40. Because of this, the top twodose-groups which we have used are '35-40' and"more than 40', rather than '35-39' and '40 or more'.This has given us a reasonable amount of obser-vational material throughout the dose-range 0-40,

Table 2 Distribution ofman-years by current age and by amount smoked*CIGARETTES/DAY (RANGE AND MEAN)

Age group Never 1-4 5-9 10-14 15-19 20-24 25-29 30-34 35-40 More than All amounts(years) smoked (2 7) (6 6) (11 3) (16 0) (20 4) (25 4) (30-2) (38 0) 40 (50 -9) (11 0)

20-24 378 194 38 914 91 57 74 2 24 0 68725-29 5 0994 400 701 1 529 1 427 1 424 3044 153 46 104 11 09530-34 10 838 914 1 7624 3 270 3 343 3 966 1 0424 5824 2244 324 25 97635-39 15105 11564 2 178 3 8194 4 6494 6 0034 1 9914 1 1084 5451 1104 36 66840-44 17 846 1216 2 0414 3 7954 4 824 7 046 2 523 1 7154 8924 234 42 134445-49 15 832 10004 1 745 3 205 3 995 6 4604 2 5654 2 123 1150 3054 38 382450-54 12 226 8534 1 5624 2 727 3 2784 5 583 2 620 2 2264 1281 335i 32 693455-59 8 9054 625 1 355 2 288 2 4664 4 3574 2 1084 1 923 1063 284 25 37660-64 6 248 5094 1 068 1 714 1 8294 2 8634 1 5084 1 362 826 1834 18 112465-69 4 351 3924 8434 1 214 1 237 1 930 974 7634 515 120 12 34170-74 2 7234 242 6964 862 6834 1 055 527 3174 233 52 7 39275-79 1 772 2084 5174 547 3704 512 2094 130 884 184 437480-84 1 1854 173 281 314 1804 188 81 37 36 24 2478485+ 87Oj 774 149 1234 614 674 28 1 4 0 1 379All ages 103 381 7788 14 9394 25 500 28 437 41 514 16 4914 12 445 6904 1689 259 0894*On average, a group of men of a particular stated age at the start of the study wiUl have a mean age exactly halfway through that year of age.Likewise, on average, men who suffer lung cancer onset or death from any cause in a particular study year will do so exactly halfway throughthat study year. These 'average' assumptions underlie this Table.

Table 3 Numbers oflung cancer onsets during 20-year study period by current age and by amount smokedCIGARETTES/DAY

Per cent inAge group Never diagnostic(years) smoked 1-4 5-9 10-14 15-19 20-24 25-29 30-34 35-40 More than 40 Al amounts category

20-24 - - - - - - - - - -

25-29 - - - - - - - - - -

30-34 - - z_ _ _ _35-39 1/1 - _ _ _ _ _ _ _ -1/

0 0

40 44 - - - 1/1 - 0/1 - 1/1 - - 2/3 85%u 0 o,u

45-49 - - - 0/1 1/1 0/1 2/2 2/2 - - 5/7a o, u 2s o,a,2s, u

50-54 1/1 - - 2/2 3/4 5/6 3/3 3/3 3/3 - 20/22 js a, s s u 2o, 3s 3s a, 2s o, 8 u 3o, 2a, 12s, 2u J

55-59 2/2 1/1 - 1/1 - 6/8 4/5 5/6 3/4 1/1 23/28 82%O, u 8 0 3a,3s 3s, u a, 2s, 2u o s, u a 3o, Ss, 10s, Su

60-64 - 1/1 1/1 1/1 2/2 7/13 3/4 5111 6/7 1/1 27/41 66%a s s a a, 6s 3s 3s, 2u o, a, 2s, u o 2o, 4a, 16s, 3u

65-69 - - 1/1 1/2 0/2 9/12 3/5 5/9 3/9 0/1 22/41 54%s o 2a, Ss, u o, a, s 3o,2s 3s So, 3a, 12s, u

70-74 0/1 1/1 1/2 2/4 2/4 3/10 4/7 0/2 2/5 1/2 16/38 42%s s s, u s, u 3s a 2s, u s, u o o, a, 10s, 4u

75-79 0/2 - - 1/4 3/5 3/7 2/4 2/2 0/2 - 11/26u a, s, u s, u s, u 2s a, Ss, 4u

80-84 - - - 0/1 1/1 1/1 - 1/2 0/1 - 3/6 k 41%a u a,u

85+ - - - - 0/1 0/1 - - - - 0/2 JAll ages 4/7 3/3 3/4 9/17 12/20 34/60 21/30 24/38 17/31 3/5 130/215 60%

2o a 2o, a 3a 2o,7a 2o, 2a 4o, 2a 3o,a 2o,a 17o,18aa, u 2s 3s 3s, 3u 3s, 3u 21s, 2u 13s, 4u 13s, Su 8s, 4u 67s, 22u

Per cent in '< - "diagnosticcategory 1 57% 62% 57% 62% 60%

Key: x/y, where x=number in diagnostic category 1 and y=total number of confirmed onsets.The cell type was determined histologically for 124 ofthe 130 cancers ofdiagnostic category 1, and for these the numbers ofoat cell (o), squamnouscell (s), undifferentiated (u), and adenocarcinomas (a), are indicated below x/y.

on 7 April 2009 jech.bmj.comDownloaded from

Page 6: Cigarette smoking and bronchial carcinoma: dose and time ... · Epidemiological data on smoking and bronchial carcinoma are more extensive than for anyother causeofhumancarcinomas,

Cigarette smoking and bronchial carcinoma

while the topmost group, 'more than 40', withwhich we shall deal separately by discussion ratherthan by model-fitting, contains under 1% of thewhole study population. The percentages of casesthat were diagnostic Category 1 are also cited inTable 3. It appears that over 80% of lung cancerpatients aged under 60 underwent a thoroughdiagnostic investigation, while only about 40% oflung cancer patients aged over 70 did so. There doesnot appear to be any marked tendency for thethoroughness of diagnostic investigation to berelated to smoking habits, but although this isreassuring it does not guarantee that diagnosticerrors will be unrelated to dose. However, weconsider it unlikely that any material diagnosticbiases exist which are dose related.


The numbers in the four separate histologicalcategories (oat, adeno, squamous, and undifferen-tiated) are too small to allow us to estimate theshapes of the four separate dose-response relation-ships; moreover, the histologic criteria used musthave varied from hospital to hospital. However, inspite of, (or, perhaps, partly because of!) thesedifficulties, we did observe statistically significant(1-tailed P<0 001) trends with respect to the tendose-groups in Tables 2 and 3 in the age-standardisedincidence of each separate histologic category(oat, adeno, squamous and undifferentiated), con-trary to the report of Doll and Hill (1964), based onthe 10-year follow-up of this study, that there wasno material dependence on smoking for adeno-carcinomas.


In what follows we shall restrict our attentionwholly to Tables 2 and 3 (lifelong non-smokers, ormen who started smoking between the ages of 16and 25 and who never reported stopping, changing

by more than 5/day, or smoking any form oftobacco other than cigarettes). We shall concernourselves only with those lung cancers where thediagnosis was accepted and where the estimateddate of onset lay within the main study period, andwe shall analyse these irrespective of histologicaltype or diagnostic category. Furthermore, we shallexclude from our analysis those men who reportedsmoking over 40 cigarettes/day, studying only thedose range 0-40 cigarettes/day, over the whole ofwhich we have appreciable amounts of data. (Be-cause the numbers of lung cancers arising amongmen smoking 1-4 and 5-9/day were both small wehave, when plotting our data, merged these into onesingle group, with mean consumption 5-3/day.)Finally, because there were no lung cancers amongsmokers aged under 40, and because there was verylittle data on smokers aged over 80, we shall restrictour analysis to lung cancer onsets in the age range40-79. This avoids those extremes of old age inwhich diagnostic errors are likely to be most severe.The range of doses and ages that will henceforthconcern us are delimited by the lines in Tables 2 and3.


The proportion of heavy smokers in old age differsfrom that in middle age, and this must be allowedfor when studying the dependence of incidencerates on dose. For each 5-year age group we havetherefore computed the numbers of lung cancersthat would have been expected in each dosagecategory if the total number of onsets in that agegroup had been distributed (in proportion to theman-years in that age group) irrespective of dose.Summation of these expected numbers for all eightage groups within one dosage category then yieldsan 'indirectly age-standardised' overall expectednumber for that dosage category. These overallnumbers are given in Table 4, together with the

Table 4 Dose-response standardisedfor age among men aged 40-79 smoking 0-40/dayO E

Cigarettes/day Observed Indirectly Relative risk Age-standardisedRange Mean number of age-standardised estimated by onset rate/l05

onsets expected (a) (b) ML man-years*OIE Maximum

likelihood0 0 0 6 74-20 0*081 0-080 9

(1-4) (2 7) (3) (6 38) (0 470) (0 458) (50)1-9 5 3 7 21-02 0 333 0-321 36

(5-9) (6 *6) (4) (14-64) (0 273) (0 263) (29)10-14 11*3 16 20 *47 0*782 0 *769 8515-19 16-0 18 19*66 0-916 0-972 10220-24 20-4 58 31 11 1-864 1-891 20925-29 25 4 30 15 09 1.988 2-032 22430-34 30*2 36 11 97 3 008 3*125 34435-40 38 0 30 7 49 4*005 4 134 456

Total 0-40 10-7 201 201 00 1.000 1.000 112

*Maximum likelihood (ML) relative risk x 112, where 112=crude onset rate/10' man-years in all men aged 40-79 smoking 0-40/day=201 x10'/179273.


on 7 April 2009 jech.bmj.comDownloaded from

Page 7: Cigarette smoking and bronchial carcinoma: dose and time ... · Epidemiological data on smoking and bronchial carcinoma are more extensive than for anyother causeofhumancarcinomas,

Richard Doll and Richard Peto

ratio of observed to expected. This ratio gives anindication of the relative risks associated withdifferent habits. Estimation of these relative risksby an iterative maximum likelihood procedure*is in principle slightly preferable to estimation ofthem by O/E, but as may be seen by comparingcolumns (a) and (b) in Table 4, no material differ-ences exist between the results of these two methodsof estimation. These statistical methods would beideal if the age-specific incidence rates at differentdose levels were simple multiples of each other.Although this might be approximately true amongthe different groups of smokers, there is goodevidence (Doll, 1971) that the incidence amongnon-smokers is not a multiple of that amongsmokers. However, since we have only six lungcancers among non-smokers aged 40-79, thesestatistical methods will be sufficiently accurate forthese data, although the formulae we shall derivefrom these relative risk estimates will only bewholly satisfactory for predicting lung cancer riskswhich are dominated by the effects of smoking.We have found it easier to grasp the medical

significance of these relative risks if we multiplyeach by the overall lung cancer incidence rate in thewhole population (for men aged 40-79 smoking0-40/day this is 112/105), to obtain age-standardisedincidence rates/105. These are therefore displayedin the last column of Table 4 and are plotted inFig. 1.

In Fig. 1 we also plot the best-fitting straight line(incidence=9 (dose+ 1)/105) and the best-fittingsecond order polynomial (incidence=0-26 (dose+6)2/105). Although the indicated 90% confidenceintervals for the upper points are large, the curvedline does appear to fit considerably better than thestraight line, and this visual impression may beconfirmed by a statistical test (x2t=7*5 on onedegree of freedom, P<0 01 for curvature).RESULTS AMONG MEN WHO REPORT SMOKINGMORE THAN 40/DAYIt should be noted that if we had not excluded the 100 orso men who reported smoking more than 40/day, theresults of this whole analysis would have been materiallydifferent. Although their average reported consumption*In this procedure, joint effects of dose and age were fitted simul-taneously by one independent parameter per age group and one perdose group.tBased on double the improvement in log likelihood as we go fromthe best linear fit to the plotted curve.The fact that (dose+6)' fits the data significantly better than does

a straight line is, however, not proof that exactly 2 stages or processesare strongly affected by smoking, nor even proof that the true relation-ship is quadratic. It is merely an indication that, as has been hypo-thesised on other grounds, the dose-response relationship doesexhibit some upward curvature in the range 0-40 cigarettes/day.Even if we ignore the very substantial effects of various biases, thealgebraic form of the dose-response relationship cannot be uniquelyinferred from these data. For example, (dose+37)' fits just as well as(dose+6)', and indeed any exponent of 2 or more apparently allowsadequate fit if a suitable 'background' is first added to the dose.

was 50 cigarettes/day, their observed lung cancer risk issimilar to that among men reporting smoking only 30cigarettes/day. (In Fig. 1, the point for men smokingmore than 40/day is plotted but has not influenced thefitted lines). Only five of these self-reported heavysmokers developed lung cancer, which is about two-thirds of the number predicted by the straight line andbarely one-third of the number predicted by the curvein Fig. 1. Chance may play some part in this shortfall,but several other explanations may be considered,although we cannot test them directly.

Curved line0-26 (dose+ 6)2

^ 500lan

Ia-E4Q2E 400-u)0c:s>300



2 200 -


.V 100'


/IStraight "line /

9 (dose+1) /I


Over 40 perday(ignored whenf itting lines)

0 10 20Dose (cigarettes/ day)

30 40

Fig. 1 Dose-response relationship, standardisedfor age.The numbers of onsets in each group are given, and 90%confidence intervals are plotted.

Firstly, it may be that the only men who can reallystand smoking 50 or 60 cigarettes/day are those who,because of some aspect of their constitution or theiraverage dose per cigarette, are less affected than theaverage smoker by the noxious components of smoke.

Secondly, the amount stated to be smoked may forvarious reasons be an especially misleading measure ofthe amount actually smoked among those reporting themost extreme habits. For example, one man claimed tohave smoked 120 cigarettes/day since the age of three.This was obviously bogus, but a few doctors mightenjoy a certain bravado in reporting substantial but notobviously bogus habits to a study of smoking andmortality, and lesser degrees of exaggeration would beundetectable. (Alternatively, a reported extreme con-sumption at the initial survey might have been a tempor-ary aberration for a few days, weeks, or months only).Occasional misreporting would have little effect on theobserved outcomes in the common smoking categories



on 7 April 2009 jech.bmj.comDownloaded from

Page 8: Cigarette smoking and bronchial carcinoma: dose and time ... · Epidemiological data on smoking and bronchial carcinoma are more extensive than for anyother causeofhumancarcinomas,

Cigarette smoking and bronchial carcinoma

Table 5 Age-specific incidence standardisedfor dose

O E Relative risk estimated by Dose-standardisedAge group (years) Observed number Indirectly dose- (a) OIE (b) ML onset rate/lO0

of onsets standardised expected man-years (as inTable 4)

40-44 3 41-24 0*073 0-072 845-49 7 41-06 0 170 0 170 1950-54 22 38*85 0-566 0 566 6355-59 27 31-46 0-858 0-859 9660-64 40 22 50 1-778 1*787 19965-69 40 14-48 2-762 2-780 30970-74 36 7-67 4-694 4-753 52975-79 26 3-74 6-952 7 097 790

Total 40-79 201 201-00 1-000 1-000 112

but the lung cancer rates among the small proportion ofmen who really smoke more than 40/day could beappreciably biased by just a few dozen exaggeratedclaims, or by a few dozen men who do actually lightmore than 40 cigarettes a day, but who then take ratherfew puffs per cigarette, either leaving them smoulderingfor long periods or stubbing them out while still quitelong. If the actual intake of smoke by the men whoclaimed to smoke more than 40/day is really rathersimilar to that of men who only reported 30/day then noanomaly remains, and it is possible that some suchbiases do operate. After all, it must be physically quitean achievement to smoke each of 60 cigarettes thoroughlyin one single day. Sixty cigarettes/day is about one every15 minutes of waking life; apart from any other effects,the carboxyhaemoglobin accumulation alone understandard smoking conditions would be nearly sufficientto poison the subject.

Thirdly, there are paradoxical effects of inhalation.Among men smoking 25 to 40/day, those who say theyinhale actually get less lung cancer than those who saythey do not (Doll and Peto, 1976). The reasons for thisare not known, but most lung cancers do arise in theupper bronchi and it might be that deep inhalationactually carries most smoke particles well past this dangerarea. Whatever the reason may be, the observed pro-tective effect does exist, and although the proportions ofself-reported inhalers were the same (70 3 % and 70 5 %)in those smoking more than 40 and in those smoking1-40, it might still be that some paradoxical aspect ofsmoking style reduces the risk for the 0 7 % of men whosmoke more than 40/day to the risk for men smoking 30to 40/day.

It is unlikely that the true relationship between cancerrisk and exposure of the target cells to smoke flattens offand decreases with increasing dose. Although there is ananimal model for such a turnover in leukaemia inducedby acute doses of radiation (Major and Mole, 1978), weknow of no model for it in animal carcinoma inductionby chronic exposure to carcinogens.


An analysis in which the incidence rates at differentages are indirectly standardised for dose in ninegroups (0, 1-4, 5-9, etc., to 35-40) may be per-formed, yielding Table 5. Again, the relative risk

estimates derived by simple indirect standardisationare very similar to those derived more correctly byiterative maximum likelihood, and again we havepreferred to multiply these relative risks by 112 toestimate dose-standardised incidence rates in eachage group.

Fig. 2 gives three alternative plots of these dataagainst time on a log/log scale, plotting the stand-ardised rates against age, against (age-22-5), oragainst (age-34). Although the fit of the observeddata to a straight line is marginally better if incidenceis plotted against (age-22-5) than if it is plottedagainst (age-34) or just against (age)*, it is clearthat, as in animal experiments (Peto and Lee, 1973),a reasonably straight line would result no matterwhat number of years (between 0 and 34 in thesedata) we subtract from the age before plotting suchgraphs. Since for inclusion in the present analysisall the smokers had to start smoking when they werebetween 16 and 25 years old (mean=19-2 yearsold), and since once it starts growing a lung cancerprobably usually takes only a few years to becomeclinically evident, (age-22-5) approximately rep-resents the duration of smoking when the lungcancer finally emerged as a truly neoplastic focusand is therefore a physiologically reasonablemeasure of time.

If the cancers being studied nearly all arose fromcells which were completely normal until acted onby cigarette smoke (rather than from cells whichsuffered early preneoplastic changes spontaneouslyand which then suffered further changes due tosmoking), simple multistage models (Armitage andDoll, 1961; Doll, 1971; Peto, 1977) predict that in-cidence rates should rise approximately as a power ofduration of smoking. If we estimate the duration of*Relative to the log-likelihood value for the best-fitting straight line,incidence rates proportional to age7 ', (age-22 5)' ' and (age-34)'imply log-likelihood values of -0.5, 00 and -0-4 respectively.For any w in the range 0<w<34, an assumption of incidence ratesproportional to (age-w) (7-2-o-.w) would imply a log-likelihoodbetween 0 0 and J-05 for these data, so statistical considerationsalone cannot indicate definitely which range ofw is acceptable. Givenw, the coefficient of variation of the ML estimate of the exponent of(age-w) is approximately 7% in these data.


on 7 April 2009 jech.bmj.comDownloaded from

Page 9: Cigarette smoking and bronchial carcinoma: dose and time ... · Epidemiological data on smoking and bronchial carcinoma are more extensive than for anyother causeofhumancarcinomas,

Richard Doll and Richard Peto

smoking at the time of emergence of a truly neo-plastic focus as (age-22*5) then the central straightline in Fig. 2 (which has slope 4-49 with standarderror 0-31) clearly fits the data excellently. Whetheror not the men in this study really started smokingexactly when they claimed to have done, it is clearthat the only whole-number exponents of smokingduration which can possibly fit these data are 4 or 5.

Lo 1000a

Ico 300


to 100c LO

. tola 30-£c E'-0W L.0-C- 10

c 3

' 3

In each of these six cases, the constants of pro-portionality have been estimated from the 195lung cancers among men aged 40-79 smoking1-40/day, so the estimates have coefficients ofvariation of ±7 %. (The non-smokers are known toobey a different age distribution (Doll, 1971).)The age-specific incidence rate and the dose-

response relationship have both been obtained on

(age - 34 ) (age -22 5)45 (age)724.26.



Years (log scale)

Fig. 2 Age-specific incidence rates, standardisedfor dose. The numbers of onsetsin each group are given, and 90% confidence intervals are given as vertical lines.



If we take (age-22*5) as a measure of duration ofsmoking, we have shown that incidence may beproportional to (age-22*5)4 5 but cannot beproportional to any substantially different power ofduration of smoking, and we have compared thelinear relationship in which incidence is proportionalto (dose+l) (where dose=cigarettes/day) with thecurved relationship in which incidence is pro-portional to (dose+6)2, and found the latterpreferable. Combining our dose and time analyses,we have preferred the relationship:

incidence proportional to (dose+ 6)2. (Age-22 5)4, 4'5, or s,and, depending on whether the exponent is4, 4 5, or 5, the coefficient of proportionality will beestimated from our data as 1-74 x 10-12,0.2730 x10-12, or 0-0423 x 10-12.

Ifwe had fitted a linear dose-response relationship,then the fitted linear relationship would have been:

incidence proportional to (dose+ 1). (Age-22. 5)4, 4s5, or s,

and, depending on whether the exponent is4, 4-5, or 5, the coefficient of proportionality will beestimated as 604 x 10-12, 9-46 x 10-12, or 1-46 x10-12.

the assumption that the age-specific incidencerates for men smoking different amounts are

parallel. To test this assumption, we have lookedfor an 'interaction' between (dose+6)2 and(age-22-5)4'5 among the smokers of 1 40/day aged40-79. This yielded a chi-square on one degree offreedom of 0-14, indicating that the model fitsexcellently and suggesting that the age-specificincidence rates for light and heavy smokers areindeed adequately described as being parallel.


As was already known (for example, Doll, 1971),incidence rates in the age range 40-79 are pro-portional to a power of duration of smoking, inexcellent conformity with the predictions of simplemultistage model theory. To estimate this power,we must calculate smoking duration by subtractinga chosen quantity (we have chosen 22-5 years)from the age. Having done this, we find that thebest-fitting exponent is 4-5 ±0t3. The adequate fit ofthis model is not a critical test of simple multistagemodel theory, of course, but it is nevertheless

15 2b 30 40 6 80


on 7 April 2009 jech.bmj.comDownloaded from

Page 10: Cigarette smoking and bronchial carcinoma: dose and time ... · Epidemiological data on smoking and bronchial carcinoma are more extensive than for anyother causeofhumancarcinomas,

Cigarette smoking and bronchial carcinoma

gratifying. The three features of note in this analysisare, firstly, that (as with the animal data) the datathemselves cannot define the quantity to subtractfrom age; secondly, that the only two whole numberexponents of duration that fit the data naturally are4 and 5; and thirdly, that if the whole analysis isrepeated, including the men aged over 80, then thesmooth increase in age-specific rates which we haveseen between the ages of 40 and 79 suddenly reversesat the age of 80; even after standardisation for dose,men aged 80-84 or 85 + both have lung cancerincidence rates which are only about half the ratesfor men aged 75-79. Whether this shortfall is dueentirely to under-diagnosis, to selective survival*and to unreported cohort differences in smokinghabits during early life, or whether in additionsome aspect of the biology of extreme old agedoes indeed reduce the risk of carcinoma induc-tion, is unclear. (Although under each of our fittedformulae we would expect about two lung cancersamong smokers aged 20-39, we actually observednone. We attribute this shortfall entirely to chance,as national lung cancer death rates exhibit nosudden changes at the age of 40.)The upward curvature of the dose-response

relationship is intriguing, and runs rather counter tothe linear dose-response relationship estimated byWhittemore and Altshuler (1976) when studyingsome less carefully restricted data from this study.The minimal claim that we make for it is that itindicates a need to re-examine data from otherstudies to see if they, too, exhibit upward curvaturein the range 0-40 cigarettes/day when attention isrestricted to the age range 40-79 among men whohave since the age of about 20 smoked onlycigarettes. It also indicates a need to discoversome way of estimating cigarette smoke dosage tothe stem cells of the upper bronchi more accuratelythan is possible by simple smoking questionnaires.Dosage to the alveoli can of course be estimated bythe uptake of CO or nicotine into the blood, eitherby measuring CO or nicotine directly, or by measur-ing nicotine breakdown products in the blood; andit would probably be helpful to be able to relate thelung cancer risk to such objective measurements ofdose to the periphery ofthe lung. It is, however, moredifficult to imagine reliable methods of assessingthe dose deposited on the upper bronchi. Phlegmproduction in response to cigarette smoke chiefly

*Smokers reporting a given daily cigarette consumption differ in theeffective amounts they really smoke and perhaps they differ in othercorrelates of lung cancer in their life-styles or genotypes. Conse-quently, some are at less risk of lung cancer than the average for suchsmokers, and those at lower risk are presumably more likely tosurvive beyond the age of 80. Despite the lack of evidence among menaged 75 to 79 of any such effects, at least some small part of theshortfall in lung cancer beyond the age of 80 may thus be due toselective survival.


originates from the upper bronchi, and so mightperhaps help us to assess bronchial dose. It hasalready been reported (Rimington, 1971) that amongmen who say they smoke similar amounts oftobacco, those with chronic phlegm productionfrom their upper bronchi are at considerablygreater age-standardised risk of lung cancer, butit is not clear how much this is due to differences inthe effective bronchial dose per cigarette and howmuch it is due to effects of nature or nurture whichpredispose to both mucus hypersecretion andcancer.


The number of cigarettes/day that we record maybe a very inaccurate measure indeed of the extent towhich bronchial epithelial stem cells are affected,which we shall refer to as the 'true insult'. It istherefore possible that the shape of our graph of

_s 500-a

*-'a.u 400


0a 300->1








"true:' butunobse rvable,



0 10 20 30Dose (cigarettes / day)

Fig. 3 Hypothetical dose-response data, with shapedistorted.After men have been divided on the basis of their currentsmoking habits into non-smokers, light smokers (mean10/day), medium smokers (mean 20/day), and heavysmokers (mean 30/day), suppose the mean 'true insult'rate (in arbitrary units) in the four groups is 0, 1 5x,2x, and 2'5x, and suppose the relation between meancancer incidence and mean true insult (open circles) is asgiven by the curved line. The relation between cancerincidence and reported cigarette consumption (plussigns) will then be misleadingly observed as a straightline.

on 7 April 2009 jech.bmj.comDownloaded from

Page 11: Cigarette smoking and bronchial carcinoma: dose and time ... · Epidemiological data on smoking and bronchial carcinoma are more extensive than for anyother causeofhumancarcinomas,

Richard Doll and Richard Peto

incidence against cigarettes/day might be materiallydifferent from the shape of a graph of incidenceagainst 'true insult'. A particular example ofdifferent shapes is illustrated in Fig. 3. At first sight,such systematic discrepancies between reportedconsumption and 'true insult' seem extremelyimplausible, but in fact they can be shown to arisefrom quite general and plausible assumptions.



For example, the mean 'true insult' per cigarette iscertainly very different for different individuals, andif people who usually get less nicotine per cigarettetend to smoke a few more cigarettes to make up forthis, then some of them will be removed from thelight into the medium group, or from the mediuminto the heavy group, causing exactly the effectspostulated in Fig. 3. Such effects can likewise beexpected if smokers who usually smoke a certainnumber of cigarettes by a certain time of day are atall influenced in the number they usually smokethereafter by their usual blood nicotine at thattime. (Indeed, the biases indicated in Fig. 3 willonly be avoided if, implausibly, pharmacologicfactors are entirely irrelevant to cigarette con-




If men were divided into light, medium, and heavysmokers at the age of 25, then the division wouldobviously not classify everyone exactly as theywould be classified at the age of 65; some of thosewho are light smokers at 65 were once heavy or

medium smokers, while some of those who are

heavy smokers at 65 were once light or mediumsmokers. Thus, the ratio of the mean lifelongsmoking of those who are light smokers at 65 tothat of those who are heavy smokers at 65 is likelyto be less extreme than the ratio of the mean current(at 65) consumption of these two groups of men.

Since the 'observed consumption' for those who dieof lung cancer in an epidemiological study isusually recorded in middle or old age, the biasesinvoked in Fig. 3 again follow.



If, in the low dose groups, most of the lung cancersarise among people with some special constitutionalor occupational synergy with smoking (or withsome especially harmful manner of inhaling), andif most such people in the high dose groups die ofearly lung cancer, then in old age the survivors in thehigh dose group will be more 'cancer-proof' than

average, reducing the upward curvature of thedose-response relationship. (However, such effectsmight also cause downward curvature in the re-lationship of incidence with age, and this was notobserved in the age range 40-79).

ConclusionThe general effect of random errors in dosimetry islikely to make the relationship between risk andmeasured dose less extreme*-that is, to bias ahigher powered relationship with dose (for example,incidence proportional to the square of 'trueinsult') into a lower powered one (for example,incidence directly proportional to recorded con-sumption). This is a true bias, in that it is not inexpectation reduced by doing larger and largerstudies. The fact that we have observed statisticallysignificant upward curvature in spite of this biassuggests that with perfect dosimetry we would haveobtained even greater upward curvature.

If correct, our suggestion of a quadratic (or higherpowered) dose-response relationship in the range1-40 cigarettes/day has two mechanistic implications.Firstly, the shape of the dose-response relationshipceases to be evidence against the suggestion thattwo (or more) stages in the production of lungcancer may be strongly affected by smoking.Secondly, it may help explain the well-known factthat lung carcinomas arise chiefly in the upperbronchi. The surface area of the lower bronchi sogreatly exceeds that of the upper bronchi that if thetotal dose deposited from the smoke droplets ontothe bronchi between (say) bronchial divisions 1 and8 were of the same order as that deposited between8 and 16, then the dose per unit area in the lowerbronchi would be much less. If the risk per cell weresimply proportional to dose, this would not affectthe total risk, for although there would be a farlower dose per cell in the lower bronchi, there wouldbe correspondingly far more cells at risk. If, however,the risk per cell were proportional to dose2, thenthis dilution of the total dose deposited over alarger area would greatly reduce the total risk, andmight partly or wholly account for the fact thatbronchial carcinomas chiefly arise in the upperbronchi.

Reprints from Sir Richard Doll, Regius Professor ofMedicine, Radcliffe Infirmary, University of Oxford.

The study was started in collaboration with SirAustin Bradford Hill. Barbara Hafner collected

*There is an analogy here with the fact that in the standard leastsquares regression of y on x, random errors in x will bias the regressioncoefficient towards zero, if we imagine that we are estimating theexponent of dose to which incidence is proportional by examiningthe regression coefficient of y=log (incidence) on x=log (recordedconsumption) = log (true insult) +error.


on 7 April 2009 jech.bmj.comDownloaded from

Page 12: Cigarette smoking and bronchial carcinoma: dose and time ... · Epidemiological data on smoking and bronchial carcinoma are more extensive than for anyother causeofhumancarcinomas,

Cigarette smoking and bronchial carcinoma

and maintained the records, assisted by Mrs.Sutherland, Mrs. Norton, and Mrs. Thompson.Richard Gray assisted with computing, and RuthRohrbasser typed this report.


Armitage, P. (1971). Discussion of 'The Age Distributionof Cancer'. Journal of the Royal Statistical Society,series A, 134, 155-156.

Armitage, P., and Doll, R. (1961). Stochastic models forcarcinogenesis. Proceedings Fourth Berkeley Symposiumon Mathematical Statistics and Probability, 4, 19-38.University of California Press: Berkeley.

Doll, R. (1971). The Age Distribution of Cancer:implications for models of carcinogenesis (withdiscussion). Journal of the Royal Statistical Society,series A, 134, 133-166.

Doll, R., and Hill, A. B. (1964). Mortality in relation tosmoking: ten years observations of British doctors.British Medical Journal, 1, 1399-1410 and 1460-1467.

Doll, R., and Peto, R. (1976). Mortality in relation tosmoking: 20 years observations on male Britishdoctors. British Medical Journal, 2, 1525-1536.

Major, I. R., and Mole, R. H. (1978). Myeloid leukaemiain X-ray irradiated mice. Nature, 272, 455-456.

Peto, R. (1977). Epidemiology, multistage models andshort-term mutagenicity tests. In: Origins of HumanCancer, pp. 1403-1428. Cold Spring Harbor Publi-cations: New York.

Peto, R., and Lee, P. N. (1973). Weibull distributions forcontinuous-carcinogenesis experiments. Biometrics,29, 457470.

Rimington, J. (1971). Smoking, chronic bronchitis andlung cancer. British Medical Journal, 2, 373-375.

Whittemore, A., and Altshuler, B. (1976). Lung cancerincidence in cigarette smokers: further analysis ofDoll and Hill's data for British physicians. Biometrics,32, 805-816.


on 7 April 2009 jech.bmj.comDownloaded from