43

10.1.1.52.8816.pdf

Embed Size (px)

Citation preview

Page 1: 10.1.1.52.8816.pdf

CHAPTER 2Spatial sampling and censoringAdrian BaddeleyUniversity of Western AustraliaWhen a spatial pattern is observed through a bounded window,inference about the pattern is hampered by sampling e�ects knownas \edge e�ects". This chapter identi�es two main types of edgee�ects: size-dependent sampling bias and censoring e�ects. Sam-pling bias can be eliminated by changing the sampling technique,or `corrected' by weighting the observations. Censoring e�ects canbe tackled using the methods of survival analysis.IntroductionWe shall consider spatial patterns like those sketched in Figure 2.1.The pattern may consist of distinct features or objects, such aspoints (which might represent the locations of trees, meteorite im-pacts or bird nests), line segments (geological faults, microscopic�bres or cracks) or other shapes (biological cells or mineral grains).Alternatively the pattern may be simply a two-colour image, forexample representing the presence or absence of vegetation in aregion.Figure 2.1 Four spatial patterns, each observed within a rectangular sam-pling window. Left to right: (a) points; (b) line segments; (c) irregularshapes; (d) mosaic.

Page 2: 10.1.1.52.8816.pdf

2 SAMPLING AND CENSORINGThe exploratory data analysis of such patterns begins by com-puting summary statistics analogous to the sample moments ofnumerical data. Examples are the average number of objects perunit area, and the empirical distributions of object sizes, of objectorientations, and of distances between pairs of objects.Each pattern in Figure 2.1 has been observed within a rectangu-lar sampling `window' W while the pattern itself extends beyondthe window, with potentially in�nite extent. Normally we assumethe spatial pattern is a realisation of a stationary random spatialprocess X. Since information is available only within a window,sampling e�ects known as `edge e�ects' arise, which a�ect sta-tistical inference.Suppose for example that we wish to estimate the average num-ber of �bres per unit area in paper, from a microscope image likeFigure 2.1(b). In counting �bres, should we include those whichcross the boundary of the window and apparently extend outside?Ignoring such �bres would clearly underestimate the �bre density,while including them would yield an overestimate.In general, when observation of a spatial pattern is restricted toa bounded window, two types of edge e�ects arise:sampling bias is present when the probability of observing a ge-ometrical object depends on its size or shape;censoring e�ects occur when we are prevented from observingthe full extent of a geometrical object that lies partially withinthe window.Spatial sampling bias was encountered in the 1940's by statisticiansworking on textile applications [13, 14, 18] but the �rst generaltreatment was presented by Miles [54]. General techniques for elim-inating sampling bias were developed by Miles [55], Lantu�ejoul [50,51], Gundersen [33, 34] and Jensen & Sundberg [47]. For the spe-cial case of point processes, edge e�ects have been extensively dis-cussed, and corrections introduced by Ripley, Lantu�ejoul, Hanisch,Stoyan, Ohser and others; see [21, 40, 56, 58, 60], [64, p. 246]. Forsurveys see [61, chap. 3], [69, pp. 122{131], [15, chap. 8], [2].Much more recently, it was noticed that there is an analogy be-tween edge e�ects for spatial processes and the random censoring oflifetimes in survival analysis. Laslett [52, 53] drew attention to thisanalogy for spatial patterns of line segments. The observed lengthsof line segments, after they have been `clipped' within a samplingwindow, can be compared to censored survival times. Wijers [72]

Page 3: 10.1.1.52.8816.pdf

SPATIAL SAMPLING BIAS 3found the optimal estimator of the segment length distribution.Baddeley, Gill and Hansen [3, 4, 42, 41] noted a similar analogyfor distance distributions in point patterns and random sets. Zim-merman [73] used arti�cial censoring to control edge e�ects.Edge e�ects are severe in dimensions d > 2 and when the windowis small or complex in shape. The most e�ective strategies whichhave been adopted to deal with them are as follows.unbiased sampling rules: Edge e�ects depend on the samplingrule which we use to decide which objects should be counted orsampled as lying `in' the window of observation. It may be pos-sible to use an alternative sampling rule which has no samplingbias.additive statistics: Certain summary statistics are not suscepti-ble to edge e�ects, namely those which are additive functionalsof the pattern. It may be possible to modify the statistic ofinterest to have this property.data-dependent weighting: Sampling bias may be corrected byweighting the contribution from each sampled object by the re-ciprocal of its sampling probability. This is closely related to theHorvitz-Thompson device in survey sampling.survival analysis: Censoring e�ects may be countered using themethods of survival analysis.This chapter describes the four strategies in detail. Section 2.1discusses the general issue of sampling bias. Sections 2.2, 2.3 and2.4 outline the three strategies of unbiased sampling rules, additivestatistics, and data-dependent weighting. In section 2.5 we applythese techniques to the special case of point patterns.Sections 2.6 and 2.8 develop the analogy between edge e�ects andrandom censoring. In section 2.6 we recall some general conceptsof random censoring, study censoring e�ects for point processes,and construct Kaplan-Meier style estimators of the standard pointprocess functions F , G and K. Section 2.8 treats censoring e�ectsfor general random sets.2.1 Spatial sampling biasIn this section, we discuss spatial sampling bias in the general casewhere the spatial pattern X consists of distinct objects Xi in Rd(which might be points, lines or other compact sets). The observa-

Page 4: 10.1.1.52.8816.pdf

4 SAMPLING AND CENSORINGtion window W is a �xed, known compact set in Rd. Visualise thesampling situation as in Figure 2.2.Figure 2.2 A spatial pattern of distinct objects Xi observed through arectangular sampling window W .It is useful to distinguish between a clipping window and asampling frame. A clipping window supplies information withinW only; the data consist of the `clipped' pattern X \W . For ex-ample, the images produced by cameras and satellite sensors areclipped within a rectangular boundary. On the other hand, a sam-pling frame is a region W of known size and shape, outside whichwe may still be able to observe the pattern. In forestry and ecol-ogy, one can sample a �eld of vegetation by throwing a rectangularwooden frame at random into the �eld. In optical microscopy, itis common to delineate a rectangular sampling frame of knownsize within the �eld of view. The (vaguely de�ned) visible regionoutside W is sometimes called the \guard area".The pioneering paper of Miles [54] discussed spatial samplingbias in general and treated two basic sampling operations:plus-sampling, where we sample any object Xi that intersectsthe frame, Xi \W 6= ;;minus-sampling, where we sample only those objects that liewithin the frame, Xi � W .These are illustrated in Figure 2.3. Plus-sampling uses informationfrom outside the frame W , while minus-sampling requires only aclipping window.It is intuitively clear that plus-sampling introduces a samplingbias in favour of larger objects Xi, while minus-sampling favourssmaller objects. For example, under minus-sampling it is impossibleto sample an object larger than the window.

Page 5: 10.1.1.52.8816.pdf

SPATIAL SAMPLING BIAS 5Figure 2.3 Plus-sampling (left) and minus-sampling (right) in a rectan-gular sampling frame or window W . The shaded objects are sampled.These and other sampling rules can be analysed using the meth-ods of marked point processes described in Section 1.7.1 of Chapter1. Regard the objects Xi as arising from a germ-grain process, thatis, represent each Xi as the translation Xi = Ki +xi of a compactset Ki to a location xi, where f(xi;Ki)g is a stationary markedpoint process in Rd with marks in the space K of compact setsin Rd. The point process of germs fxig has intensity � and thegrains Ki have common distribution Q. Note that the represen-tation Xi = Ki + xi is a mathematical convenience; it is alwayspossible to represent a random process of compact sets in this way[71], but the germs xi are not necessarily observable in practice.For plus-sampling, the Campbell-Mecke formula (equation (1.26)in Theorem 9 of Chapter 1, section 1.8.3) yields, for the expectednumber of objects sampled,E#fi : Xi \W 6= ;g = EXi 1fXi \W 6= ;g= � E0 �ZRd 1f(x+K0) \W 6= ;g dx�= � E0 ��W � �K0�� :Here K0 is \the typical grain", a random compact set with dis-tribution Q, and E0 may be interpreted in this context as theexpectation with respect to Q. The symbol j�j denotes Lebesguearea, and W � �K is the dilation of W by K [64]W � �K = fx 2 Rd : (x+K) \W 6= ;g:The operation of dilation is illustrated in Figure 2.4.

Page 6: 10.1.1.52.8816.pdf

6 SAMPLING AND CENSORINGA B

BAA� �B

Figure 2.4 Top row: two plane sets A and B; Bottom left: dilationA� �B;Bottom right: erosion A B.More generally, for some functional f : K ! R, consider theexpectation of the sample total of the values of f(Xi) for all plus-sampled objects Xi. The Campbell-Mecke formula givesE XXi\W 6=; f(Xi) = EXi f(Xi) 1fXi \W 6= ;g= � E0 �ZRd f(K0 + x)1f(K0 + x) \W 6= ;g dx� :Assume f is translation-invariant,f(K + x) = f(K) for all x 2 Rd; K 2 Kso that f(Xi) = f(Ki) and we getE XXi\W 6=; f(Xi) = �E0 �f(K0) ZRd 1f(K0 + x) \W 6= ;g dx�= �E0 �f(K0) ��W � �K0��� :

Page 7: 10.1.1.52.8816.pdf

SPATIAL SAMPLING BIAS 7Hence EPXi\W 6=; f(Xi)E#fi : Xi \W 6= ;g = E0 �f(K0) ��W � �K0���E0 ��W � �K0�� : (2.1)The right-hand side of (2:1) is the expectation of f under the size-biased distribution Q� which has densitydQ�dQ (K) / ��W � �K��with respect to Q. That is, plus-sampling introduces a samplingbias proportional to the area of the dilationW � �Xi of each objectXi. This is a bias in favour of larger objects.Clearly a similar analysis can be applied to minus-sampling, andindeed to any sampling rule which decides whether to include orexclude each object Xi based only on information about Xi.Theorem 1 Let fXig be a stationary germ-grain model in Rd withgerm intensity � and compact grains with distribution Q. Considerany sampling rule such that Xi is included in the sample if and onlyif I(Xi) = 1 where I : K! f0; 1g is measurable.Then for any translation-invariant, measurable f : K! RE Xsample f(Xi) = E X I(Xi)f(Xi) = � E0 [f(K0) �(K0)] (2.2)where for K 2 K �(K) = ZRd I(K + x) dx (2.3)is the volume of the set of all translation vectors x such that K+xwould be included in the sample. The objects sampled by this ruleare �-weighted in the sense thatEPsample f(Xi)E(number in sample) = EPi I(Xi) f(Xi)EPi I(Xi) = E0 [f(K0) �(K0)]E0 [�(K0)] (2.4)is the expectation of f under the distribution which is �-weightedwith respect to Q.The theorem applies to minus-sampling, which turns out to havesampling bias factor �(K) = jW Kj

Page 8: 10.1.1.52.8816.pdf

8 SAMPLING AND CENSORINGwhere W K is the erosionW K = fx 2W : K + x � Wg:The operation of erosion is also illustrated in Figure 2.4.2.2 Unbiased sampling rulesFor applications in optical microscopy, forestry, ecology and other�elds which favor manual counting methods, the most appropri-ate solution to the problem of spatial sampling bias is to avoid italtogether by adopting an alternative sampling rule. An unbiasedsampling rule is one for which the sampling bias factor �(K) in(2:3) is constant. This produces `unbiased samples' in the sensethat the right side of (2:4) is then simply the expectation of funder Q, the unweighted distribution of the typical grain.This situation is typical of estimators in spatial statistics, mostof which are not unbiased but instead are ratios of two unbiasedconsistent estimatorsb� = YX where � = E YEXwith X;Y � 0, PfX > 0g > 0 and X = 0 ) Y = 0, typicallyarising as the mean of a weighted empirical distribution wherethe weights are random variables [6, 61]. We call such estimators\ratio-unbiased" [2] and accept this property as a substitute forthe generally unobtainable unbiasedness.The associated point rule introduced by Miles [55] is an unbiasedsampling rule. Associate with each object K 2 K a unique pointc(K) 2 Rd, such as the centroid, lowest point, or circumcentreof K. The choice of this `associated point' is arbitrary providedit is equivariant under translations, c(K + x) = c(K) + x for allK 2 K and x 2 Rd. Then we sample an object Xi if and only if itsassociated point c(Xi) falls in the window W . See Figure 2.5.The associated point c(Xi) should not be confused with the germpoint xi featuring in the germ-grain process construction. The germpoint is a mathematical convenience which need not be observablein practice. The associated point c(Xi) is observable since it canbe determined from Xi alone.For any K 2 K we have that K + x is sampled if and onlyif c(K + x) 2 W , or equivalently if c(K) + x 2 W , that is, ifx 2 (W � c(K)). The latter set is congruent to W . Thus the bias

Page 9: 10.1.1.52.8816.pdf

UNBIASED SAMPLING RULES 9Figure 2.5 Miles' associated point rule. An object Xi is sampled if andonly if its associated point c(Xi) (marked +) falls in the sampling frameW . The shaded objects are sampled.factor (2:3) is �(K) = ZRd 1fc(K + x) 2Wg dx= jW � c(K)j= jW j :Hence the associated point rule is unbiased.Additionally, since the value of �(K) is known, the `sample total'formula (2:2) can be used in practice:E Xsample f(Xi) = � jW j E0 [f(K0)] (2.5)so that b� = number in samplejW j (2.6)is an unbiased estimator of �.An alternative method, Gundersen's tiling rule [33, 34], has be-come very popular in microscopy [36, 35]. Its practical implemen-tation is sketched in Figure 2.6. Any object Xi intersecting therectangular sampling frame W will be sampled, provided it doesnot intersect any of the `forbidden' lines marked in bold (namelyone side of the rectangle and two half-in�nite lines extending fromit).An equivalent description of the tiling rule is as follows. Tessel-late R2 with copies of W , say Wm;n = W + (ma; nb) where a; bare the side lengths of W . Order the tiles Wm;n according to anarbitrary total order. Then we sample Xi in tile W if it intersects

Page 10: 10.1.1.52.8816.pdf

10 SAMPLING AND CENSORINGFigure 2.6 Gundersen's tiling rule in a rectangular sampling window W .The shaded objects are sampled.W and does not intersect any tile which is `earlier' in this ordering.Here we have used the ordering in which the tiles to the left of Wor below W are `earlier' than W ,Wm;n � Wm0 ;n0 i� m < m0 or (m = m0 and n < n0):It is easy to check that this ordering implies the `forbidden line'rule enunciated above.The key fact is that any object Xi will be sampled by exactly oneof the tiles Wm;n (namely the `earliest' tile which intersects Xi).Fix K 2 K and let S(K;m; n) be the set of all translations x 2 R2such that K + x will be sampled in Wm;n. The sets S(K;m; n),m;n 2Zare congruent, disjoint and cover the whole of R2. HencejS(K;m; n)j = jW j for all m;n and�(K) = jS(K; 0; 0)j = jW j :The tiling rule is therefore unbiased, and again we have (2:5){(2:6).Both the associated point rule and the tiling rule generally re-quire a guard area around the sampling frame, in order to deter-mine which objects Xi are sampled. This can be dispensed within some cases, for example, when it is known beforehand that allXi are convex. The tiling rule is easier to perform manually, butapplies only when the window shape tessellates the plane. The as-sociated point rule is easy for computers to carry out and appliesto arbitrary window shapes.

Page 11: 10.1.1.52.8816.pdf

ADDITIVE FUNCTIONALS 112.3 Additive functionalsAn extension of the foregoing strategy is to allow objects Xi to becounted potentially more than once, but to compensate by assign-ing each object a fractional weight [19, 38, 47, 55].For example, in a spatial pattern of line segments, Hall [38], [39,pp. 216{217] observed that since every line segment has two end-points, we could estimate the expected number of segments perunit area by counting the number of segment endpoints in W anddividing by 2 jW j. The sample of line segments intersecting W(i.e. selected by plus-sampling) can be converted into an unbiasedsample by assigning a weight of 12 to segments with only one end-point in W , and a weight of 1 to segments with both endpoints inW . This is a variant of Miles' associated point method in whicheach segment has two associated points, namely its endpoints. In-dependently Jensen and Sundberg [47] proposed the use of m > 1associated points for each object.In general, let u(Xi;W ) be the weight which we attach to anobject Xi when it is sampled in windowW . If we can arrange thatthe identity ZRd I(K + x)u(K + x;W ) dx = 1 (2.7)holds for all �xed K and W , where I is the sampling rule as inTheorem 1, then the bias is corrected, in the sense that (2:2) isreplaced by E Xsampleu(Xi;W ) f(Xi) = � E0 [f(K0)] : (2.8)Following are some mechanisms which guarantee (2:7).� Equip each object Xi with m > 1 di�erent associated pointsc1(Xi); : : : ; cm(Xi), and weight Xi by the number of associatedpoints of Xi which fall in the sampling window [47]. Thusu(K;W ) = 1m mXj=1 1fcj(K) 2Wg;it is easy to verify (2:7).� Weight each object K proportional to its area of intersection

Page 12: 10.1.1.52.8816.pdf

12 SAMPLING AND CENSORINGwith the window, u(K;W ) = jK \W jjKj ;standard integral geometric results give (2:7).� Tessellate the plane with copies of the sampling window, andweight Xi by the reciprocal of the number of tiles which in-tersect Xi. We can verify (2:7) using arguments like those forGundersen's tiling rule.� Other weight functions in R2 include the integral of curvatureof W \ @Xi, if the boundary @Xi of each Xi is a simple closedcurve; and 4 minus the number of boundary intersections, 1 �#(@W \ @Xi)=4, if both Xi and W are convex.2.4 Horvitz-Thompson estimatorsMiles [54] showed that spatial sampling bias can be corrected byweighting each sampled object Xi by a quantity analogous to thereciprocal of sampling probability. Lantu�ejoul [50, 51] further ex-plored and extended the results.Weighting by the reciprocal of sampling probability is a well-known technique which goes back to the Horvitz-Thompson esti-mator of survey sampling theory.2.4.1 Horvitz-Thompson estimator in a �nite populationWe shall �rst review this estimator [46], [12, pp. 259{261], [49, pp.291, 313, 428], [10].Suppose we observe a (non-independent, non-uniform) randomsample from a �nite population. The sample size may be random;the only condition is that �i = Pfi 2 sampleg be known and posi-tive for all i. Then we can estimate any population total by weight-ing each element i in the sample by 1=�i. LetY = Xi2population yibe the population total of some variable y. TheHorvitz-Thompsonestimator bYHT = Xi2sample yi�i (2.9)

Page 13: 10.1.1.52.8816.pdf

HORVITZ-THOMPSON ESTIMATORS 13is unbiased for Y sinceE bYHT = E 24 Xi2population1fi 2 sampleg yi�i35= Xi2population�i yi�i = Y: (2.10)2.4.2 Spatial Horvitz-Thompson estimatorsReturning to the context of Theorem 1, consider a stationary germ-grain model with germ intensity � and compact grains Ki withcommon distribution Q. The sample consists of all objects Xi suchthat I(Xi) = 1. Adapting the Horvitz-Thompson approach to thisspatial random sample is more complicated than (2:10) in that werequire averages over an in�nite population, and we cannot simplyexchange the expectation and summation. The solution is again touse the Campbell-Mecke formula (Chapter 1, section 1.8.3). Sup-pose we weight each sampled object Xi by 1=�(Xi), the reciprocalof the sampling bias factor de�ned at (2:3). This requires that�(Xi) be known and almost surely positive for the typical grain.Then E 24 Xi2sample 1�(Xi)35 = �E 24 Xi2sample 1�(Xi)f(Xi)35 = � E0f(K0);Hence E hPi2sample f(Xi)=�(Xi)iE hPi2sample 1=�(Xi)i = E0f(K0) (2.11)so that the Horvitz-Thompson style estimatorPi2sample f(Xi)=�(Xi)Pi2sample 1=�(Xi) (2.12)is ratio-unbiased, and (under suitable regularity conditions) con-sistent and approximately unbiased for E0f(K0).For example, the bias introduced by plus-sampling can be cor-

Page 14: 10.1.1.52.8816.pdf

14 SAMPLING AND CENSORINGrected by weighting each object Xi in the sample by 1= ��W � �Xi��.Similarly the bias of minus-sampling can be corrected using theweights 1= jW Xij, but here we require that jW K0j > 0 al-most surely, i.e. every grain must be small enough to �t in thesampling window.Lantu�ejoul [50, 51] noted that the minus-sampling correction iseasy to compute when W is an a � b rectangle, for in that casejW Xij = (a�h)(b�v) where h and v are the dimensions of thesmallest rectangle (aligned with W ) containing K. See Figure 2.7.b-

v

a-h

W

b

h

v

a

KFigure 2.7 Calculation of the minus sampling correction jW Kj in arectangular sampling window W .Unfortunately the analogy with classical Horvitz-Thompson es-timators is not strong enough to enable the estimation of vari-ances in a simple way. The Sen-Yates-Grundy and similar varianceestimators [49] depend on speci�c model assumptions in the �nitepopulation case, and in our context the required information aboutjoint sampling probabilities would also have to be estimated fromthe data.2.5 Sampling bias for point processesWe now focus on the special case of spatial point processes. Sam-pling bias e�ects for point processes have been extensively dis-cussed, and corrections introduced by Ripley, Lantu�ejoul, Hanisch,Stoyan, Ohser and others; see [21, 40, 56, 58, 60], [64, p. 246]. Forsurveys see [61, chap. 3], [69, pp. 122{131], [15, chap. 8], [2].In the exploratory data analysis of a point pattern, one startstypically by estimating certain distance distributions: F (t), the dis-tribution of the distance from an arbitrary point in space to thenearest point of the process; G(t), the distribution of the distance

Page 15: 10.1.1.52.8816.pdf

SAMPLING BIAS FOR POINT PROCESSES 15from a typical point of the process to the nearest other point ofthe process; and K(t), the expected number of other points withindistance t of a typical point of the process, divided by the inten-sity. For a homogeneous Poisson process F;G and K take knownfunctional forms, and deviations of estimates of F;G;K from theseforms are taken as indications of `clustered' or `inhibited' alter-natives [21, 60, 61]. We will �rst de�ne these functions and thenconsider the sampling bias e�ects.2.5.1 De�nitionsFor x 2 Rd and any subset A � Rd let�(x;A) = inffjjx� ajj : a 2 Ag (2.13)be the shortest Euclidean distance from x to A, andA(+r) = fx 2 Rd : �(x;A) � rgA(�r) = fx 2 A : �(x;Ac) > rgwhere c denotes complement. For closed sets A, these are respec-tively the dilation and erosion of A by a ball of radius r. Writeb(x; r) for the closed ball of radius r and centre x in Rd.Let � be a simple point process in Rd which is a.s. stationaryunder translations, with �nite positive intensity �. For r � 0 de�neF (r) = Pf�(0;�) � rg (2.14)By stationarity the point 0 here may be replaced by any arbitrarypoint x. Thus F is the cumulative distribution function of therandom distance �(0;�) from an arbitrary point 0 to the nearestrandom point. Also de�neG(r) = P0f�(0;� n f0g) � rg (2.15)where P0 denotes the Palm distribution of � at 0. Thus G is thec.d.f. of the distance from a typical random point of the processto the nearest other random point. Alternatively, treating � as arandom measure, we could writeF (r) = Pf�(b(0; r)) > 0g (2.16)G(r) = P0f�(b(0; r) n f0g) > 0g (2.17)where �(A) denotes the random number of points of � falling inA � Rd.

Page 16: 10.1.1.52.8816.pdf

16 SAMPLING AND CENSORINGFurther de�ne Ripley's K-functionK(r) = ��1E0 [�(b(0; r) n f0g)] (2.18)where E0 denotes the expectation with respect to the Palm distri-bution P0. Thus �K(r) is the expected number of further pointswithin a distance r of a typical random point of the process.It turns out (see Theorem 2 of Section 2.6.5) that F is alwaysdi�erentiable, while G and K need not have any special continuityproperties. In fact G may be degenerate and K purely discrete, asin the case of a randomly translated lattice.The following results are useful for estimation of F , G and K.Trivially F (r) = Pf�(0;�) � rg = P�0 2 �(+r)so that by Robbins' Theorem (Theorem 1 of Chapter 1, section 1.2)F (r) = E ���(+r) \A��djAjd (2.19)for arbitrary Borel sets A with 0 < jAjd < 1, where j�jd denotesLebesgue volume in Rd. Using the Campbell-Mecke formula (equa-tion (1.26) in Theorem 9 of Chapter 1, section 1.8.3)G(r) = EPx2�\A 1f�(x;� n fxg) � rgE�(A) (2.20)and �K(r) = EPx2�\A �(b(x; r) n fxg)E�(A) (2.21)The latter applies to any second-order stationary process [69].2.5.2 Estimation problemThe point process � is observed through a window W � Rd. Weassume W is compact and inner regular (it is the closure of itsinterior), and denote its boundary by @W .Estimators of F (r), G(r) and K(r) can be formed by taking theappropriate sample averages. From (2:19){(2:21) we have that foreach �xed r and for any A � Rd with 0 < jAjd <1,bF (r) = ���(+r) \A��djAjd (2.22)

Page 17: 10.1.1.52.8816.pdf

SAMPLING BIAS FOR POINT PROCESSES 17is an unbiased estimator of F (r), whilebG(r) = Px2�\A 1f�(x;� n fxg) � rg�(A) (2.23)is ratio-unbiased for G(r) and\�K(r) = Px2�\A �(b(x; r) n fxg)�(A) (2.24)is ratio-unbiased for �K(r).However, these estimators are not feasible in general, since theyrequire information from outside the window W . The essentialproblem is an edge e�ect: �(x;�) is not determined from infor-mation inside W alone, since for a given point x 2 W the closestpoint of � may lie outside W . The available data � \ W yield�(x;� \W ) rather than �(x;�) for all x 2W . See Figure 2.8.WFigure 2.8 Edge e�ect for distances in point patterns. Filled dots: pointpattern �. Open circle: reference point x, either a �xed point in W or apoint of the pattern �. The edge e�ect arises because the nearest pointof � to x could lie outside the window of observation W .2.5.3 Border methodEdge-corrected estimators for F;G and K based on observation of� in W are reviewed in [61, chap. 3], [69, pp. 122{131], [15, chap.8]. See [6, 7, 23, 24, 25, 26, 27, 28, 29, 65].The simplest approach is the \border method" [20, 60, 61] inwhich we restrict attention (when estimating F;G or K at distancer) to those reference points lying more than r units away from the

Page 18: 10.1.1.52.8816.pdf

18 SAMPLING AND CENSORINGboundary of W . These are the points x for which distances up tor are observed correctly:�(x; @W ) > r ) (�(x;� \W ) � r, �(x;�) � r) : (2.25)Equivalently, �(+r) coincides with (� \W )(+r) within the maskW(�r): �(+r) \W(�r) = (� \W )(+r) \W(�r): (2.26)This is an instance of the \local knowledge principle" of mathe-matical morphology [64, pp. 49, 233], [5].The border method estimators of F;G and K are the sampleaverages within the eroded window:bF b(r) = ��W(�r) \�(+r)��d��W(�r)��d (2.27)bGb(r) = Px2�\W(�r) 1f�(x;� n fxg) � rg�(W(�r)) (2.28)bKb(r) = Px2�\W(�r) �(b(x; r) n fxg)b� �(W(�r)) (2.29)where b� = �(W )jW jd :These are the estimators (2:22){(2:24) using A = W(�r). By (2:26)the numerator in each case is observable in the sense that it isdetermined by the data � \W . This approach was introduced byDiggle [20] and dubbed the `border method' by Ripley [61].The border method estimator of F is pointwise unbiased, E bF b (r) =F (r) for all r satisfying ��W(�r)��d > 0. The estimator of G is ratio-unbiased, and b� bKb(r) is ratio-unbiased for �K(r). However, bF band bGb may not be distribution functions. The three estimatorsmay fail to be monotone functions of r, and bF b and bGb mayhave maximumvalues either greater or less than unity. The bordermethod also discards much of the data; in three dimensions [6] itseems to be unacceptably wasteful, especially when estimating G.The variances of all three estimators increase with r. One pos-sibility for reducing variance in the estimators of G and K is toreplace the denominator �(W(�r)) by a less variable estimate of

Page 19: 10.1.1.52.8816.pdf

SAMPLING BIAS FOR POINT PROCESSES 19Figure 2.9 Geometry of the border method estimators. Spatial process �indicated by �lled dots. The eroded windowW(�r) is the dashed rectangle.Left: The estimator of F is the fraction of area shaded in the erodedwindow. Right: The estimator of G or K is a sum of contributions frompoints xi in the eroded window.its expectation, b� ��W(�r)��d. For exampleG2(r) = jW jd�(W )Px2�\W(�r) 1f�(x;� n fxg) � rg��W(�r)��d ; (2.30)see [69, p. 178]. However, this may not reduce the variance, sinceit is plausible that the numerators and denominators of (2:28) and(2:29) may be positively correlated. Such estimators also fail to bemonotone.2.5.4 Edge corrections\Edge-corrected" estimators are generally an improvement on theborder method for G and K; these are weighted empirical dis-tributions of the distances between points. The weight c(x; y) at-tached to the observed distance jjx � yjj between two points x; yis the reciprocal of the `probability' of observing this distance, un-der invariance assumptions (stationarity under translation and/orrotation). In other words, these are Horvitz-Thompson style esti-mators. Corrections of this type were �rst suggested by Miles [54]and developed by Ripley, Lantu�ejoul, Hanisch, Stoyan, Ohser andothers [21, 40, 56, 58, 60], [64, p. 246]. For surveys see [61, chap.3], [69, pp. 122{131], [15, chap. 8], [2].

Page 20: 10.1.1.52.8816.pdf

20 SAMPLING AND CENSORINGThe edge corrections can be derived from the Campbell-Meckeformula for a stationary point process (Theorem 9 of Chapter 1,section 1.8.3).A simple way to appreciate the various edge corrections is tointerpret the de�nitions of G(r) and K(r) as requiring us to countcertain geometrical objects associated with the point process �.For K(r) we should count all line segments joining ordered pairsof distinct points x; y in � such that jjx � yjj � r. For G(r) wecount all discs b(xi; r) centred on points of � which contain noother points of �. Then the edge corrections are instances of thecorrections already described for spatial processes of geometricalobjects.First consider the estimation of G. The notation for estimatorswill be simpli�ed if we write for each point xi of � \Wsi = �(xi;� n fxig \W ) (2.31)ci = �(xi; @W ) (2.32)for the distances from xi to its nearest neighbour and to the edgeof the window, respectively.Suppose we construct from � the random process of all discs orspheres b(xi; r) of radius r centred on points xi of � which do notcontain any further points of �. Minus-sampling for these discs isequivalent to sampling the points xi 2W(�r). The minus-samplingbias factor for a disc b(x; r) is�(b(x; r)) = ��W(�r)��dso that the Horvitz-Thompson style estimator forG based on minussampling is just the border method estimator G2 of (2:28),bG2(r) = Pi 1fsi � rg1fci > rg��W(�r)��d :Alternatively, consider the process of all discs or spheres b(xi; si)where si is the nearest neighbour distance as above. Minus-samplingfor these discs is equivalent to sampling those xi 2 �\W for whichsi � ci or equivalently xi 2 W(�si). The corresponding Horvitz-Thompson style estimator isb� bG4(r) = Xx2�\W 1f�(x;� n fxg) � �(x; @W )g1f�(x;� n fxg) � rg��W(��(x;�nfxg))��d

Page 21: 10.1.1.52.8816.pdf

SAMPLING BIAS FOR POINT PROCESSES 21= Xi 1fsi � cig1fsi � rg��W(�si)��d : (2.33)This estimator was introduced ad hoc by Hanisch [40], along withthe corresponding ratio-of-counts estimator G3(r) in which the de-nominator of (2:33) is replaced byXi 1fsi � cig:Other estimators of G are described in [69, p. 128], [15, p. 614,637{638], [23, 28] and [30], and in sections 2.5.5, 2.6.3 and 2.6.7below.Next consider the estimation of K. Construct the process of allline segments joining distinct points x and y of � with length jjx�yjj � r. Each segment is counted once for each endpoint. Minus-sampling of these line segments is equivalent to simply observingall ordered pairs of distinct points in the window which are at mostr units apart. The minus-sampling bias factor for a line segmentjoining the points x and y is�(fx; yg) = jW fx; ygjd= j(W + x) \ (W + y)jd = jW \ (W + y � x)jdso that b�2 bK2(r) = Xx;y2�\W 1f0 < jjx� yjj � rgj(W + x) \ (W + y)jd (2.34)is unbiased for �2K(r), provided jW \ (W + z)jd > 0 for all z 2b(0; r). This is dubbed the translation correction./ Recall that thesampling bias factor is easy to compute for rectangular windows,see Figure 2.7.If the point process � is also isotropic (invariant under rotations)we may invoke a variant of Theorem 1 in which the bias factor �(K)is replaced by a rotational average. This leads to Ripley's [57, 58]estimator (slightly corrected by Ohser [56]) in two dimensions,b�2 bK3(r) = Xx;y2�\W 1f0 < jjx� yjj � rgw(x; r) v(jjx� yjj) (2.35)which again is unbiased for �2K(r), wherew(x; r) = 12�r length(W \ @b(x; r))

Page 22: 10.1.1.52.8816.pdf

22 SAMPLING AND CENSORINGi.e. w(x; jjx� yjj) is the fraction of the circumference of the circlecentred at x and passing through y which lies in W , andv(r) = jfx 2W : W \ @b(x; r) 6= ;gj :See also [21, 56, 58, 60], [61, chap. 3], [69, pp. 122{131], [15, pp.616{619, 639{644] and recent investigations in [24, 27, 65].Small-sample variances of these estimators are intractable. Theyhave been studied by Ripley [59], [61, p. 40] and Ohser [56] for thePoisson and binomial processes. Asymptotic limiting distributionshave been obtained by Heinrich [44, 43] for Poisson cluster pro-cesses; see also [45]. Various limiting regimes have been studied byStein [65, 66, 67] and in [4].2.5.5 Stein's variance reduction techniquesA general problem with the Horvitz-Thompson style estimatorsis that objects Xi with very small values of �(Xi) will be givenvery large weights if they are encountered in the sample. This is asubstantial source of variability. In the edge corrections describedabove, contributions from points close to the edge of the windowwill be given large weights, in ating the variance of the estimator,despite the paucity of data near the edge.One strategy for variance reduction is to downweight the morevariable contributions. M. Stein [65] pointed out that any modi�-cation of Ripley's estimator (2:35) of the formc�2 bK4(r) = Xx;y2�\W u(x; r)1f0 < jjx� yjj � rgw(x; r) v(jjx� yjj) (2.36)where u(�; �) is a weight function, is still unbiased for �2K(r) pro-vided ZW [r] u(x; r) dx = 1where W [r] = fx 2 W : @b(x; r) \ W 6= ;g. This estimatormay have smaller variance than (2:35) if the values of u decreasenear the boundary of W . There is one natural choice of u whichhas superior asymptotic properties when the underlying process isPoisson.A second strategy for variance reduction is projection [66]. Take

Page 23: 10.1.1.52.8816.pdf

CENSORING EFFECTS FOR POINT PROCESSES 23a statistic of the formT = Xxi;xj2�\W �(xi; xj)such as (2:34){(2:35). Consider the modi�ed statisticT 0 = Xxi;xj2�\W �(xi; xj)� Xxi2�\W �g(xi) � 1jW jd ZW g(x) dx�(2.37)where g : W ! R is any function. If T is unbiased then so is T 0,because the expectation of the second sum on the right of (2:37)is zero, by the Campbell-Mecke formula. The H�ajek ProjectionLemma [37, Lemma 4.1] identi�es the choice of g which minimisesthe conditional variance of (2:37) when the process is Poisson. Stein[66] applies this modi�cation to the rigid motion correction estima-tor for �2K(r) and demonstrates that T 0 has substantially lowervariance than the Ripley isotropic correction.These variance reduction techniques clearly should apply to anyof the Horvitz-Thompson style estimators. A practical problemwith projection is that the optimal g is often di�cult to compute.2.6 Censoring e�ects for point processesEdge e�ects can also be interpreted as a type of censoring. In thissection we deal with censoring e�ects for point processes.The estimation problem for F;G and K from a point pattern ina bounded window W has a clear analogy to the estimation of asurvival function based on a sample of randomly censored survivaltimes. Essentially the distance from a given reference point x to �is right-censored by its distance to the boundary of W . We shall�rst recall some basic theory of random censoring.2.6.1 Survival dataFollowing is a brief account of random censoring. Suppose T1; : : : ; Tnare i.i.d. positive r.v.'s with distribution function F and survivalfunction S = 1 � F . Let C1; : : : ; Cn be independent of the Ti'sand i.i.d. with d.f. H. Let eTi = Ti ^ Ci, Di = 1fTi � Cig wherea ^ b denotes minfa; bg. Then ( eT1; D1); : : : ; ( eTn; Dn) is a sampleof censored survival times eTi with censoring indicators Di (really,non-censoring indicators).

Page 24: 10.1.1.52.8816.pdf

24 SAMPLING AND CENSORINGThe reduced-sample estimator of F isbF rs(t) = #fi : ~Ti � t � Cig#fi : Ci � tg (2.38)This requires that we can observe the censoring times Ci them-selves, or at least the event fCi � tg for all t for which F (t) mustbe estimated. This estimator is clearly pointwise unbiased for Fand has values in [0; 1] but may not be a monotone function of t.The optimal estimator of F is the Kaplan-Meier estimator [48],bF (t) = 1�Ys�t 1� #fi : eTi = s; Di = 1g#fi : eTi � sg ! : (2.39)Note that the product in (2:39) is e�ectively only a �nite product inwhich s ranges over the observed failure times eTi. Thus bF (t) jumpsonly at these values of t. The validity of the Kaplan-Meier estimatorcan be understood intuitively by considering #fi : eTi 2 ds; Di =1g=#fi : eTi � sg, for a small interval ds = [s; s+ ds), as an estima-tor of PfTi 2 ds j Ti � sg. The complement of this probability istherefore PfTi � s + ds j Ti � sg. Multiplying over small intervals[s; s+ ds) partitioning [0; t+ dt) produces PfTi > tg = 1� F (t).More formally, introduceNn(t) = 1n#fi : eTi � t; Di = 1g (2.40)Yn(t) = 1n#fi : eTi � tg (2.41)b�n(t) = Z t0 dNn(s)Yn(s) (2.42)�(t) = Z t0 dF (s)1� F (s�) : (2.43)Then � is the cumulative hazard associated with F , and b�n is theNelson-Aalen estimator of �. One can write1� F (t) = t0 (1� d�(s)) ;1� bFn(t) = t0 �1� db�n(s)� (2.44)

Page 25: 10.1.1.52.8816.pdf

CENSORING EFFECTS FOR POINT PROCESSES 25where denotes product integration:t0 (1 + dA(s)) = limmax jti�ti�1j!0 mYi=1 (1 + A(ti)� A(ti�1)) ;the limit of the product over increasingly �ne partitions 0 = t0 <: : : < tm = t of the interval (0; t]. See [31, 32] for further informa-tion on the product integral.If F is absolutely continuous with density f then de�ning thehazard rate �(t) = f(t)=(1 � F (t))one has �(t) = R t0 �(s) ds and1� F (t) = t0 (1� d�(s)) = exp (��(t)) :However if F has a discrete component the relation � = � log(1�F ) no longer holds.Under random censorship the empirical processes Nn, Yn satisfya mean value relationENn(t) = Z t0 E Yn(s) d�(s) (2.45)which may be interpreted loosely as saying that dNn(t)=Yn(t) isratio-unbiased for d�(t).2.6.2 Analogy between censoring and edge e�ectsReturning to the spatial pattern context, let � be an a.s. station-ary point process in Rd, and W a �xed compact window W withnonempty interior. We observe � only within W .Consider the estimation of the empty space function F . Everypoint x in the window W contributes one possibly censored ob-servation of the distance from an arbitrary point in space to thenearest point of �. The analogy with survival times is to regardthe distance T = �(x;�)as the `distance (time) to failure', andC = �(x; @W )

Page 26: 10.1.1.52.8816.pdf

26 SAMPLING AND CENSORINGas the censoring distance. The observation is censored if �(x; @W ) <�(x;�). See Figure 2.10.Similar remarks apply to the nearest neighbour distance distri-bution G. Let xi be a point of the pattern � \W . In this case weregard the nearest neighbour distanceT = �(xi;� n fxig)as the distance (time) to failure, and the observation is censored if�(xi; @W ) < �(xi;� n fxig).WFigure 2.10 Censoring occurs when the reference point x (open circle) iscloser to the boundary of the observation window W than to the nearestpoint of the spatial pattern (�lled dots).Under this analogy, the reduced sample estimator (2:38) for Gis precisely the border method estimator (2:28). Both are obtainedusing only those observations for which the censoring time or dis-tance is at least r when estimating the probability of survival totime or distance r. The border method estimator (2:27) of F is alsoa reduced sample estimator, as we explain in section 2.6.5. Sincethe reduced sample estimator is known to be ine�cient in the caseof i.i.d. random censoring, it is of interest to explore analogues ofthe Kaplan-Meier estimator for point processes.First we note that Kaplan-Meier style estimators are feasible. Forthe empty space function, from the data � \W we can computeT � = �(x;� \W ) and C = �(x; @W ) for each x. Note that�(x;�) ^ �(x; @W ) = �(x;� \W ) ^ �(x; @W ) (2.46)(another application of the local knowledge principle [64, pp. 49,

Page 27: 10.1.1.52.8816.pdf

CENSORING EFFECTS FOR POINT PROCESSES 27233]), that is, T^C = T �^C. Thus we can indeed observe eT = T^Cand D = 1fT � Cg for each x, as required for the Kaplan-Meierestimator.Similarly for the nearest neighbour distance, replacing � by � nfxig, the analogue of (2:46) holds and the other statements abovecontinue to hold.In the next three sections we discuss Kaplan-Meier style estima-tors of G, K and F respectively, developed in [4].2.6.3 Kaplan-Meier estimator of G for point patternsIt is simplest to begin with the estimation of the nearest neighbourdistance distribution function G de�ned in (2.15), (2.20).Let � \W = fx1; : : : ; xmg be the observed point pattern andsi = �(xi;� n fxig \W ), ci = �(xi; @W ). By the analogy sketchedabove, the set fxi : si ^ ci � rgcan be thought of as the set of points `at risk of failure at distancer', and fxi : si = r; si � cigare the `observed failures at distance r'. These two sets are analo-gous to the points counted in the empirical functions Yn(s); dNn(s)respectively in the de�nition of the Kaplan-Meier estimator in sec-tion 2.6.1. Counting them as for censored data, letY G(r) = #fx 2 � \W : r � �(x;� n fxg) ^ �(x; @W )g= #fi : si ^ ci � rgandNG(r) = #fx 2 � \W : �(x;� n fxg) � �(x; @W ) ^ rg= #fi : si � ci ^ rg:Continuing the analogy, de�ne the Nelson-Aalen estimatorb�G(r) = Z r0 dNG(s)Y G(s) (2.47)and the Kaplan-Meier style estimator of GbG(r) = 1� r0 �1� db�G(s)�

Page 28: 10.1.1.52.8816.pdf

28 SAMPLING AND CENSORING= 1�Ys�r�1� #fi : si = s; si � cig#fi : si � s; ci � sg � (2.48)where s in the product ranges over the �nite set fsig.It follows from the Campbell-Mecke formula (see (2:20)) that thenumerator and denominator of (2:47) satisfy the same mean-valuerelation (2:45) as for i.i.d. randomly censored data,ENG(r) = Z r0 EY G(s) d�G(s); (2.49)where �G is the cumulative hazard associated with G as in (2:43),d�G(s) = dG(s)=(1� G(s�)).Compare this to the reduced-sample (border method) estimatorbGb(r) = #fi : si � r; ci � rg#fi : ci � rg : (2.50)Since the observed, censored distances are highly interdepen-dent, classical theory from survival analysis has little to say aboutstatistical properties of the Kaplan-Meier estimator here. Modestsimulation experiments [4] show that the Kaplan-Meier estimatorbG is generally more e�cient than the reduced sample estimator bGb,as expected. However, bK is not uniformly better than bGb, and ap-pears to be less e�cient than some of the edge corrected estimatorsof section 2.5.4. It seems that the spatial dependence has destroyedthe uniform optimality enjoyed by the Kaplan-Meier estimator un-der i.i.d. random censoring. Further investigation is needed.2.6.4 Kaplan-Meier estimator of KThe function K(r) was de�ned in (2.18). One may also write�K(r) = 1Xn=0Gn(r) (2.51)where Gn(r) = P0f�(b(0; r)) > ng is the distribution function ofthe distance from a typical point of � to the nth nearest point. Foreach of the distance distributions Gn one can form a Kaplan-Meierestimator bGn, since the distance from a point x 2 � to its nthnearest neighbour is also censored just as before by its distanceto the boundary. The sequence of Kaplan-Meier estimators alwayssatis�es the natural stochastic ordering of the distance distribu-tions.

Page 29: 10.1.1.52.8816.pdf

CENSORING EFFECTS FOR POINT PROCESSES 29The pointwise sum of these estimators bGn yields an estimator bKof K which is always a nondecreasing, right-continuous function,with jumps at the observed interpoint distances (between all pairsof points of the pattern). This estimator was studied brie y in [4]but needs further investigation.2.6.5 Kaplan-Meier estimator of FThe estimation of F poses a new problem, since one has a con-tinuum of observations: for each point in the sampling window, acensored distance to the nearest point of the process.The set fx 2W : �(x;�) ^ �(x; @W ) � rgcan be thought of as the set of points `at risk of failure at distancer', and fx 2 W : �(x;�) = r; �(x;�) � �(x; @W )gare the `observed failures at distance r'. Geometrically these setsare the closures ofW(�r)n�(+r) and @ ��(+r)�\W(�r) respectively.See Figure 2.11.Figure 2.11 Geometry of the Kaplan-Meier estimator of F . Spatial pro-cess � indicated by �lled dots. Points x at risk are shaded, and observedfailures constitute the curved boundary of the shaded region.De�ne the Kaplan-Meier style estimator bF of F , based on data

Page 30: 10.1.1.52.8816.pdf

30 SAMPLING AND CENSORING� \W , to bebF (r) = 1� exp(� Z r0 ��@ ��(+s)� \W(�s)��d�1��W(�s) n�(+s)��d ds) (2.52)where j�jd�1 denotes d�1 dimensional Hausdor� measure (`surfacearea' or `length'). Note that the estimator is a proper distributionfunction and is even absolutely continuous, with hazard rateb�(r) = ��@ ��(+r)� \W(�r)��d�1��W(�r) n�(+r)��d (2.53)for almost all r.Here bF is the Kaplan-Meier estimator based on the continuumof observations generated by all x 2W .An alternative representation, showing the contribution fromeach point x, isbF (r) = 1� exp(� ZW 1ft(x) � c(x)g1ft(x) � rg��W(�t(x)) n�(+t(x))��d dx) (2.54)where t(x) = �(x;�) and c(x) = �(x; @W ).In practice, one would compute the estimator by discretizing W ,superimposing a regular lattice L of points, calculating for eachxi 2W \L the censored distance �(xi;�)^ �(xi; @W ) and the in-dicator 1f�(xi;�) � �(xi; @W )g. Then one would calculate the or-dinary Kaplan-Meier estimator (2.39) based on this �nite dataset.As the lattice becomes �ner, the discrete Kaplan-Meier estimatesconverge to the continuous estimator bF , uniformly on any compactinterval in [0; R). See [4].The censoring approach leads to the following insights about Fwhich are of general interest.Theorem 2 ([4]) Let � be any stationary point process with in-tensity 0 < � <1. Then(a) the empty space function F is absolutely continuous;(b) the hazard rate of F equals�(r) = E ��W \ @ ��(+r)���d�1E ��W n�(+r)��dfor almost all r, for any compact window W such that the de-nominator is positive.

Page 31: 10.1.1.52.8816.pdf

CENSORING EFFECTS FOR POINT PROCESSES 31It follows that the Kaplan-Meier estimator (2.53) of �(r) is ratio-unbiased for almost all r. The estimator bF (r) respects the smooth-ness of the true empty space function F . The border method esti-mator (2.27) is not even necessarily monotone.The key to Theorem 2 and the representations (2:53){(2:54) isthe identity��Z \A(+r)��d = jZ \Ajd + Z r0 ��Z \ @ �A(+s)���d�1 dsholding for compact Z;A � Rd where A is su�ciently regular.This is related to Crofton's perturbation method ([1, 16], see sec-tion 1.8.2 of Chapter 1). Geometrical techniques are also enlisted[4, 42, 41] to show that ��Z \ @ ��(+r)���d�1 is uniformly boundedover possible realisations of �, so that dominated convergence jus-ti�es interchanges of expectation and integration or di�erentiation.The numerator and denominator of (2:53) satisfy the same mean-value relation as for ordinary randomly censored data,EN (r) = Z r0 E Y (s) d�(s):It does not seem to be widely known in spatial statistics (cf.[15, p. 764], [22, 25]) that computation of the distances �(x;� \W ); �(x; @W ) for all points x in a �ne rectangular lattice canbe performed very e�ciently using the distance transform algo-rithm of image processing [8, 9, 62, 63]. Thus the reduced-sampleand Kaplan-Meier estimators are equivalent in computational costwhen a �ne grid is used.Again, classical statistical theory for the Kaplan-Meier estimatoris not applicable here because of the strong dependence betweenobservations at di�erent points. Simulations [4] suggest that bF issubstantially more e�cient than bF b in most situations.2.6.6 Hanisch-type estimatorThe empty space function F could also be estimated by a contin-uous analogue of Hanisch's [40] estimator (2:33) for G:bFcs(r) = ZW 1ft(x) � c(x)g1ft(x) � rg��W(�t(x))��d dx (2.55)

Page 32: 10.1.1.52.8816.pdf

32 SAMPLING AND CENSORINGwhere t(x) = �(x;�) and c(x) = �(x; @W ). Chiu and Stoyan [11]attribute this estimator to earlier work of Hanisch (see also [70,pp. 138, 215]) although it clearly originates in [11].This estimator can be rewritten in a form comparable to (2:52):bFcs(r) = Z r0 ��@(�(+s)) \ (W(�s))��d�1��W(�s)��d ds: (2.56)It follows from Theorem 2 above, and associated continuity resultsin [4], that bFcs(r) is unbiased for F (r).2.6.7 Imputation estimatorsThe foregoing estimators do not use all `information' available froma point pattern, in the following sense. Write C(x) for the censoringdistance �(x; @W ) at a point x, and T (x) for the true failure dis-tance �(x;�) or �(x;� n fxg) as appropriate. Also write T �(x) forthe observed failure distance �(x;�\W ) or �(x;�\W nfxg). Thenthe border method estimate at distance r depends only on thosepoints x where C(x) � r. The Kaplan-Meier and Hanisch/Chiu-Stoyan estimates use these points, but also use cases where T �(x) �C(x) < r. However, neither estimator makes use of censored caseswhere C(x) < T �(x) and it seems plausible that these may stillcontain usable information. Indeed the weighted edge-correctionestimators for G and K use information from cases where C(x) <T �(x) � r.Doguwa [25] argued that information should be used from all sixpossible orderings of C(x); T �(x); r. For �ve of these orderings, itis known with certainty whether the true failure distance satis�esT (x) � r. The sixth ordering, C(x) < r < T �(x), is the \maybe"case where the ball of radius r centred on x does not include anyobserved points of the pattern but also extends outside W .Consider the estimation of the nearest neighbour distance dis-tribution function G. LetH(r; x) = PxfT (x) � r j T �(x) > rg= Pxf�(x;�) � r j �(x;� \W ) > rg= Pxf�(b(x; r)) > 2 j �(b(x; r)\W ) = 1gbe the conditional probability that there is another random pointwithin a distance r of x but outside W , given that there is no suchpoint inside W (and given also that x is a point of the process).

Page 33: 10.1.1.52.8816.pdf

LINE SEGMENT PROCESSES 33For the uniform Poisson process of intensity �, H(r; x) = 1 �expf�� jb(x; r) nW jdg. Doguwa [25] suggested estimating G bybG6(t) = 1n nXi=1 h1fsi � rg+ 1fsi > rg1fci < rg bH(r; xi)i (2.57)where si = �(xi;� n fxig \W ), ci = �(xi; @W ) and bH(r; xi) = 1�expf�b� jb(xi; r) nW jdg. This would e�ectively impute a fractionof the `maybe' cases to the favorable cases. Doguwa also proposedanalogous kernel estimators.Floresroux & Stein [30] noted that bG6 could have substantialbias if the process � is not Poisson. They proposed instead thatH(r; xi) be estimated nonparametrically from the data, by aver-aging over spatial con�gurations in the dataset that are analogousto the neighbourhood of xi. That is, since H(r; xi) is itself a Palmprobability, it should �rst be estimated from the data using theCampbell-Mecke formula. The improved Floresroux-Stein estima-tor performs well on a range of simulated patterns.2.7 Line segment processesLaslett [52, 53] �rst noted the analogy between edge e�ects andcensoring for the case of line segments. Figure 2.1(b) shows a linesegment process observed within a rectangular window. The seg-ments may be uncensored (both endpoints visible within the win-dow), censored at one end (one endpoint visible) or doubly cen-sored (neither endpoint visible). Laslett [53] proposed estimatingthe line segment length distribution essentially using the Kaplan-Meier estimator based upon the uncensored and singly-censoredlengths. However Wijers [72] showed that the optimal (nonpara-metric maximum likelihood) estimator of the length distribution,in a Poisson line segment process, is another, complicated estima-tor determined implicitly as the solution of an integral equationinvolving data from all segments.2.8 Censoring of random setsFinally we switch attention to the case where the spatial pattern Xis a random closed set in Rd, assumed to be stationary. A typicalapplication is to the study of vegetation patterns where X could

Page 34: 10.1.1.52.8816.pdf

34 SAMPLING AND CENSORINGrepresent that part of the surveyed region which is covered by aparticular vegetation type.Summary statistics for random closed sets are described in [64],[69, x6.2{6.3]. For several such statistics, notably the spatial co-variance function, edge e�ects can be handled in a straightforwardfashion using the local knowledge principle [64, pp. 49, 233], [17,p. 374]. In this section we discuss the more complex question ofcensoring e�ects for random sets.2.8.1 Empty space function FOf particular interest here is the empty space function, de�nedanalogously to (2:14) as the distribution function of the distancefrom an arbitrary point in space to the nearest point of X:F (r) = Pf�(x;X) � rg ; r � 0where the point x 2 Rd is arbitrary and may be taken to be theorigin 0. The empty space function is a useful summary of the`size' of voids between the components ofX. For stationary Poissonprocesses of points, lines or other �gures, F takes known functionalforms, and departures of the empirical F from these benchmarksare taken as indications of `clustered' or `ordered' patterns. See [64,chap. XIII], [69, p. 178].The estimation of F for a random closed set X observed in awindow W is an almost trivial extension of the point process case.The border method estimatorbF b(r) = ��W(�r) \X(+r)��d��W(�r)��d (2.58)is pointwise unbiased for F (r). Theorem 2 extends to the statementthat for any stationary random closed set X, the empty spacefunction F (r) is absolutely continuous for r > 0, with an atom atr = 0 of mass E jX \W jd = jW jd, and has hazard rate�(r) = E ��W \ @ �X(+r)���d�1E ��W nX(+r)��dfor almost all r, for any compact window W such that the denom-inator is positive. The Kaplan-Meier style estimator isbF (r) = 1� exp(� Z r0 ��@ �X(+s)� \W(�s)��d�1��W(�s) nX(+s)��d ds) (2.59)

Page 35: 10.1.1.52.8816.pdf

CENSORING OF RANDOM SETS 35where j�jd�1 denotes d� 1 dimensional Hausdor� measure. It canalso be represented as a window integralbF (r) = 1� exp(� ZW 1ft(x) � c(x)g1ft(x) � rg��W(�t(x)) nX(+t(x))��d dx) (2.60)where t(x) = �B(x;X) and c(x) = �B(x; @W ). The hazard rate ofbF is pointwise ratio-unbiased for the hazard rate of F .Figure 2.12 Geological faults (straight lines) observable on exposed gran-ite (region enclosed by curved contours). Square side is 160 metres.Figure 2.12 represents a pattern of geological fractures in graniticpluton near Lac du Bonnet, Manitoba, Canada, from [68]. Thefaults were mapped only when visible on the surface, the othertwo-thirds of the granite being covered by soil. In Figure 2.12 thefaults are represented as line segments and the boundary of theobservable region is indicated as a curved contour.Figure 2.13 shows the Kaplan-Meier and reduced sample esti-mates of the empty space function of the fault pattern, the point-wise Kaplan-Meier estimate of the hazard rate �(r) of F , and akernel smoothed estimate of �(r).

Page 36: 10.1.1.52.8816.pdf

36 SAMPLING AND CENSORINGr ( metres )

F(r

)

0.0 0.5 1.0 1.5 2.0 2.5

0.0

0.2

0.4

0.6

0.8

1.0

r ( metres )

Haz

ard

0.0 0.5 1.0 1.5 2.0

01

23

45

Figure 2.13 Estimated empty space distribution function F and hazardfunction � for geological faults. Left: Empty space function F . ||:Kaplan-Meier estimate; � � � � � � � � �: reduced sample estimate. Right: Haz-ard function � of F . � | � |: point estimate; ||: kernel smoothedfunction.2.8.2 First contact distribution HBMore generally, consider the analogue of FFB(r) = Pf�B(x;X) � rg ; r � 0where the Euclidean distance �(x;X) is replaced by�B(x;X) = infft � 0 : (tB + x) \X 6= ;gwhere B � Rd is some �xed \test set". Thus �B(x;X) is the earliesttime t at which a balloon with shape B will touch X if it beginsin ating with zero size at time t = 0. The conditional distributionHB(r) = Pf�B(x;X) � r j �B(x;X) > 0g = FB(r) � FB(0)1� FB(0)is called the �rst contact distribution with test set B. See [64, 69].When B is the unit sphere, FB reduces to the usual emptyspace function F , and HB is called the spherical contact distri-bution function. When B is a line segment, HB is called the linearcontact distribution function. When B is convex and contains aneighbourhood of the origin, FB is a generalised empty space func-tion in which Euclidean distance is replaced by a metric �B on Rdwith unit ball B.There are various reasons for studying FB for non-spherical B.

Page 37: 10.1.1.52.8816.pdf

CENSORING OF RANDOM SETS 37Anisotropy (preferential orientation) in a spatial pattern cannotbe assessed using the spherical empty space function F , and theusual approach is to use the linear contact distributions for varioussegments B pointing in di�erent directions [69]. The linear contactfunction is not applicable to point patterns; instead one can useFB for elliptical shapes B at various orientations. When the dis-tance transform algorithm [8, 9] is used to compute approximatedistances in a digital image, Euclidean distance has e�ectively beenreplaced by a metric with an octagonal unit ball.Again we have an edge e�ect, in that �B(x;X) is not determinedfrom information inside W alone. The border method estimatorbF bB(r) = ��(W rB) \ (X � r �B)��djW rBjd (2.61)is a pointwise unbiased estimator of FB(r).The Kaplan-Meier estimator of FB is derived exactly as for pointprocesses. Hansen et al. [41, 42] treat the cases where B is convexwith nonempty interior, and where B is a line segment, respec-tively.The `window integral' representation (2:60) continues to holdfor bFB, with the modi�cation that t(x) = �B(x;X) and c(x) =�B(x; @W ). Hence in practice bFB can be obtained using the sametechnique of discretising the window and computing the discreteKaplan-Meier estimator.However, the representation of bF as an integral over r now in-volves a Jacobian:bF (r) = 1� jW nX jdjXjd � (2.62)exp(� Z r0 Z(W(�sB))\@(X(+sB)) 1J1�B(x;X) dHd�1x ds)where J1�B(x;X) is the 1-dimensional approximate Jacobian ofthe Lipschitz function �B(�; X). If B = b(0; 1) then J1�B(x;X) � 1and we recover (2:59).Let B be compact, convex and contain a neighbourhood of theorigin. Then [41] showed that for any stationary random closedset X, both FB and bFB are continuous monotone functions, andabsolutely continuous for r > 0 with an atom at r = 0. The hazard

Page 38: 10.1.1.52.8816.pdf

38 SAMPLING AND CENSORINGrate of FB equals�B(r) = 1E ��W nX(+rB)��d E ZW\@(X(+rB)) 1J1�B(x;X) dHd�1xfor almost all r > 0, for any compact window W such that thedenominator is positive. Thus the Kaplan-Meier estimator of �B(r)b�B(r) = 1��W nX(+rB)��d ZW(�rB)\@(X(+rB)) 1J1�B(x;X) dHd�1xis ratio-unbiased for almost all r > 0. Examples of applications ofthe Kaplan-Meier estimator can be seen in [41].A Hanisch-type estimator for FB for convex B was proposed byChiu and Stoyan [11], [70, pp. 138, 215]. This is de�ned by thedirect analogue of (2:55), and can be expressed in a form similarto (2:56) with the introduction of the relevant Jacobian. Detailedassessments of performance have not yet been made.For the linear contact distribution, where B is a line segment, aKaplan-Meier type estimator was constructed in [42]. This has aform similar to (2:62) but the statements about regularity of FBno longer hold. Examples of applications can be seen in [42].ConclusionEdge e�ects for spatial processes exhibit features analogous to sam-pling bias and other features analogous to random censoring. Theseanalogies suggest estimators for properties of the process. However,there does not seem to be an adequate optimality theory whichidenti�es the best estimator in this general context.AcknowledgementsI thank many colleagues for their comments, advice and encour-agement, especially Drs Katja Schladitz and Aila S�arkk�a.

Page 39: 10.1.1.52.8816.pdf

ReferencesA.J. Baddeley. Integrals on a moving manifold and geometrical proba-bility. Advances in Applied Probability, 9:588{603, 1977.A.J. Baddeley. Stereology and survey sampling theory. Bulletin of theInternational Statistical Institute, 50, book 2:435{449, 1993.A.J. Baddeley and R.D. Gill. Kaplan-Meier estimators for interpointdistance distributions of spatial point processes. Research ReportBS-R9315, Centrum voor Wiskunde en Informatica, july 1993.A.J. Baddeley and R.D. Gill. Kaplan-Meier estimators for interpointdistance distributions of spatial point processes. Annals of Statistics,25:263{292, 1997.A.J. Baddeley and H.J.A.M. Heijmans. Incidence and lattice calcu-lus with applications to stochastic geometry and image analysis.Applicable Algebra in Engineering, Communication, and Computing,6(3):129{146, 1995.A.J. Baddeley, R.A. Moyeed, C.V. Howard, and A. Boyde. Analysis of athree-dimensional point pattern with replication. Applied Statistics,42(4):641{668, 1993.L. G. Barendregt and M. J. Rottsch�afer. A statistical analysis of spatialpoint patterns. A case study. StatisticaNeerlandica, 45:345{363, 1991.G. Borgefors. Distance transformations in arbitrary dimensions. Com-puter Vision, Graphics and Image Processing, 27:321{345, 1984.G. Borgefors. Distance transformations in digital images. ComputerVision, Graphics and Image Processing, 34:344{371, 1986.K.R. Brewer and M. Hanif. Sampling with unequal probabilities. Num-ber 15 in Lecture Notes in Statistics. Springer Verlag, New York,1983.S.N. Chiu and D. Stoyan. Estimation of distance distributions for spatialpatterns. Unpublished manuscript, 1995.W. G. Cochran. Sampling Techniques. John Wiley and Sons, 3rd edition,1977.D.R. Cox. Appendix to `The dye sampling method of measuring �brelength distribution' by D.R. Palmer. Journal of the Textile Institute,39:T8{T22, 1949.D.R. Cox. Some sampling problems in technology. In N.L. Johnsonand H. Smith, editors, New developments in survey sampling, pages

Page 40: 10.1.1.52.8816.pdf

40 REFERENCES506{527. John Wiley and Sons, 1969.N.A.C. Cressie. Statistics for spatial data. John Wiley and Sons, NewYork, 1991.M. W. Crofton. Sur quelques th�eor�emes du calcul int�egral. ComptesRendus de l'Acad�emie des Sciences de Paris, 68:1469{1470, 1869.D.J. Daley and D. Vere-Jones. An introduction to the theory of pointprocesses. Springer Verlag, New York, 1988.H.E. Daniels. A new technique for the analysis of �bre length distribu-tion in wool. Journal of the Textile Institute, 33:T137{T150, 1942.R.T. DeHo�. The geometric meaning of the integral mean curvature. InMicrostructural Science, volume 5, pages 331{348, Amsterdam, 1977.Elsevier.P.J. Diggle. On parameter estimation and goodness-of-�t testing forspatial point patterns. Biometrika, 35:87{101, 1979.P.J. Diggle. Statistical analysis of spatial point patterns. Academic Press,London, 1983.P.J. Diggle and B. Mat�ern. On sampling designs for the estimation ofpoint-event nearest neighbour distributions. Scandinavian Journal ofStatistics, 7:80{84, 1981.S.I. Doguwa. A comparative study of the edge-corrected kernel-basednearest neighbour density estimators for point processes. J. Statist.Comp. Simul., 33:83{100, 1989.S.I. Doguwa. On edge-corrected kernel-based pair correlation functionestimators for point processes. Biometrical Journal, 32:95{106, 1990.S.I. Doguwa. On the estimation of the point-object nearest neighbourdistribution F (y) for point processes. J. Statist. Comp. Simul., 41:95{107, 1992.S.I. Doguwa and D.N.Choji. On edge-corrected probability density func-tion estimators for point processes. Biometrical Journal, 33:623{637,1991.S.I. Doguwa and G.J.G. Upton. Edge-corrected estimators for the re-duced second moment measure of point processes. Biometrical Jour-nal, 31:563{675, 1989.S.I. Doguwa and G.J.G. Upton. On the estimation of the nearest neigh-bour distribution, G(t), for point processes. Biometrical Journal,32:863{876, 1990.Thomas Fiksel. Edge-corrected density estimators for point processes.Statistics, 19:67{75, 1988.E.M. Floresroux and M.L. Stein. A new method of edge correction forestimating the nearest neighbor distribution. J. Statist. Planning andInference, 50:353{371, 1996.R.D. Gill. Lectures on survival analysis. In P. Bernard, editor, 22e Ecoled'Et�e de Probabilit�es de Saint-Flour 1992, number 1581 in LectureNotes in Mathematics. Springer, 1994.

Page 41: 10.1.1.52.8816.pdf

REFERENCES 41R.D. Gill and S. Johansen. A survey of product-integration with a viewtoward application in survival analysis. Annals of Statistics, 18:1501{1555, 1990.H.J.G. Gundersen. Notes on the estimation of the numerical density ofarbitrary pro�les: the edge e�ect. Journal of Microscopy, 111:219{223,1977.H.J.G. Gundersen. Estimators of the number of objects per area unbi-ased by edge e�ects. Microscopica Acta, 81:107{117, 1978.H.J.G. Gundersen et al. The new stereological tools: disector, fraction-ator, nucleator and point sampled intercepts and their use in patho-logical research and diagnosis. Acta Pathologica Microbiologica etImmunologica Scandinavica, 96:857{881, 1988.H.J.G. Gundersen et al. Some new, simple and e�cient stereologicalmethods and their use in pathological research and diagnosis. ActaPathologica Microbiologica et Immunologica Scandinavica, 96:379{394, 1988.J. H�ajek. Asymptotic normality of simple linear rank statistics underalternatives. Annals of Mathematical Statistics, 39:325{346, 1968.P. Hall. Correcting segment counts for edge e�ects when estimatingintensity. Biometrika, 72:459{463, 1985.Peter Hall. An introduction to the theory of coverage processes. JohnWiley and Sons, New York, 1988.K.-H. Hanisch. Some remarks on estimators of the distribution functionof nearest neighbour distance in stationary spatial point patterns.Mathematische Operationsforschung und Statistik, series Statistics,15:409{412, 1984.M.B. Hansen, A.J. Baddeley, and R.D. Gill. First contact distributionsfor spatial patterns: regularity and estimation. Provisionally acceptedfor publication.M.B. Hansen, R.D. Gill, and A.J. Baddeley. Kaplan-Meier type es-timators for linear contact distributions. Scandinavian Journal ofStatistics, 23:129{155, 1996.L. Heinrich. Asymptotic behaviour of an empirical nearest-neighbourdistance function for stationary Poisson cluster processes. Mathema-tische Nachrichten, 136:131{148, 1988.L. Heinrich. Asymptotic Gaussianity of some estimators for reduced fac-torial moment measures and product densities of stationary Poissoncluster processes. Statistics, 19:87{106, 1988.L. Heinrich. Goodness-of-�t tests for the second moment function of astationary multidimensional Poisson process. Statistics, 22:245{268,1991.D.G. Horvitz and D.J. Thompson. A generalization of sampling withoutreplacement from a �nite universe. Journal of the American StatisticalAssociation, 47:663{685, 1952.

Page 42: 10.1.1.52.8816.pdf

42 REFERENCESE.B. Jensen and R. Sundberg. Generalized associated point methods forsampling planar objects. Journal of Microscopy, 144:55{70, 1986.E.L. Kaplan and P. Meier. Nonparametric estimation from incompleteobservations. Journal of the American Statistical Association, 53:457{481, 1958.P.R. Krishnaiah and C.R. Rao, editors. Sampling. Number 6 in Hand-book of Statistics. North-Holland, Amsterdam, 1988.Ch. Lantu�ejoul. Computation of the histograms of the number of edgesand neighbours of cells in a tessellation. In R.E. Miles and J. Serra,editors, Geometrical Probability and Biological Structures: Bu�on's200th Anniversary, Lecture Notes in Biomathematics, No 23, pages323{329. Springer Verlag, Berlin-Heidelberg-New York, 1978.Ch. Lantu�ejoul. La squelettisation et son application aux mesurestopologiques des mosaiques polycristallines. Thesis, docteur-ing�enieuren Sciences et Techniques Mini�eres, Ecole Nationale Sup�erieure deMines de Paris, Fontainebleau, 1978.G.M. Laslett. Censoring and edge e�ects in areal and line transect sam-pling of rock joint traces. Mathematical Geology, 14:125{140, 1982.G.M. Laslett. The survival curve under monotone density con-straints with applications to two-dimensional line segment processes.Biometrika, 69:153{160, 1982.R.E. Miles. On the elimination of edge-e�ects in planar sampling. InE F Harding and D G Kendall, editors, Stochastic geometry: a tributeto the memory of Rollo Davidson, pages 228{247. John Wiley andSons, London-New York-Sydney-Toronto, 1974.R.E. Miles. The sampling, by quadrats, of planar aggregates. Journalof Microscopy, 113:257{267, 1978.J. Ohser. On estimators for the reduced second moment measure ofpoint processes. Mathematische Operationsforschung und Statistik,series Statistics, 14:63{71, 1983.B.D. Ripley. The second-order analysis of stationary point processes.jap, 13:255{266, 1976.B.D. Ripley. Modelling spatial patterns (with discussion). Journal ofthe Royal Statistical Society, Series B, 39:172 { 212, 1977.B.D. Ripley. On tests of randomness for spatial point patterns. Journalof the Royal Statistical Society, series B, 41:368{374, 1979.B.D. Ripley. Spatial statistics. John Wiley and Sons, New York, 1981.B.D. Ripley. Statistical inference for spatial processes. Cambridge Uni-versity Press, 1988.A. Rosenfeld and J. L. Pfalz. Sequential operations in digital pictureprocessing. Journal of the Association for Computing Machinery,13:471, 1966.A. Rosenfeld and J. L. Pfalz. Distance functions on digital pictures.Pattern Recognition, 1:33{61, 1968.

Page 43: 10.1.1.52.8816.pdf

REFERENCES 43J. Serra. Image analysis and mathematical morphology. Academic Press,London, 1982.M.L. Stein. A new class of estimators for the reduced second momentmeasure of point processes. Biometrika, 78:281{286, 1991.M.L. Stein. Asymptotically optimal estimation for the reduced secondmoment measure of point processes. Biometrika, 80:443{449, 1993.M.L. Stein. An approach to asymptotic inference for spatial point pro-cesses. Statistica Sinica, 5:221{234, 1995.D. Stone, D.C. Kamineni, and A.Brown. Geology and fracture charac-teristics of the Underground Research Laboratory lease near Lac duBonnet, Manitoba. Technical Report 243, Atomic Energy of CanadaLtd. Research Co., 1984.D. Stoyan, W.S. Kendall, and J. Mecke. Stochastic Geometry and itsApplications. John Wiley and Sons, Chichester, 1987.D. Stoyan, W.S. Kendall, and J. Mecke. Stochastic Geometry and itsApplications. John Wiley and Sons, Chichester, second edition, 1995.W. Weil and J.A. Wieacker. A representation theorem for random sets.Probability and Mathematical Statistics, 9:147{151, 1987.B.J. Wijers. Nonparametric estimation for a windowed line segmentprocess. PhD thesis, University of Leiden, Leiden, The Netherlands,January 1995.D. Zimmerman. Censored distance-based intensity estimation of spatialpoint processes. Biometrika, 78:287{294, 1991.