39
. RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education and Research, Pune - 411021, India Talk Delivered at the INDO-US Workshop on Environmental Statistics held at Duke University, NC, U.S.A. from March 4 to March 6, 2013 1

RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

.

RANKED SET SAMPLING FORENVIRONMENTAL STUDIES

— 000 —

JAYANT V. DESHPANDEIndian Institute of Science Education and Research,

Pune - 411021, India

Talk Delivered at the INDO-US Workshop on Environmental

Statistics held at Duke University, NC, U.S.A. from March 4

to March 6, 2013

1

Page 2: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

ABSTRACT

Here the fact that Ranked Set Sampling (RSS)procedures were initially developed for environmen-tal studies is emphasized. These procedures en-visage ranking a large number of observations (insmall groups) and actually measuring only a few ofthem. The basic procedure to estimate the meanand its higher efficiency over the SRS procedure isdemonstrated. Then we discuss the modification tobe carried out for developing nonparametric con-fidence intervals for quantiles. Lastly we discusssome tests for perfect judgment ranking of obser-vations which should be used as preliminary pro-cedures before the RSS methodology is adopted.

2

Page 3: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

Outline

Examples from Environmental Studies

Introduction to Ranked Set Sampling

Estimation of Mean

Confidence Intervals for Quantiles

Tests for Perfect Ranking

Conclusions.

3

Page 4: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

1. Introduction and Summary

This paper reviews the use of ranked set sam-pling methodology for statistical problems arisingin environmental studies. These studies typicallyrequire high cost observations. Hence it is impor-tant that the inferences be based on the smallestnumber of observations. Ranked set sampling is amethodology which can improve the efficiency oftechniques such as estimation and confidence in-tervals without increasing the number of substan-tial observations; if additional information on theirrelative ranking is available (or can be obtainedwithout significant expense).

4

Page 5: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

In the second section we discuss the modes ofobtaining environmental data and the reasons whythis can be very expensive, bringing out the needfor exercising parsimony in the number of observa-tions.

We introduce ranked set sampling (RSS) and ex-plain why this leads to greater efficiency of the sta-tistical procedures compared to those based on theusual simple random sampling (SRS) procedures.We exhibit the gain in efficiency of the mean basedon RSS over SRS. We also explicitly bring out theincrease in coverage probabilities of distributionfree confidence intervals based on order statisticsin similar circumstances.

5

Page 6: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

In the operation of RSS procedures one needs torank a larger number of observations than thosemeasured for the characteristic of interest. Thisordering is often done on the basis of some surro-gate variable or covariate whose values are availablepractically free of cost. For choice of optimal pro-cedures we must have perfect rankings. But thequestion of perfection of ranking remains unan-swered. We therefore discuss tests available fortesting perfect ranking against the extreme alter-native of random rankings. We see that reasonabletest procedures allow us to either decide in favourof perfect ranking or against it.

6

Page 7: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

In the last section we discuss some aspects of thelarge body of work of RSS methodology giving ref-erences to a recent book and some review papers.

It may be observed that after introducing thewell known properties of the RSS estimation of themean, we discuss some newer work, not yet re-ported in say the book of Chen, Bai and Sinha(2003).

7

Page 8: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

2. Problems Arising in EnvironmentalStudies where Ranked Set Sampling isUseful

Observations in environmental studies are noto-riously difficult and/or expensive to obtain. Stud-ies after studies have confirmed this. Hence theRSS methodology which has higher efficiency forthe same number of measured (as opposed to ranked)observations is the preferred one. In fact the ori-gins and much of its development has been withrespect to the domain of environmental studies.

8

Page 9: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

McIntyre (1952) who initially proposed thismethodology did so in the context of estimatingmean pasture and forage yields. In order to obtainobservations one must harvest the forage. It wouldinvolve moving and clipping the browse and thenweighing it after drying it. All this is a labouri-ous, expensive and time consuming process. So allavenues towards reduction of number of observa-tions are explored. Now one can rank the observa-tional units (called quadrats) more or less by justimputing them visually. Thus ranking can be ac-complished in a large number of quadrats withoutmuch expense and included in the RSS procedures.

9

Page 10: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

Another example often quoted is that of assess-ing the status of hazard waste sites. The radio-chemical analysis of the soil samples is expensive aswell as hazardous. But the samples can be rankedaccording to the value of a surrogate variable, viz.,the Field Instrument for the Determination of LowEnergy Radiation (FIDLER) counts per minutetaken at the location from where the soil samplesfor the radiochemical analysis would be obtained.These can be obtained at relatively low cost andthen used for ranking a large number of soil sam-ples. The actual measurements are on plutoniumconcentration, per square metre of surface soil (Yuand Lam (1997)), which are expensive.

10

Page 11: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

Another experiment described in details in Mur-ray, Ridout and Cross (2000). Leaves on standingtrees were sprayed with a fluorescent, water sol-uble tracer at 2% concentration in water. Thena large number of leaves were harvested and wereranked according to the surface area covered withthe chemical. Once ranking was complete, the de-posits on fewer number of selected ranked leaveswere washed and the flow directed in a test tube.The relevant concentration of the tracer was thenmeasured using a spectrometer. Thus the data wascollected. It may be noted that the ranking stage ismuch less laborious, time consuming and expensivethan the measuring stage.

11

Page 12: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

Another example concerns estimation of mer-cury contamination in fish as discussed in Murffand Sager (2006). Ten appropriate live or freshlydead fish were selected from a catch. Their lengthwas measured for ranking. Then in the labora-tory a fillet was removed from each fish, homoge-nized with a blender and analyzed for mercury con-centration (in mg/kg) with a gas chromatograph.This is an expensive process, besides resulting indestruction of the sample. RSS methodology wassuggested for use in such an experiment. However,in this case it was found that although more effi-cient, the gain was not very much for ordinary leastsquares regression technique.

Hence it is of relevance to see how much gain inthe efficiency is actually made by the RSS method-ology over the SRS methodology.

12

Page 13: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

3. Ranked Set Sampling Methodologyand Estimation of the Mean

The methodology of Ranked Set Sampling wasintroduced in 1952 by McIntyre. For a while itdid not attract the attention it deserved, but therehas been a surge in the interest in it over the lasttwenty years or so. Patil (1994), Barnett (1999),Chen, Bai and Sinha (2003), Wolfe (2004) may beseen as some of the landmark contributions. Let usfirst describe the basic framework of this method-ology. It consists of the following stages. Supposewe are interested in a characteristic represented bythe values of a random variable X taken by eachof the units in the population. Let X have con-tinuous probability distribution with c.d.f. F andp.d.f. f . Let µ be the expectation of X and σ2 itsvariance.

13

Page 14: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

(i) First choose k units from the population as asimple random sample. By some (hopefully inex-pensive) procedure select the unit with the smallestvalue of X . Let it be X[1].(ii) A second independent SRS of size k is chosenand the unit with the second smallest value X[2] ofthe character is selected. Similarly, continue thisprocess until one has X[1], X[2], · · · , X[k], a collec-tion of independent order statistics from k disjointcollections of k simple random samples.

14

Page 15: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

This constitutes the basic balanced Ranked SetSample. These are independently distributed withrespective p.d.f.’s

f(i)(x[i]) =k!

(i− 1)!(k − i)!(F (x[i]))

i−1

(1− F (x[i]))k−i · f (x[i]), −∞ < x[i] < ∞.

15

Page 16: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

Note that we obtained k2 SRS samples from thepopulation, but retained only one (with the ap-propriate rank) from each group of k samples. Sothe total effort has been to rank k random sam-ples of size k each, and retaining only k, the i-th(i = 1, 2, · · · , k) order statistic from the i-th grouprespectively, which are to be actually measured.

It is expected that ranking is cheap and measur-ing is expensive. Ranking k2 observations and us-ing k of them after measurement is thus a samplingmethod of boosting the efficiency of the statisticalprocedures from that based merely in groups of k(SRS) measurements.

16

Page 17: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

In order to retain the reliability of the rankings,it is usually suggested that it be carried out insmall groups, say upto 4, 5, 6 in size. However,this would limit both the versatility and efficiencyof the procedures. Hence recourse is taken to repli-cating the whole process m times. Thus mk2 unitsare examined, each group of k is ranked within it-self and the i-th order statistic is obtained from mgroups, leading to the data X[i]j, i = 1, · · · , k, j =1, · · · , m. This constituted balanced RSS sam-pling. In unbalanced sampling instead of m repli-cations for each order statistic, one will have mi

replications of the i-th order statistic.

17

Page 18: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

4. Estimation of MeanLet us consider the estimation of µ, the mean of

F now. If we use k SRS observations then X , itssample mean is an unbiased estimator with vari-ance σ2/k. The mean

X∗

=1

k

k∑i=1

X[i]

of the k RSS observation is also unbiased for µ asseen below.

E(X∗) =

1

k

k∑i=1

E(X[i])

=1

k

k∑i=1

∫ ∞

−∞x

k!

(i− 1)!(k − 1)!

(F (x))i−1(1− σ(x))k−if (x)dx

=

∫ ∞

−∞x

[k∑

i=1

(k − 1i− 1

)(F (x))i−1(1− F (x))k−i

]f (x)dx

=

∫ ∞

−∞xf (x)dx = µ.

18

Page 19: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

Further we see that

V (X∗) =

1

k2

k∑i=1

V (X[i])

=1

k2

k∑i=1

E(X[i] − E(X[i])2

=1

k2[kσ2 − (

∑µ∗(i) − µ)2]

=σ2

k− 1

k2

k∑i=1

(µ∗(i) − µ)2

where µ∗(i) = E(X[i])

= V (X)− 1

k2

∑(µ∗(i) − µ)2

≤ V (X).

Hence, if the rankings are perfect then the RSS es-timator of m has a smaller variance than the SRSestimator. This increase in the efficiency comes atthe cost (if any) of ranking the k2 observations, inaddition to measuring the k selected order statis-tics.

19

Page 20: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

5. Confidence Intervals for QuantilesIf F is a continuous c.d.f. with the quantile of

order p, qp = inf{x : F (x) ≥ p} then a stan-dard nonparametric confidence interval for qp isprovided by the order statistics. Let X1, · · · , Xn

be a random sample (SRS) of size n and X1:n ≤· · · ≤ Xn:n the order statistics from it then if weset

P [Xr:n ≤ qp ≤ Xs:n]

=

s−1∑j=r

(xj

)pj(1− p)n−j = 1− α,

[Xr:n, Xs:n], r < s, provides a (1 − α)100% confi-dence interval for qp.

20

Page 21: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

Let us now adopt the RSS methodology. We se-lect n2 independent observations. These are rankedin groups are n each. Then let X[1], X[2], · · · , X[n]

be the 1-st, 2-nd, · · · n-th order statistics fromthese disjoint groups. Hence, although the marginaldistribution of X[i] is the same as that of X[i:n], theyfurther are independent. Then one interprets theinterval [X[r], X[s]], r < s, as a confidence intervalfor qp. Due to independence of X[r] and X[s], theconfidence coefficient is

P [X[r] ≤ qp ≤ X[s]]

P [X[r] ≤ qp][1− P (X[s] ≤ qp)]

=

n∑

j=r

(nj

)pj(1− p)n−j

1−n∑

j=1

(nj

)pj(1− p)n−j

.

21

Page 22: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

One notes two properties here(i) E(Xs:n −Xr:n) = E(X[s] −X[r]). Hence boththe SRS based and the RSS based confidence in-tervals have the same expected lengths.(ii) If r and s are so chosen that the confidence co-efficient for the traditional SRS confidence intervalis 1− α with probability α/2 in each tail then theconfidence coefficient of the RSS based confidenceinterval is (

1− α

2

)2

= 1− α +α2

4,

giving an increase of α2

4 in coverage property.

22

Page 23: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

It is recognized that X[r] and X[s] being orderstatistics from independent SRS do not necessarilyobey X[r] < X[s], so in some cases we may not havea proper interval at all. It is therefore suggestedthat we order the two statistics X[r] and X[s] asX(rs1), X(rs2) and use [X(rs1), X(rs2)] as the confi-dence interval. The confidence coefficient of thismodified interval is

P [X(rs1) ≤ qp ≤ X(rs2)]

P [{X[r] ≤ qp ≤ X[s]} or

{X[s] ≤ qp ≤ X[r]}]= P [X[r] ≤ qp ≤ X[s]] + P [X[s] ≤ qp ≤ X[r]]

P (X[r] ≤ qp)P (qp ≤ X[s])

+P (X[s] ≤ qp)P (qp ≤ X[r]).

23

Page 24: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

Again, if r and s are so chosen that the SRSconfidence intervals leave probability α/2 in thetails then the probability of coverage of [X(rs1) ≤X(rs2)] is(

1− α

2

)2

+(α

2

)2

= 1− α +α2

2.

This C.I. thus adds a further α2/4 to the confidencecoefficient of the SRS C.I. However, this comes atthe cost of some increase in the expected length ofthe new C.I.. It can be seen that

E(X(rs2) −X(rs1)) = EF(ss){2XF[r](X)−X}

+EF(rr){2XF[s](X)−X}

and can be calculated for specific distributions.

24

Page 25: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

To increase the flexibility of the confidence inter-vals, it is suggested that m independent groups ofn2 observations be obtained for ranking purposes.From these

X[i]j, j = 1, 2, · · · , m, i = 1, · · · , n

replicates of the i-th order statistic be obtained.These N independent order statistics be orderedfrom lowest to highest as

X1:N ≤ · · · ≤ XN :N .

Then 1 ≤ r < s ≤ N can be appropriately cho-sen to obtain nonparametric confidence intervalswith confidence coefficient (1 − α)100%. Fo de-tails see Ozturk and Deshpande (2004). Furthersimilar RSS based confidence intervals for quan-tiles of finite populations have been discussed inDeshpande, Frey and Ozturk (2006).

25

Page 26: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

6. Judgment Rankings and Tests forPerfect Judgment

The ranked set sampling protocol depends heav-ily on the ability to rank observations which are notmeasured. As the groups in which ranking is to bemade become large such rankings become more dif-ficult and thus more unreliable. So it is always pre-ferred to actually rank observations only in smallgroups (usually restricted to 4, 5 or 6 units). Thereare other practical considerations as well. If twoquadrats are adjacent then it is easy to visuallycompare them for their prospective yields. But ifthe two or more quadrats are far flung them tomake comparative judgments is far more difficult.

26

Page 27: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

The extensive ranked set sampling literature in-cludes optimal procedures which rely on the as-sumption of perfect rankings and also those whichare more robust with respect to the violation ofthis assumption. So an applied statistician whoassumes that the rankings are perfect and choosesthe appropriately optimal procedure faces the pos-sibility that the procedures are not optimal andperhaps may not even be valid, if the rankings areimperfect. So as a preliminary procedure we sug-gest that a test for the perfectness of rankings maybe used.

27

Page 28: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

In Frey, Deshpande and Ozturk (2007) the fol-lowing approach has been suggested. Let X[i]j, i =1, · · · , m, j = 1, · · · , nj, be a ranked set sample.Here m observations are ranked and X[i]j is thei-th order statistic to be measured. This is doneni times j = 1, 2, · · · , ni, giving us a full set ofN = n1 + n2 + · · ·+ nm observations. Let R[i]j bethe rank of X[i]j among these N observations. Incase of perfect rankings one can find the probabil-ities of each of the N ! possible rank vectors whichare the permutations of {1, 2, · · · , N}. The nullhypothesis says that the distributions of the ranksfollow the distribution of order statistics ranks.

28

Page 29: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

This distribution although theoretically known,is not the usual equal probability for all possi-ble rankings. In a very small example with m =3, n1 = n2 = n3 = 1 one can obtain the prob-abilities under the null hypothesis, of the vectorR = (R[1]1, R[2]1, R[3]1) as follows.

R Prob.1 2 3 64/1051 3 2 143/8402 1 3 143/8402 3 1 17/8403 1 2 17/8403 2 1 1/105

A possible way to test the H0 is to form the criti-cal region as the union of the least likely (under H0)outcomes. For example, the union {(2 3 1), (3 1 2),(3 2 1)} will provide a critical region with almost.05 as the probability of first type of error. This isin consonance with Fisher’s approach of construct-ing tests of significance.

29

Page 30: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

Another approach due to Neyman-Pearson re-quires, first of all, an alternative hypothesis. Heuris-tically, one may say that the alternative to perfectrankings is totally random ranking, i.e., those pro-viding equal probability to each rank vector. Thiswould indicate that the rankings are arbitrary anddevoid of any information regarding the sizes of theobservations. The Neyman-Pearson approach willreject the H0 when the ratio of the probabilitiesof R under H1 and under H0 exceeds a thresholdvalue so chosen that the probability of type I er-ror is the specified α. It is quickly seen that thisapproach too leads to exactly the same critical re-gion as the Fisherian approach described above.Also see Frey and Wang (2013).

30

Page 31: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

However, the distribution theory of the ranks be-comes too complicated for even moderate sized mand ni. Frey (2007) has provided a recursion for-mula for this purpose.

But as is usual in nonparametric tests one in-vestigates functions (linear or otherwise) of rankswhose exact distributions may be tabulated forsmall samples and asymptotic distributions maybe obtained by appropriate versions of the centrallimit theorem.

Two such statistics have been proposed by Frey,Ozturk and Deshpande (2007). Let

R[i]. =1

ni

ni∑j=1

R[i]j,

Ti =1√N

(R[i]. − E(R[i].) and

K = T ′QT ′ where

T ′ = (T1, · · · , Tm),

and Q is the Moore-Penrose inverse of the asymp-totic covariance matrix of T . The expectationsand the covariances are under H0. Then K hasasymptotically, under H0, the chi-squared distri-

31

Page 32: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

bution with m−1 degree of freedom. Large values,indicating large departures of the observed R fromits null expectation, will indicate evidence againstH0 and for some alternative hypothesis.

32

Page 33: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

Another statistic proposed in the same paper is

W ∗ =

m∑i=1

n∑j=1

iR[i]j,

the test rejecting for small values of the statistic.After standardization, its asymptotic distributionis the standard normal.

We consider performance of these tests throughsimulated power for alternatives which are convexcombinations of probabilities under H0 and underthe extreme random rankings. We find the testbased on W ∗ consistently out-performing the onebased on K.

Similar work was undertaken by Vock and Bal-akrishnan (2011, 2013). In the first paper theypropose a Jonkheere-Terpstra type test for perfectrankings and in the second paper they find that itis essentially the Frey, Ozturk, Deshpande (2007)test, the test statistics being linear functions ofeach other.

33

Page 34: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

Example : One of the examples introduced ear-lier was about ranking and measuring the percentcover of leaves under various sprayer settings. Theexperiment had m = n = 5 giving a total of 25observations. The ranks observed in the 5 groupsof order statistics were

(1, 5, 4, 6, 3),(2, 11,10, 8, 15),(22, 12, 14, 18, 13) ,(7, 9, 20, 19, 23), and(16, 25, 24, 17, 21).We find W ∗ = 1.175 which has (simulated) p-

value > 0.10. Hence it may be concluded that thereis insufficient evidence to reject the null hypothesisof perfect orderings.

34

Page 35: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

7. ConclusionsIn this review we have only provided an intro-

duction to the ranked set sampling methodologyas applicable to environmental studies. Since itsintroduction in 1952 it has taken great strides inanalysis of parametric and nonparametric modelsas introduced here. Further, problems like opti-mal estimation in the context of lognormal extremevalue and other distributions are discussed by Bar-nett (1999). An easy to read introduction is avail-able in Patil (2002). Chen, Bai and Sinha (2003)have an entire book on ranked set sampling. Oz-turk and Deshapnde (2004) have proposed a newtest for the nonparametric two sample scale prob-lem. Newer contributions include more detailedpower studies. Murff and Sagar (2006) have shownthat the use of this methodology in ordinary leastsquares regression does improve the efficiency, butonly marginally.

35

Page 36: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

However, the basic result which states that ifinexpensive (or cost free) ranking is incorporatedalong with measurements which are expensive, thenRSS does provide some increase in efficiency of theprocedure over its SRS version.

36

Page 37: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

References

[1] Barnett V., (1999), Ranked set sampledesign for environmental investigations,Env. Eco. Statist., 6, 59-74.

[2] Chen Z., Bai Z., Sinha B. K. (2003),Ranked Set Sampling, Springer.

[3] Deshpande J. V., Frey H., Ozturk O,(2006), Nonparametric rank set samplingconfidence intervals for quantiles of a fi-nite population, Environ Eco. Statist.,13, 25-40.

[4] Frey J., Ozturk O., Deshpande J. V.,(2007), Nonparametric tests for perfectjudgment rankings, J.Am. Statisti. As-soc., 102, 708-717.

[5] Frey J., (2007), A note on probability in-volving independent order statistics, J.Statist. Comp. Sim., 77, 969-975.

[6] Frey J., Wang L., (2013), Most pow-erful tests to perfect rankings, Comp.Statist. Data An., 60, 157-168.

[7] McIntyre G. A., (1952), A method for

37

Page 38: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

unbiased selective sampling, using rankedsets, Aust. J. Agr. Res., 3, 385-390.

[8] Murff E., Sager T., (2006), The relativeefficiency of ranked set sampling in or-dinary least squares regression, EnvironEco. Statist., 13, 41-52.

[9] Murray J. A., Ridout M. S., Cross J. V.,(2000), The use of ranked set samplingin spray deposit assessment, Aspects ofApp. Bio., 57, 141-146.

[10] Ozturk O., Deshpande J.V., (2004), Anew nonparametric test using ranked setdata for a two sample scale problem,Sankhya, 66, 513-527.

[11] Ozturk O., Deshpande J. V., (2006),Ranked set sample nonparametric quan-tile confidence intervals, J. Statist. Plan-ning Inf., 136, 570-577.

[12] Patil G. P., Sinha A. K., Taillie C.,(1994), Ranked set sampling, in Hand-book of Statistics, 12, Ed. G. P. Patiland C. R. Rao, 167-200.

38

Page 39: RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES · 2018-10-16 · RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES — 000 — JAYANT V. DESHPANDE Indian Institute of Science Education

[13] Patil G. P., (2002), Ranked set sam-pling, Ency. Environmentrics, Ed. A.H. El-Sharawi, W. W. Piegorsch, 3, 1684-1690.

[14] Vock M., Balakrishnan N., (2011), AJonkheere-Terpstra type test for perfectranking in balanced ranked set sampling,J. Statist. Planing Inf., 141, 624-630.

[15] Vock M., Balakrishnan N., (2013), Aconnection between two nonparametrictests for perfect ranking in balanced rankedset sampling, Comm. statist.(Th. Meth-ods), 42, 191-193.

[16] Wolfe D. A., (2004), Ranked set sam-pling : An approach to more efficientdata collection, Statist. Sc., 19, 636-643.

[17] Yu P.L.H., Lam K., (1997), Regressionestimator in ranked set sampling, Bio-metrics, 53, 1070-1080.

39