Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
Testing Inverse Problems : a direct or an indirectproblem ?
Beatrice Laurent, Jean-Michel Loubes and Clement Marteau
Abstract
In this paper, we consider ill-posed inverse problems models Y = Tf + ξΞ whereT denotes a compact operator, ξ a noise level, Ξ a Gaussian white noise and f thefunction of interest. Recently, minimax rates of testing in such models have beenobtained in various situations, both from asymptotic and non-asymptotic point ofviews. Nevertheless, it seems necessary to propose tests strategies attaining thesesrates, being easy to implement and robust with respect to the characteristics of theoperator. In particular, we prove that the inversion of the operator is not alwaysnecessary. This result provides interesting perspectives, for instance in the specificcases where the operator is unknown or difficult to handle.
Keywords: Test, Inverse ProblemsSubject Class. MSC-2000 : 62G05, 62K20
1 Introduction
Inverse problems have been extensively studied over the past decades. They providea non-trivial generalization of classical statistical models and are at the core of severalapplication problems. In this paper, we consider the model
Y = Tf + ĎΞ,
where T : X â Y denotes a compact operator, X and Y Hilbert spaces, Ď a noise leveland Ξ a Gaussian white noise.
In this framework, the most common issues mainly concern the estimation of theparameter of interest f , with either parametric or non parametric technics. Optimalitywith respect to a particular loss function have been achieved and some authors havebuilt adaptive methods leading to general oracle inequalities. Many methods have been
1
considered, representing the most trends in statistic estimation methods (kernel methods,model selection, projection onto specific bases).
Actually one of the main difference between direct and indirect problems comes fromthe fact that two spaces are at hand: the space of the observations Y and the space wherethe function will be estimated, namely X , the operator mapping one space into another,T : X â Y . Hence to build a statistical procedure, a choice must be made which willdetermine the whole methodology. This question is at the core of the inverse problemstructure and is encountered in many cases. When trying to build basis well adapted tothe operator, two strategies can be chosen, either expanding the function onto a waveletbasis of the space X and taking the image of the basis by the operator as stated in [6], orexpanding the image of the function onto a wavelet basis of Y and looking at the imageof the basis by the inverse of the operator, studied in [1]. For the estimation problemwith model selection theory, estimators can be obtained either by considering sieves on(Ym)m â Y with their counterpart Xm := T ?Ym â X or sieves on (Xm)m â X and theirimage Ym := TXm â Y (see for instance in [13, 12]).
Signal detection for inverse problem has received little attention. If some par-ticular cases such as convolution problems have been widely investigated (see [4] or[7] for general references), the general case of tests for inverse problem have onlybeen tackled very recently. We refer to [8], [10] for a complete asymptotical andnon asymptotical theory. The previous dilemma concerning the choice of the spacewhere to perform the study is here also crucial. Indeed, two methods are at hand: the first one consists in performing tests on the functional space X which impliesinverting the operator, while the second method is to build a test directly on theobservations in the space Y , which involves considering an hypothesis on the image ofthe unknown function. Such issues have been tackled for different alternatives in [3] or [7].
It is obvious that from a practical point of view, for known operators, testing directlythe data has the advantage to be very easy to use, requiring few computations. Indeed,since Tf follows a direct regression model, all well known testing procedures may apply. Inthis paper, we will show that in many cases, considering the problem as a direct problemoften leads to very interesting testing performances. Hence depending on the difficulty ofthe inverse problem and on the set of assumptions on the function to be detected (sparseconditions or smoothness conditions), we prove that the specific treatment devoted toinverse problem which includes an underlying inversion of the operator, may worsen thedetection strategy. For each situation, we also highlight the cases where the direct strategyfails and were a specific test for inverse problem should be preferred. Deviations froman assumption on the function may be more natural to consider rather than assumptionon its image by the operator, Tf but since we consider signal detection, i.e tests on thenullity of the function, both assumptions can be investigated.
2
The paper falls into the following parts. Section 2 defines precisely the criterionchosen to describe the optimality of the tests. Section 3 is devoted to testing issues wherethe signal is characterized by smoothness constraints. In Section 4, we consider finitedimensional models: we present and develop some tools that will be used to prove theresults of Section 3. Then, some simulation results are gathered in Section 5. A generaldiscussion is displayed in Section 6 while the proofs are postponed to Section 7.
2 Signal detection for inverse problems : two possible
frameworks
We consider in this paper signal detection of a function f observed in the followingframework. Let T a linear operator on an Hilbert space X with inner product (., .), andconsider an unknown function f observed from indirect observations in a Gaussian whitenoise model
Y (g) = (Tf, g) + ĎÎľ(g), g â H (1)
where Îľ(g) is a centered Gaussian variable with variance âgâ2 := (g, g). The operatorT is supposed to be compact. Then it admits a singular value decomposition (SVD)(bj, Ďj, Ďj)jâĽ1 in the sense that
TĎj = bjĎj, T âĎj = bjĎj âj â N?, (2)
where T â denotes the adjoint operator of T . Considering the observations (Y (Ďj))jâN? ,Model (1) becomes
Yj = bjθj + ĎÎľj = νj + ĎÎľj, âj ⼠1, (3)
with Yj = Y (Ďj), Îľj = Îľ(Ďj), (Tf, Ďj) = bjθj = νj and θj = (f, Ďj). Hence, inference onthe sequence θ = (θj)jâN? provides the same results for the function f .
In order to measure the testing difficulty of a given model, we will consider the minimaxpoint of view developed in the series of paper due to Ingster [9]. Let G be some subsetof an Hilbert space and g â G a function of interest. We consider the minimal radius Ďfor which the problem of testing âg = 0â against the alternative âg â G and âgâ ⼠Ďâwith prescribed probabilities of errors is possible. Of course, the smaller the radius Ď, thebetter will be the test.
More precisely, a test is a decision rule ÎŚ with value in {0, 1}. By convention, weaccept a null hypothesis H0 when ÎŚ = 0 and we reject otherwise. Let Îą â (0, 1) fixed, adecision rule ÎŚ is a level-Îą test if PH0(ÎŚ = 1) ⤠ι for some Îą > 0. In this case, we oftenwrite ÎŚÎą instead of ÎŚ in order to enlighten the dependence in Îą. The quality of the testÎŚÎą then relies on the quantity Ď(ÎŚÎą, β,G) defined as
Ď(ÎŚÎą, β,G) = inf
{Ď > 0, sup
gâG,âgââĽĎ
Pg(Όι = 0) ⤠β
},
3
for some fixed β. The minimax rate of testing on G is defined as the smallest possiblerate on G, namely
Ď(G, Îą, β) := infÎŚÎą
Ď(ÎŚÎą, β,G)
where the infimum is taken over all possible level-ι testing procedures.A testing procedure Όι is said to be minimax for H : g = 0 on G if there exists a constantC ⼠1 such that
supgâG,âgââĽCĎ(G,Îą,β)
Pg(Όι = 0) ⤠β.
Let HIP0 be the null hypothesis corresponding to inference on the function f or the
corresponding coefficients, namely
HIP0 : f = 0,
associated with the alternative HIP1 : âfâ ⼠Ď, f â F for some F â X . The corresponding
rates of testing have been computed very recently in [8] or [10] and different testingprocedures have been proposed. We may alternatively consider the hypothesis image ofthe assumption HIP
0 by the operator T
HDP0 : Tf = 0,
with the alternative HDP1 : âTfâ ⼠Ď, Tf â TF , where TF denotes the image of F by
the operator T . Since the operator is known, note that the assumption HDP0 is completely
specified. In this case the model (3) may be viewed as a direct observation model.Previous rates have been computed for the different sets of assumptions that we willconsider, conditions on the number of non zeroes coefficients or regularity assumptions.For more details, we refer to [9], [11] or [2] for a non-asymptotic point of view.
It seems clear that both assumptions Tf = 0 and f = 0 are equivalent since theoperator T is injective. So HIP
0 and HDP0 are two ways of rephrasing the same question.
Nevertheless, remark that the associated alternatives and rates of convergence stronglydiffer. In some sense, the inversion of the operator may introduce an additional difficulty.Hence, a test minimax for HDP
0 on TF is not necessary minimax for HIP0 on F . The same
remark holds for the reverse. Nevertheless, we can conjecture that in some specific cases,these hypotheses may be in some sense exchangeable or at least that there exists somekind of hierarchy. The aim of this paper is to present these situations. More precisely, wewill enlighten situations where tests procedures may be used to consider both HDP
0 andHIP
0 in an optimal way. For different classes of functions F , we will point out the caseswhere
⢠every procedure Όι minimax for HIP0 on F is also necessary minimax for HDP
0 onTF ,
4
⢠every procedure Όι minimax for HDP0 on T F is also necessary minimax for HIP
0 onF .
We will see that the answer is not so simple and depends on the spaces F and on theill-posedness of the problem.
We will consider different spaces F in order to define the alternative to HDP0 and
HIP0 . Hereafter we will consider inference on the function f =
âj θjĎj or the image
by the operator T , Tf =â
j bjθjĎj =â
j νjĎj, expressing the assumptions directly onthe functions or their corresponding coefficients in the appropriate basis. Concerning theoperator, conditions will be set on the sequence (bk)kâNâ . The decay of the eigenvaluesindeed describes the difficulty of the inverse problem. We will consider two main cases:
Mildly ill-posed There exist two constants c, C and an index s > 0 such that
âj ⼠1, cjâs ⤠bj ⤠Cjâs
Severely ill-posed There exist two constants c, C and an index Îł > 0 such that
âj ⼠1, c exp(âÎłj) ⤠bj ⤠C exp(âÎłj)
The parameters s and Îł are called index of ill-posedness of the corresponding inverseproblem.
3 Tests strategies under smoothness constraints
In the following, the functions of interest are assumed to belong to smoothness classesdetermined by the decay of their coefficients in the bases (Ďk)kâĽ1 or (Ďk)kâĽ1. Severalexplicit rates of testing have been established following the considered spaces (X or Y),the rate of decay of the coefficients , and the behavior of the sequence (bk)kâĽ1. For moredetails, we refer for instance to [2] in the direct case (i.e. when T denotes the identity) orto [10] in an heteroscedastic setting, which is equivalent to the inverse problem modeledin (3).
More precisely, let a = (ak)kâNâ be a monotone non-decreasing sequence and R > 0.In the following, we assume that the function f is embedded in an ellipsoid EXa,2(R) of theform
EXa,2(R) =
{g â X ,
+ââj=1
a2jăg, Ďjă2 ⤠R2
}. (4)
The sequence a characterizes the shape of the ellipsoid. Functional sets of the form (4)are often used to model some smoothness class of functions : the choice aj = js âj
5
corresponds to a Sobolev ball with regularity s, namely Hs(R) while aj = exp(sj) âjentails that the function belongs to an analytic class of function with parameter s. Wecan also define the analogue of EXa,2(R) on Y , denoted in the following by EYc,2(R), for agiven sequence c.
Imposing an ellipsoid type constraint on the function f entails also an ellipsoidconstraint on Tf . Indeed, using Model (3), we get that Tf â EYc,2(R) as soon as
f â EXa,2(R), for c = (ck)kâN = (akbâ1k )kâN. We point out that the operator regularizes the
function and for such a smoothness assumption, function Tf is smoother than f .
On the one hand, the rates for the inverse hypothesis HIP0 highly depend on the
ill-posedness of the inverse problem and the different types of ellipsoids. They are givenin [10] and recalled in Table 1 below. In the following, we write Âľk ⟠νk if there existstwo constants c1 and c2 such that for all k ⼠1, c1 ⤠¾k/νk ⤠c2. The Table 1 presentsthe minimax rates of testing over the ellipsoids EXa,2(R) with respect to the l2 norm. Weconsider various behaviours for the sequences (ak)kâN? and (bk)kâN? . For each case, wegive f(Ď) such that for all 0 < Ď < 1, C1(Îą, β)f(Ď) ⤠Ď2(EXa,2(R), Îą, β) ⤠C2(Îą, β)f(Ď)where C1(Îą, β) and C2(Îą, β) denote positive constants independent of Ď.
Mildly ill-posed Severely ill-posedbk âź kât bk âź exp(âÎłk)
ak âź ks Ď4s
2s+2t+1/2 (log(Ďâ2))â2s
ak âź exp(νks) Ď2 (log(Ďâ2))(2t+1/2)/s
eâ2νDs(s < 1)
Table 1: Minimax rates of testing in the indirect case.
Here D denotes the integer part of the solution of 2νDs + 2ÎłD = log(Ďâ2) and t, s, ν, Îłare positive constants.
On the other hand, in [2], the rates for the direct hypothesis HDP0 are obtained under
the ellipsoid condition Tf â EYc,2(R) and are presented in Table 2 below.
ck âź ku Ď4u
2u+1/2
ck âź exp(Îąku) Ď2 (log(Ďâ2))1/2u
Table 2: Minimax rates of testing in the direct case.
6
The terms u, Îą are positive constants. Test strategies in this framework are based on thefollowing theorem.
Theorem 3.1 Let (Yj)jâĽ1 a sequence obeying to model (3). Let Îą, β â (0, 1) be fixed. LetEXa,2(R) the ellipsoid defined in (4). We assume that 0 < Ď < 1. Then, in the four casesdisplayed in Table 1, we have
⢠Every level-ι test minimax for HDP0 on EYc,2(R) is also minimax for HIP
0 on EXa,2(R),
⢠There exist level-ι tests minimax for HIP0 on EXa,2(R) but not for HDP
0 on EYc,2(R),
where for all k ⼠1, ck = akbâ1k .
Note that under ellipsoid constraint, previous results hold both for mildly and severelyill-posed problems. Hence the conclusion of this theorem is that testing in the space ofobservations should be preferred rather than building specific tests designed for inverseproblem which will not improve the rates and will introduce additional difficulties.Remark that we do not claim that all the procedure minimax for HIP
0 will necessary failfor testing HDP
0 , but rather that an additional study seems necessary.
A part of the proof of Theorem 3.1 is based on Lemma 3.2 below and is postponed toSection 7. This lemma provides, under ellipsoid constraint, an embedding on the deviationballs for Tf provided that f is bounded away from zero.
Lemma 3.2 Let ÎłĎ a positive sequence such that ÎłĎ â 0 as Ď â 0. The followingembedding holds:{
f â EXa,2(R), âfâ2 ⼠γĎ
}â{f â EXa,2(R), âTfâ2 ⼠(1â c)ÂľĎ
},
where c â]0, 1[, ÂľĎ = b2m(Ď)ÎłĎ and m(Ď) is such that R2aâ2m(Ď) ⤠cÎłĎ.
Using Lemma 3.2, we can control the norm of Tf given âfâ2 for all f belonging to EXa,2(R).Remark that the reverse is not true. Indeed, we can always construct signals such thatâTfâ2 and âfâ2 are of the same order.
4 Rate optimal strategies for finite dimensional mod-
els
The aim of this section is to present and develop some tools useful for the proofsof the results presented in the previous section. In particular, our aim is to exhibittests that fail to be powerful simultaneously in the inverse and the direct setting ( i.e.for the hypothesis HDP
0 and HIP0 ). Most of the tests designed for alternatives of the
7
form âf â EXa,2(R), âfâ ⼠Ďâ involve only a finite number of coefficients. These testsare in some sense biased. The key issue is thus to control this bias with respect to therates of testing. As in classical nonparametric estimation problems, finding the besttrade-off between these two quantities is at the heart of the testing approach. A goodunderstanding of such kind of testing procedures is thus necessary.
Hence, in this section, we deal with functions having zero coefficients after a certainlevel. For a given D ⼠1, let
HD = span {Ďj, 1 ⤠j ⤠D} ,KD = span {Ďj, 1 ⤠j ⤠D} ,
where span(A) denotes the linear space generated by A â l2(N). From (2), the assertionf â HD is equivalent to Tf â KD.
Assume that f â HD, or equivalently that Tf â KD which entails that the signalf is concentrated on the first D coefficients of the basis (Ďk)kâN. As said before inSection 2, choosing the best framework for building signal detection tests involves knowingthe minimax rates of testings for the both settings : the direct and the indirect versionof the observation model. On the one hand, rates of testing for the inverse problem (3)have been obtained in [10] under such assumptions. For CL(Îą, β), CU(Îą, β) two positiveconstants, we have
CL(Îą, β)Ď2
(Dâ
j=1
bâ4j
) 12
⤠Ď22(HD) ⤠CU(Îą, β)Ď2
(Dâ
j=1
bâ4j
) 12
.
On the other hand, for the direct observation model, Yj = νj + ξΞj, j â N, rates are givenin [2]. For cL(Îą, β), cU(Îą, β) two positive constants, we get
cL(Îą, β)Ď2âD ⤠Ď2
2(KD) ⤠cU(Îą, β)Ď2âD.
For fixed D, we are here in a parametric setting, both rates of testing are of order Ď2.In order to compare the hypotheses HDP
0 and HIP0 in the spirit of Section 3, we have
to take into account the dependency with respect to the parameter D of these rates. Inthis sense, we introduce a definition of rate-optimality which will provide a framework forpotential comparisons between HDP
0 and HIP0 .
Definition 4.1 Assume that Y = (Yj)jâĽ1 obeys to Model (3). Let Îą, β â]0, 1[. For allD â Nâ, we consider a levelâÎą test ÎŚÎą,D : Pf=0(ÎŚÎą,D = 1) ⤠ι. The collection of tests(ÎŚÎą,D, D ⼠1) is rate-optimal for HIP
0 over the sets (HD, D ⼠1) if the following propertyholds :
âCÎą,β,âD ⼠1,âf â HD, âfâ2 ⼠Ď2CÎą,β
(Dâ
j=1
bâ4j
) 12
=â Pf (ÎŚÎą,D = 1) ⼠1â β. (5)
8
The collection of tests (Όι,D, D ⼠1) is rate-optimal for HDP0 over the sets (KD, D ⼠1)
if the following property holds :
âC â˛Îą,βâD ⼠1,âf â HD, âTfâ2 ⼠Ď2C â˛
ι,β
âD =â Pf (ÎŚÎą,D = 1) ⼠1â β. (6)
Following Definition 4.1, a family is rate optimal if, in some sense, it possesses a similarbehaviour for all the values of D.
The following theorem provides a comparison for the two signal detection proceduresat hand.
Theorem 4.2 Assume that Y = (Yj)jâĽ1 obeys to Model (3). Let 0 < Îą < β < 1/2.
Mildly ill-posed inverse problem
Let (ÎŚÎą,D)DâĽ1 be a collection of levelâÎą tests which is rate-optimal for HDP0 over the sets
(KD, D ⼠1) then (ÎŚÎą,D)DâĽ1 is also rate-optimal for HIP0 over the sets (HD, D ⼠1).
Mildly or severely ill-posed inverse problem
There exists a collection of tests (ÎŚÎą,D)DâĽ1 which is rate-optimal for HIP0 over the sets
(HD, D ⼠1) but not for HDP0 over the sets (KD, D ⼠1).
Severely ill-posed inverse problem
There exists a collection of tests (ÎŚÎą,D)DâĽ1 which is rate-optimal for HDP0 over the sets
(KD, D ⼠1) but not for HIP0 over the sets (HD, D ⼠1).
Previous theorem proves that the situation where the function is defined by a fixednumber of coefficients differs from the case where a regularity constraint is added onits coefficients. On the one hand, when considering mildly ill-posed inverse problems, itseems clear that the both problems of testing respectivelyHIP
0 andHDP0 are not equivalent
when the function f belongs to HD. Every testing procedure that works in the direct casecould be used without additional assumptions in the inverse case. The reverse is not true.On the other hand, the severely ill-posed case is not a straightforward generalization ofthe results obtained for polynomially decreasing eigenvalues. Indeed, in this particularsetting, testing HIP
0 and HDP0 seems to be two different tasks. A good test for HIP
0 mayfail for HDP
0 and the reverse is true.Hence, without additional assumption on the function f , different strategies should
be used according to the considered assumption.
9
5 Simulations
Our goal is to illustrate the difference between direct and indirect testing strategieson some simulations. In Figure 1, we focus on mildly ill-posed problems for testing thenullity of two coefficients. We compare and plot here the power of an indirect test ÎŚ
(2)D,Îą
and a direct test ÎŚ(1)D,Îą with same level 5%.
The test ÎŚ(1)D,Îą rejects the null hypothesis H0 : f = 0, if
Dâj=1
Y 2j > vD,Îą,
where vD,Îą denotes the 1 â Îą quantile ofâD
j=1 Y2j under H0. The test ÎŚ
(2)D,Îą rejects the
null hypothesis H0 : f = 0, ifDâ
j=1
bâ2j Y 2
j > uD,Îą,
where uD,Îą denotes the 1â Îą quantile ofâD
j=1 bâ2j Y 2
j under H0.
We will conduct two different simulations. First, for an alternative f â H2, weestimate the power of the tests ÎŚ
(1)2,Îą and ÎŚ
(2)2,Îą, for the Yjâs obeying to Model (3), with on
the one hand bj = jâ1, and on the other hand (θ1, θ2) â (â5, 5)2 ( and θj = 0, âj ⼠2).The simulations are obtained using 10000 replications. In Figure 1, we present theestimated powers of the two tests. As expected, the direct test is powerful outside anellipsoid while the indirect test behaves well outside a ball. Then, in Figure 2, we providethe difference between the two test powers. The comparison of the two tests enlightensthat specific tests for inverse problems are outperformed by a procedure that directlytests the observations.
In a second simulation, we focus on two generic examples of goodness of fit. Weconsider two different alternatives f1,D an f2,D defined respectively as
< f1,D, Ďj >=
{C1D
1/4 if j = 1
0 âj > 1, < f2,D, Ďj >=
{C1D
1+1/4 if j = D
0 âj 6= D. (7)
where C1 =â
6. We perform these simulations for a mildly ill-posed problem : we setbj = jâ1 for all j ⼠1.
Remark that for a given D, both functions f1,D an f2,D belong to HD. We plot for
different values of D = 1, . . . , 50 the second kind error for the direct test ÎŚ(1)Îą,D in dotted
line and for the indirect test ÎŚ(2)Îą,D in straight line, with Îą = 5%. Results are displayed in
10
Figure 1: Power of direct and indirect tests
Figure 3. For each run, the test is conducted over 10000 replications.For the function f1,D, the indirect test fails in detecting the non zero coefficient while
the direct test succeeds. We point out that this function corresponds to the counter-example exhibited in the proof of Theorem 4.2.
Remark also that the function f2,D corresponds to the theoretical case for which thedirect test does not perform better than the indirect test. This statement appears alsoclearly on the curves of Figure 4 since the simulated second kind errors are of the sameorder whatever D is.
6 Concluding Remarks
Our aim was to be able to decide whether tests for inverse problems should be per-formed or if direct inference on the observations could be sufficient or even lead to betterresults. Note that, we only have considered tests problems for which the null hypothesiswas f = 0. Actually, this is not restrictive and the same results still hold for any good-ness of fit problem of the type f = f0. As shown in this paper, we promote the followingtesting strategies according to the nature of the inverse problem.
⢠Mildly ill-posed problems : in this situation, in most cases, tests built directly onthe rough observed data, i.e in the space of the observations, outperform tests thatinvert the operator. Note that we have not investigated the special situation of`p bodies tackled in [2] or in [10]. Indeed, for this case, minimax rates of testingare only achieved up to a logarithmic term and in this case specific tests should bedesigned to avoid flawed rates of testing.
11
Figure 2: Comparison of direct and indirect tests
⢠Severely ill-posed problems : for this kind of inverse problem, tests are more difficultand the situation is reversed. Indeed, in most cases, the operator changes thedifficulty of the testing issues and thus a specific treatment must be addressedto the data. Only the specific frame of signal testing where the function has anellipsoid-type regularity, enables to get rid of the inverse problem settings and touse direct testing procedure on the raw data.
Hence a large number of goodness of fit issues for inverse problems can be more efficientlysolved by using tests on the direct observations, with the great advantage that a largevariety of tests are available. Moreover, since only the observations are needed, as soonas the kind of inverse problem is known, tests can be designed without knowing theoperator. Hence, our results enable to build tests for inverse problems with unknownoperators. Such situations are encountered in [5] where the operators are unknown orpartially known for instance. Nevertheless, our testing procedures in this direct case onlyconsider alternatives of the form âTfâ ⼠Ď. In some cases, deviations from AssumptionsHIP
0 may be more natural to consider rather than applying the operator T .
To conclude this discussion, we point out that when it is difficult to assess whichstrategy should be used, we can always combine two level-Îą/2 tests respectively minimaxfor HIP
0 and HDP0 , to obtain a global minimax test.
12
â
ââ
ââ â
ââ â â â
â â ââ â â
â â â â â â ââ â â â â â â â â â â â â â â â â â â â â â â â â â
0 10 20 30 40 50
0.0
0.2
0.4
0.6
0.8
1.0
D
Sec
ond_
Ord
er_E
rror
â
ââ
ââ
ââ â
â â â â â â ââ â â
â â â â â ââ
â â â â â â â â â â â â â â â â â â â â â â â â â
0 10 20 30 40 50
0.0
0.2
0.4
0.6
0.8
1.0
D
Sec
ond_
Ord
er_E
rror
Figure 3: Second kind error for direct and indirect approaches. The dashed lines representthe second kind error for the test ÎŚ
(1)D,Îą for different values of D while the solid lines are
associated to the test ÎŚ(2)D,Îą. Results for the alternatives f1,D and f2,D introduced in (7) are
respectively displayed on the left-hand and right-hand sides.
7 Proofs
7.1 Technical tools
Let us recall the following lemma which is proved in [10] (see Lemma 2).
Lemma 7.1 Let D â Nâ,
Zj = Îťj + ĎjÎľj, 1 ⤠j ⤠D
where Îľ1, . . . ÎľD are i.i.d. Gaussian variables with mean 0 and variance 1. We defineT =
âDj=1 Z
2j and
ÎŁ =Dâ
j=1
Ď4j + 2
Dâj=1
Ď2jÎť
2j .
The following inequalities hold for all x ⼠0 :
P(T â E(T ) ⼠2
âÎŁx+ 2 sup
1â¤jâ¤D(Ď2
j )x
)⤠exp(âx). (8)
13
P(T â E(T ) ⤠â2
âÎŁx)⤠exp(âx). (9)
We can easily deduce from this lemma the following corollary.
Corollary 7.2 LetYj = bjθj + ĎÎľj, j â Nâ.
Let D ⼠1, β â]0, 1[ and denote by q1âβ(D) the (1 â β) quantile ofâD
j=1 bâ2j Y 2
j , and by
qâ˛1âβ(D) the (1â β) quantile ofâD
j=1 Y2j . Then, setting xβ = ln(1/β)
q1âβ(D) â¤Dâ
j=1
θ2j + Ď2
Dâj=1
bâ2j + 2
âÎŁxβ + 2Ď2 sup
j=1...Dbâ2j xβ. (10)
qâ˛1âβ(D) â¤Dâ
j=1
b2jθ2j + Ď2D + 2
âÎŁâ˛xβ + 2Ď2xβ, (11)
where
ÎŁ = Ď4
Dâj=1
bâ4j + 2Ď2
Dâj=1
θ2j
b2j, and ΣⲠ= Ď4D + 2Ď2
Dâj=1
b2jθ2j .
Lemma 7.3 Let (Yj)jâĽ1 a sequence obeying to model (3). Let Îą, β â (0, 1) be fixed andEYc,2(R) the ellipsoid introduced in Section 3. Assume that
c1eι1ku ⤠ck ⤠c2e
Îą2ku
, âk â N?,
where c1, c2, Îą1 and Îą2 denote positive constants independent of k. Then, there exists c3, c4such that
c3Ď2(log(Ďâ2)
)1/2u ⤠Ď2(EYc,2(R), Îą, β) ⤠c4Ď2(log(Ďâ2)
)1/2u.
PROOF. The proof follows the same lines as in [10] (see Corollary 3). Recall from [2] or[10] that the minimax rate on EYc,2(R) verifies
supDâJ
(Ď2D â§R2Ccâ2
D ) ⤠EYc,2(R) ⤠infDâJ
(CĎ2D +R2Ccâ2
D ),
where C is a positive constant depending only on Îą, β and Ď2D = 0(Ď2
âD) as D â +â.
In a first time, we set
D0 =
â(1
2Îą1
log(Ďâ2)
)1/uâ.
14
Then
Ď22(EYc,2(R), Îą, β) ⤠CĎ2
D0+ CR2câ2
D0,
⤠CĎ2(log(Ďâ2)
)1/2u+ CĎ2,
⤠CĎ2(log(Ďâ2)
)1/2u,
where C denotes a constant independent of Ď. Concerning the lower bound, we set
D1 =
â(1
4Îą2
log(Ďâ2)
)1/uâ.
Then
Ď22(EYa,2(R), Îą, β) ⼠Ď2
D1â§R2Ccâ2
D1,
⼠CĎ2(log(Ďâ2)
)1/2u â§ Ď = CĎ2(log(Ďâ2)
)1/2u,
for some C > 0. This concludes the proof.
7.2 Proof of Theorem 4.2
For the sake of convenience, we set Ď = 1 in the sequel without loss of generality.Let us prove the first point of Theorem 4.2. We consider mildly ill-posed inverse problems.Let (ÎŚÎą,D, D ⼠1) be a collection of level-Îą tests which is rate-optimal for HDP
0 on KD.Then, for all D ⼠1
P0(ÎŚÎą,D = 1) ⤠ι, and sup{fâHD:âTfâ2âĽCâ˛
Îą,β(â
D}Pf (ÎŚÎą = 1) ⼠1â β,
for some constant C â˛Îą,β > 0. Remark that (ÎŚÎą,D, D ⼠1) is also a collection of level-Îą
tests for HIP0 . For mildly ill-posed inverse problems (cjâs ⤠bj ⤠Cjâs for all j ⼠1),
there exists a constant C(s) such that
infj=1..D
b2j
(Dâ
j=1
bâ4j
) 12
⼠C(s)âD.
Hence, for all f â HD such that âfâ2 =âD
j=1 θ2j ⼠Cι,β(
âDj=1 b
â4j )
12
âTfâ2 =Dâ
j=1
b2jθ2j ⼠inf
j=1..Db2j
Dâj=1
θ2j ⼠C(s)à Cι,β(
âD.
15
Since (6) holds for all constant CÎą,β( such that C(s) Ă CÎą,β(⼠C â˛Îą,β, (5) holds which
concludes the first part of the proof.
Let us now prove the second point of Theorem 4.2. We consider mildly or severelyill-posed inverse problems. We consider the following testing procedures : for all D ⼠1,let
ÎŚIPÎą,D = 1{PD
j=1 bâ2j Y 2
j âĽsD,Îą} (12)
where
sD,Îą =Dâ
j=1
bâ2j + 2
ââââ Dâj=1
bâ4j xÎą + 2 sup
1â¤jâ¤Dbâ2j xÎą, (13)
with xÎą = ln(1/Îą). It is proved in [10] that for all D ⼠1, ÎŚIPÎą,D is a levelâÎą test which is
rate-optimal for HIP0 on HD. We want to show that the collection of tests (Όι,D, D ⼠1)
is not rate-optimal for HDP0 over the sets (KD, D ⼠1), i.e. that for all C > 0, we can
find D ⼠1 and f â HD such that âTfâ2 ⼠CâDĎ2 and
Pf (ÎŚIPÎą,D = 1) = Pf (b
â2j Y 2
j ⼠sD,ι) ⤠β. (14)
Let q1âβ(D) denote the (1 â β) quantile ofâD
j=1 bâ2j Y 2
j . Inequality (14) holds if sD,Îą âĽq1âβ(D). Using (10), with Ď = 1, the condition sD,Îą ⼠q1âβ(D) is satisfied if
Dâj=1
θ2j + 2
âÎŁxβ ⤠2
ââââ Dâj=1
bâ4j xÎą + 2 sup
1â¤jâ¤Dbâ2j (xÎą â xβ).
Using the inequalityâa+ b â¤
âa+
âb for a, b > 0, the above inequality holds if
Dâj=1
θ2j + 2
ââââ2xβ
Dâj=1
θ2j ⤠2
ââââ Dâj=1
bâ4j (
âxÎą â
âxβ) + 2 sup
1â¤jâ¤Dbâ2j (xÎą â xβ). (15)
One can check that if
Dâj=1
θ2j â¤
ââââ Dâj=1
bâ4j (
âxÎą â
âxβ) + sup
1â¤jâ¤Dbâ2j (xÎą â xβ)â xβ, (16)
then (15) holds, and thus Pf (ΌIPι,D = 1) ⤠β.
For the sake of simplicity, we assume that bj = jâs for mildly ill-posed inverse problemsand bj = exp(âÎłj) for severely ill-posed inverse problems. We consider the functionf â HD defined by θ2
1 = CâD with C > 0, and θj = 0 for all j > 1. Then
âTfâ2 =Dâ
j=1
b2jθ2j =
Dâj=1
θ2j = C
âD.
16
Moreover, the right hand term in (16) is respectively bounded from below by
C(Îą, β, s)D1/2+2s â xβ
in the polynomial case and by
C(Îą, β, Îł) exp(2DÎł)â xβ
when bj = exp(âÎłj).Hence, for D large enough (15) holds and (ÎŚIP
ι,D, D ⼠1) is not rate-optimal for HDP0 over
(KD, D ⼠1).
Let us now prove the third point of Theorem 4.2. We consider severely ill-posed inverseproblems. We introduce the following testing procedure : for all D ⼠1, let
ÎŚDPÎą,D = 1{PD
j=1 Y 2j âĽsD,Îą} with sD,Îą = D + 2
âDxÎą + 2xÎą.
It is proved in [2] that the collection of levelâÎą tests (ÎŚDPÎą,D, D ⼠1) is rate-optimal for
HDP0 on (KD, D ⼠1). For all C > 0, we want to find D ⼠1 and Tf â KD such that
âfâ2 ⼠C(âD
j= bâ4j )1/2 and
Pf (ΌDPι,D = 1) ⤠β.
Let qâ˛1âβ(D) denote the (1âβ) quantile ofâD
j=1 Y2j . If sD,Îą ⼠qâ˛1âβ(f), we have Pf (ÎŚ
DPÎą,D =
1) ⤠β. Using (11) with Ď = 1, the condition sD,Îą ⼠qâ˛1âβ(D) is satisfied if
Dâj=1
b2jθ2j + 2
âÎŁâ˛xβ ⤠2
âDxÎą + 2(xÎą â xβ). (17)
By similar computations as for the second point, this condition holds if
Dâj=1
b2jθ2j + 2
ââââ2xβ
Dâj=1
b2jθ2j ⤠2
âD(âxÎą â
âxβ) + 2(xÎą â xβ).
We consider the function f â HD defined by θD = Cbâ1D with C > 0, and θj = 0 for
j 6= D. ThenDâ
j=1
b2jθ2j = C.
Hence (17) holds for D large enough.
17
7.3 Proof of Lemma 3.2
Assume thatf â
{ν â EXa,2(R), âνâ2 ⼠γĎ
}.
Then, for all m ⼠1
âTfâ2 =âkâĽ1
b2kθ2k âĽ
mâk=1
b2kθ2k ⼠b2m
mâk=1
θ2k = b2m
(âfâ2 â
âk>m
θ2k
).
Since f â EXa,2(R) âk>m
θ2k ⤠aâ2
m
âk>m
a2kθ
2k ⤠R2aâ2
m .
HenceâTfâ2 ⼠b2m
(ÎłĎ âR2aâ2
m
).
We conclude the proof choosing m = m(Ď) such that R2aâ2m(Ď) ⤠cÎłĎ, for some 0 < c < 1
independent of Ď.
7.4 Proof of Theorem 3.1
Let Îą and β be fixed. Denote by Ď2(EXa,2(R), Îą, β) and Ď2(EYc,2(R), Îą, β) the minimax
rates of testing on respectively EXa,2(R) and EYc,2(R). Let ÎŚÎą a level-Îą test minimax for
HDP0 on EYc,2(R). Then there exists C1 positive constant such that
supTfâEYc,2(R), âTfâ2âĽC1Ď2(EYc,2(R),Îą,β)
Pθ(Όι = 0) ⤠β.
Let C2 > 0, thanks to Lemma 3.2 with ÎłĎ = C2Ď2(EXa,2(R), Îą, β){
f â EXa,2(R), âfâ2 ⼠C2Ď2(EXa,2(R), Îą, β)
}â
{f, Tf â EYc,2(R), âTfâ2 ⼠(1â c)C2b
2m(Ď)Ď
2(EXa,2(R), ι, β)}.
Hence, ÎŚÎą is a level-Îą test minimax for HIP0 as soon as
R2aâ2m(Ď) ⤠cC2Ď
2(EXa,2(R), ι, β),
for some 0 < c < 1 and
C1Ď2(EYc,2(R), Îą, β) ⤠(1â c)C2b
2m(Ď)Ď
2(EXa,2(R), ι, β).
We verify that these both inequalities hold for the four different cases given by thetwo degrees of ill-posedness of the operator and the two different sets of smoothness
18
conditions displayed in Table 1. To this end, we use the minimax rates of testingestablished in [2] and [10].
In the following we write Âľ(Ď) ' ν(Ď) if there exists two constants c1 and c2 suchthat for all Ď > 0, c1 ⤠¾(Ď)/ν(Ď) ⤠c2. For all x ⼠0, we denote by bxc the greatestinteger smaller than x and by dxe the smallest integer greater than x. Without loss ofgenerality, we assume that 0 < Ď < Ď1 for some 0 < Ď1 < 1.
1st case: Assume that (ak)kâĽ1 âź (ks)kâĽ1 for some s > 0 and that the problem ismildly ill-posed: (bk)kâĽ1 âź (kât)kâĽ1. In this setting, recall that
Ď2(EXa,2(R), Îą, β) ' Ď4s
2s+2t+1/2 and Ď2(EYc,2(R), Îą, β) ' Ď4s+4t
2s+2t+1/2 .
We definem(Ď) =
âĎ
â22s+2t+1/2
â.
ThenR2aâ2
m(Ď) ' m(Ď)â2s ' Ď4s
2s+2t+1/2 ⤠cC2Ď2(EXa,2(R), Îą, β),
where C2 > 0 and 0 < c < 1. Moreover
b2m(Ď)Ď2(EXa,2(R), Îą, β) ' m(Ď)â2tĎ2(EXa,2(R), Îą, β) ' Ď
4t+4s2s+2t+1/2 ' Ď2(EYc,2(R), Îą, β).
2nd case: Assume that (ak)kâĽ1 âź (exp(νks))kâĽ1 for some ν, s > 0 and that the problem ismildly ill-posed : (bk)kâĽ1 âź (kât)kâĽ1. In this setting, remark that the sequence (akb
â1k )kâN
satisfies the inequality c1eνks ⤠akb
â1k ⤠c2e
2νks. Hence, using Lemma 7.3
Ď2(EXa,2(R), Îą, β) ' Ď2(log(Ďâ2)
) 2t+1/2s and Ď2(EYc,2(R), Îą, β) ' Ď2
(log(Ďâ2)
)1/2s.
We define
m(Ď) =
â(1
2νln(Ďâ2)
)1/sâ.
Then, for Ď small enough, m(Ď) ⼠1 and
R2aâ2m(Ď) ' R2Ď2 ⤠R2Ď2
(ln(Ďâ2)
)(2t+1/2)/s ⤠cC2Ď2(EXa,2(R), Îą, β),
where C2 > 0 and 0 < c < 1. Moreover
b2m(Ď)Ď2(EXa,2(R), Îą, β) ' Ď2
(lnĎâ2
)1/2s ' Ď2(EYc,2(R), Îą, β).
3rd case: Assume that (ak)kâN âź (ks)kâĽ1 for some s > 0 and that the problem is severelyill-posed: (bk)kâĽ1 âź (eâÎłk)kâĽ1 for some Îł > 0. In this setting, remark that the sequence
19
c = (ck)kâĽ1 = (akbâ1k )kâN satisfies the inequality c1e
νk ⤠ck ⤠c2e2νk. Hence, using Lemma
7.3Ď2(EXa,2(R), Îą, β) '
(log(Ďâ2)
)â2sand Ď2(EYc,2(R), Îą, β) ' Ď2
(log(Ďâ2)
)1/2.
We set
m(Ď) =
â1
2Îłlog(Ďâ2 log(Ďâ2)â1/2
)+
1
2Îłlog(log(Ďâ2)â2s)
â.
ThenR2aâ2
m(Ď) ⤠CR2m(Ď)â2s ⤠C(log(Ďâ2)
)â2s ⤠cC2Ď2(EXa,2(R), Îą, β),
where C2 > 0, 0 < c < 1, and
b2m(Ď)Ď2(EXa,2(R), Îą, β) ⼠Ceâ2Îłm(Ď) log(Ďâ2)â2s ⼠CĎ2
(log(Ďâ2)
)1/2 ' Ď2(EYc,2(R), Îą, β)
4th case: Assume that (ak)kâĽ1 âź (exp(νks))kâĽ1 for some ν, s > 0 and that the problemis severely ill-posed: (bk)kâĽ1 âź (eâÎłk)kâĽ1 for some Îł > 0. In this setting, remark that thesequence c = (ck)kâĽ1 = (akb
â1k )kâN satisfies the inequality c1e
γk ⤠ck ⤠c2e2γk. Hence,
using Lemma 7.3
Ď2(EXa,2(R), Îą, β) ' eâ2νDs
and Ď2(EYc,2(R), Îą, β) ' Ď2(log(Ďâ2)
)1/2,
where D denotes the integer part of the solution of 2νDs +2ÎłD = log(Ďâ2). Then, we set
m(Ď) =
â1
2Îłlog(Ďâ2 log(Ďâ2)1/2
)â ν
ÎłDs
â.
Remark that for Ď small enough, m(Ď) ⼠D since s < 1. Therefore
R2aâ2m(Ď) = R2eâ2νm(Ď)s ⤠cC2e
â2νDs
,
where C2 > 0 and 0 < c < 1. Then
b2m(Ď)Ď2(EXa,2(R), Îą, β) ' eâ2Îłm(Ď)â2νDs ⼠CĎ2
(log(Ďâ2)
)1/2.
This concludes the first part of Theorem 3.1. In the second part, we have to find alevel-Îą test minimax for HIP
0 on EXa,2(R) but not for HDP0 on EYc,2(R). For the sake of
convenience, we assume that the problem is mildly ill-posed. The proof follows essentiallythe same lines when (bk)kâĽ1 is an exponentially decreasing sequence. Recall that
Ď2(EXa,2(R), Îą, β) = supDâNâ
(Ď2
D â§R2aâ2D
), with Ď2
D ' Ď2D2s+1,
20
andĎ2(EYc,2(R), Îą, β) = sup
DâNâ
(Ď2
D â§R2b2Daâ2D
), with Ď2
D ' Ď2âD.
Hence, we are looking for a function f and a level-Îą test ÎŚÎą minimax for HIP0 on EXa,2(R)
verifying:
Tf â EYc,2(R), âTfâ2 ⼠CÎą,β supDâNâ
(Ď2D â§Rb2Daâ2
D ) and Pf (Όι = 0) > β.
To this end, introduceD? = inf
{D : Ď2
D ⼠R2b2Daâ2D
}.
ClearlyâTfâ2 ⼠CÎą,βĎ
2D? â âTfâ2 ⼠CÎą,β sup
DâNâ(Ď2
D â§Rb2Daâ2D ).
Let f the function defined as
θ21 = CÎą,βĎ
2âD? and θj = 0 âj > 1. (18)
The function f belongs to the space HD? and âTfâ2 ⼠CÎą,βĎ2âD?. Moreover, Tf â
EYc,2(R) since+ââk=1
a2kbâ2k ăTf, Ďkă2 = a2
1θ21 ' CÎą,βĎ
2âD? ⤠Q,
at least for Ď small enough. Moreover, from Theorem 4.2, we deduce that the global testÎŚD?,Îą defined in (12)-(13) is not powerful for HDP
0 when the alternative is defined by (18).
References
[1] F. Abramovich and B. W. Silverman. Wavelet decomposition approaches to statisticalinverse problems. Biometrika, 85(1):115â129, 1998.
[2] Yannick Baraud. Non-asymptotic minimax rates of testing in signal detection.Bernoulli, 8:577â606, 2002.
[3] N. Bissantz, G. Claeskens, H. Holzmann, and A. Munk. Testing for lack of fit ininverse regression, with applications to biophotonic imaging. preprint, 2008.
[4] Cristina Butucea, Catherine Matias, and Christophe Pouet. Adaptive goodness-of-fit testing from indirect observations. Ann. Inst. Henri Poincare Probab. Stat.,45(2):352â372, 2009.
[5] L. Cavalier and N.W. Hengartner. Adaptative estimation for inverse problems withnoisy operators. Inverse Problems, 21:1345â1361, 2005.
21
[6] David L. Donoho. Nonlinear solution of linear inverse problems by wavelet-vaguelettedecomposition. Appl. Comput. Harmon. Anal., 2(2):101â126, 1995.
[7] Hajo Holzmann, Nicolai Bissantz, and Axel Munk. Density testing in a contaminatedsample. J. Multivariate Anal., 98(1):57â75, 2007.
[8] Sapatinas T. Ingster, Yu.I. and I.A. Suslina. Minimax signal detection in ill-posedinverse problems. 2010. Working paper.
[9] Yu. I. Ingster. Asymptotically minimax hypothesis testing for nonparametric alter-natives. I-II-III. Math. Methods Statist., 2(2):85â114, 171â189, 249â268, 1993.
[10] B. Laurent, J-M. Loubes, and C. Marteau. Non asymptotic minimax rates of testingin signal detection with heterogeneous variances. ArXIv: 0912.2423.
[11] Oleg V. Lepski and Vladimir G. Spokoiny. Minimax nonparametric hypothesis test-ing: the case of an inhomogeneous alternative. Bernoulli, 5(2):333â358, 1999.
[12] Jean-Michel Loubes and Carenne Ludena. Model selection for non linear inverseproblems. ESAIM PS.
[13] Jean-Michel Loubes and Carenne Ludena. Adaptive complexity regularization forlinear inverse problems. Electron. J. Stat., 2:661â677, 2008.
22