A New Class of Shape Constraints for Probability Density Estimates · contact: [email protected] ICSA 2016 made with ffslides 13/14 A shape constraint based on Q000 •Estimation

A New Class of Shape Constraints forA New Class of Shape Constraints for

Probability Density EstimatesProbability Density Estimates

Mark Wolters

Shanghai Center forMathematical Sciences

Fudan University

ICSA 2016, ShanghaiDecember 20, 2016

1

Motivation (1/3)Motivation (1/3)

contact: [email protected] ICSA 2016 made with ffslides 2/14

The parametric sniff test

Here are 6 density estimates. Which are nonparametric?

−0.5 0.0 0.5 1.0

0.0

0.5

1.0

1.5

2.0

x

estim

ate

−0.5 0.0 0.5 1.00.

01.

02.

0

x

estim

ate

−0.5 0.0 0.5 1.0

0.0

0.2

0.4

0.6

0.8

x

estim

ate

−0.5 0.0 0.5 1.0

0.0

1.0

2.0

x

estim

ate

−0.5 0.0 0.5 1.0

0.0

0.5

1.0

1.5

2.0

x

estim

ate

−0.5 0.0 0.5 1.0

0.0

1.0

2.0

x

estim

ate

2



The parametric sniff test

Here are 6 density estimates. Which are nonparametric?

−0.5 0.0 0.5 1.0

0.0

0.5

1.0

1.5

2.0

x

estim

ate

−0.5 0.0 0.5 1.00.

01.

02.

0

x

estim

ate

−0.5 0.0 0.5 1.0

0.0

0.2

0.4

0.6

0.8

x

estim

ate

−0.5 0.0 0.5 1.0

0.0

1.0

2.0

x

estim

ate

−0.5 0.0 0.5 1.0

0.0

0.5

1.0

1.5

2.0

x

estim

ate

−0.5 0.0 0.5 1.0

0.0

1.0

2.0

x

estim

ate

normal gamma lognormal

KDElog-concave NPMLE

(Cule et al. 2010)

unimodal KDE

(Wolters 2012)

3



A typical example: axon diameter distribution

• Small to moderate n

• Smooth and regular his-tograms

• They tried 16 paramet-ric forms and comparedGOF.

4



5



Claim: There is a market among data analysts for density esti-mators that have data-driven shape, but “look like” parametricdensities.

• This is one motivation for shape restricted estimation

• Methods so far use constraints that are simple (e.g. unimodal-ity) or convenient (e.g. log-concavity).

fullynonparametric

fullyparametric

ECDF,

log-concaveNPMLE

KDE UnimodalKDE

Ourproposal?

6

ProposalProposal


Simple idea: restrict the number ofinflection points

• “Waves” and “kinks” in the PDF:concavity changes

• For increased smoothness, restrictinflections of derivatives of f , too

– equivalently, control number ofzeros of f ′, f ′′, f ′′′, . . .

– After f ′′′, restrictions becomeharder to notice.

• For a two-tailed density, maximalsmoothness achieved when f (r) hasr zero crossings.

(equivalently, when f (r) has r+2 in-flection points)

−4 0 2 4 6 8

0.0

0.2

0.4

x

f(x)

Density Function

−4 0 2 4 6 8

−0.

40.

00.

20.

4

x

f(3) (x

)

Third Derivative

−4 0 2 4 6 8

0.00

0.10

0.20

0.30

x

f(x)

Density Function

−4 0 2 4 6 8

−0.

40.

00.

20.

4

x

f(3) (x

)

Third Derivative

−4 0 2 4 6 8

0.00

0.10

0.20

x

f(x)

Density Function

−4 0 2 4 6 8

−0.

040.

000.

04

x

f(3) (x

)

Third Derivative

7

k-monotonicityk-monotonicity


For decreasing densities, the k-monotonicityconcept has already covered this idea∗.

• Definition:– 1-monotone: f nonnegative,

nonincreasing– 2-monotone: f nonnegative,

nonincreasing, convex

– k-monotone: (−1)jf (j) nonnegative,nonincreasing, convex,j = 0, 1, . . . , k − 2.

• But, this only applies to convex, decreas-ing PDFs

• What about two-tailed densities?

∗e.g., Balabdaoui & Wellner, An. Stat 2007; Chee & Wang CSDA 2014

f

f ′

f ′′

f ′′′

f (4)

8

Extending k-monotone idea to two tailsExtending k-monotone idea to two tails


Integral of a...• ...negative function is decreasing• ...positive function is increasing• ...increasing function is convex• ...decreasing function is concave

9

Toward an Estimator (1/4)Toward an Estimator (1/4)


Method of Hall & Huang (Stat Sinica, 2002)

• Given locations of zeros, can solve by QP• Need to search for zero locations• Inherits KDE boundary issues.

Change KDE weights.

Developed for unimodal-ity.

Can be used for otherderivative constraints.

Can be generalized towork with other estima-tor constructions.

10



11



Quantile function approach

Let any point on the CDF be (x, p),where p = F (x) and x = Q(p).

Claim: constraint Q′′′ > 0 sufficientto ensure a nice looking density.

• Q′′′ > 0 guarantees unimodality(at most one zero of f ′).

• Observation: parametric PDFswith bounded f and convex tailshave this property.

• Constructed examples nothaving this property havewaves/kinks/knees

• Still working on formal results onzeros of f ′′, f ′′′.

F

f

f ′

f ′′

f ′′′

Q

Q′

Q′′

Q′′′

Q(4)

12



A shape constraint based on Q′′′

• Estimation of f via Q introduces complexities.

• But derivatives of F and Q have relationships(note p = F (Q(p)), then take derivatives)

• We findQ′′′ > 0 ⇐⇒ 3 (f ′)2 − ff ′′ > 0

– Constraint applies uniformly (no searching for zerolocations)

– If f is a spline, the constraint is quadratic in parameters∗ Max likelihood is convex programming problem∗ Min discrepancy to ECDF is a QCQP problem

• Same constraint applies to unimodal, increasing, or decreasingdensities

• Implementation of this approach is ongoing.

13

ConclusionsConclusions


Summary

• Goal is nonparametric estimators that pass the parametricsniff test.

• Our ideal for qualitative smoothness: f(r) has r + 2 inflectionpoints, r = 0, 1, 2, 3, . . . (two-tailed case)

• Practical options to get close to this ideal:– A weighted-KDE method– Q′′′-derived constraint looks promising

• Still many challenges:

– Hard to avoid a smoothing parameter– Optimal construction, estimation– Theory about estimator quality– Confidence bounds– Testing validity of the constraint

14

Documents

A New Class of Shape Constraints for Probability Density Estimates · contact: [email protected] ICSA 2016 made with ffslides 13/14 A shape constraint based on Q000 •Estimation