25
ZERO-INFLATED NEGATIVE BINOMIAL MODELS IN SMALL AREA ESTIMATION IRENE MUFLIKH NADHIROH DEPARTMENT OF STATISTICS FACULTY OF MATHEMATICS AND NATURAL SCIENCES BOGOR AGRICULTURAL UNIVERSITY 2009

Zero-Inflated Negative Binomial in Small Area Estimation · Irianto Oetomo and Fine Analisa Maharani. She has two siblings. In 1998, she graduated from SD Dukuh 09 East Jakarta and

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Zero-Inflated Negative Binomial in Small Area Estimation · Irianto Oetomo and Fine Analisa Maharani. She has two siblings. In 1998, she graduated from SD Dukuh 09 East Jakarta and

ZERO-INFLATED NEGATIVE BINOMIAL MODELS IN SMALL AREA ESTIMATION

IRENE MUFLIKH NADHIROH

DEPARTMENT OF STATISTICSFACULTY OF MATHEMATICS AND NATURAL SCIENCES

BOGOR AGRICULTURAL UNIVERSITY2009

ii

ABSTRACT

IRENE MUFLIKH NADHIROH Zero-Inflated Negative Binomial Models in Small Area Estimation Under the advisory of KHAIRIL ANWAR NOTODIPUTRO and INDAHWATI

The problem of over-dispersion in Poisson data is usually solved by introducing prior distributions which lead to negative binomial models Poisson data sometime is also suffered by excess zero problems a condition when data contains too many zero or exceeds the distributions expectation Zero Inflated Negative Binomial (ZINB) method can be utilized to solve such problems This paper demonstrates the adaption of ZINB methods in Small Area Estimation with excess zero data It is shown that the excess zero problem has substantially influenced the Empirical Bayes (EB) estimates and the adaption of ZINB methods has improved the precision and reliability of the estimates

Key Words Small Area Estimation Zero-Inflation Poisson-Gamma Negative Binomial Regression Empirical Bayes

iii

ZERO-INFLATED NEGATIVE BINOMIAL MODELS IN SMALL AREA ESTIMATION

BYIRENE MUFLIKH NADHIROH

G14104031

Final Project ReportAs a partial fulfillment for requirement of a Bachelor Degree in Science

at Department of StatisticsFaculty of Mathematics and Natural Sciences

DEPARTMENT OF STATISTICSFACULTY OF MATHEMATICS AND NATURAL SCIENCES

BOGOR AGRICULTURAL UNIVERSITY2009

iv

Title ZERO-INFLATED NEGATIVE BINOMIAL MODELS IN SMALL AREA ESTIMATION

Name Irene Muflikh NadhirohID No G14104031

Approved by

Advisor I Advisor II

Prof Dr Ir Khairil Anwar Notodiputro Ir Indahwati MSi NIP 130891386 NIP 131909223

Acknowledged byDean of Faculty of Mathematics and Natural Sciences

Bogor Agricultural University

Dr Drh Hasim DEANIP 131578806

Passed examination date

v

BIOGRAPHY

Irene Muflikh Nadhiroh was born in Padang on October 3th 1986 as a first daughter of Ir Irianto Oetomo and Fine Analisa Maharani She has two siblings

In 1998 she graduated from SD Dukuh 09 East Jakarta and then she continued his study at SLTP Negeri 1 Bogor and graduated at 2001 She finished her study at SMU Labschool Rawamangun Jakarta in 2004 and then enrolled in Bogor Agricultural University through USMI In 2004 she joined Department of Statistics Faculty of Mathematics and Natural Sciences

During her time of study she was signed up as lecturer assistant for Basic Statistics class and Experimental Design class in 2006 and 2007 respectively She was also a member of Gamma Sigma Beta (Statistics Students AssociationGSB) and had occupied the head of science division of GSB in 2006-2007 On February-March 2008 complete her fields practice at PT Field Dimension Indonesia

vi

ACKNOWLEDGEMENTS

First of all the author modestly admitted that completion of this paper would not be possible without invaluable help from many generous and extraordinary people The author was deeply in debt for their helps ideas critics and improvement advices during writing process However they should not be hold responsible for all mistakes and deficiencies in this paper which were purely authors So hereby I would like to express my graceful to

1 All praise and gratitude for Allah SWT Alhamdulillah hirabbil alamin With his bless I able to finish this paper Thanks Allah for giving me a wonderful life with extraordinary people around me

2 Prof Dr Khairil Anwar Notodiputro and Ir Indahwati MSi for the early motivation discussion advices support and their great enthusiasm

3 Mr Bambang Sumantri MSi as examiner thanks for the spirit advices and critics4 My beloved family for the unlimited love ever after 5 Mr Alfian Futuhul Hadi MSi for enlightening discussion when I was in trouble6 Mr Bagus Sartono MSi thank you very much to run my data at your lab with your

wonderful computer Sorry if it might disturb you7 Mr Anang MSi and Mr Rahman MSi for sharing their knowledge and technical support8 Mr Dr Ir Hari Wijayanto MS all lecturer and staff at Statistics Department Thanks for

knowledge of statistics and knowledge of life that you shared It means a lot for me9 Rahmatullah Sigit Dodiet Sasongko SSi for the spirit love care time and patience Keep

it real Still love me forever and ever10 Mr Dionisius Laksmana Bisara Putra SSi for edited my paper critics and provided useful

discussion for author 11 Maulana Chistanto SSi and Yhanuar Ismail SSi thank for being my best brother12 Nikhen Sevrien and (alm Dini) thanks for lighting my day13 Rere Yusri Agus Ika Cinong Toki Cheri Fisca Wiwik Neng Mala Lilis Dika

Rangga Lele Dodi Kus Inal Bebek Koler and all of Statisticsrsquo41 14 Everyone that helps me in this study which can not be named personally

This thesis is not perfect so I am expecting the critics advices and recommendation to people who read my thesis Thank You God bless you all

Bogor January 2009

Irene Muflikh Nadhiroh

1

TABLE OF CONTENTS

PageINTRODUCTION 1

Background 1Objectives1

LITERATURE REVIEW 1Direct Estimation1Small Area Estimation 1Small Area Models1Empirical Bayes Methods 2Poisson-Gamma Models 2Negative Binomial Regression 2Over-disperse at Count Data 3Zero-Inflated Models3Zero-Inflated Negative Binomial 3Jackknife Method of Estimating MSE( EB

i )3

METHODOLOGY 4Data 4Methods4

RESULT AND DISCUSSION 4Estimation of Prior Parameter is Based of EB Method with Negative Binomial Regression 4Estimation of Prior Parameter is Based of EB Method with Zero-Inflated Negative Binomial Regression 5Comparison of EB estimator with Negative Binomial Regression and EB estimator with ZINB 5

CONCLUSION 5RECOMMENDATION 6REFERENCES 6

LIST OF TABLES

PageTable 1 MSE and RRMSE of EB Estimator with NBR 4Table 2 MSE (II) and RRMSE (II) of EB Estimator with NBR 4Table 3 MSE and RRMSE of EB Estimator with ZINB 5Table 4 MSE (II) and RRMSE (II) of EB Estimator with ZINB 5

LIST OF APPENDICES

PageAppendix 1 Result of EB estimation with NBR 7Appendix 2 Result of EB estimation with ZINB 8Appendix 3 Result of EB estimation (II) with NBR 9Appendix 4 Result of EB estimation (II) with ZINB 10Appendix 5 Syntax program for generate data 11Appendix 6 Syntax program EB with NBR 13Appendix 7 Syntax program EB with ZINB 16

1

INTRODUCTION

BackgroundDirect estimation is usually applied in big

scale survey but it is sometime difficult to utilize such estimator in a smaller region especially the sample size is too small In this case indirect estimation which adds covariates to estimate the parameter is usually used This type of estimation is broadly known as Small Area Estimation

Kismiantini (2007) conducted a research in Small Area Estimation based on Poisson-Gamma models Maximum Likelihood Estimation was used with Negative Binomial Regression techniques to estimate the respective prior parameter Moreover Negative Binomial Regression was used to resolve over-dispersion problem in the data

In reality count data is not onlycharacterized by over-dispersion but sometimes by excess-zero Excess-zero is a condition when the data contains too many zero or exceeds the distributionrsquos expectation 100 observations from Poisson model with response mean of 4 we could expect that there will be 2 zeros If the data have 30 zeros it should be obvious that the distributional assumptions have been violated Therefore the estimated parameter and standard error will be biased (Hardin amp Hilbe 2007) In this paper Zero-Inflated models were adapted to solve this type of problem

ObjectivesThe research objectives are

1 To investigate the performance of Negative Binomial Regression on Small Area Estimation in case of excess-zero

2 To apply Zero-Inflated Count Models on Small Area Estimation in case of excess-zero

3 To evaluate the performance of Zero-Inflated Count Models in estimating prior parameter for Small Area Estimation

LITERATURE REVIEW

Direct EstimationDirect estimates are generally ldquodesign

basedrdquo in the sense that they make use of ldquosurvey weightrdquo and associated inferences are based on the probability distribution by the sample design with the population values held fixed (Rao 2003) In particular direct estimates of a domain parameter are based only on the domain-specific sample data

Data from sample survey have been used to be a reliable estimate of parameter Ramsini et al (2001) mentioned that direct estimates of small area are unbiased although it would have big variance cause itrsquos small sample size

Small Area EstimationThe term of small area can be everything

depending on our object of interest It can be a city age group sex group region and rural district In general small area is used to denote any domain which the direct estimation with adequate precision can not be produced (Rao 2003) It happens because the sample size in small area is too small As a result direct estimation based on sampling design is not capable to produce direct estimation with adequate precision Furthermore small area estimation is developed as a statistic technique for estimating the parameter of small area This technique is used in effort to make estimation with adequate level of precision It works as indirect estimation that lend the strength of variable interest values from related areas through the use of supplementary information related to variable interest such as recent census count and current administrative records (Rao 2003) Indirect estimation is a process of estimating a domainrsquos parameter by connecting the information in that domain with another domain using an appropriate model So the estimator works by including other domainrsquos data (Kurnia amp Notodiputro 2006)

Small Area ModelsThere are two link models in indirect

estimation First traditional method based on implicit models that provide a link to relate small area through supplementary data Second explicit small area models that make specific allowance between area variations (Rao 2003) This research used the second model and it could be classified into two broad types of basic model1 Basic area level (type A) model

Basic area level model or aggregate model includes all models that relate small area with area-specific auxiliary variables These models are essential if unit (element) level data are not available Assuming parameter estimators

i is

related to area specific auxiliary data or covariate variables T

pii xxx )( 11 by

a linear model

2

iiT

ii vbx with i=1hellipm

iv ~N(0 2v ) are area-specific random

effect and Tp )( 1 is 1p vector of

regression coefficients Therefore ib are

known as positive constants For making inferences about

i direct estimators iy

are assumed available Accordingly assuming

iii ey where i=1hellipm with

sampling error ie ~N(0 ei2 ) and ei

2are known At the end both models are combined and as a result is new model

iiiT

ii evbxy where i=1hellipm

(Rao 2003)2 Basic unit level (type B) model

Unit level model includes all models that relate unit values of the study variable to unit-specific auxiliary variables Assuming unit-specific auxiliary variables T

ijpijij xxx )( 1 and

correspondingly a nested regression model

ijiT

ijij evxy where

i=1hellipm and j=1hellip in with

iv ~N(0 2v ) and also ie ~N(0 ei

2 )

Empirical Bayes MethodsThe Bayesian approach is based on Bayes

Law which was found by Thomas Bayes This law was introduced by Richard Proce in 1763 two years after Thomas Bayes passed away In 1774 and 1781 Laplace gave the details and relevancies for modern Bayesian statistics (Gill 2002 in Kismiantini 2007)

Novick in Good (1980) mentioned that Bayes method is difficult to adopt and sometimes is very sensitive due to the requirement of prior probability informationwhich is usually difficult to obtain Robbin (1955) introduced Empirical Bayes methods by assuming a particular prior distribution estimating based on the sample Rao (2003) said that EB (Empirical Bayes) and HB (Hierarchical Bayes) are compatible for binary and count data in Small Area Estimation Therefore EB method was used in this research

Rao (2003) summarized EB methods in Small Area Estimation as follows 1 Obtain the posterior probability density

function of the small area parameter2 Estimate the parameters from the

marginal density function

3 Use the estimated posterior density forinferences regarding the parameters ofinterest

Poisson-Gamma ModelsPoisson model is a standard model in

dealing with count data Generally count data can be suffered by over-dispersion problem Therefore a Poisson formula had been developed to accommodate extra variance from sample data Two-stage models have been introduced for count data known as mixed model Poisson-Gamma Wakefield (2006) introduced Poisson-Gamma model which was easier to use with SMR (Standard Mortality Ratio) as a direct estimator This study used Wakefield model with alteration in direct estimator

Let iy be a number of specific individual

at small area-i which has specific characteristic of interest and written as follow

j

iji yy

ijy are the-jth object at the-ith small area where

j=1hellipn and i=1hellipm

First stage )(~ ii

ind

i Poissony is assumed

where )( ii x describes a regression

model in area level ix is a vector of

covariates and Tpii)( is a vector of

regression coefficientsSecond stage distribution

)1(~ gammaiid

iis assumed as a prior

distribution with mean 1 and variance 1

Then the marginal distribution |iy is

negative binomialMoreover Wakefield (2006) used Bayes

Theorem and acquired posterior distributionas

)1(~|i

iii ygammay

and EB estimator as

iiiiB

iEB

i )ˆ1(ˆˆ)ˆˆ(ˆˆ

with )ˆˆ(ˆˆ iii ii y are direct

estimation from i and iy are the number of

observation

Negative Binomial Regression The negative binomial regression model

seems have been first discussed by Anscombe (1972) Others have pointed out its success indealing with over-dispersed count data

3

Lawless (1987) elaborated the mixture model parameterization of the negative binomial providing formulas for its log likelihood mean variance and moments Later Breslow (1990) cited Lawlessrsquo work and since its inception to the late 1980rsquos the negative binomial regression model have been construed as a mixture model that is useful for accommodating otherwise over-dispersed Poisson data (Hardin amp Hilbe 2007) The negative binomial distribution function is written as

yk

kk

k

y

kyxyg

)1()(

)()|(

where y=012hellip k and are negative

binomial parameter with )(yE and

ky 2)var( k mention as disperse

parameter which is shown that the data consist of over-dispersed

Over-disperse at Count DataCount data for Poisson regression

including by over-disperse if variance bigger than mean or if the expected value of variance is smaller than expected This phenomenon is written as

)()( ii yEyVar (McCullagh amp Nelder 1989)

Zero-Inflated ModelsZero-Inflated models consider two distinct

sources of zero outcomes One source is generated from individuals who do not enter into the counting process the other from those who do enter the count process but result in a zero outcome (Hardin amp Hilbe 2007)

Lambert (1992) first described this type of mixture model in the context of process control in manufacturing It has since been used in many applications and is now founddiscussed in nearly every book or article dealing with count response models

For the zero-inflated model the probability of observing a zero outcome equals the probability that an individual is in the always-zero group plus the probability that individual is not that group times the probability that the counting process produces a zero If )0(B as

the probability that the binary process result in a zero outcomes and )0Pr( as the probability

that the counting of a zero outcomes the probability of a zero outcome for the system is then given by (Hardin amp Hilbe 2007)

)0Pr()1()0()0Pr( ZBy The probability of a nonzero count is

)Pr()]0(1[)0Pr( kBkky This model would produce two groups of

parameter one is zero-inflation parameter which shown that the covariate significantly contribute to having a zero outcomes And the other parameter is negative binomial parameter which modeling the response with the covariate

Zero-Inflated Negative BinomialThere are many kinds of zero-inflated

model each model has plus and minus and is used in different type of data Zero-Inflated negative binomial is one kind of them This model is used in over-disperse and excess-zero data As a result among parameter estimators there would be k parameters which indicate that over-disperse occur in data just as disperse parameter in negative binomial regression

The probability distribution of this model is as follow

)|( iii xyYP )|0()(1)( iii xgxx )|()(1 iii xygx

Where is a function of iz ix are vector

of zero-inflated covariate and is a vector of

zero-inflated coefficient which will be estimated Meanwhile )|( ii xyg is probability

distribution of negative binomial written asiy

i

i

iii

iii y

yxyg

)1()(

)()|(

Mean and variance of ZINB are

))(1)(1()|(

)1()|(

iiiiii

iiii

xyV

xyE

Jackknife Method of Estimating MSE( EBi )

Jackknife methods is one of general methods used in survey because itrsquos unpretentious concept (Jiang Lahiri and Wan 2002) This methods have been known by Tukey (1958) and developed to be a method that capable to be bias corrected of estimator by remove observation-i for i=1hellipm and performs parameter estimation

Rao (2003) the Jackknife step to estimate MSE( EB

i ) are

1 Assume that )ˆˆ(ˆ iiEBi yk

)ˆˆ(ˆ111 ii

EBi yk then calculate

m

l

EBi

EBii m

mM 2

12 )ˆˆ(1ˆ

2 Calculate the delete-i estimator 1

ˆ

and

1 then calculate

4

)]ˆˆ()ˆˆ([1

)ˆˆ(ˆ111111 ii

m

miiiiii ygyg

m

mygM

And )ˆ( 21 vig is the variance estimator of

posterior distribution which is used to measure the variability associated with i

The use of )ˆ( 21 vig is leads to severe of

underestimation of )ˆ( EBiJMSE related

with estimation in prior parameter Therefore the estimator

iM1ˆ correct the

bias of )ˆ( 21 vig

3 Calculate the jackknife estimator of MSE( EB

i ) as

iiEBiJ MMMSE 21

ˆˆ)ˆ(

METHODOLOGY

DataThis research assumed that the available

auxiliary data is on area level so this research used basic area level model The data were simulated with 30 small areas and one covariate Every batch generated different conditions of excess-zeros data start from 01 until 09 probability of zero in small area This research assumed structure of relation between respond and covariate was linear

MethodsThe following steps in generating data

using SAS 91 were used1 Fix the value of

iX for the- i th area

2 Define the expected probability of zero in each small area ))0(( iYP then

calculate ))0(log( ii YPLambda

3 Generate )11(~ Gammai4 Calculate )log(

iiLambda 5 Fit linear regression between and

iX to

obtain0 and

16 Calculate )`exp(X= ii 7 Calculate

iiparmlambda 8 Generate )(~ parmlambdaPoissonyi

Moreover in analyzing data the following steps were applied 1 Generate the negative binomial regression

with genmod procedure in SAS 91 and Zero-Inflated Negative binomial Regression with countreg procedure in SAS 92

2 Estimate the prior parameter which are and

3 Estimate using EB method4 Calculate MSE for indirect estimation5 Calculate RRMSE (Root Relative Mean

Square Error)

i

ii

MSERRMSE

ˆ)ˆ(

)ˆ(

RESULT AND DISCUSSION

Estimation of Prior Parameter is Based on EB Method with Negative Binomial

RegressionIn case of non-excess-zero data the

estimator produced small and consistent MSE Meanwhile if the number of excess-zero isapproximately 30 or more with expected probability of zero 06 the performance of estimates tends to be unreliable As a result EB estimation produced negative values

RRMSE of the estimator increasessimultaneously along with the increase of number of zero in the data Furthermore if thedata contain excess zero at least 30 theestimator is unreliable

Table 1 MSE and RRMSE of EB Estimator with NBR

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 033 016 018 01302 035 020 026 02003 040 023 036 03004 042 027 050 04205 045 031 072 05906 -12875 033 -038 08107 253671 040 -1216 13508 -584495 030 30946 21109 39135606 016 116E+10 664

Table 2 MSE (II) and RRMSE (II) of EB Estimator with NBR

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 033 016 018 01302 035 020 026 02003 040 023 036 03004 042 027 050 04205 046 031 071 05806 26197 033 -035 07507 950007 040 -1002 09908 1444250 030 22054 11009 41595285 016 677E+09 056

5

Table 1 show that the iterative process produced unexpected negative values of MSEThe simplest way to solve this problem is tochange the negative value to zero MSE (II) and RRMSE (II) in table 2 are the result of MSE and RRMSE after the negative value of MSE has been changed to zero

When data have expected probability of zero by 06 to 09 mean of MSE (II) increases drastically Similarly mean of RRMSE (II) increases sharply when data have 08 to 09 expected probability of zero However when data have 06 to 07 expected probability of zero the mean of RRMSE (II) is negative due to the negative value of EB estimates

Estimation of Prior Parameter is Based on EB Method with Zero-Inflated

Negative Binomial RegressionThe EB estimates are similar to the

estimates produced by NBR method although they are slightly outperformed NBR method when the data only contain small number of zeros In particular as shown by table 3 if data have expected probability of zero by 01 to 05 ZINB produces bigger MSE for EB estimator than which NBR produces

Whereas if data have expected probability of zero by 06 to 07 ZINB gives better estimates The estimates were also unbiased as it covers parameter values adequately However ZINB begins to produce inconsistent estimates if data have expected probability of zero by 08 or more due to enormous MSE

Besides when data have expected is because ZINB generates small estimates which is close to the parameter values

Mean of MSE (II) with ZINB is biggerthan the mean of MSE with ZINB That is because when negative value of MSE changed to zero it doesnrsquot have reduction factor in the mean calculation

Comparison of EB estimator withNegative Binomial Regression and EB

estimator with ZINBEB estimates given by both NBR and

ZINB methods are similar for data with small numbers of zero However ZINB method produces bigger MSE than NBR do as long as expected probability of zero in data does not exceed 06 thresholds

But ZINB method performs better if data have expected probability of zero by 06 to 07 In this case EB estimates given by NBR method are unstable and inconsistent due to estimatesrsquo negative value and huge MSE that

can be thousand times larger than theiracceptable value On the other hand EB estimator with ZINB works well it givesunbiased estimates and its MSE values are more stable than EB estimates with NBR

Both methods would have performed poorly if data had expected probability of zero by 08 or more EB estimators with both methods were inconsistent as a result of very huge MSE values they produced

Table 3 MSE and RRMSE of EB Estimator with ZINB

Probability of zero

Mean of MSE

Median ofMSE

Mean ofRRMSE

Median of RRMSE

01 045 017 024 01402 043 020 033 02103 071 028 052 03204 054 028 0632 04205 086 033 7322807 06606 061 038 29817 10307 058 025 218119 19408 -128 -14E-07 162697 37509 2954790 -1E-06 35E+278 609508

Table 4 MSE (II) and RRMSE (II) of EB Estimator with ZINB

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 045 017 024 01402 0436 020 0324 02103 072 028 051 031104 055 028 061 04105 095 033 6561235 05806 075 038 23406 07007 150 025 134655 06908 175 0 733506 009 2954908 0 12E+278 0

CONCLUSION

Excess-zero in data highly influenced the result of EB estimation Conventional method such as negative binomial regression in prior estimation has produced unbiased and unreliable EB estimator for data with expected probability of zero by 06 This is shown bybig number of MSE and negative value of estimator

Meanwhile EB estimation by ZINB method produced more reliable estimator even when the data have expected probability of zero by 06 to 07

The ZINB has also provided a reliable estimator for data with less than 5333 of zeros This means that performance of ZINB

6

declines when the data have expected probability of zero by 08 or more As shown by the big MSE and inconsistent estimator

RECOMMENDATION

This research is based on many assumptions and suffered by several limitations If the assumptions and boundaries can be relaxed can be expected better result There are some recommendations for the next research1 The generating process in this research

does not reflect the real sampling processIf the generating process similar to the real sampling process it might give better result because it will be closer with the real application

2 It will be more interesting to runexperiment which takes account of larger number of areas since the number of areas will influence data modeling

3 The Restricted Maximum Likelihood maybe applied when estimating prior parameter with ZINB and NBR in other to solve the negative value of MSE

4 Theoretical research of ZINB and Empirical Bayes estimator is important to understand the behavior of parameter estimates of ZINB in Empirical Bayes setting

REFERENCES

Erdman D L Jackson A Sinko 2008 Zero-Inflated Poisson and Zero-Inflated Negative Binomial Models Using the COUNTREG Procedure SAS Global Forum 2008322-2008httpwww2sascomproceedingsforum2008322-2008pdf [25 Agustus 2008]

Famoye F KP Singh 2006 Zero-Inflated Generalized Poisson Regression Model with an Application to Domestic Violence Data Journal of Data Science 4117-130

Hardin JW JM Hilbe 2007 Generalized Linear Models and Extensions Texas A Stata Press Publication

Kurnia A KA Notodiputro 2006 Penerapan Metode Jackknife dalam pendugaan Area Kecil Forum Statistika dan Komputasi April 2006 p12-15

Kismiantini 2007 Pendugaan Statistik Area Kecil Berbasis Model Poisson-Gamma [Tesis] Bogor Institut Pertanian Bogor Fakultas Matematika dan Pengetahuan Alam

McCullagh P J A Nelder 1983 Generalized Linear Models London Chapmann and Hall

Ramsini B et all 2001 Uninsured Estimates by County A Review of Options and IssueshttpwwwodhohiogovDataOFHSurvofhsrfq7pdf [24 April 2008]

Rao JNK 2003 Small Area Estimation New York John Wiley amp Sons

Wakefield J 2006 Disease mapping and spatial regression with count data httpwwwbepresscomuwbiostatpaper286pdf [24 April 2008]

7

Appendix 1 Result of EB estimation with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE -100569 0194196 0271669 041875 045917 3239598RRMSE 0123605 0300339 0422426 0503566 0642418 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE -237643 0235733 0306641 0452955 05091 3652167RRMSE 0038956 0412708 0588924 0717336 0844735 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE -749011 0254097 0330078 -12875 0539873 2354887RRMSE -663045 051763 0813734 -038057 1287528 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE -7513075 0235378 0402092 2536714 0876569 6051162RRMSE -10741 0704796 1355566 -121606 3040291 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601

bias 0000395 0116669 0254473 1091172 0497898 6297454MSE -6E+09 -016583 0301527 -584495 5718409 185E+09RRMSE -212936 0927338 2115163 3094627 1359703 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655MSE -38E+09 -130817 0159682 39135606 3074073 12E+11

RRMSE -909131 1647188 6639631 116E+10 1585472 706E+11

8

Appendix 2 Result of EB estimation with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE -053954 010933 0168797 0449506 0369775 360843RRMSE 0022947 0096443 0136424 0238099 0241955 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE -07309 0126202 0201463 0425844 0414597 1734815RRMSE 0021807 0144983 0210692 0326097 0401786 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE -229891 0156942 0277017 0707983 0590466 7469014RRMSE 0023998 0210095 0317195 0519524 0618802 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE -125713 0181557 0284338 0540615 0498521 423089RRMSE 0054916 028362 0420396 0630776 0778033 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442

MSE -181856 0194818 0334706 0859252 0711939 7997074RRMSE 0026206 0387294 0662251 7322807 1312302 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE -34589 0078006 0376514 060793 0804116 3426488RRMSE 000461 0502807 1033578 2981671 2012552 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE -142213 -001433 0255331 0584152 1132152 264456RRMSE 0064209 0847956 1942286 2181192 4589042 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE -10651 -56E-05 -14E-07 -127819 1452962 1132741RRMSE 0063244 1475413 3754705 162697 9221163 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE -175652 -33E-05 -1E-06 2954790 152E-06 613E+08

RRMSE 0040681 4059441 6095076 35E+278 5569021 16E+281

9

Appendix 3 Result of EB estimation (II) with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE 0 0194196 0271669 0419181 045917 3239598RRMSE 0 0300116 0422209 0502895 0641904 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE 0 0235733 0306641 0456258 05091 3652167RRMSE 0 0410357 0585765 0712314 0841838 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE 0 0254097 0330078 2619677 0539873 2354887RRMSE -663045 0448118 0750369 -034911 1209918 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE 0 0235378 0402092 9500073 0876569 6051162RRMSE -10741 0288999 0995659 -100163 2527784 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601bias 0000395 0116669 0254473 1091172 0497898 6297454MSE 0 0 0301527 1444250 5718409 185E+09RRMSE -212936 0 1104113 2205437 5656681 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655

MSE 0 0 0159682 41595285 3074073 12E+11

RRMSE -909131 0 0557622 677E+09 9311925 706E+11

10

Appendix 4 Result of EB estimation (II) with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE 0 010933 0168797 0450626 0369775 360843RRMSE 0 0095932 0135647 023675 0239669 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE 0 0126202 0201463 0428006 0414597 1734815RRMSE 0 0142648 020709 0320663 0395479 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE 0 0156942 0277017 0716543 0590466 7469014RRMSE 0 0203913 0311937 0506882 0615401 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE 0 0181557 0284338 0549835 0498521 423089RRMSE 0 0270309 0405926 0606317 0766631 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442MSE 0 0194818 0334706 094973 0711939 7997074RRMSE 0 0316402 0576343 6561235 1240175 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE 0 0078006 0376514 0749436 0804116 3426488RRMSE 0 0258286 0698814 2340612 1714808 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE 0 0 0255331 1501268 1132152 264456RRMSE 0 0 0688797 1346552 2500825 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE 0 0 0 1755486 1452962 1132741RRMSE 0 0 0 7335062 3311711 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE 0 0 0 2954908 152E-06 613E+08

RRMSE 0 0 0 12E+278 416189 16E+281

11

Appendix 5 Syntax program for generate data

data b generate x1(covariate) and ei input x1cards0222831971100013131702314625252218171412202210run

macro bangkit_datado r=1 to 100

data egenerate poisson-gamma with excess zerodo kk=1 to 30set btetha = rangam(11)lambda = -log(01) peluang munculnya nilai nol yang diinginkan (01-09)starlambda = log(lambdatetha)output endrun

proc regmodel starlambda = x1 ods output ParameterEstimates=workbetha_lr (keep=Parameter Estimate)run

proc transpose data=workbetha_lr out=workbetha_lr_t

12

Appendix 5 Syntax program for generate data (continued)

rundata _null_set workbetha_lr_tcall symput (Intercept col1)call symput (x1 col2)run

data ddo kk=1 to 30set emu = exp(ampIntercept + ampx1x1)parmlambda = mutethaypoi = rand(poissonparmlambda)output endrun

ods trace onto take percent zero on dataproc freq data=dtables ypoi ods output OneWayFreqs=workzerorundata zeroset zerokeep percentrunproc transpose data=zero out=zero1 rundata _null_set workzero1call symput (pctz col1)rundata dset dpzero=amppctzr=amprrun

proc append data=d base=d1run

endmend

bangkit_data

13

Appendix 6 Syntax program EB with NBR

macro sae_nbdo x=1 to 900

data workaset workeif ^(u=ampx) then deleterun

this genmod procedure estimates the response without zero-inflation proc genmod data=amodel ypoi = x1 dist=nb link=logods output ParameterEstimates=workbetha_nb (keep=Parameter Estimate)run

proc transpose data=workbetha_nb out=workbetha_nb_trun

data _null_set workbetha_nb_tcall symput (Intercept col1)call symput (x1 col2)call symput (Dispersion col3)run

EB with negbin-regdata workduga_nbset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + ampDispersion)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(ampDispersion+ypoi)((mu_hat_b+ampDispersion)2)bias_b=abs(teta_hat_bayes-parmlambda)run

proc append data=workduga_nb base=workduga_nb1run

jacknifedo h=1 to 30

data workdset workduga_nb1if ^(u=ampx) then deleterundata workjacknbamphset workdif u=ampxif kk=amph then deleterun

proc genmod data=workjacknbamph output p out=sasyi_estmodel ypoi = x1 dist = nb link=logods output parameterestimates=workbetha_est_nbamph (keep=parameter Estimate)

14

Appendix 6 Syntax program EB with NBR (continued)

runproc transpose data=workbetha_est_nbamph out=workbetha_est_nbtamphrundata _null_set workbetha_est_nbtamphcall symput (Intercept_ col1)call symput (x1_ col2)call symput (Dispersion_ col3)run

data workduganbamphset workdmu_hat_b_amph=exp(ampIntercept_ + ampx1_x1)w_b_amph=mu_hat_b_amph (mu_hat_b_amph + ampDispersion_)teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2g1_amph=(ampDispersion_+ypoi)((mu_hat_b_amph+ampDispersion_)2)beda_g_amph=g1_amph-g1run

data workmse_nb_jmerge workduganb1 workduganb2 workduganb3 workduganb4 workduganb5 workduganb6 workduganb7 workduganb8 workduganb9 workduganb10 workduganb11 workduganb12workduganb13 workduganb14 workduganb15 workduganb16 workduganb17workduganb18 workduganb19 workduganb20 workduganb21 workduganb22workduganb23 workduganb24 workduganb25 workduganb26 workduganb27workduganb28 workduganb29 workduganb30by kkrun

data workmse_nb_jset workmse_nb_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampjendm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesul = ampxrun

proc append data=workmse_nb_j base=workmse_nb_j1run

data workhasilnbmerge workd workmse_nb_j keep kk x1 tetha mu parmlambda ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_b

15

Appendix 6 Syntax program EB with NBR (continued)

run

ods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilnb BASE=workhasilnb1 appendver=v6run

ENDmend

sae_nb

16

Appendix 7 Syntax program EB with ZINB

macro sae_zinb

do x=1 to 900

data workaset work eif ^(u=ampx) then deleterun

proc countreg data=amodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workpe(keep=Parameter Estimate)run

proc transpose data=workpe out=workpe_trun

data _null_set workpe_tcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + amp_Alpha)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(amp_Alpha+ypoi)((mu_hat_b+amp_Alpha)2)bias_b=abs(teta_hat_bayes-parmlamdha)

run

proc append data=workduga base=workduga1run

do h=1 to 30

data workdset workduga1if ^(u=ampx) then deleterundata workjackzinbamphset workdif u=ampxif kk=amph then deleterun

proc countreg data=jackzinbamphmodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workbetha_est_ZINBamph

17

Appendix 7 Syntax program EB with ZINB (continued)

(keep=Parameter Estimate)run

proc transpose data=workbetha_est_ZINBamph out=workbetha_est_ZINBtamphrun

data _null_set workbetha_est_ZINBtamphcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaZINBamphset workdmu_hat_b_amph=exp(ampIntercept + ampx1x1)mu_hat_b_amph= ampb_o- + ampb_1- x1w_b_amph=mu_hat_b_amph (mu_hat_b_amph + (amp_Alpha))teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2

g1_amph =((mu_hat_b_amph2ampalpha_)2)(ampalpha_+y_i)((mu_hat_b_amph2ampalpha_)+mu_hat_b_amph)2

g1_amph=(amp_Alpha+ypoi)((mu_hat_b_amph+amp_Alpha)2)

g1_amph =(A2)(ampk- + y_i)( a +mu_hat_b)2

beda_g_amph=g1_amph-g1run

data workmse_ZINB_jmerge workdugaZINB1 workdugaZINB2 workdugaZINB3 workdugaZINB4 workdugaZINB5 workdugaZINB6 workdugaZINB7 workdugaZINB8 workdugaZINB9 workdugaZINB10 workdugaZINB11 workdugaZINB12workdugaZINB13 workdugaZINB14 workdugaZINB15 workdugaZINB16 workdugaZINB17workdugaZINB18 workdugaZINB19 workdugaZINB20 workdugaZINB21 workdugaZINB22workdugaZINB23 workdugaZINB24 workdugaZINB25 workdugaZINB26 workdugaZINB27workdugaZINB28 workdugaZINB29 workdugaZINB30by kkrun

data workmse_ZINB_jset workmse_ZINB_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampj

18

Appendix 7 Syntax program EB with ZINB (continued)

endm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesrun

data workhasilZINBmerge workd workmse_ZINB_j keep kk x1 tetha mu lamdha ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_brunods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilZINB BASE=workhasilZINB1run

ENDmend

sae_zinb

  • KOPER AMPE PRAKATA_2pdf
  • isiirenepdf
Page 2: Zero-Inflated Negative Binomial in Small Area Estimation · Irianto Oetomo and Fine Analisa Maharani. She has two siblings. In 1998, she graduated from SD Dukuh 09 East Jakarta and

ii

ABSTRACT

IRENE MUFLIKH NADHIROH Zero-Inflated Negative Binomial Models in Small Area Estimation Under the advisory of KHAIRIL ANWAR NOTODIPUTRO and INDAHWATI

The problem of over-dispersion in Poisson data is usually solved by introducing prior distributions which lead to negative binomial models Poisson data sometime is also suffered by excess zero problems a condition when data contains too many zero or exceeds the distributions expectation Zero Inflated Negative Binomial (ZINB) method can be utilized to solve such problems This paper demonstrates the adaption of ZINB methods in Small Area Estimation with excess zero data It is shown that the excess zero problem has substantially influenced the Empirical Bayes (EB) estimates and the adaption of ZINB methods has improved the precision and reliability of the estimates

Key Words Small Area Estimation Zero-Inflation Poisson-Gamma Negative Binomial Regression Empirical Bayes

iii

ZERO-INFLATED NEGATIVE BINOMIAL MODELS IN SMALL AREA ESTIMATION

BYIRENE MUFLIKH NADHIROH

G14104031

Final Project ReportAs a partial fulfillment for requirement of a Bachelor Degree in Science

at Department of StatisticsFaculty of Mathematics and Natural Sciences

DEPARTMENT OF STATISTICSFACULTY OF MATHEMATICS AND NATURAL SCIENCES

BOGOR AGRICULTURAL UNIVERSITY2009

iv

Title ZERO-INFLATED NEGATIVE BINOMIAL MODELS IN SMALL AREA ESTIMATION

Name Irene Muflikh NadhirohID No G14104031

Approved by

Advisor I Advisor II

Prof Dr Ir Khairil Anwar Notodiputro Ir Indahwati MSi NIP 130891386 NIP 131909223

Acknowledged byDean of Faculty of Mathematics and Natural Sciences

Bogor Agricultural University

Dr Drh Hasim DEANIP 131578806

Passed examination date

v

BIOGRAPHY

Irene Muflikh Nadhiroh was born in Padang on October 3th 1986 as a first daughter of Ir Irianto Oetomo and Fine Analisa Maharani She has two siblings

In 1998 she graduated from SD Dukuh 09 East Jakarta and then she continued his study at SLTP Negeri 1 Bogor and graduated at 2001 She finished her study at SMU Labschool Rawamangun Jakarta in 2004 and then enrolled in Bogor Agricultural University through USMI In 2004 she joined Department of Statistics Faculty of Mathematics and Natural Sciences

During her time of study she was signed up as lecturer assistant for Basic Statistics class and Experimental Design class in 2006 and 2007 respectively She was also a member of Gamma Sigma Beta (Statistics Students AssociationGSB) and had occupied the head of science division of GSB in 2006-2007 On February-March 2008 complete her fields practice at PT Field Dimension Indonesia

vi

ACKNOWLEDGEMENTS

First of all the author modestly admitted that completion of this paper would not be possible without invaluable help from many generous and extraordinary people The author was deeply in debt for their helps ideas critics and improvement advices during writing process However they should not be hold responsible for all mistakes and deficiencies in this paper which were purely authors So hereby I would like to express my graceful to

1 All praise and gratitude for Allah SWT Alhamdulillah hirabbil alamin With his bless I able to finish this paper Thanks Allah for giving me a wonderful life with extraordinary people around me

2 Prof Dr Khairil Anwar Notodiputro and Ir Indahwati MSi for the early motivation discussion advices support and their great enthusiasm

3 Mr Bambang Sumantri MSi as examiner thanks for the spirit advices and critics4 My beloved family for the unlimited love ever after 5 Mr Alfian Futuhul Hadi MSi for enlightening discussion when I was in trouble6 Mr Bagus Sartono MSi thank you very much to run my data at your lab with your

wonderful computer Sorry if it might disturb you7 Mr Anang MSi and Mr Rahman MSi for sharing their knowledge and technical support8 Mr Dr Ir Hari Wijayanto MS all lecturer and staff at Statistics Department Thanks for

knowledge of statistics and knowledge of life that you shared It means a lot for me9 Rahmatullah Sigit Dodiet Sasongko SSi for the spirit love care time and patience Keep

it real Still love me forever and ever10 Mr Dionisius Laksmana Bisara Putra SSi for edited my paper critics and provided useful

discussion for author 11 Maulana Chistanto SSi and Yhanuar Ismail SSi thank for being my best brother12 Nikhen Sevrien and (alm Dini) thanks for lighting my day13 Rere Yusri Agus Ika Cinong Toki Cheri Fisca Wiwik Neng Mala Lilis Dika

Rangga Lele Dodi Kus Inal Bebek Koler and all of Statisticsrsquo41 14 Everyone that helps me in this study which can not be named personally

This thesis is not perfect so I am expecting the critics advices and recommendation to people who read my thesis Thank You God bless you all

Bogor January 2009

Irene Muflikh Nadhiroh

1

TABLE OF CONTENTS

PageINTRODUCTION 1

Background 1Objectives1

LITERATURE REVIEW 1Direct Estimation1Small Area Estimation 1Small Area Models1Empirical Bayes Methods 2Poisson-Gamma Models 2Negative Binomial Regression 2Over-disperse at Count Data 3Zero-Inflated Models3Zero-Inflated Negative Binomial 3Jackknife Method of Estimating MSE( EB

i )3

METHODOLOGY 4Data 4Methods4

RESULT AND DISCUSSION 4Estimation of Prior Parameter is Based of EB Method with Negative Binomial Regression 4Estimation of Prior Parameter is Based of EB Method with Zero-Inflated Negative Binomial Regression 5Comparison of EB estimator with Negative Binomial Regression and EB estimator with ZINB 5

CONCLUSION 5RECOMMENDATION 6REFERENCES 6

LIST OF TABLES

PageTable 1 MSE and RRMSE of EB Estimator with NBR 4Table 2 MSE (II) and RRMSE (II) of EB Estimator with NBR 4Table 3 MSE and RRMSE of EB Estimator with ZINB 5Table 4 MSE (II) and RRMSE (II) of EB Estimator with ZINB 5

LIST OF APPENDICES

PageAppendix 1 Result of EB estimation with NBR 7Appendix 2 Result of EB estimation with ZINB 8Appendix 3 Result of EB estimation (II) with NBR 9Appendix 4 Result of EB estimation (II) with ZINB 10Appendix 5 Syntax program for generate data 11Appendix 6 Syntax program EB with NBR 13Appendix 7 Syntax program EB with ZINB 16

1

INTRODUCTION

BackgroundDirect estimation is usually applied in big

scale survey but it is sometime difficult to utilize such estimator in a smaller region especially the sample size is too small In this case indirect estimation which adds covariates to estimate the parameter is usually used This type of estimation is broadly known as Small Area Estimation

Kismiantini (2007) conducted a research in Small Area Estimation based on Poisson-Gamma models Maximum Likelihood Estimation was used with Negative Binomial Regression techniques to estimate the respective prior parameter Moreover Negative Binomial Regression was used to resolve over-dispersion problem in the data

In reality count data is not onlycharacterized by over-dispersion but sometimes by excess-zero Excess-zero is a condition when the data contains too many zero or exceeds the distributionrsquos expectation 100 observations from Poisson model with response mean of 4 we could expect that there will be 2 zeros If the data have 30 zeros it should be obvious that the distributional assumptions have been violated Therefore the estimated parameter and standard error will be biased (Hardin amp Hilbe 2007) In this paper Zero-Inflated models were adapted to solve this type of problem

ObjectivesThe research objectives are

1 To investigate the performance of Negative Binomial Regression on Small Area Estimation in case of excess-zero

2 To apply Zero-Inflated Count Models on Small Area Estimation in case of excess-zero

3 To evaluate the performance of Zero-Inflated Count Models in estimating prior parameter for Small Area Estimation

LITERATURE REVIEW

Direct EstimationDirect estimates are generally ldquodesign

basedrdquo in the sense that they make use of ldquosurvey weightrdquo and associated inferences are based on the probability distribution by the sample design with the population values held fixed (Rao 2003) In particular direct estimates of a domain parameter are based only on the domain-specific sample data

Data from sample survey have been used to be a reliable estimate of parameter Ramsini et al (2001) mentioned that direct estimates of small area are unbiased although it would have big variance cause itrsquos small sample size

Small Area EstimationThe term of small area can be everything

depending on our object of interest It can be a city age group sex group region and rural district In general small area is used to denote any domain which the direct estimation with adequate precision can not be produced (Rao 2003) It happens because the sample size in small area is too small As a result direct estimation based on sampling design is not capable to produce direct estimation with adequate precision Furthermore small area estimation is developed as a statistic technique for estimating the parameter of small area This technique is used in effort to make estimation with adequate level of precision It works as indirect estimation that lend the strength of variable interest values from related areas through the use of supplementary information related to variable interest such as recent census count and current administrative records (Rao 2003) Indirect estimation is a process of estimating a domainrsquos parameter by connecting the information in that domain with another domain using an appropriate model So the estimator works by including other domainrsquos data (Kurnia amp Notodiputro 2006)

Small Area ModelsThere are two link models in indirect

estimation First traditional method based on implicit models that provide a link to relate small area through supplementary data Second explicit small area models that make specific allowance between area variations (Rao 2003) This research used the second model and it could be classified into two broad types of basic model1 Basic area level (type A) model

Basic area level model or aggregate model includes all models that relate small area with area-specific auxiliary variables These models are essential if unit (element) level data are not available Assuming parameter estimators

i is

related to area specific auxiliary data or covariate variables T

pii xxx )( 11 by

a linear model

2

iiT

ii vbx with i=1hellipm

iv ~N(0 2v ) are area-specific random

effect and Tp )( 1 is 1p vector of

regression coefficients Therefore ib are

known as positive constants For making inferences about

i direct estimators iy

are assumed available Accordingly assuming

iii ey where i=1hellipm with

sampling error ie ~N(0 ei2 ) and ei

2are known At the end both models are combined and as a result is new model

iiiT

ii evbxy where i=1hellipm

(Rao 2003)2 Basic unit level (type B) model

Unit level model includes all models that relate unit values of the study variable to unit-specific auxiliary variables Assuming unit-specific auxiliary variables T

ijpijij xxx )( 1 and

correspondingly a nested regression model

ijiT

ijij evxy where

i=1hellipm and j=1hellip in with

iv ~N(0 2v ) and also ie ~N(0 ei

2 )

Empirical Bayes MethodsThe Bayesian approach is based on Bayes

Law which was found by Thomas Bayes This law was introduced by Richard Proce in 1763 two years after Thomas Bayes passed away In 1774 and 1781 Laplace gave the details and relevancies for modern Bayesian statistics (Gill 2002 in Kismiantini 2007)

Novick in Good (1980) mentioned that Bayes method is difficult to adopt and sometimes is very sensitive due to the requirement of prior probability informationwhich is usually difficult to obtain Robbin (1955) introduced Empirical Bayes methods by assuming a particular prior distribution estimating based on the sample Rao (2003) said that EB (Empirical Bayes) and HB (Hierarchical Bayes) are compatible for binary and count data in Small Area Estimation Therefore EB method was used in this research

Rao (2003) summarized EB methods in Small Area Estimation as follows 1 Obtain the posterior probability density

function of the small area parameter2 Estimate the parameters from the

marginal density function

3 Use the estimated posterior density forinferences regarding the parameters ofinterest

Poisson-Gamma ModelsPoisson model is a standard model in

dealing with count data Generally count data can be suffered by over-dispersion problem Therefore a Poisson formula had been developed to accommodate extra variance from sample data Two-stage models have been introduced for count data known as mixed model Poisson-Gamma Wakefield (2006) introduced Poisson-Gamma model which was easier to use with SMR (Standard Mortality Ratio) as a direct estimator This study used Wakefield model with alteration in direct estimator

Let iy be a number of specific individual

at small area-i which has specific characteristic of interest and written as follow

j

iji yy

ijy are the-jth object at the-ith small area where

j=1hellipn and i=1hellipm

First stage )(~ ii

ind

i Poissony is assumed

where )( ii x describes a regression

model in area level ix is a vector of

covariates and Tpii)( is a vector of

regression coefficientsSecond stage distribution

)1(~ gammaiid

iis assumed as a prior

distribution with mean 1 and variance 1

Then the marginal distribution |iy is

negative binomialMoreover Wakefield (2006) used Bayes

Theorem and acquired posterior distributionas

)1(~|i

iii ygammay

and EB estimator as

iiiiB

iEB

i )ˆ1(ˆˆ)ˆˆ(ˆˆ

with )ˆˆ(ˆˆ iii ii y are direct

estimation from i and iy are the number of

observation

Negative Binomial Regression The negative binomial regression model

seems have been first discussed by Anscombe (1972) Others have pointed out its success indealing with over-dispersed count data

3

Lawless (1987) elaborated the mixture model parameterization of the negative binomial providing formulas for its log likelihood mean variance and moments Later Breslow (1990) cited Lawlessrsquo work and since its inception to the late 1980rsquos the negative binomial regression model have been construed as a mixture model that is useful for accommodating otherwise over-dispersed Poisson data (Hardin amp Hilbe 2007) The negative binomial distribution function is written as

yk

kk

k

y

kyxyg

)1()(

)()|(

where y=012hellip k and are negative

binomial parameter with )(yE and

ky 2)var( k mention as disperse

parameter which is shown that the data consist of over-dispersed

Over-disperse at Count DataCount data for Poisson regression

including by over-disperse if variance bigger than mean or if the expected value of variance is smaller than expected This phenomenon is written as

)()( ii yEyVar (McCullagh amp Nelder 1989)

Zero-Inflated ModelsZero-Inflated models consider two distinct

sources of zero outcomes One source is generated from individuals who do not enter into the counting process the other from those who do enter the count process but result in a zero outcome (Hardin amp Hilbe 2007)

Lambert (1992) first described this type of mixture model in the context of process control in manufacturing It has since been used in many applications and is now founddiscussed in nearly every book or article dealing with count response models

For the zero-inflated model the probability of observing a zero outcome equals the probability that an individual is in the always-zero group plus the probability that individual is not that group times the probability that the counting process produces a zero If )0(B as

the probability that the binary process result in a zero outcomes and )0Pr( as the probability

that the counting of a zero outcomes the probability of a zero outcome for the system is then given by (Hardin amp Hilbe 2007)

)0Pr()1()0()0Pr( ZBy The probability of a nonzero count is

)Pr()]0(1[)0Pr( kBkky This model would produce two groups of

parameter one is zero-inflation parameter which shown that the covariate significantly contribute to having a zero outcomes And the other parameter is negative binomial parameter which modeling the response with the covariate

Zero-Inflated Negative BinomialThere are many kinds of zero-inflated

model each model has plus and minus and is used in different type of data Zero-Inflated negative binomial is one kind of them This model is used in over-disperse and excess-zero data As a result among parameter estimators there would be k parameters which indicate that over-disperse occur in data just as disperse parameter in negative binomial regression

The probability distribution of this model is as follow

)|( iii xyYP )|0()(1)( iii xgxx )|()(1 iii xygx

Where is a function of iz ix are vector

of zero-inflated covariate and is a vector of

zero-inflated coefficient which will be estimated Meanwhile )|( ii xyg is probability

distribution of negative binomial written asiy

i

i

iii

iii y

yxyg

)1()(

)()|(

Mean and variance of ZINB are

))(1)(1()|(

)1()|(

iiiiii

iiii

xyV

xyE

Jackknife Method of Estimating MSE( EBi )

Jackknife methods is one of general methods used in survey because itrsquos unpretentious concept (Jiang Lahiri and Wan 2002) This methods have been known by Tukey (1958) and developed to be a method that capable to be bias corrected of estimator by remove observation-i for i=1hellipm and performs parameter estimation

Rao (2003) the Jackknife step to estimate MSE( EB

i ) are

1 Assume that )ˆˆ(ˆ iiEBi yk

)ˆˆ(ˆ111 ii

EBi yk then calculate

m

l

EBi

EBii m

mM 2

12 )ˆˆ(1ˆ

2 Calculate the delete-i estimator 1

ˆ

and

1 then calculate

4

)]ˆˆ()ˆˆ([1

)ˆˆ(ˆ111111 ii

m

miiiiii ygyg

m

mygM

And )ˆ( 21 vig is the variance estimator of

posterior distribution which is used to measure the variability associated with i

The use of )ˆ( 21 vig is leads to severe of

underestimation of )ˆ( EBiJMSE related

with estimation in prior parameter Therefore the estimator

iM1ˆ correct the

bias of )ˆ( 21 vig

3 Calculate the jackknife estimator of MSE( EB

i ) as

iiEBiJ MMMSE 21

ˆˆ)ˆ(

METHODOLOGY

DataThis research assumed that the available

auxiliary data is on area level so this research used basic area level model The data were simulated with 30 small areas and one covariate Every batch generated different conditions of excess-zeros data start from 01 until 09 probability of zero in small area This research assumed structure of relation between respond and covariate was linear

MethodsThe following steps in generating data

using SAS 91 were used1 Fix the value of

iX for the- i th area

2 Define the expected probability of zero in each small area ))0(( iYP then

calculate ))0(log( ii YPLambda

3 Generate )11(~ Gammai4 Calculate )log(

iiLambda 5 Fit linear regression between and

iX to

obtain0 and

16 Calculate )`exp(X= ii 7 Calculate

iiparmlambda 8 Generate )(~ parmlambdaPoissonyi

Moreover in analyzing data the following steps were applied 1 Generate the negative binomial regression

with genmod procedure in SAS 91 and Zero-Inflated Negative binomial Regression with countreg procedure in SAS 92

2 Estimate the prior parameter which are and

3 Estimate using EB method4 Calculate MSE for indirect estimation5 Calculate RRMSE (Root Relative Mean

Square Error)

i

ii

MSERRMSE

ˆ)ˆ(

)ˆ(

RESULT AND DISCUSSION

Estimation of Prior Parameter is Based on EB Method with Negative Binomial

RegressionIn case of non-excess-zero data the

estimator produced small and consistent MSE Meanwhile if the number of excess-zero isapproximately 30 or more with expected probability of zero 06 the performance of estimates tends to be unreliable As a result EB estimation produced negative values

RRMSE of the estimator increasessimultaneously along with the increase of number of zero in the data Furthermore if thedata contain excess zero at least 30 theestimator is unreliable

Table 1 MSE and RRMSE of EB Estimator with NBR

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 033 016 018 01302 035 020 026 02003 040 023 036 03004 042 027 050 04205 045 031 072 05906 -12875 033 -038 08107 253671 040 -1216 13508 -584495 030 30946 21109 39135606 016 116E+10 664

Table 2 MSE (II) and RRMSE (II) of EB Estimator with NBR

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 033 016 018 01302 035 020 026 02003 040 023 036 03004 042 027 050 04205 046 031 071 05806 26197 033 -035 07507 950007 040 -1002 09908 1444250 030 22054 11009 41595285 016 677E+09 056

5

Table 1 show that the iterative process produced unexpected negative values of MSEThe simplest way to solve this problem is tochange the negative value to zero MSE (II) and RRMSE (II) in table 2 are the result of MSE and RRMSE after the negative value of MSE has been changed to zero

When data have expected probability of zero by 06 to 09 mean of MSE (II) increases drastically Similarly mean of RRMSE (II) increases sharply when data have 08 to 09 expected probability of zero However when data have 06 to 07 expected probability of zero the mean of RRMSE (II) is negative due to the negative value of EB estimates

Estimation of Prior Parameter is Based on EB Method with Zero-Inflated

Negative Binomial RegressionThe EB estimates are similar to the

estimates produced by NBR method although they are slightly outperformed NBR method when the data only contain small number of zeros In particular as shown by table 3 if data have expected probability of zero by 01 to 05 ZINB produces bigger MSE for EB estimator than which NBR produces

Whereas if data have expected probability of zero by 06 to 07 ZINB gives better estimates The estimates were also unbiased as it covers parameter values adequately However ZINB begins to produce inconsistent estimates if data have expected probability of zero by 08 or more due to enormous MSE

Besides when data have expected is because ZINB generates small estimates which is close to the parameter values

Mean of MSE (II) with ZINB is biggerthan the mean of MSE with ZINB That is because when negative value of MSE changed to zero it doesnrsquot have reduction factor in the mean calculation

Comparison of EB estimator withNegative Binomial Regression and EB

estimator with ZINBEB estimates given by both NBR and

ZINB methods are similar for data with small numbers of zero However ZINB method produces bigger MSE than NBR do as long as expected probability of zero in data does not exceed 06 thresholds

But ZINB method performs better if data have expected probability of zero by 06 to 07 In this case EB estimates given by NBR method are unstable and inconsistent due to estimatesrsquo negative value and huge MSE that

can be thousand times larger than theiracceptable value On the other hand EB estimator with ZINB works well it givesunbiased estimates and its MSE values are more stable than EB estimates with NBR

Both methods would have performed poorly if data had expected probability of zero by 08 or more EB estimators with both methods were inconsistent as a result of very huge MSE values they produced

Table 3 MSE and RRMSE of EB Estimator with ZINB

Probability of zero

Mean of MSE

Median ofMSE

Mean ofRRMSE

Median of RRMSE

01 045 017 024 01402 043 020 033 02103 071 028 052 03204 054 028 0632 04205 086 033 7322807 06606 061 038 29817 10307 058 025 218119 19408 -128 -14E-07 162697 37509 2954790 -1E-06 35E+278 609508

Table 4 MSE (II) and RRMSE (II) of EB Estimator with ZINB

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 045 017 024 01402 0436 020 0324 02103 072 028 051 031104 055 028 061 04105 095 033 6561235 05806 075 038 23406 07007 150 025 134655 06908 175 0 733506 009 2954908 0 12E+278 0

CONCLUSION

Excess-zero in data highly influenced the result of EB estimation Conventional method such as negative binomial regression in prior estimation has produced unbiased and unreliable EB estimator for data with expected probability of zero by 06 This is shown bybig number of MSE and negative value of estimator

Meanwhile EB estimation by ZINB method produced more reliable estimator even when the data have expected probability of zero by 06 to 07

The ZINB has also provided a reliable estimator for data with less than 5333 of zeros This means that performance of ZINB

6

declines when the data have expected probability of zero by 08 or more As shown by the big MSE and inconsistent estimator

RECOMMENDATION

This research is based on many assumptions and suffered by several limitations If the assumptions and boundaries can be relaxed can be expected better result There are some recommendations for the next research1 The generating process in this research

does not reflect the real sampling processIf the generating process similar to the real sampling process it might give better result because it will be closer with the real application

2 It will be more interesting to runexperiment which takes account of larger number of areas since the number of areas will influence data modeling

3 The Restricted Maximum Likelihood maybe applied when estimating prior parameter with ZINB and NBR in other to solve the negative value of MSE

4 Theoretical research of ZINB and Empirical Bayes estimator is important to understand the behavior of parameter estimates of ZINB in Empirical Bayes setting

REFERENCES

Erdman D L Jackson A Sinko 2008 Zero-Inflated Poisson and Zero-Inflated Negative Binomial Models Using the COUNTREG Procedure SAS Global Forum 2008322-2008httpwww2sascomproceedingsforum2008322-2008pdf [25 Agustus 2008]

Famoye F KP Singh 2006 Zero-Inflated Generalized Poisson Regression Model with an Application to Domestic Violence Data Journal of Data Science 4117-130

Hardin JW JM Hilbe 2007 Generalized Linear Models and Extensions Texas A Stata Press Publication

Kurnia A KA Notodiputro 2006 Penerapan Metode Jackknife dalam pendugaan Area Kecil Forum Statistika dan Komputasi April 2006 p12-15

Kismiantini 2007 Pendugaan Statistik Area Kecil Berbasis Model Poisson-Gamma [Tesis] Bogor Institut Pertanian Bogor Fakultas Matematika dan Pengetahuan Alam

McCullagh P J A Nelder 1983 Generalized Linear Models London Chapmann and Hall

Ramsini B et all 2001 Uninsured Estimates by County A Review of Options and IssueshttpwwwodhohiogovDataOFHSurvofhsrfq7pdf [24 April 2008]

Rao JNK 2003 Small Area Estimation New York John Wiley amp Sons

Wakefield J 2006 Disease mapping and spatial regression with count data httpwwwbepresscomuwbiostatpaper286pdf [24 April 2008]

7

Appendix 1 Result of EB estimation with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE -100569 0194196 0271669 041875 045917 3239598RRMSE 0123605 0300339 0422426 0503566 0642418 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE -237643 0235733 0306641 0452955 05091 3652167RRMSE 0038956 0412708 0588924 0717336 0844735 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE -749011 0254097 0330078 -12875 0539873 2354887RRMSE -663045 051763 0813734 -038057 1287528 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE -7513075 0235378 0402092 2536714 0876569 6051162RRMSE -10741 0704796 1355566 -121606 3040291 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601

bias 0000395 0116669 0254473 1091172 0497898 6297454MSE -6E+09 -016583 0301527 -584495 5718409 185E+09RRMSE -212936 0927338 2115163 3094627 1359703 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655MSE -38E+09 -130817 0159682 39135606 3074073 12E+11

RRMSE -909131 1647188 6639631 116E+10 1585472 706E+11

8

Appendix 2 Result of EB estimation with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE -053954 010933 0168797 0449506 0369775 360843RRMSE 0022947 0096443 0136424 0238099 0241955 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE -07309 0126202 0201463 0425844 0414597 1734815RRMSE 0021807 0144983 0210692 0326097 0401786 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE -229891 0156942 0277017 0707983 0590466 7469014RRMSE 0023998 0210095 0317195 0519524 0618802 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE -125713 0181557 0284338 0540615 0498521 423089RRMSE 0054916 028362 0420396 0630776 0778033 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442

MSE -181856 0194818 0334706 0859252 0711939 7997074RRMSE 0026206 0387294 0662251 7322807 1312302 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE -34589 0078006 0376514 060793 0804116 3426488RRMSE 000461 0502807 1033578 2981671 2012552 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE -142213 -001433 0255331 0584152 1132152 264456RRMSE 0064209 0847956 1942286 2181192 4589042 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE -10651 -56E-05 -14E-07 -127819 1452962 1132741RRMSE 0063244 1475413 3754705 162697 9221163 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE -175652 -33E-05 -1E-06 2954790 152E-06 613E+08

RRMSE 0040681 4059441 6095076 35E+278 5569021 16E+281

9

Appendix 3 Result of EB estimation (II) with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE 0 0194196 0271669 0419181 045917 3239598RRMSE 0 0300116 0422209 0502895 0641904 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE 0 0235733 0306641 0456258 05091 3652167RRMSE 0 0410357 0585765 0712314 0841838 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE 0 0254097 0330078 2619677 0539873 2354887RRMSE -663045 0448118 0750369 -034911 1209918 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE 0 0235378 0402092 9500073 0876569 6051162RRMSE -10741 0288999 0995659 -100163 2527784 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601bias 0000395 0116669 0254473 1091172 0497898 6297454MSE 0 0 0301527 1444250 5718409 185E+09RRMSE -212936 0 1104113 2205437 5656681 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655

MSE 0 0 0159682 41595285 3074073 12E+11

RRMSE -909131 0 0557622 677E+09 9311925 706E+11

10

Appendix 4 Result of EB estimation (II) with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE 0 010933 0168797 0450626 0369775 360843RRMSE 0 0095932 0135647 023675 0239669 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE 0 0126202 0201463 0428006 0414597 1734815RRMSE 0 0142648 020709 0320663 0395479 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE 0 0156942 0277017 0716543 0590466 7469014RRMSE 0 0203913 0311937 0506882 0615401 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE 0 0181557 0284338 0549835 0498521 423089RRMSE 0 0270309 0405926 0606317 0766631 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442MSE 0 0194818 0334706 094973 0711939 7997074RRMSE 0 0316402 0576343 6561235 1240175 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE 0 0078006 0376514 0749436 0804116 3426488RRMSE 0 0258286 0698814 2340612 1714808 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE 0 0 0255331 1501268 1132152 264456RRMSE 0 0 0688797 1346552 2500825 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE 0 0 0 1755486 1452962 1132741RRMSE 0 0 0 7335062 3311711 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE 0 0 0 2954908 152E-06 613E+08

RRMSE 0 0 0 12E+278 416189 16E+281

11

Appendix 5 Syntax program for generate data

data b generate x1(covariate) and ei input x1cards0222831971100013131702314625252218171412202210run

macro bangkit_datado r=1 to 100

data egenerate poisson-gamma with excess zerodo kk=1 to 30set btetha = rangam(11)lambda = -log(01) peluang munculnya nilai nol yang diinginkan (01-09)starlambda = log(lambdatetha)output endrun

proc regmodel starlambda = x1 ods output ParameterEstimates=workbetha_lr (keep=Parameter Estimate)run

proc transpose data=workbetha_lr out=workbetha_lr_t

12

Appendix 5 Syntax program for generate data (continued)

rundata _null_set workbetha_lr_tcall symput (Intercept col1)call symput (x1 col2)run

data ddo kk=1 to 30set emu = exp(ampIntercept + ampx1x1)parmlambda = mutethaypoi = rand(poissonparmlambda)output endrun

ods trace onto take percent zero on dataproc freq data=dtables ypoi ods output OneWayFreqs=workzerorundata zeroset zerokeep percentrunproc transpose data=zero out=zero1 rundata _null_set workzero1call symput (pctz col1)rundata dset dpzero=amppctzr=amprrun

proc append data=d base=d1run

endmend

bangkit_data

13

Appendix 6 Syntax program EB with NBR

macro sae_nbdo x=1 to 900

data workaset workeif ^(u=ampx) then deleterun

this genmod procedure estimates the response without zero-inflation proc genmod data=amodel ypoi = x1 dist=nb link=logods output ParameterEstimates=workbetha_nb (keep=Parameter Estimate)run

proc transpose data=workbetha_nb out=workbetha_nb_trun

data _null_set workbetha_nb_tcall symput (Intercept col1)call symput (x1 col2)call symput (Dispersion col3)run

EB with negbin-regdata workduga_nbset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + ampDispersion)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(ampDispersion+ypoi)((mu_hat_b+ampDispersion)2)bias_b=abs(teta_hat_bayes-parmlambda)run

proc append data=workduga_nb base=workduga_nb1run

jacknifedo h=1 to 30

data workdset workduga_nb1if ^(u=ampx) then deleterundata workjacknbamphset workdif u=ampxif kk=amph then deleterun

proc genmod data=workjacknbamph output p out=sasyi_estmodel ypoi = x1 dist = nb link=logods output parameterestimates=workbetha_est_nbamph (keep=parameter Estimate)

14

Appendix 6 Syntax program EB with NBR (continued)

runproc transpose data=workbetha_est_nbamph out=workbetha_est_nbtamphrundata _null_set workbetha_est_nbtamphcall symput (Intercept_ col1)call symput (x1_ col2)call symput (Dispersion_ col3)run

data workduganbamphset workdmu_hat_b_amph=exp(ampIntercept_ + ampx1_x1)w_b_amph=mu_hat_b_amph (mu_hat_b_amph + ampDispersion_)teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2g1_amph=(ampDispersion_+ypoi)((mu_hat_b_amph+ampDispersion_)2)beda_g_amph=g1_amph-g1run

data workmse_nb_jmerge workduganb1 workduganb2 workduganb3 workduganb4 workduganb5 workduganb6 workduganb7 workduganb8 workduganb9 workduganb10 workduganb11 workduganb12workduganb13 workduganb14 workduganb15 workduganb16 workduganb17workduganb18 workduganb19 workduganb20 workduganb21 workduganb22workduganb23 workduganb24 workduganb25 workduganb26 workduganb27workduganb28 workduganb29 workduganb30by kkrun

data workmse_nb_jset workmse_nb_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampjendm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesul = ampxrun

proc append data=workmse_nb_j base=workmse_nb_j1run

data workhasilnbmerge workd workmse_nb_j keep kk x1 tetha mu parmlambda ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_b

15

Appendix 6 Syntax program EB with NBR (continued)

run

ods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilnb BASE=workhasilnb1 appendver=v6run

ENDmend

sae_nb

16

Appendix 7 Syntax program EB with ZINB

macro sae_zinb

do x=1 to 900

data workaset work eif ^(u=ampx) then deleterun

proc countreg data=amodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workpe(keep=Parameter Estimate)run

proc transpose data=workpe out=workpe_trun

data _null_set workpe_tcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + amp_Alpha)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(amp_Alpha+ypoi)((mu_hat_b+amp_Alpha)2)bias_b=abs(teta_hat_bayes-parmlamdha)

run

proc append data=workduga base=workduga1run

do h=1 to 30

data workdset workduga1if ^(u=ampx) then deleterundata workjackzinbamphset workdif u=ampxif kk=amph then deleterun

proc countreg data=jackzinbamphmodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workbetha_est_ZINBamph

17

Appendix 7 Syntax program EB with ZINB (continued)

(keep=Parameter Estimate)run

proc transpose data=workbetha_est_ZINBamph out=workbetha_est_ZINBtamphrun

data _null_set workbetha_est_ZINBtamphcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaZINBamphset workdmu_hat_b_amph=exp(ampIntercept + ampx1x1)mu_hat_b_amph= ampb_o- + ampb_1- x1w_b_amph=mu_hat_b_amph (mu_hat_b_amph + (amp_Alpha))teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2

g1_amph =((mu_hat_b_amph2ampalpha_)2)(ampalpha_+y_i)((mu_hat_b_amph2ampalpha_)+mu_hat_b_amph)2

g1_amph=(amp_Alpha+ypoi)((mu_hat_b_amph+amp_Alpha)2)

g1_amph =(A2)(ampk- + y_i)( a +mu_hat_b)2

beda_g_amph=g1_amph-g1run

data workmse_ZINB_jmerge workdugaZINB1 workdugaZINB2 workdugaZINB3 workdugaZINB4 workdugaZINB5 workdugaZINB6 workdugaZINB7 workdugaZINB8 workdugaZINB9 workdugaZINB10 workdugaZINB11 workdugaZINB12workdugaZINB13 workdugaZINB14 workdugaZINB15 workdugaZINB16 workdugaZINB17workdugaZINB18 workdugaZINB19 workdugaZINB20 workdugaZINB21 workdugaZINB22workdugaZINB23 workdugaZINB24 workdugaZINB25 workdugaZINB26 workdugaZINB27workdugaZINB28 workdugaZINB29 workdugaZINB30by kkrun

data workmse_ZINB_jset workmse_ZINB_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampj

18

Appendix 7 Syntax program EB with ZINB (continued)

endm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesrun

data workhasilZINBmerge workd workmse_ZINB_j keep kk x1 tetha mu lamdha ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_brunods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilZINB BASE=workhasilZINB1run

ENDmend

sae_zinb

  • KOPER AMPE PRAKATA_2pdf
  • isiirenepdf
Page 3: Zero-Inflated Negative Binomial in Small Area Estimation · Irianto Oetomo and Fine Analisa Maharani. She has two siblings. In 1998, she graduated from SD Dukuh 09 East Jakarta and

iii

ZERO-INFLATED NEGATIVE BINOMIAL MODELS IN SMALL AREA ESTIMATION

BYIRENE MUFLIKH NADHIROH

G14104031

Final Project ReportAs a partial fulfillment for requirement of a Bachelor Degree in Science

at Department of StatisticsFaculty of Mathematics and Natural Sciences

DEPARTMENT OF STATISTICSFACULTY OF MATHEMATICS AND NATURAL SCIENCES

BOGOR AGRICULTURAL UNIVERSITY2009

iv

Title ZERO-INFLATED NEGATIVE BINOMIAL MODELS IN SMALL AREA ESTIMATION

Name Irene Muflikh NadhirohID No G14104031

Approved by

Advisor I Advisor II

Prof Dr Ir Khairil Anwar Notodiputro Ir Indahwati MSi NIP 130891386 NIP 131909223

Acknowledged byDean of Faculty of Mathematics and Natural Sciences

Bogor Agricultural University

Dr Drh Hasim DEANIP 131578806

Passed examination date

v

BIOGRAPHY

Irene Muflikh Nadhiroh was born in Padang on October 3th 1986 as a first daughter of Ir Irianto Oetomo and Fine Analisa Maharani She has two siblings

In 1998 she graduated from SD Dukuh 09 East Jakarta and then she continued his study at SLTP Negeri 1 Bogor and graduated at 2001 She finished her study at SMU Labschool Rawamangun Jakarta in 2004 and then enrolled in Bogor Agricultural University through USMI In 2004 she joined Department of Statistics Faculty of Mathematics and Natural Sciences

During her time of study she was signed up as lecturer assistant for Basic Statistics class and Experimental Design class in 2006 and 2007 respectively She was also a member of Gamma Sigma Beta (Statistics Students AssociationGSB) and had occupied the head of science division of GSB in 2006-2007 On February-March 2008 complete her fields practice at PT Field Dimension Indonesia

vi

ACKNOWLEDGEMENTS

First of all the author modestly admitted that completion of this paper would not be possible without invaluable help from many generous and extraordinary people The author was deeply in debt for their helps ideas critics and improvement advices during writing process However they should not be hold responsible for all mistakes and deficiencies in this paper which were purely authors So hereby I would like to express my graceful to

1 All praise and gratitude for Allah SWT Alhamdulillah hirabbil alamin With his bless I able to finish this paper Thanks Allah for giving me a wonderful life with extraordinary people around me

2 Prof Dr Khairil Anwar Notodiputro and Ir Indahwati MSi for the early motivation discussion advices support and their great enthusiasm

3 Mr Bambang Sumantri MSi as examiner thanks for the spirit advices and critics4 My beloved family for the unlimited love ever after 5 Mr Alfian Futuhul Hadi MSi for enlightening discussion when I was in trouble6 Mr Bagus Sartono MSi thank you very much to run my data at your lab with your

wonderful computer Sorry if it might disturb you7 Mr Anang MSi and Mr Rahman MSi for sharing their knowledge and technical support8 Mr Dr Ir Hari Wijayanto MS all lecturer and staff at Statistics Department Thanks for

knowledge of statistics and knowledge of life that you shared It means a lot for me9 Rahmatullah Sigit Dodiet Sasongko SSi for the spirit love care time and patience Keep

it real Still love me forever and ever10 Mr Dionisius Laksmana Bisara Putra SSi for edited my paper critics and provided useful

discussion for author 11 Maulana Chistanto SSi and Yhanuar Ismail SSi thank for being my best brother12 Nikhen Sevrien and (alm Dini) thanks for lighting my day13 Rere Yusri Agus Ika Cinong Toki Cheri Fisca Wiwik Neng Mala Lilis Dika

Rangga Lele Dodi Kus Inal Bebek Koler and all of Statisticsrsquo41 14 Everyone that helps me in this study which can not be named personally

This thesis is not perfect so I am expecting the critics advices and recommendation to people who read my thesis Thank You God bless you all

Bogor January 2009

Irene Muflikh Nadhiroh

1

TABLE OF CONTENTS

PageINTRODUCTION 1

Background 1Objectives1

LITERATURE REVIEW 1Direct Estimation1Small Area Estimation 1Small Area Models1Empirical Bayes Methods 2Poisson-Gamma Models 2Negative Binomial Regression 2Over-disperse at Count Data 3Zero-Inflated Models3Zero-Inflated Negative Binomial 3Jackknife Method of Estimating MSE( EB

i )3

METHODOLOGY 4Data 4Methods4

RESULT AND DISCUSSION 4Estimation of Prior Parameter is Based of EB Method with Negative Binomial Regression 4Estimation of Prior Parameter is Based of EB Method with Zero-Inflated Negative Binomial Regression 5Comparison of EB estimator with Negative Binomial Regression and EB estimator with ZINB 5

CONCLUSION 5RECOMMENDATION 6REFERENCES 6

LIST OF TABLES

PageTable 1 MSE and RRMSE of EB Estimator with NBR 4Table 2 MSE (II) and RRMSE (II) of EB Estimator with NBR 4Table 3 MSE and RRMSE of EB Estimator with ZINB 5Table 4 MSE (II) and RRMSE (II) of EB Estimator with ZINB 5

LIST OF APPENDICES

PageAppendix 1 Result of EB estimation with NBR 7Appendix 2 Result of EB estimation with ZINB 8Appendix 3 Result of EB estimation (II) with NBR 9Appendix 4 Result of EB estimation (II) with ZINB 10Appendix 5 Syntax program for generate data 11Appendix 6 Syntax program EB with NBR 13Appendix 7 Syntax program EB with ZINB 16

1

INTRODUCTION

BackgroundDirect estimation is usually applied in big

scale survey but it is sometime difficult to utilize such estimator in a smaller region especially the sample size is too small In this case indirect estimation which adds covariates to estimate the parameter is usually used This type of estimation is broadly known as Small Area Estimation

Kismiantini (2007) conducted a research in Small Area Estimation based on Poisson-Gamma models Maximum Likelihood Estimation was used with Negative Binomial Regression techniques to estimate the respective prior parameter Moreover Negative Binomial Regression was used to resolve over-dispersion problem in the data

In reality count data is not onlycharacterized by over-dispersion but sometimes by excess-zero Excess-zero is a condition when the data contains too many zero or exceeds the distributionrsquos expectation 100 observations from Poisson model with response mean of 4 we could expect that there will be 2 zeros If the data have 30 zeros it should be obvious that the distributional assumptions have been violated Therefore the estimated parameter and standard error will be biased (Hardin amp Hilbe 2007) In this paper Zero-Inflated models were adapted to solve this type of problem

ObjectivesThe research objectives are

1 To investigate the performance of Negative Binomial Regression on Small Area Estimation in case of excess-zero

2 To apply Zero-Inflated Count Models on Small Area Estimation in case of excess-zero

3 To evaluate the performance of Zero-Inflated Count Models in estimating prior parameter for Small Area Estimation

LITERATURE REVIEW

Direct EstimationDirect estimates are generally ldquodesign

basedrdquo in the sense that they make use of ldquosurvey weightrdquo and associated inferences are based on the probability distribution by the sample design with the population values held fixed (Rao 2003) In particular direct estimates of a domain parameter are based only on the domain-specific sample data

Data from sample survey have been used to be a reliable estimate of parameter Ramsini et al (2001) mentioned that direct estimates of small area are unbiased although it would have big variance cause itrsquos small sample size

Small Area EstimationThe term of small area can be everything

depending on our object of interest It can be a city age group sex group region and rural district In general small area is used to denote any domain which the direct estimation with adequate precision can not be produced (Rao 2003) It happens because the sample size in small area is too small As a result direct estimation based on sampling design is not capable to produce direct estimation with adequate precision Furthermore small area estimation is developed as a statistic technique for estimating the parameter of small area This technique is used in effort to make estimation with adequate level of precision It works as indirect estimation that lend the strength of variable interest values from related areas through the use of supplementary information related to variable interest such as recent census count and current administrative records (Rao 2003) Indirect estimation is a process of estimating a domainrsquos parameter by connecting the information in that domain with another domain using an appropriate model So the estimator works by including other domainrsquos data (Kurnia amp Notodiputro 2006)

Small Area ModelsThere are two link models in indirect

estimation First traditional method based on implicit models that provide a link to relate small area through supplementary data Second explicit small area models that make specific allowance between area variations (Rao 2003) This research used the second model and it could be classified into two broad types of basic model1 Basic area level (type A) model

Basic area level model or aggregate model includes all models that relate small area with area-specific auxiliary variables These models are essential if unit (element) level data are not available Assuming parameter estimators

i is

related to area specific auxiliary data or covariate variables T

pii xxx )( 11 by

a linear model

2

iiT

ii vbx with i=1hellipm

iv ~N(0 2v ) are area-specific random

effect and Tp )( 1 is 1p vector of

regression coefficients Therefore ib are

known as positive constants For making inferences about

i direct estimators iy

are assumed available Accordingly assuming

iii ey where i=1hellipm with

sampling error ie ~N(0 ei2 ) and ei

2are known At the end both models are combined and as a result is new model

iiiT

ii evbxy where i=1hellipm

(Rao 2003)2 Basic unit level (type B) model

Unit level model includes all models that relate unit values of the study variable to unit-specific auxiliary variables Assuming unit-specific auxiliary variables T

ijpijij xxx )( 1 and

correspondingly a nested regression model

ijiT

ijij evxy where

i=1hellipm and j=1hellip in with

iv ~N(0 2v ) and also ie ~N(0 ei

2 )

Empirical Bayes MethodsThe Bayesian approach is based on Bayes

Law which was found by Thomas Bayes This law was introduced by Richard Proce in 1763 two years after Thomas Bayes passed away In 1774 and 1781 Laplace gave the details and relevancies for modern Bayesian statistics (Gill 2002 in Kismiantini 2007)

Novick in Good (1980) mentioned that Bayes method is difficult to adopt and sometimes is very sensitive due to the requirement of prior probability informationwhich is usually difficult to obtain Robbin (1955) introduced Empirical Bayes methods by assuming a particular prior distribution estimating based on the sample Rao (2003) said that EB (Empirical Bayes) and HB (Hierarchical Bayes) are compatible for binary and count data in Small Area Estimation Therefore EB method was used in this research

Rao (2003) summarized EB methods in Small Area Estimation as follows 1 Obtain the posterior probability density

function of the small area parameter2 Estimate the parameters from the

marginal density function

3 Use the estimated posterior density forinferences regarding the parameters ofinterest

Poisson-Gamma ModelsPoisson model is a standard model in

dealing with count data Generally count data can be suffered by over-dispersion problem Therefore a Poisson formula had been developed to accommodate extra variance from sample data Two-stage models have been introduced for count data known as mixed model Poisson-Gamma Wakefield (2006) introduced Poisson-Gamma model which was easier to use with SMR (Standard Mortality Ratio) as a direct estimator This study used Wakefield model with alteration in direct estimator

Let iy be a number of specific individual

at small area-i which has specific characteristic of interest and written as follow

j

iji yy

ijy are the-jth object at the-ith small area where

j=1hellipn and i=1hellipm

First stage )(~ ii

ind

i Poissony is assumed

where )( ii x describes a regression

model in area level ix is a vector of

covariates and Tpii)( is a vector of

regression coefficientsSecond stage distribution

)1(~ gammaiid

iis assumed as a prior

distribution with mean 1 and variance 1

Then the marginal distribution |iy is

negative binomialMoreover Wakefield (2006) used Bayes

Theorem and acquired posterior distributionas

)1(~|i

iii ygammay

and EB estimator as

iiiiB

iEB

i )ˆ1(ˆˆ)ˆˆ(ˆˆ

with )ˆˆ(ˆˆ iii ii y are direct

estimation from i and iy are the number of

observation

Negative Binomial Regression The negative binomial regression model

seems have been first discussed by Anscombe (1972) Others have pointed out its success indealing with over-dispersed count data

3

Lawless (1987) elaborated the mixture model parameterization of the negative binomial providing formulas for its log likelihood mean variance and moments Later Breslow (1990) cited Lawlessrsquo work and since its inception to the late 1980rsquos the negative binomial regression model have been construed as a mixture model that is useful for accommodating otherwise over-dispersed Poisson data (Hardin amp Hilbe 2007) The negative binomial distribution function is written as

yk

kk

k

y

kyxyg

)1()(

)()|(

where y=012hellip k and are negative

binomial parameter with )(yE and

ky 2)var( k mention as disperse

parameter which is shown that the data consist of over-dispersed

Over-disperse at Count DataCount data for Poisson regression

including by over-disperse if variance bigger than mean or if the expected value of variance is smaller than expected This phenomenon is written as

)()( ii yEyVar (McCullagh amp Nelder 1989)

Zero-Inflated ModelsZero-Inflated models consider two distinct

sources of zero outcomes One source is generated from individuals who do not enter into the counting process the other from those who do enter the count process but result in a zero outcome (Hardin amp Hilbe 2007)

Lambert (1992) first described this type of mixture model in the context of process control in manufacturing It has since been used in many applications and is now founddiscussed in nearly every book or article dealing with count response models

For the zero-inflated model the probability of observing a zero outcome equals the probability that an individual is in the always-zero group plus the probability that individual is not that group times the probability that the counting process produces a zero If )0(B as

the probability that the binary process result in a zero outcomes and )0Pr( as the probability

that the counting of a zero outcomes the probability of a zero outcome for the system is then given by (Hardin amp Hilbe 2007)

)0Pr()1()0()0Pr( ZBy The probability of a nonzero count is

)Pr()]0(1[)0Pr( kBkky This model would produce two groups of

parameter one is zero-inflation parameter which shown that the covariate significantly contribute to having a zero outcomes And the other parameter is negative binomial parameter which modeling the response with the covariate

Zero-Inflated Negative BinomialThere are many kinds of zero-inflated

model each model has plus and minus and is used in different type of data Zero-Inflated negative binomial is one kind of them This model is used in over-disperse and excess-zero data As a result among parameter estimators there would be k parameters which indicate that over-disperse occur in data just as disperse parameter in negative binomial regression

The probability distribution of this model is as follow

)|( iii xyYP )|0()(1)( iii xgxx )|()(1 iii xygx

Where is a function of iz ix are vector

of zero-inflated covariate and is a vector of

zero-inflated coefficient which will be estimated Meanwhile )|( ii xyg is probability

distribution of negative binomial written asiy

i

i

iii

iii y

yxyg

)1()(

)()|(

Mean and variance of ZINB are

))(1)(1()|(

)1()|(

iiiiii

iiii

xyV

xyE

Jackknife Method of Estimating MSE( EBi )

Jackknife methods is one of general methods used in survey because itrsquos unpretentious concept (Jiang Lahiri and Wan 2002) This methods have been known by Tukey (1958) and developed to be a method that capable to be bias corrected of estimator by remove observation-i for i=1hellipm and performs parameter estimation

Rao (2003) the Jackknife step to estimate MSE( EB

i ) are

1 Assume that )ˆˆ(ˆ iiEBi yk

)ˆˆ(ˆ111 ii

EBi yk then calculate

m

l

EBi

EBii m

mM 2

12 )ˆˆ(1ˆ

2 Calculate the delete-i estimator 1

ˆ

and

1 then calculate

4

)]ˆˆ()ˆˆ([1

)ˆˆ(ˆ111111 ii

m

miiiiii ygyg

m

mygM

And )ˆ( 21 vig is the variance estimator of

posterior distribution which is used to measure the variability associated with i

The use of )ˆ( 21 vig is leads to severe of

underestimation of )ˆ( EBiJMSE related

with estimation in prior parameter Therefore the estimator

iM1ˆ correct the

bias of )ˆ( 21 vig

3 Calculate the jackknife estimator of MSE( EB

i ) as

iiEBiJ MMMSE 21

ˆˆ)ˆ(

METHODOLOGY

DataThis research assumed that the available

auxiliary data is on area level so this research used basic area level model The data were simulated with 30 small areas and one covariate Every batch generated different conditions of excess-zeros data start from 01 until 09 probability of zero in small area This research assumed structure of relation between respond and covariate was linear

MethodsThe following steps in generating data

using SAS 91 were used1 Fix the value of

iX for the- i th area

2 Define the expected probability of zero in each small area ))0(( iYP then

calculate ))0(log( ii YPLambda

3 Generate )11(~ Gammai4 Calculate )log(

iiLambda 5 Fit linear regression between and

iX to

obtain0 and

16 Calculate )`exp(X= ii 7 Calculate

iiparmlambda 8 Generate )(~ parmlambdaPoissonyi

Moreover in analyzing data the following steps were applied 1 Generate the negative binomial regression

with genmod procedure in SAS 91 and Zero-Inflated Negative binomial Regression with countreg procedure in SAS 92

2 Estimate the prior parameter which are and

3 Estimate using EB method4 Calculate MSE for indirect estimation5 Calculate RRMSE (Root Relative Mean

Square Error)

i

ii

MSERRMSE

ˆ)ˆ(

)ˆ(

RESULT AND DISCUSSION

Estimation of Prior Parameter is Based on EB Method with Negative Binomial

RegressionIn case of non-excess-zero data the

estimator produced small and consistent MSE Meanwhile if the number of excess-zero isapproximately 30 or more with expected probability of zero 06 the performance of estimates tends to be unreliable As a result EB estimation produced negative values

RRMSE of the estimator increasessimultaneously along with the increase of number of zero in the data Furthermore if thedata contain excess zero at least 30 theestimator is unreliable

Table 1 MSE and RRMSE of EB Estimator with NBR

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 033 016 018 01302 035 020 026 02003 040 023 036 03004 042 027 050 04205 045 031 072 05906 -12875 033 -038 08107 253671 040 -1216 13508 -584495 030 30946 21109 39135606 016 116E+10 664

Table 2 MSE (II) and RRMSE (II) of EB Estimator with NBR

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 033 016 018 01302 035 020 026 02003 040 023 036 03004 042 027 050 04205 046 031 071 05806 26197 033 -035 07507 950007 040 -1002 09908 1444250 030 22054 11009 41595285 016 677E+09 056

5

Table 1 show that the iterative process produced unexpected negative values of MSEThe simplest way to solve this problem is tochange the negative value to zero MSE (II) and RRMSE (II) in table 2 are the result of MSE and RRMSE after the negative value of MSE has been changed to zero

When data have expected probability of zero by 06 to 09 mean of MSE (II) increases drastically Similarly mean of RRMSE (II) increases sharply when data have 08 to 09 expected probability of zero However when data have 06 to 07 expected probability of zero the mean of RRMSE (II) is negative due to the negative value of EB estimates

Estimation of Prior Parameter is Based on EB Method with Zero-Inflated

Negative Binomial RegressionThe EB estimates are similar to the

estimates produced by NBR method although they are slightly outperformed NBR method when the data only contain small number of zeros In particular as shown by table 3 if data have expected probability of zero by 01 to 05 ZINB produces bigger MSE for EB estimator than which NBR produces

Whereas if data have expected probability of zero by 06 to 07 ZINB gives better estimates The estimates were also unbiased as it covers parameter values adequately However ZINB begins to produce inconsistent estimates if data have expected probability of zero by 08 or more due to enormous MSE

Besides when data have expected is because ZINB generates small estimates which is close to the parameter values

Mean of MSE (II) with ZINB is biggerthan the mean of MSE with ZINB That is because when negative value of MSE changed to zero it doesnrsquot have reduction factor in the mean calculation

Comparison of EB estimator withNegative Binomial Regression and EB

estimator with ZINBEB estimates given by both NBR and

ZINB methods are similar for data with small numbers of zero However ZINB method produces bigger MSE than NBR do as long as expected probability of zero in data does not exceed 06 thresholds

But ZINB method performs better if data have expected probability of zero by 06 to 07 In this case EB estimates given by NBR method are unstable and inconsistent due to estimatesrsquo negative value and huge MSE that

can be thousand times larger than theiracceptable value On the other hand EB estimator with ZINB works well it givesunbiased estimates and its MSE values are more stable than EB estimates with NBR

Both methods would have performed poorly if data had expected probability of zero by 08 or more EB estimators with both methods were inconsistent as a result of very huge MSE values they produced

Table 3 MSE and RRMSE of EB Estimator with ZINB

Probability of zero

Mean of MSE

Median ofMSE

Mean ofRRMSE

Median of RRMSE

01 045 017 024 01402 043 020 033 02103 071 028 052 03204 054 028 0632 04205 086 033 7322807 06606 061 038 29817 10307 058 025 218119 19408 -128 -14E-07 162697 37509 2954790 -1E-06 35E+278 609508

Table 4 MSE (II) and RRMSE (II) of EB Estimator with ZINB

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 045 017 024 01402 0436 020 0324 02103 072 028 051 031104 055 028 061 04105 095 033 6561235 05806 075 038 23406 07007 150 025 134655 06908 175 0 733506 009 2954908 0 12E+278 0

CONCLUSION

Excess-zero in data highly influenced the result of EB estimation Conventional method such as negative binomial regression in prior estimation has produced unbiased and unreliable EB estimator for data with expected probability of zero by 06 This is shown bybig number of MSE and negative value of estimator

Meanwhile EB estimation by ZINB method produced more reliable estimator even when the data have expected probability of zero by 06 to 07

The ZINB has also provided a reliable estimator for data with less than 5333 of zeros This means that performance of ZINB

6

declines when the data have expected probability of zero by 08 or more As shown by the big MSE and inconsistent estimator

RECOMMENDATION

This research is based on many assumptions and suffered by several limitations If the assumptions and boundaries can be relaxed can be expected better result There are some recommendations for the next research1 The generating process in this research

does not reflect the real sampling processIf the generating process similar to the real sampling process it might give better result because it will be closer with the real application

2 It will be more interesting to runexperiment which takes account of larger number of areas since the number of areas will influence data modeling

3 The Restricted Maximum Likelihood maybe applied when estimating prior parameter with ZINB and NBR in other to solve the negative value of MSE

4 Theoretical research of ZINB and Empirical Bayes estimator is important to understand the behavior of parameter estimates of ZINB in Empirical Bayes setting

REFERENCES

Erdman D L Jackson A Sinko 2008 Zero-Inflated Poisson and Zero-Inflated Negative Binomial Models Using the COUNTREG Procedure SAS Global Forum 2008322-2008httpwww2sascomproceedingsforum2008322-2008pdf [25 Agustus 2008]

Famoye F KP Singh 2006 Zero-Inflated Generalized Poisson Regression Model with an Application to Domestic Violence Data Journal of Data Science 4117-130

Hardin JW JM Hilbe 2007 Generalized Linear Models and Extensions Texas A Stata Press Publication

Kurnia A KA Notodiputro 2006 Penerapan Metode Jackknife dalam pendugaan Area Kecil Forum Statistika dan Komputasi April 2006 p12-15

Kismiantini 2007 Pendugaan Statistik Area Kecil Berbasis Model Poisson-Gamma [Tesis] Bogor Institut Pertanian Bogor Fakultas Matematika dan Pengetahuan Alam

McCullagh P J A Nelder 1983 Generalized Linear Models London Chapmann and Hall

Ramsini B et all 2001 Uninsured Estimates by County A Review of Options and IssueshttpwwwodhohiogovDataOFHSurvofhsrfq7pdf [24 April 2008]

Rao JNK 2003 Small Area Estimation New York John Wiley amp Sons

Wakefield J 2006 Disease mapping and spatial regression with count data httpwwwbepresscomuwbiostatpaper286pdf [24 April 2008]

7

Appendix 1 Result of EB estimation with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE -100569 0194196 0271669 041875 045917 3239598RRMSE 0123605 0300339 0422426 0503566 0642418 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE -237643 0235733 0306641 0452955 05091 3652167RRMSE 0038956 0412708 0588924 0717336 0844735 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE -749011 0254097 0330078 -12875 0539873 2354887RRMSE -663045 051763 0813734 -038057 1287528 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE -7513075 0235378 0402092 2536714 0876569 6051162RRMSE -10741 0704796 1355566 -121606 3040291 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601

bias 0000395 0116669 0254473 1091172 0497898 6297454MSE -6E+09 -016583 0301527 -584495 5718409 185E+09RRMSE -212936 0927338 2115163 3094627 1359703 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655MSE -38E+09 -130817 0159682 39135606 3074073 12E+11

RRMSE -909131 1647188 6639631 116E+10 1585472 706E+11

8

Appendix 2 Result of EB estimation with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE -053954 010933 0168797 0449506 0369775 360843RRMSE 0022947 0096443 0136424 0238099 0241955 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE -07309 0126202 0201463 0425844 0414597 1734815RRMSE 0021807 0144983 0210692 0326097 0401786 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE -229891 0156942 0277017 0707983 0590466 7469014RRMSE 0023998 0210095 0317195 0519524 0618802 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE -125713 0181557 0284338 0540615 0498521 423089RRMSE 0054916 028362 0420396 0630776 0778033 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442

MSE -181856 0194818 0334706 0859252 0711939 7997074RRMSE 0026206 0387294 0662251 7322807 1312302 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE -34589 0078006 0376514 060793 0804116 3426488RRMSE 000461 0502807 1033578 2981671 2012552 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE -142213 -001433 0255331 0584152 1132152 264456RRMSE 0064209 0847956 1942286 2181192 4589042 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE -10651 -56E-05 -14E-07 -127819 1452962 1132741RRMSE 0063244 1475413 3754705 162697 9221163 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE -175652 -33E-05 -1E-06 2954790 152E-06 613E+08

RRMSE 0040681 4059441 6095076 35E+278 5569021 16E+281

9

Appendix 3 Result of EB estimation (II) with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE 0 0194196 0271669 0419181 045917 3239598RRMSE 0 0300116 0422209 0502895 0641904 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE 0 0235733 0306641 0456258 05091 3652167RRMSE 0 0410357 0585765 0712314 0841838 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE 0 0254097 0330078 2619677 0539873 2354887RRMSE -663045 0448118 0750369 -034911 1209918 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE 0 0235378 0402092 9500073 0876569 6051162RRMSE -10741 0288999 0995659 -100163 2527784 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601bias 0000395 0116669 0254473 1091172 0497898 6297454MSE 0 0 0301527 1444250 5718409 185E+09RRMSE -212936 0 1104113 2205437 5656681 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655

MSE 0 0 0159682 41595285 3074073 12E+11

RRMSE -909131 0 0557622 677E+09 9311925 706E+11

10

Appendix 4 Result of EB estimation (II) with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE 0 010933 0168797 0450626 0369775 360843RRMSE 0 0095932 0135647 023675 0239669 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE 0 0126202 0201463 0428006 0414597 1734815RRMSE 0 0142648 020709 0320663 0395479 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE 0 0156942 0277017 0716543 0590466 7469014RRMSE 0 0203913 0311937 0506882 0615401 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE 0 0181557 0284338 0549835 0498521 423089RRMSE 0 0270309 0405926 0606317 0766631 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442MSE 0 0194818 0334706 094973 0711939 7997074RRMSE 0 0316402 0576343 6561235 1240175 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE 0 0078006 0376514 0749436 0804116 3426488RRMSE 0 0258286 0698814 2340612 1714808 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE 0 0 0255331 1501268 1132152 264456RRMSE 0 0 0688797 1346552 2500825 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE 0 0 0 1755486 1452962 1132741RRMSE 0 0 0 7335062 3311711 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE 0 0 0 2954908 152E-06 613E+08

RRMSE 0 0 0 12E+278 416189 16E+281

11

Appendix 5 Syntax program for generate data

data b generate x1(covariate) and ei input x1cards0222831971100013131702314625252218171412202210run

macro bangkit_datado r=1 to 100

data egenerate poisson-gamma with excess zerodo kk=1 to 30set btetha = rangam(11)lambda = -log(01) peluang munculnya nilai nol yang diinginkan (01-09)starlambda = log(lambdatetha)output endrun

proc regmodel starlambda = x1 ods output ParameterEstimates=workbetha_lr (keep=Parameter Estimate)run

proc transpose data=workbetha_lr out=workbetha_lr_t

12

Appendix 5 Syntax program for generate data (continued)

rundata _null_set workbetha_lr_tcall symput (Intercept col1)call symput (x1 col2)run

data ddo kk=1 to 30set emu = exp(ampIntercept + ampx1x1)parmlambda = mutethaypoi = rand(poissonparmlambda)output endrun

ods trace onto take percent zero on dataproc freq data=dtables ypoi ods output OneWayFreqs=workzerorundata zeroset zerokeep percentrunproc transpose data=zero out=zero1 rundata _null_set workzero1call symput (pctz col1)rundata dset dpzero=amppctzr=amprrun

proc append data=d base=d1run

endmend

bangkit_data

13

Appendix 6 Syntax program EB with NBR

macro sae_nbdo x=1 to 900

data workaset workeif ^(u=ampx) then deleterun

this genmod procedure estimates the response without zero-inflation proc genmod data=amodel ypoi = x1 dist=nb link=logods output ParameterEstimates=workbetha_nb (keep=Parameter Estimate)run

proc transpose data=workbetha_nb out=workbetha_nb_trun

data _null_set workbetha_nb_tcall symput (Intercept col1)call symput (x1 col2)call symput (Dispersion col3)run

EB with negbin-regdata workduga_nbset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + ampDispersion)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(ampDispersion+ypoi)((mu_hat_b+ampDispersion)2)bias_b=abs(teta_hat_bayes-parmlambda)run

proc append data=workduga_nb base=workduga_nb1run

jacknifedo h=1 to 30

data workdset workduga_nb1if ^(u=ampx) then deleterundata workjacknbamphset workdif u=ampxif kk=amph then deleterun

proc genmod data=workjacknbamph output p out=sasyi_estmodel ypoi = x1 dist = nb link=logods output parameterestimates=workbetha_est_nbamph (keep=parameter Estimate)

14

Appendix 6 Syntax program EB with NBR (continued)

runproc transpose data=workbetha_est_nbamph out=workbetha_est_nbtamphrundata _null_set workbetha_est_nbtamphcall symput (Intercept_ col1)call symput (x1_ col2)call symput (Dispersion_ col3)run

data workduganbamphset workdmu_hat_b_amph=exp(ampIntercept_ + ampx1_x1)w_b_amph=mu_hat_b_amph (mu_hat_b_amph + ampDispersion_)teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2g1_amph=(ampDispersion_+ypoi)((mu_hat_b_amph+ampDispersion_)2)beda_g_amph=g1_amph-g1run

data workmse_nb_jmerge workduganb1 workduganb2 workduganb3 workduganb4 workduganb5 workduganb6 workduganb7 workduganb8 workduganb9 workduganb10 workduganb11 workduganb12workduganb13 workduganb14 workduganb15 workduganb16 workduganb17workduganb18 workduganb19 workduganb20 workduganb21 workduganb22workduganb23 workduganb24 workduganb25 workduganb26 workduganb27workduganb28 workduganb29 workduganb30by kkrun

data workmse_nb_jset workmse_nb_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampjendm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesul = ampxrun

proc append data=workmse_nb_j base=workmse_nb_j1run

data workhasilnbmerge workd workmse_nb_j keep kk x1 tetha mu parmlambda ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_b

15

Appendix 6 Syntax program EB with NBR (continued)

run

ods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilnb BASE=workhasilnb1 appendver=v6run

ENDmend

sae_nb

16

Appendix 7 Syntax program EB with ZINB

macro sae_zinb

do x=1 to 900

data workaset work eif ^(u=ampx) then deleterun

proc countreg data=amodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workpe(keep=Parameter Estimate)run

proc transpose data=workpe out=workpe_trun

data _null_set workpe_tcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + amp_Alpha)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(amp_Alpha+ypoi)((mu_hat_b+amp_Alpha)2)bias_b=abs(teta_hat_bayes-parmlamdha)

run

proc append data=workduga base=workduga1run

do h=1 to 30

data workdset workduga1if ^(u=ampx) then deleterundata workjackzinbamphset workdif u=ampxif kk=amph then deleterun

proc countreg data=jackzinbamphmodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workbetha_est_ZINBamph

17

Appendix 7 Syntax program EB with ZINB (continued)

(keep=Parameter Estimate)run

proc transpose data=workbetha_est_ZINBamph out=workbetha_est_ZINBtamphrun

data _null_set workbetha_est_ZINBtamphcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaZINBamphset workdmu_hat_b_amph=exp(ampIntercept + ampx1x1)mu_hat_b_amph= ampb_o- + ampb_1- x1w_b_amph=mu_hat_b_amph (mu_hat_b_amph + (amp_Alpha))teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2

g1_amph =((mu_hat_b_amph2ampalpha_)2)(ampalpha_+y_i)((mu_hat_b_amph2ampalpha_)+mu_hat_b_amph)2

g1_amph=(amp_Alpha+ypoi)((mu_hat_b_amph+amp_Alpha)2)

g1_amph =(A2)(ampk- + y_i)( a +mu_hat_b)2

beda_g_amph=g1_amph-g1run

data workmse_ZINB_jmerge workdugaZINB1 workdugaZINB2 workdugaZINB3 workdugaZINB4 workdugaZINB5 workdugaZINB6 workdugaZINB7 workdugaZINB8 workdugaZINB9 workdugaZINB10 workdugaZINB11 workdugaZINB12workdugaZINB13 workdugaZINB14 workdugaZINB15 workdugaZINB16 workdugaZINB17workdugaZINB18 workdugaZINB19 workdugaZINB20 workdugaZINB21 workdugaZINB22workdugaZINB23 workdugaZINB24 workdugaZINB25 workdugaZINB26 workdugaZINB27workdugaZINB28 workdugaZINB29 workdugaZINB30by kkrun

data workmse_ZINB_jset workmse_ZINB_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampj

18

Appendix 7 Syntax program EB with ZINB (continued)

endm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesrun

data workhasilZINBmerge workd workmse_ZINB_j keep kk x1 tetha mu lamdha ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_brunods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilZINB BASE=workhasilZINB1run

ENDmend

sae_zinb

  • KOPER AMPE PRAKATA_2pdf
  • isiirenepdf
Page 4: Zero-Inflated Negative Binomial in Small Area Estimation · Irianto Oetomo and Fine Analisa Maharani. She has two siblings. In 1998, she graduated from SD Dukuh 09 East Jakarta and

iv

Title ZERO-INFLATED NEGATIVE BINOMIAL MODELS IN SMALL AREA ESTIMATION

Name Irene Muflikh NadhirohID No G14104031

Approved by

Advisor I Advisor II

Prof Dr Ir Khairil Anwar Notodiputro Ir Indahwati MSi NIP 130891386 NIP 131909223

Acknowledged byDean of Faculty of Mathematics and Natural Sciences

Bogor Agricultural University

Dr Drh Hasim DEANIP 131578806

Passed examination date

v

BIOGRAPHY

Irene Muflikh Nadhiroh was born in Padang on October 3th 1986 as a first daughter of Ir Irianto Oetomo and Fine Analisa Maharani She has two siblings

In 1998 she graduated from SD Dukuh 09 East Jakarta and then she continued his study at SLTP Negeri 1 Bogor and graduated at 2001 She finished her study at SMU Labschool Rawamangun Jakarta in 2004 and then enrolled in Bogor Agricultural University through USMI In 2004 she joined Department of Statistics Faculty of Mathematics and Natural Sciences

During her time of study she was signed up as lecturer assistant for Basic Statistics class and Experimental Design class in 2006 and 2007 respectively She was also a member of Gamma Sigma Beta (Statistics Students AssociationGSB) and had occupied the head of science division of GSB in 2006-2007 On February-March 2008 complete her fields practice at PT Field Dimension Indonesia

vi

ACKNOWLEDGEMENTS

First of all the author modestly admitted that completion of this paper would not be possible without invaluable help from many generous and extraordinary people The author was deeply in debt for their helps ideas critics and improvement advices during writing process However they should not be hold responsible for all mistakes and deficiencies in this paper which were purely authors So hereby I would like to express my graceful to

1 All praise and gratitude for Allah SWT Alhamdulillah hirabbil alamin With his bless I able to finish this paper Thanks Allah for giving me a wonderful life with extraordinary people around me

2 Prof Dr Khairil Anwar Notodiputro and Ir Indahwati MSi for the early motivation discussion advices support and their great enthusiasm

3 Mr Bambang Sumantri MSi as examiner thanks for the spirit advices and critics4 My beloved family for the unlimited love ever after 5 Mr Alfian Futuhul Hadi MSi for enlightening discussion when I was in trouble6 Mr Bagus Sartono MSi thank you very much to run my data at your lab with your

wonderful computer Sorry if it might disturb you7 Mr Anang MSi and Mr Rahman MSi for sharing their knowledge and technical support8 Mr Dr Ir Hari Wijayanto MS all lecturer and staff at Statistics Department Thanks for

knowledge of statistics and knowledge of life that you shared It means a lot for me9 Rahmatullah Sigit Dodiet Sasongko SSi for the spirit love care time and patience Keep

it real Still love me forever and ever10 Mr Dionisius Laksmana Bisara Putra SSi for edited my paper critics and provided useful

discussion for author 11 Maulana Chistanto SSi and Yhanuar Ismail SSi thank for being my best brother12 Nikhen Sevrien and (alm Dini) thanks for lighting my day13 Rere Yusri Agus Ika Cinong Toki Cheri Fisca Wiwik Neng Mala Lilis Dika

Rangga Lele Dodi Kus Inal Bebek Koler and all of Statisticsrsquo41 14 Everyone that helps me in this study which can not be named personally

This thesis is not perfect so I am expecting the critics advices and recommendation to people who read my thesis Thank You God bless you all

Bogor January 2009

Irene Muflikh Nadhiroh

1

TABLE OF CONTENTS

PageINTRODUCTION 1

Background 1Objectives1

LITERATURE REVIEW 1Direct Estimation1Small Area Estimation 1Small Area Models1Empirical Bayes Methods 2Poisson-Gamma Models 2Negative Binomial Regression 2Over-disperse at Count Data 3Zero-Inflated Models3Zero-Inflated Negative Binomial 3Jackknife Method of Estimating MSE( EB

i )3

METHODOLOGY 4Data 4Methods4

RESULT AND DISCUSSION 4Estimation of Prior Parameter is Based of EB Method with Negative Binomial Regression 4Estimation of Prior Parameter is Based of EB Method with Zero-Inflated Negative Binomial Regression 5Comparison of EB estimator with Negative Binomial Regression and EB estimator with ZINB 5

CONCLUSION 5RECOMMENDATION 6REFERENCES 6

LIST OF TABLES

PageTable 1 MSE and RRMSE of EB Estimator with NBR 4Table 2 MSE (II) and RRMSE (II) of EB Estimator with NBR 4Table 3 MSE and RRMSE of EB Estimator with ZINB 5Table 4 MSE (II) and RRMSE (II) of EB Estimator with ZINB 5

LIST OF APPENDICES

PageAppendix 1 Result of EB estimation with NBR 7Appendix 2 Result of EB estimation with ZINB 8Appendix 3 Result of EB estimation (II) with NBR 9Appendix 4 Result of EB estimation (II) with ZINB 10Appendix 5 Syntax program for generate data 11Appendix 6 Syntax program EB with NBR 13Appendix 7 Syntax program EB with ZINB 16

1

INTRODUCTION

BackgroundDirect estimation is usually applied in big

scale survey but it is sometime difficult to utilize such estimator in a smaller region especially the sample size is too small In this case indirect estimation which adds covariates to estimate the parameter is usually used This type of estimation is broadly known as Small Area Estimation

Kismiantini (2007) conducted a research in Small Area Estimation based on Poisson-Gamma models Maximum Likelihood Estimation was used with Negative Binomial Regression techniques to estimate the respective prior parameter Moreover Negative Binomial Regression was used to resolve over-dispersion problem in the data

In reality count data is not onlycharacterized by over-dispersion but sometimes by excess-zero Excess-zero is a condition when the data contains too many zero or exceeds the distributionrsquos expectation 100 observations from Poisson model with response mean of 4 we could expect that there will be 2 zeros If the data have 30 zeros it should be obvious that the distributional assumptions have been violated Therefore the estimated parameter and standard error will be biased (Hardin amp Hilbe 2007) In this paper Zero-Inflated models were adapted to solve this type of problem

ObjectivesThe research objectives are

1 To investigate the performance of Negative Binomial Regression on Small Area Estimation in case of excess-zero

2 To apply Zero-Inflated Count Models on Small Area Estimation in case of excess-zero

3 To evaluate the performance of Zero-Inflated Count Models in estimating prior parameter for Small Area Estimation

LITERATURE REVIEW

Direct EstimationDirect estimates are generally ldquodesign

basedrdquo in the sense that they make use of ldquosurvey weightrdquo and associated inferences are based on the probability distribution by the sample design with the population values held fixed (Rao 2003) In particular direct estimates of a domain parameter are based only on the domain-specific sample data

Data from sample survey have been used to be a reliable estimate of parameter Ramsini et al (2001) mentioned that direct estimates of small area are unbiased although it would have big variance cause itrsquos small sample size

Small Area EstimationThe term of small area can be everything

depending on our object of interest It can be a city age group sex group region and rural district In general small area is used to denote any domain which the direct estimation with adequate precision can not be produced (Rao 2003) It happens because the sample size in small area is too small As a result direct estimation based on sampling design is not capable to produce direct estimation with adequate precision Furthermore small area estimation is developed as a statistic technique for estimating the parameter of small area This technique is used in effort to make estimation with adequate level of precision It works as indirect estimation that lend the strength of variable interest values from related areas through the use of supplementary information related to variable interest such as recent census count and current administrative records (Rao 2003) Indirect estimation is a process of estimating a domainrsquos parameter by connecting the information in that domain with another domain using an appropriate model So the estimator works by including other domainrsquos data (Kurnia amp Notodiputro 2006)

Small Area ModelsThere are two link models in indirect

estimation First traditional method based on implicit models that provide a link to relate small area through supplementary data Second explicit small area models that make specific allowance between area variations (Rao 2003) This research used the second model and it could be classified into two broad types of basic model1 Basic area level (type A) model

Basic area level model or aggregate model includes all models that relate small area with area-specific auxiliary variables These models are essential if unit (element) level data are not available Assuming parameter estimators

i is

related to area specific auxiliary data or covariate variables T

pii xxx )( 11 by

a linear model

2

iiT

ii vbx with i=1hellipm

iv ~N(0 2v ) are area-specific random

effect and Tp )( 1 is 1p vector of

regression coefficients Therefore ib are

known as positive constants For making inferences about

i direct estimators iy

are assumed available Accordingly assuming

iii ey where i=1hellipm with

sampling error ie ~N(0 ei2 ) and ei

2are known At the end both models are combined and as a result is new model

iiiT

ii evbxy where i=1hellipm

(Rao 2003)2 Basic unit level (type B) model

Unit level model includes all models that relate unit values of the study variable to unit-specific auxiliary variables Assuming unit-specific auxiliary variables T

ijpijij xxx )( 1 and

correspondingly a nested regression model

ijiT

ijij evxy where

i=1hellipm and j=1hellip in with

iv ~N(0 2v ) and also ie ~N(0 ei

2 )

Empirical Bayes MethodsThe Bayesian approach is based on Bayes

Law which was found by Thomas Bayes This law was introduced by Richard Proce in 1763 two years after Thomas Bayes passed away In 1774 and 1781 Laplace gave the details and relevancies for modern Bayesian statistics (Gill 2002 in Kismiantini 2007)

Novick in Good (1980) mentioned that Bayes method is difficult to adopt and sometimes is very sensitive due to the requirement of prior probability informationwhich is usually difficult to obtain Robbin (1955) introduced Empirical Bayes methods by assuming a particular prior distribution estimating based on the sample Rao (2003) said that EB (Empirical Bayes) and HB (Hierarchical Bayes) are compatible for binary and count data in Small Area Estimation Therefore EB method was used in this research

Rao (2003) summarized EB methods in Small Area Estimation as follows 1 Obtain the posterior probability density

function of the small area parameter2 Estimate the parameters from the

marginal density function

3 Use the estimated posterior density forinferences regarding the parameters ofinterest

Poisson-Gamma ModelsPoisson model is a standard model in

dealing with count data Generally count data can be suffered by over-dispersion problem Therefore a Poisson formula had been developed to accommodate extra variance from sample data Two-stage models have been introduced for count data known as mixed model Poisson-Gamma Wakefield (2006) introduced Poisson-Gamma model which was easier to use with SMR (Standard Mortality Ratio) as a direct estimator This study used Wakefield model with alteration in direct estimator

Let iy be a number of specific individual

at small area-i which has specific characteristic of interest and written as follow

j

iji yy

ijy are the-jth object at the-ith small area where

j=1hellipn and i=1hellipm

First stage )(~ ii

ind

i Poissony is assumed

where )( ii x describes a regression

model in area level ix is a vector of

covariates and Tpii)( is a vector of

regression coefficientsSecond stage distribution

)1(~ gammaiid

iis assumed as a prior

distribution with mean 1 and variance 1

Then the marginal distribution |iy is

negative binomialMoreover Wakefield (2006) used Bayes

Theorem and acquired posterior distributionas

)1(~|i

iii ygammay

and EB estimator as

iiiiB

iEB

i )ˆ1(ˆˆ)ˆˆ(ˆˆ

with )ˆˆ(ˆˆ iii ii y are direct

estimation from i and iy are the number of

observation

Negative Binomial Regression The negative binomial regression model

seems have been first discussed by Anscombe (1972) Others have pointed out its success indealing with over-dispersed count data

3

Lawless (1987) elaborated the mixture model parameterization of the negative binomial providing formulas for its log likelihood mean variance and moments Later Breslow (1990) cited Lawlessrsquo work and since its inception to the late 1980rsquos the negative binomial regression model have been construed as a mixture model that is useful for accommodating otherwise over-dispersed Poisson data (Hardin amp Hilbe 2007) The negative binomial distribution function is written as

yk

kk

k

y

kyxyg

)1()(

)()|(

where y=012hellip k and are negative

binomial parameter with )(yE and

ky 2)var( k mention as disperse

parameter which is shown that the data consist of over-dispersed

Over-disperse at Count DataCount data for Poisson regression

including by over-disperse if variance bigger than mean or if the expected value of variance is smaller than expected This phenomenon is written as

)()( ii yEyVar (McCullagh amp Nelder 1989)

Zero-Inflated ModelsZero-Inflated models consider two distinct

sources of zero outcomes One source is generated from individuals who do not enter into the counting process the other from those who do enter the count process but result in a zero outcome (Hardin amp Hilbe 2007)

Lambert (1992) first described this type of mixture model in the context of process control in manufacturing It has since been used in many applications and is now founddiscussed in nearly every book or article dealing with count response models

For the zero-inflated model the probability of observing a zero outcome equals the probability that an individual is in the always-zero group plus the probability that individual is not that group times the probability that the counting process produces a zero If )0(B as

the probability that the binary process result in a zero outcomes and )0Pr( as the probability

that the counting of a zero outcomes the probability of a zero outcome for the system is then given by (Hardin amp Hilbe 2007)

)0Pr()1()0()0Pr( ZBy The probability of a nonzero count is

)Pr()]0(1[)0Pr( kBkky This model would produce two groups of

parameter one is zero-inflation parameter which shown that the covariate significantly contribute to having a zero outcomes And the other parameter is negative binomial parameter which modeling the response with the covariate

Zero-Inflated Negative BinomialThere are many kinds of zero-inflated

model each model has plus and minus and is used in different type of data Zero-Inflated negative binomial is one kind of them This model is used in over-disperse and excess-zero data As a result among parameter estimators there would be k parameters which indicate that over-disperse occur in data just as disperse parameter in negative binomial regression

The probability distribution of this model is as follow

)|( iii xyYP )|0()(1)( iii xgxx )|()(1 iii xygx

Where is a function of iz ix are vector

of zero-inflated covariate and is a vector of

zero-inflated coefficient which will be estimated Meanwhile )|( ii xyg is probability

distribution of negative binomial written asiy

i

i

iii

iii y

yxyg

)1()(

)()|(

Mean and variance of ZINB are

))(1)(1()|(

)1()|(

iiiiii

iiii

xyV

xyE

Jackknife Method of Estimating MSE( EBi )

Jackknife methods is one of general methods used in survey because itrsquos unpretentious concept (Jiang Lahiri and Wan 2002) This methods have been known by Tukey (1958) and developed to be a method that capable to be bias corrected of estimator by remove observation-i for i=1hellipm and performs parameter estimation

Rao (2003) the Jackknife step to estimate MSE( EB

i ) are

1 Assume that )ˆˆ(ˆ iiEBi yk

)ˆˆ(ˆ111 ii

EBi yk then calculate

m

l

EBi

EBii m

mM 2

12 )ˆˆ(1ˆ

2 Calculate the delete-i estimator 1

ˆ

and

1 then calculate

4

)]ˆˆ()ˆˆ([1

)ˆˆ(ˆ111111 ii

m

miiiiii ygyg

m

mygM

And )ˆ( 21 vig is the variance estimator of

posterior distribution which is used to measure the variability associated with i

The use of )ˆ( 21 vig is leads to severe of

underestimation of )ˆ( EBiJMSE related

with estimation in prior parameter Therefore the estimator

iM1ˆ correct the

bias of )ˆ( 21 vig

3 Calculate the jackknife estimator of MSE( EB

i ) as

iiEBiJ MMMSE 21

ˆˆ)ˆ(

METHODOLOGY

DataThis research assumed that the available

auxiliary data is on area level so this research used basic area level model The data were simulated with 30 small areas and one covariate Every batch generated different conditions of excess-zeros data start from 01 until 09 probability of zero in small area This research assumed structure of relation between respond and covariate was linear

MethodsThe following steps in generating data

using SAS 91 were used1 Fix the value of

iX for the- i th area

2 Define the expected probability of zero in each small area ))0(( iYP then

calculate ))0(log( ii YPLambda

3 Generate )11(~ Gammai4 Calculate )log(

iiLambda 5 Fit linear regression between and

iX to

obtain0 and

16 Calculate )`exp(X= ii 7 Calculate

iiparmlambda 8 Generate )(~ parmlambdaPoissonyi

Moreover in analyzing data the following steps were applied 1 Generate the negative binomial regression

with genmod procedure in SAS 91 and Zero-Inflated Negative binomial Regression with countreg procedure in SAS 92

2 Estimate the prior parameter which are and

3 Estimate using EB method4 Calculate MSE for indirect estimation5 Calculate RRMSE (Root Relative Mean

Square Error)

i

ii

MSERRMSE

ˆ)ˆ(

)ˆ(

RESULT AND DISCUSSION

Estimation of Prior Parameter is Based on EB Method with Negative Binomial

RegressionIn case of non-excess-zero data the

estimator produced small and consistent MSE Meanwhile if the number of excess-zero isapproximately 30 or more with expected probability of zero 06 the performance of estimates tends to be unreliable As a result EB estimation produced negative values

RRMSE of the estimator increasessimultaneously along with the increase of number of zero in the data Furthermore if thedata contain excess zero at least 30 theestimator is unreliable

Table 1 MSE and RRMSE of EB Estimator with NBR

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 033 016 018 01302 035 020 026 02003 040 023 036 03004 042 027 050 04205 045 031 072 05906 -12875 033 -038 08107 253671 040 -1216 13508 -584495 030 30946 21109 39135606 016 116E+10 664

Table 2 MSE (II) and RRMSE (II) of EB Estimator with NBR

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 033 016 018 01302 035 020 026 02003 040 023 036 03004 042 027 050 04205 046 031 071 05806 26197 033 -035 07507 950007 040 -1002 09908 1444250 030 22054 11009 41595285 016 677E+09 056

5

Table 1 show that the iterative process produced unexpected negative values of MSEThe simplest way to solve this problem is tochange the negative value to zero MSE (II) and RRMSE (II) in table 2 are the result of MSE and RRMSE after the negative value of MSE has been changed to zero

When data have expected probability of zero by 06 to 09 mean of MSE (II) increases drastically Similarly mean of RRMSE (II) increases sharply when data have 08 to 09 expected probability of zero However when data have 06 to 07 expected probability of zero the mean of RRMSE (II) is negative due to the negative value of EB estimates

Estimation of Prior Parameter is Based on EB Method with Zero-Inflated

Negative Binomial RegressionThe EB estimates are similar to the

estimates produced by NBR method although they are slightly outperformed NBR method when the data only contain small number of zeros In particular as shown by table 3 if data have expected probability of zero by 01 to 05 ZINB produces bigger MSE for EB estimator than which NBR produces

Whereas if data have expected probability of zero by 06 to 07 ZINB gives better estimates The estimates were also unbiased as it covers parameter values adequately However ZINB begins to produce inconsistent estimates if data have expected probability of zero by 08 or more due to enormous MSE

Besides when data have expected is because ZINB generates small estimates which is close to the parameter values

Mean of MSE (II) with ZINB is biggerthan the mean of MSE with ZINB That is because when negative value of MSE changed to zero it doesnrsquot have reduction factor in the mean calculation

Comparison of EB estimator withNegative Binomial Regression and EB

estimator with ZINBEB estimates given by both NBR and

ZINB methods are similar for data with small numbers of zero However ZINB method produces bigger MSE than NBR do as long as expected probability of zero in data does not exceed 06 thresholds

But ZINB method performs better if data have expected probability of zero by 06 to 07 In this case EB estimates given by NBR method are unstable and inconsistent due to estimatesrsquo negative value and huge MSE that

can be thousand times larger than theiracceptable value On the other hand EB estimator with ZINB works well it givesunbiased estimates and its MSE values are more stable than EB estimates with NBR

Both methods would have performed poorly if data had expected probability of zero by 08 or more EB estimators with both methods were inconsistent as a result of very huge MSE values they produced

Table 3 MSE and RRMSE of EB Estimator with ZINB

Probability of zero

Mean of MSE

Median ofMSE

Mean ofRRMSE

Median of RRMSE

01 045 017 024 01402 043 020 033 02103 071 028 052 03204 054 028 0632 04205 086 033 7322807 06606 061 038 29817 10307 058 025 218119 19408 -128 -14E-07 162697 37509 2954790 -1E-06 35E+278 609508

Table 4 MSE (II) and RRMSE (II) of EB Estimator with ZINB

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 045 017 024 01402 0436 020 0324 02103 072 028 051 031104 055 028 061 04105 095 033 6561235 05806 075 038 23406 07007 150 025 134655 06908 175 0 733506 009 2954908 0 12E+278 0

CONCLUSION

Excess-zero in data highly influenced the result of EB estimation Conventional method such as negative binomial regression in prior estimation has produced unbiased and unreliable EB estimator for data with expected probability of zero by 06 This is shown bybig number of MSE and negative value of estimator

Meanwhile EB estimation by ZINB method produced more reliable estimator even when the data have expected probability of zero by 06 to 07

The ZINB has also provided a reliable estimator for data with less than 5333 of zeros This means that performance of ZINB

6

declines when the data have expected probability of zero by 08 or more As shown by the big MSE and inconsistent estimator

RECOMMENDATION

This research is based on many assumptions and suffered by several limitations If the assumptions and boundaries can be relaxed can be expected better result There are some recommendations for the next research1 The generating process in this research

does not reflect the real sampling processIf the generating process similar to the real sampling process it might give better result because it will be closer with the real application

2 It will be more interesting to runexperiment which takes account of larger number of areas since the number of areas will influence data modeling

3 The Restricted Maximum Likelihood maybe applied when estimating prior parameter with ZINB and NBR in other to solve the negative value of MSE

4 Theoretical research of ZINB and Empirical Bayes estimator is important to understand the behavior of parameter estimates of ZINB in Empirical Bayes setting

REFERENCES

Erdman D L Jackson A Sinko 2008 Zero-Inflated Poisson and Zero-Inflated Negative Binomial Models Using the COUNTREG Procedure SAS Global Forum 2008322-2008httpwww2sascomproceedingsforum2008322-2008pdf [25 Agustus 2008]

Famoye F KP Singh 2006 Zero-Inflated Generalized Poisson Regression Model with an Application to Domestic Violence Data Journal of Data Science 4117-130

Hardin JW JM Hilbe 2007 Generalized Linear Models and Extensions Texas A Stata Press Publication

Kurnia A KA Notodiputro 2006 Penerapan Metode Jackknife dalam pendugaan Area Kecil Forum Statistika dan Komputasi April 2006 p12-15

Kismiantini 2007 Pendugaan Statistik Area Kecil Berbasis Model Poisson-Gamma [Tesis] Bogor Institut Pertanian Bogor Fakultas Matematika dan Pengetahuan Alam

McCullagh P J A Nelder 1983 Generalized Linear Models London Chapmann and Hall

Ramsini B et all 2001 Uninsured Estimates by County A Review of Options and IssueshttpwwwodhohiogovDataOFHSurvofhsrfq7pdf [24 April 2008]

Rao JNK 2003 Small Area Estimation New York John Wiley amp Sons

Wakefield J 2006 Disease mapping and spatial regression with count data httpwwwbepresscomuwbiostatpaper286pdf [24 April 2008]

7

Appendix 1 Result of EB estimation with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE -100569 0194196 0271669 041875 045917 3239598RRMSE 0123605 0300339 0422426 0503566 0642418 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE -237643 0235733 0306641 0452955 05091 3652167RRMSE 0038956 0412708 0588924 0717336 0844735 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE -749011 0254097 0330078 -12875 0539873 2354887RRMSE -663045 051763 0813734 -038057 1287528 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE -7513075 0235378 0402092 2536714 0876569 6051162RRMSE -10741 0704796 1355566 -121606 3040291 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601

bias 0000395 0116669 0254473 1091172 0497898 6297454MSE -6E+09 -016583 0301527 -584495 5718409 185E+09RRMSE -212936 0927338 2115163 3094627 1359703 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655MSE -38E+09 -130817 0159682 39135606 3074073 12E+11

RRMSE -909131 1647188 6639631 116E+10 1585472 706E+11

8

Appendix 2 Result of EB estimation with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE -053954 010933 0168797 0449506 0369775 360843RRMSE 0022947 0096443 0136424 0238099 0241955 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE -07309 0126202 0201463 0425844 0414597 1734815RRMSE 0021807 0144983 0210692 0326097 0401786 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE -229891 0156942 0277017 0707983 0590466 7469014RRMSE 0023998 0210095 0317195 0519524 0618802 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE -125713 0181557 0284338 0540615 0498521 423089RRMSE 0054916 028362 0420396 0630776 0778033 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442

MSE -181856 0194818 0334706 0859252 0711939 7997074RRMSE 0026206 0387294 0662251 7322807 1312302 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE -34589 0078006 0376514 060793 0804116 3426488RRMSE 000461 0502807 1033578 2981671 2012552 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE -142213 -001433 0255331 0584152 1132152 264456RRMSE 0064209 0847956 1942286 2181192 4589042 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE -10651 -56E-05 -14E-07 -127819 1452962 1132741RRMSE 0063244 1475413 3754705 162697 9221163 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE -175652 -33E-05 -1E-06 2954790 152E-06 613E+08

RRMSE 0040681 4059441 6095076 35E+278 5569021 16E+281

9

Appendix 3 Result of EB estimation (II) with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE 0 0194196 0271669 0419181 045917 3239598RRMSE 0 0300116 0422209 0502895 0641904 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE 0 0235733 0306641 0456258 05091 3652167RRMSE 0 0410357 0585765 0712314 0841838 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE 0 0254097 0330078 2619677 0539873 2354887RRMSE -663045 0448118 0750369 -034911 1209918 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE 0 0235378 0402092 9500073 0876569 6051162RRMSE -10741 0288999 0995659 -100163 2527784 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601bias 0000395 0116669 0254473 1091172 0497898 6297454MSE 0 0 0301527 1444250 5718409 185E+09RRMSE -212936 0 1104113 2205437 5656681 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655

MSE 0 0 0159682 41595285 3074073 12E+11

RRMSE -909131 0 0557622 677E+09 9311925 706E+11

10

Appendix 4 Result of EB estimation (II) with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE 0 010933 0168797 0450626 0369775 360843RRMSE 0 0095932 0135647 023675 0239669 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE 0 0126202 0201463 0428006 0414597 1734815RRMSE 0 0142648 020709 0320663 0395479 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE 0 0156942 0277017 0716543 0590466 7469014RRMSE 0 0203913 0311937 0506882 0615401 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE 0 0181557 0284338 0549835 0498521 423089RRMSE 0 0270309 0405926 0606317 0766631 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442MSE 0 0194818 0334706 094973 0711939 7997074RRMSE 0 0316402 0576343 6561235 1240175 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE 0 0078006 0376514 0749436 0804116 3426488RRMSE 0 0258286 0698814 2340612 1714808 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE 0 0 0255331 1501268 1132152 264456RRMSE 0 0 0688797 1346552 2500825 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE 0 0 0 1755486 1452962 1132741RRMSE 0 0 0 7335062 3311711 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE 0 0 0 2954908 152E-06 613E+08

RRMSE 0 0 0 12E+278 416189 16E+281

11

Appendix 5 Syntax program for generate data

data b generate x1(covariate) and ei input x1cards0222831971100013131702314625252218171412202210run

macro bangkit_datado r=1 to 100

data egenerate poisson-gamma with excess zerodo kk=1 to 30set btetha = rangam(11)lambda = -log(01) peluang munculnya nilai nol yang diinginkan (01-09)starlambda = log(lambdatetha)output endrun

proc regmodel starlambda = x1 ods output ParameterEstimates=workbetha_lr (keep=Parameter Estimate)run

proc transpose data=workbetha_lr out=workbetha_lr_t

12

Appendix 5 Syntax program for generate data (continued)

rundata _null_set workbetha_lr_tcall symput (Intercept col1)call symput (x1 col2)run

data ddo kk=1 to 30set emu = exp(ampIntercept + ampx1x1)parmlambda = mutethaypoi = rand(poissonparmlambda)output endrun

ods trace onto take percent zero on dataproc freq data=dtables ypoi ods output OneWayFreqs=workzerorundata zeroset zerokeep percentrunproc transpose data=zero out=zero1 rundata _null_set workzero1call symput (pctz col1)rundata dset dpzero=amppctzr=amprrun

proc append data=d base=d1run

endmend

bangkit_data

13

Appendix 6 Syntax program EB with NBR

macro sae_nbdo x=1 to 900

data workaset workeif ^(u=ampx) then deleterun

this genmod procedure estimates the response without zero-inflation proc genmod data=amodel ypoi = x1 dist=nb link=logods output ParameterEstimates=workbetha_nb (keep=Parameter Estimate)run

proc transpose data=workbetha_nb out=workbetha_nb_trun

data _null_set workbetha_nb_tcall symput (Intercept col1)call symput (x1 col2)call symput (Dispersion col3)run

EB with negbin-regdata workduga_nbset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + ampDispersion)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(ampDispersion+ypoi)((mu_hat_b+ampDispersion)2)bias_b=abs(teta_hat_bayes-parmlambda)run

proc append data=workduga_nb base=workduga_nb1run

jacknifedo h=1 to 30

data workdset workduga_nb1if ^(u=ampx) then deleterundata workjacknbamphset workdif u=ampxif kk=amph then deleterun

proc genmod data=workjacknbamph output p out=sasyi_estmodel ypoi = x1 dist = nb link=logods output parameterestimates=workbetha_est_nbamph (keep=parameter Estimate)

14

Appendix 6 Syntax program EB with NBR (continued)

runproc transpose data=workbetha_est_nbamph out=workbetha_est_nbtamphrundata _null_set workbetha_est_nbtamphcall symput (Intercept_ col1)call symput (x1_ col2)call symput (Dispersion_ col3)run

data workduganbamphset workdmu_hat_b_amph=exp(ampIntercept_ + ampx1_x1)w_b_amph=mu_hat_b_amph (mu_hat_b_amph + ampDispersion_)teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2g1_amph=(ampDispersion_+ypoi)((mu_hat_b_amph+ampDispersion_)2)beda_g_amph=g1_amph-g1run

data workmse_nb_jmerge workduganb1 workduganb2 workduganb3 workduganb4 workduganb5 workduganb6 workduganb7 workduganb8 workduganb9 workduganb10 workduganb11 workduganb12workduganb13 workduganb14 workduganb15 workduganb16 workduganb17workduganb18 workduganb19 workduganb20 workduganb21 workduganb22workduganb23 workduganb24 workduganb25 workduganb26 workduganb27workduganb28 workduganb29 workduganb30by kkrun

data workmse_nb_jset workmse_nb_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampjendm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesul = ampxrun

proc append data=workmse_nb_j base=workmse_nb_j1run

data workhasilnbmerge workd workmse_nb_j keep kk x1 tetha mu parmlambda ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_b

15

Appendix 6 Syntax program EB with NBR (continued)

run

ods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilnb BASE=workhasilnb1 appendver=v6run

ENDmend

sae_nb

16

Appendix 7 Syntax program EB with ZINB

macro sae_zinb

do x=1 to 900

data workaset work eif ^(u=ampx) then deleterun

proc countreg data=amodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workpe(keep=Parameter Estimate)run

proc transpose data=workpe out=workpe_trun

data _null_set workpe_tcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + amp_Alpha)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(amp_Alpha+ypoi)((mu_hat_b+amp_Alpha)2)bias_b=abs(teta_hat_bayes-parmlamdha)

run

proc append data=workduga base=workduga1run

do h=1 to 30

data workdset workduga1if ^(u=ampx) then deleterundata workjackzinbamphset workdif u=ampxif kk=amph then deleterun

proc countreg data=jackzinbamphmodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workbetha_est_ZINBamph

17

Appendix 7 Syntax program EB with ZINB (continued)

(keep=Parameter Estimate)run

proc transpose data=workbetha_est_ZINBamph out=workbetha_est_ZINBtamphrun

data _null_set workbetha_est_ZINBtamphcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaZINBamphset workdmu_hat_b_amph=exp(ampIntercept + ampx1x1)mu_hat_b_amph= ampb_o- + ampb_1- x1w_b_amph=mu_hat_b_amph (mu_hat_b_amph + (amp_Alpha))teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2

g1_amph =((mu_hat_b_amph2ampalpha_)2)(ampalpha_+y_i)((mu_hat_b_amph2ampalpha_)+mu_hat_b_amph)2

g1_amph=(amp_Alpha+ypoi)((mu_hat_b_amph+amp_Alpha)2)

g1_amph =(A2)(ampk- + y_i)( a +mu_hat_b)2

beda_g_amph=g1_amph-g1run

data workmse_ZINB_jmerge workdugaZINB1 workdugaZINB2 workdugaZINB3 workdugaZINB4 workdugaZINB5 workdugaZINB6 workdugaZINB7 workdugaZINB8 workdugaZINB9 workdugaZINB10 workdugaZINB11 workdugaZINB12workdugaZINB13 workdugaZINB14 workdugaZINB15 workdugaZINB16 workdugaZINB17workdugaZINB18 workdugaZINB19 workdugaZINB20 workdugaZINB21 workdugaZINB22workdugaZINB23 workdugaZINB24 workdugaZINB25 workdugaZINB26 workdugaZINB27workdugaZINB28 workdugaZINB29 workdugaZINB30by kkrun

data workmse_ZINB_jset workmse_ZINB_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampj

18

Appendix 7 Syntax program EB with ZINB (continued)

endm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesrun

data workhasilZINBmerge workd workmse_ZINB_j keep kk x1 tetha mu lamdha ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_brunods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilZINB BASE=workhasilZINB1run

ENDmend

sae_zinb

  • KOPER AMPE PRAKATA_2pdf
  • isiirenepdf
Page 5: Zero-Inflated Negative Binomial in Small Area Estimation · Irianto Oetomo and Fine Analisa Maharani. She has two siblings. In 1998, she graduated from SD Dukuh 09 East Jakarta and

v

BIOGRAPHY

Irene Muflikh Nadhiroh was born in Padang on October 3th 1986 as a first daughter of Ir Irianto Oetomo and Fine Analisa Maharani She has two siblings

In 1998 she graduated from SD Dukuh 09 East Jakarta and then she continued his study at SLTP Negeri 1 Bogor and graduated at 2001 She finished her study at SMU Labschool Rawamangun Jakarta in 2004 and then enrolled in Bogor Agricultural University through USMI In 2004 she joined Department of Statistics Faculty of Mathematics and Natural Sciences

During her time of study she was signed up as lecturer assistant for Basic Statistics class and Experimental Design class in 2006 and 2007 respectively She was also a member of Gamma Sigma Beta (Statistics Students AssociationGSB) and had occupied the head of science division of GSB in 2006-2007 On February-March 2008 complete her fields practice at PT Field Dimension Indonesia

vi

ACKNOWLEDGEMENTS

First of all the author modestly admitted that completion of this paper would not be possible without invaluable help from many generous and extraordinary people The author was deeply in debt for their helps ideas critics and improvement advices during writing process However they should not be hold responsible for all mistakes and deficiencies in this paper which were purely authors So hereby I would like to express my graceful to

1 All praise and gratitude for Allah SWT Alhamdulillah hirabbil alamin With his bless I able to finish this paper Thanks Allah for giving me a wonderful life with extraordinary people around me

2 Prof Dr Khairil Anwar Notodiputro and Ir Indahwati MSi for the early motivation discussion advices support and their great enthusiasm

3 Mr Bambang Sumantri MSi as examiner thanks for the spirit advices and critics4 My beloved family for the unlimited love ever after 5 Mr Alfian Futuhul Hadi MSi for enlightening discussion when I was in trouble6 Mr Bagus Sartono MSi thank you very much to run my data at your lab with your

wonderful computer Sorry if it might disturb you7 Mr Anang MSi and Mr Rahman MSi for sharing their knowledge and technical support8 Mr Dr Ir Hari Wijayanto MS all lecturer and staff at Statistics Department Thanks for

knowledge of statistics and knowledge of life that you shared It means a lot for me9 Rahmatullah Sigit Dodiet Sasongko SSi for the spirit love care time and patience Keep

it real Still love me forever and ever10 Mr Dionisius Laksmana Bisara Putra SSi for edited my paper critics and provided useful

discussion for author 11 Maulana Chistanto SSi and Yhanuar Ismail SSi thank for being my best brother12 Nikhen Sevrien and (alm Dini) thanks for lighting my day13 Rere Yusri Agus Ika Cinong Toki Cheri Fisca Wiwik Neng Mala Lilis Dika

Rangga Lele Dodi Kus Inal Bebek Koler and all of Statisticsrsquo41 14 Everyone that helps me in this study which can not be named personally

This thesis is not perfect so I am expecting the critics advices and recommendation to people who read my thesis Thank You God bless you all

Bogor January 2009

Irene Muflikh Nadhiroh

1

TABLE OF CONTENTS

PageINTRODUCTION 1

Background 1Objectives1

LITERATURE REVIEW 1Direct Estimation1Small Area Estimation 1Small Area Models1Empirical Bayes Methods 2Poisson-Gamma Models 2Negative Binomial Regression 2Over-disperse at Count Data 3Zero-Inflated Models3Zero-Inflated Negative Binomial 3Jackknife Method of Estimating MSE( EB

i )3

METHODOLOGY 4Data 4Methods4

RESULT AND DISCUSSION 4Estimation of Prior Parameter is Based of EB Method with Negative Binomial Regression 4Estimation of Prior Parameter is Based of EB Method with Zero-Inflated Negative Binomial Regression 5Comparison of EB estimator with Negative Binomial Regression and EB estimator with ZINB 5

CONCLUSION 5RECOMMENDATION 6REFERENCES 6

LIST OF TABLES

PageTable 1 MSE and RRMSE of EB Estimator with NBR 4Table 2 MSE (II) and RRMSE (II) of EB Estimator with NBR 4Table 3 MSE and RRMSE of EB Estimator with ZINB 5Table 4 MSE (II) and RRMSE (II) of EB Estimator with ZINB 5

LIST OF APPENDICES

PageAppendix 1 Result of EB estimation with NBR 7Appendix 2 Result of EB estimation with ZINB 8Appendix 3 Result of EB estimation (II) with NBR 9Appendix 4 Result of EB estimation (II) with ZINB 10Appendix 5 Syntax program for generate data 11Appendix 6 Syntax program EB with NBR 13Appendix 7 Syntax program EB with ZINB 16

1

INTRODUCTION

BackgroundDirect estimation is usually applied in big

scale survey but it is sometime difficult to utilize such estimator in a smaller region especially the sample size is too small In this case indirect estimation which adds covariates to estimate the parameter is usually used This type of estimation is broadly known as Small Area Estimation

Kismiantini (2007) conducted a research in Small Area Estimation based on Poisson-Gamma models Maximum Likelihood Estimation was used with Negative Binomial Regression techniques to estimate the respective prior parameter Moreover Negative Binomial Regression was used to resolve over-dispersion problem in the data

In reality count data is not onlycharacterized by over-dispersion but sometimes by excess-zero Excess-zero is a condition when the data contains too many zero or exceeds the distributionrsquos expectation 100 observations from Poisson model with response mean of 4 we could expect that there will be 2 zeros If the data have 30 zeros it should be obvious that the distributional assumptions have been violated Therefore the estimated parameter and standard error will be biased (Hardin amp Hilbe 2007) In this paper Zero-Inflated models were adapted to solve this type of problem

ObjectivesThe research objectives are

1 To investigate the performance of Negative Binomial Regression on Small Area Estimation in case of excess-zero

2 To apply Zero-Inflated Count Models on Small Area Estimation in case of excess-zero

3 To evaluate the performance of Zero-Inflated Count Models in estimating prior parameter for Small Area Estimation

LITERATURE REVIEW

Direct EstimationDirect estimates are generally ldquodesign

basedrdquo in the sense that they make use of ldquosurvey weightrdquo and associated inferences are based on the probability distribution by the sample design with the population values held fixed (Rao 2003) In particular direct estimates of a domain parameter are based only on the domain-specific sample data

Data from sample survey have been used to be a reliable estimate of parameter Ramsini et al (2001) mentioned that direct estimates of small area are unbiased although it would have big variance cause itrsquos small sample size

Small Area EstimationThe term of small area can be everything

depending on our object of interest It can be a city age group sex group region and rural district In general small area is used to denote any domain which the direct estimation with adequate precision can not be produced (Rao 2003) It happens because the sample size in small area is too small As a result direct estimation based on sampling design is not capable to produce direct estimation with adequate precision Furthermore small area estimation is developed as a statistic technique for estimating the parameter of small area This technique is used in effort to make estimation with adequate level of precision It works as indirect estimation that lend the strength of variable interest values from related areas through the use of supplementary information related to variable interest such as recent census count and current administrative records (Rao 2003) Indirect estimation is a process of estimating a domainrsquos parameter by connecting the information in that domain with another domain using an appropriate model So the estimator works by including other domainrsquos data (Kurnia amp Notodiputro 2006)

Small Area ModelsThere are two link models in indirect

estimation First traditional method based on implicit models that provide a link to relate small area through supplementary data Second explicit small area models that make specific allowance between area variations (Rao 2003) This research used the second model and it could be classified into two broad types of basic model1 Basic area level (type A) model

Basic area level model or aggregate model includes all models that relate small area with area-specific auxiliary variables These models are essential if unit (element) level data are not available Assuming parameter estimators

i is

related to area specific auxiliary data or covariate variables T

pii xxx )( 11 by

a linear model

2

iiT

ii vbx with i=1hellipm

iv ~N(0 2v ) are area-specific random

effect and Tp )( 1 is 1p vector of

regression coefficients Therefore ib are

known as positive constants For making inferences about

i direct estimators iy

are assumed available Accordingly assuming

iii ey where i=1hellipm with

sampling error ie ~N(0 ei2 ) and ei

2are known At the end both models are combined and as a result is new model

iiiT

ii evbxy where i=1hellipm

(Rao 2003)2 Basic unit level (type B) model

Unit level model includes all models that relate unit values of the study variable to unit-specific auxiliary variables Assuming unit-specific auxiliary variables T

ijpijij xxx )( 1 and

correspondingly a nested regression model

ijiT

ijij evxy where

i=1hellipm and j=1hellip in with

iv ~N(0 2v ) and also ie ~N(0 ei

2 )

Empirical Bayes MethodsThe Bayesian approach is based on Bayes

Law which was found by Thomas Bayes This law was introduced by Richard Proce in 1763 two years after Thomas Bayes passed away In 1774 and 1781 Laplace gave the details and relevancies for modern Bayesian statistics (Gill 2002 in Kismiantini 2007)

Novick in Good (1980) mentioned that Bayes method is difficult to adopt and sometimes is very sensitive due to the requirement of prior probability informationwhich is usually difficult to obtain Robbin (1955) introduced Empirical Bayes methods by assuming a particular prior distribution estimating based on the sample Rao (2003) said that EB (Empirical Bayes) and HB (Hierarchical Bayes) are compatible for binary and count data in Small Area Estimation Therefore EB method was used in this research

Rao (2003) summarized EB methods in Small Area Estimation as follows 1 Obtain the posterior probability density

function of the small area parameter2 Estimate the parameters from the

marginal density function

3 Use the estimated posterior density forinferences regarding the parameters ofinterest

Poisson-Gamma ModelsPoisson model is a standard model in

dealing with count data Generally count data can be suffered by over-dispersion problem Therefore a Poisson formula had been developed to accommodate extra variance from sample data Two-stage models have been introduced for count data known as mixed model Poisson-Gamma Wakefield (2006) introduced Poisson-Gamma model which was easier to use with SMR (Standard Mortality Ratio) as a direct estimator This study used Wakefield model with alteration in direct estimator

Let iy be a number of specific individual

at small area-i which has specific characteristic of interest and written as follow

j

iji yy

ijy are the-jth object at the-ith small area where

j=1hellipn and i=1hellipm

First stage )(~ ii

ind

i Poissony is assumed

where )( ii x describes a regression

model in area level ix is a vector of

covariates and Tpii)( is a vector of

regression coefficientsSecond stage distribution

)1(~ gammaiid

iis assumed as a prior

distribution with mean 1 and variance 1

Then the marginal distribution |iy is

negative binomialMoreover Wakefield (2006) used Bayes

Theorem and acquired posterior distributionas

)1(~|i

iii ygammay

and EB estimator as

iiiiB

iEB

i )ˆ1(ˆˆ)ˆˆ(ˆˆ

with )ˆˆ(ˆˆ iii ii y are direct

estimation from i and iy are the number of

observation

Negative Binomial Regression The negative binomial regression model

seems have been first discussed by Anscombe (1972) Others have pointed out its success indealing with over-dispersed count data

3

Lawless (1987) elaborated the mixture model parameterization of the negative binomial providing formulas for its log likelihood mean variance and moments Later Breslow (1990) cited Lawlessrsquo work and since its inception to the late 1980rsquos the negative binomial regression model have been construed as a mixture model that is useful for accommodating otherwise over-dispersed Poisson data (Hardin amp Hilbe 2007) The negative binomial distribution function is written as

yk

kk

k

y

kyxyg

)1()(

)()|(

where y=012hellip k and are negative

binomial parameter with )(yE and

ky 2)var( k mention as disperse

parameter which is shown that the data consist of over-dispersed

Over-disperse at Count DataCount data for Poisson regression

including by over-disperse if variance bigger than mean or if the expected value of variance is smaller than expected This phenomenon is written as

)()( ii yEyVar (McCullagh amp Nelder 1989)

Zero-Inflated ModelsZero-Inflated models consider two distinct

sources of zero outcomes One source is generated from individuals who do not enter into the counting process the other from those who do enter the count process but result in a zero outcome (Hardin amp Hilbe 2007)

Lambert (1992) first described this type of mixture model in the context of process control in manufacturing It has since been used in many applications and is now founddiscussed in nearly every book or article dealing with count response models

For the zero-inflated model the probability of observing a zero outcome equals the probability that an individual is in the always-zero group plus the probability that individual is not that group times the probability that the counting process produces a zero If )0(B as

the probability that the binary process result in a zero outcomes and )0Pr( as the probability

that the counting of a zero outcomes the probability of a zero outcome for the system is then given by (Hardin amp Hilbe 2007)

)0Pr()1()0()0Pr( ZBy The probability of a nonzero count is

)Pr()]0(1[)0Pr( kBkky This model would produce two groups of

parameter one is zero-inflation parameter which shown that the covariate significantly contribute to having a zero outcomes And the other parameter is negative binomial parameter which modeling the response with the covariate

Zero-Inflated Negative BinomialThere are many kinds of zero-inflated

model each model has plus and minus and is used in different type of data Zero-Inflated negative binomial is one kind of them This model is used in over-disperse and excess-zero data As a result among parameter estimators there would be k parameters which indicate that over-disperse occur in data just as disperse parameter in negative binomial regression

The probability distribution of this model is as follow

)|( iii xyYP )|0()(1)( iii xgxx )|()(1 iii xygx

Where is a function of iz ix are vector

of zero-inflated covariate and is a vector of

zero-inflated coefficient which will be estimated Meanwhile )|( ii xyg is probability

distribution of negative binomial written asiy

i

i

iii

iii y

yxyg

)1()(

)()|(

Mean and variance of ZINB are

))(1)(1()|(

)1()|(

iiiiii

iiii

xyV

xyE

Jackknife Method of Estimating MSE( EBi )

Jackknife methods is one of general methods used in survey because itrsquos unpretentious concept (Jiang Lahiri and Wan 2002) This methods have been known by Tukey (1958) and developed to be a method that capable to be bias corrected of estimator by remove observation-i for i=1hellipm and performs parameter estimation

Rao (2003) the Jackknife step to estimate MSE( EB

i ) are

1 Assume that )ˆˆ(ˆ iiEBi yk

)ˆˆ(ˆ111 ii

EBi yk then calculate

m

l

EBi

EBii m

mM 2

12 )ˆˆ(1ˆ

2 Calculate the delete-i estimator 1

ˆ

and

1 then calculate

4

)]ˆˆ()ˆˆ([1

)ˆˆ(ˆ111111 ii

m

miiiiii ygyg

m

mygM

And )ˆ( 21 vig is the variance estimator of

posterior distribution which is used to measure the variability associated with i

The use of )ˆ( 21 vig is leads to severe of

underestimation of )ˆ( EBiJMSE related

with estimation in prior parameter Therefore the estimator

iM1ˆ correct the

bias of )ˆ( 21 vig

3 Calculate the jackknife estimator of MSE( EB

i ) as

iiEBiJ MMMSE 21

ˆˆ)ˆ(

METHODOLOGY

DataThis research assumed that the available

auxiliary data is on area level so this research used basic area level model The data were simulated with 30 small areas and one covariate Every batch generated different conditions of excess-zeros data start from 01 until 09 probability of zero in small area This research assumed structure of relation between respond and covariate was linear

MethodsThe following steps in generating data

using SAS 91 were used1 Fix the value of

iX for the- i th area

2 Define the expected probability of zero in each small area ))0(( iYP then

calculate ))0(log( ii YPLambda

3 Generate )11(~ Gammai4 Calculate )log(

iiLambda 5 Fit linear regression between and

iX to

obtain0 and

16 Calculate )`exp(X= ii 7 Calculate

iiparmlambda 8 Generate )(~ parmlambdaPoissonyi

Moreover in analyzing data the following steps were applied 1 Generate the negative binomial regression

with genmod procedure in SAS 91 and Zero-Inflated Negative binomial Regression with countreg procedure in SAS 92

2 Estimate the prior parameter which are and

3 Estimate using EB method4 Calculate MSE for indirect estimation5 Calculate RRMSE (Root Relative Mean

Square Error)

i

ii

MSERRMSE

ˆ)ˆ(

)ˆ(

RESULT AND DISCUSSION

Estimation of Prior Parameter is Based on EB Method with Negative Binomial

RegressionIn case of non-excess-zero data the

estimator produced small and consistent MSE Meanwhile if the number of excess-zero isapproximately 30 or more with expected probability of zero 06 the performance of estimates tends to be unreliable As a result EB estimation produced negative values

RRMSE of the estimator increasessimultaneously along with the increase of number of zero in the data Furthermore if thedata contain excess zero at least 30 theestimator is unreliable

Table 1 MSE and RRMSE of EB Estimator with NBR

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 033 016 018 01302 035 020 026 02003 040 023 036 03004 042 027 050 04205 045 031 072 05906 -12875 033 -038 08107 253671 040 -1216 13508 -584495 030 30946 21109 39135606 016 116E+10 664

Table 2 MSE (II) and RRMSE (II) of EB Estimator with NBR

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 033 016 018 01302 035 020 026 02003 040 023 036 03004 042 027 050 04205 046 031 071 05806 26197 033 -035 07507 950007 040 -1002 09908 1444250 030 22054 11009 41595285 016 677E+09 056

5

Table 1 show that the iterative process produced unexpected negative values of MSEThe simplest way to solve this problem is tochange the negative value to zero MSE (II) and RRMSE (II) in table 2 are the result of MSE and RRMSE after the negative value of MSE has been changed to zero

When data have expected probability of zero by 06 to 09 mean of MSE (II) increases drastically Similarly mean of RRMSE (II) increases sharply when data have 08 to 09 expected probability of zero However when data have 06 to 07 expected probability of zero the mean of RRMSE (II) is negative due to the negative value of EB estimates

Estimation of Prior Parameter is Based on EB Method with Zero-Inflated

Negative Binomial RegressionThe EB estimates are similar to the

estimates produced by NBR method although they are slightly outperformed NBR method when the data only contain small number of zeros In particular as shown by table 3 if data have expected probability of zero by 01 to 05 ZINB produces bigger MSE for EB estimator than which NBR produces

Whereas if data have expected probability of zero by 06 to 07 ZINB gives better estimates The estimates were also unbiased as it covers parameter values adequately However ZINB begins to produce inconsistent estimates if data have expected probability of zero by 08 or more due to enormous MSE

Besides when data have expected is because ZINB generates small estimates which is close to the parameter values

Mean of MSE (II) with ZINB is biggerthan the mean of MSE with ZINB That is because when negative value of MSE changed to zero it doesnrsquot have reduction factor in the mean calculation

Comparison of EB estimator withNegative Binomial Regression and EB

estimator with ZINBEB estimates given by both NBR and

ZINB methods are similar for data with small numbers of zero However ZINB method produces bigger MSE than NBR do as long as expected probability of zero in data does not exceed 06 thresholds

But ZINB method performs better if data have expected probability of zero by 06 to 07 In this case EB estimates given by NBR method are unstable and inconsistent due to estimatesrsquo negative value and huge MSE that

can be thousand times larger than theiracceptable value On the other hand EB estimator with ZINB works well it givesunbiased estimates and its MSE values are more stable than EB estimates with NBR

Both methods would have performed poorly if data had expected probability of zero by 08 or more EB estimators with both methods were inconsistent as a result of very huge MSE values they produced

Table 3 MSE and RRMSE of EB Estimator with ZINB

Probability of zero

Mean of MSE

Median ofMSE

Mean ofRRMSE

Median of RRMSE

01 045 017 024 01402 043 020 033 02103 071 028 052 03204 054 028 0632 04205 086 033 7322807 06606 061 038 29817 10307 058 025 218119 19408 -128 -14E-07 162697 37509 2954790 -1E-06 35E+278 609508

Table 4 MSE (II) and RRMSE (II) of EB Estimator with ZINB

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 045 017 024 01402 0436 020 0324 02103 072 028 051 031104 055 028 061 04105 095 033 6561235 05806 075 038 23406 07007 150 025 134655 06908 175 0 733506 009 2954908 0 12E+278 0

CONCLUSION

Excess-zero in data highly influenced the result of EB estimation Conventional method such as negative binomial regression in prior estimation has produced unbiased and unreliable EB estimator for data with expected probability of zero by 06 This is shown bybig number of MSE and negative value of estimator

Meanwhile EB estimation by ZINB method produced more reliable estimator even when the data have expected probability of zero by 06 to 07

The ZINB has also provided a reliable estimator for data with less than 5333 of zeros This means that performance of ZINB

6

declines when the data have expected probability of zero by 08 or more As shown by the big MSE and inconsistent estimator

RECOMMENDATION

This research is based on many assumptions and suffered by several limitations If the assumptions and boundaries can be relaxed can be expected better result There are some recommendations for the next research1 The generating process in this research

does not reflect the real sampling processIf the generating process similar to the real sampling process it might give better result because it will be closer with the real application

2 It will be more interesting to runexperiment which takes account of larger number of areas since the number of areas will influence data modeling

3 The Restricted Maximum Likelihood maybe applied when estimating prior parameter with ZINB and NBR in other to solve the negative value of MSE

4 Theoretical research of ZINB and Empirical Bayes estimator is important to understand the behavior of parameter estimates of ZINB in Empirical Bayes setting

REFERENCES

Erdman D L Jackson A Sinko 2008 Zero-Inflated Poisson and Zero-Inflated Negative Binomial Models Using the COUNTREG Procedure SAS Global Forum 2008322-2008httpwww2sascomproceedingsforum2008322-2008pdf [25 Agustus 2008]

Famoye F KP Singh 2006 Zero-Inflated Generalized Poisson Regression Model with an Application to Domestic Violence Data Journal of Data Science 4117-130

Hardin JW JM Hilbe 2007 Generalized Linear Models and Extensions Texas A Stata Press Publication

Kurnia A KA Notodiputro 2006 Penerapan Metode Jackknife dalam pendugaan Area Kecil Forum Statistika dan Komputasi April 2006 p12-15

Kismiantini 2007 Pendugaan Statistik Area Kecil Berbasis Model Poisson-Gamma [Tesis] Bogor Institut Pertanian Bogor Fakultas Matematika dan Pengetahuan Alam

McCullagh P J A Nelder 1983 Generalized Linear Models London Chapmann and Hall

Ramsini B et all 2001 Uninsured Estimates by County A Review of Options and IssueshttpwwwodhohiogovDataOFHSurvofhsrfq7pdf [24 April 2008]

Rao JNK 2003 Small Area Estimation New York John Wiley amp Sons

Wakefield J 2006 Disease mapping and spatial regression with count data httpwwwbepresscomuwbiostatpaper286pdf [24 April 2008]

7

Appendix 1 Result of EB estimation with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE -100569 0194196 0271669 041875 045917 3239598RRMSE 0123605 0300339 0422426 0503566 0642418 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE -237643 0235733 0306641 0452955 05091 3652167RRMSE 0038956 0412708 0588924 0717336 0844735 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE -749011 0254097 0330078 -12875 0539873 2354887RRMSE -663045 051763 0813734 -038057 1287528 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE -7513075 0235378 0402092 2536714 0876569 6051162RRMSE -10741 0704796 1355566 -121606 3040291 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601

bias 0000395 0116669 0254473 1091172 0497898 6297454MSE -6E+09 -016583 0301527 -584495 5718409 185E+09RRMSE -212936 0927338 2115163 3094627 1359703 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655MSE -38E+09 -130817 0159682 39135606 3074073 12E+11

RRMSE -909131 1647188 6639631 116E+10 1585472 706E+11

8

Appendix 2 Result of EB estimation with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE -053954 010933 0168797 0449506 0369775 360843RRMSE 0022947 0096443 0136424 0238099 0241955 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE -07309 0126202 0201463 0425844 0414597 1734815RRMSE 0021807 0144983 0210692 0326097 0401786 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE -229891 0156942 0277017 0707983 0590466 7469014RRMSE 0023998 0210095 0317195 0519524 0618802 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE -125713 0181557 0284338 0540615 0498521 423089RRMSE 0054916 028362 0420396 0630776 0778033 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442

MSE -181856 0194818 0334706 0859252 0711939 7997074RRMSE 0026206 0387294 0662251 7322807 1312302 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE -34589 0078006 0376514 060793 0804116 3426488RRMSE 000461 0502807 1033578 2981671 2012552 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE -142213 -001433 0255331 0584152 1132152 264456RRMSE 0064209 0847956 1942286 2181192 4589042 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE -10651 -56E-05 -14E-07 -127819 1452962 1132741RRMSE 0063244 1475413 3754705 162697 9221163 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE -175652 -33E-05 -1E-06 2954790 152E-06 613E+08

RRMSE 0040681 4059441 6095076 35E+278 5569021 16E+281

9

Appendix 3 Result of EB estimation (II) with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE 0 0194196 0271669 0419181 045917 3239598RRMSE 0 0300116 0422209 0502895 0641904 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE 0 0235733 0306641 0456258 05091 3652167RRMSE 0 0410357 0585765 0712314 0841838 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE 0 0254097 0330078 2619677 0539873 2354887RRMSE -663045 0448118 0750369 -034911 1209918 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE 0 0235378 0402092 9500073 0876569 6051162RRMSE -10741 0288999 0995659 -100163 2527784 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601bias 0000395 0116669 0254473 1091172 0497898 6297454MSE 0 0 0301527 1444250 5718409 185E+09RRMSE -212936 0 1104113 2205437 5656681 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655

MSE 0 0 0159682 41595285 3074073 12E+11

RRMSE -909131 0 0557622 677E+09 9311925 706E+11

10

Appendix 4 Result of EB estimation (II) with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE 0 010933 0168797 0450626 0369775 360843RRMSE 0 0095932 0135647 023675 0239669 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE 0 0126202 0201463 0428006 0414597 1734815RRMSE 0 0142648 020709 0320663 0395479 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE 0 0156942 0277017 0716543 0590466 7469014RRMSE 0 0203913 0311937 0506882 0615401 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE 0 0181557 0284338 0549835 0498521 423089RRMSE 0 0270309 0405926 0606317 0766631 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442MSE 0 0194818 0334706 094973 0711939 7997074RRMSE 0 0316402 0576343 6561235 1240175 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE 0 0078006 0376514 0749436 0804116 3426488RRMSE 0 0258286 0698814 2340612 1714808 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE 0 0 0255331 1501268 1132152 264456RRMSE 0 0 0688797 1346552 2500825 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE 0 0 0 1755486 1452962 1132741RRMSE 0 0 0 7335062 3311711 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE 0 0 0 2954908 152E-06 613E+08

RRMSE 0 0 0 12E+278 416189 16E+281

11

Appendix 5 Syntax program for generate data

data b generate x1(covariate) and ei input x1cards0222831971100013131702314625252218171412202210run

macro bangkit_datado r=1 to 100

data egenerate poisson-gamma with excess zerodo kk=1 to 30set btetha = rangam(11)lambda = -log(01) peluang munculnya nilai nol yang diinginkan (01-09)starlambda = log(lambdatetha)output endrun

proc regmodel starlambda = x1 ods output ParameterEstimates=workbetha_lr (keep=Parameter Estimate)run

proc transpose data=workbetha_lr out=workbetha_lr_t

12

Appendix 5 Syntax program for generate data (continued)

rundata _null_set workbetha_lr_tcall symput (Intercept col1)call symput (x1 col2)run

data ddo kk=1 to 30set emu = exp(ampIntercept + ampx1x1)parmlambda = mutethaypoi = rand(poissonparmlambda)output endrun

ods trace onto take percent zero on dataproc freq data=dtables ypoi ods output OneWayFreqs=workzerorundata zeroset zerokeep percentrunproc transpose data=zero out=zero1 rundata _null_set workzero1call symput (pctz col1)rundata dset dpzero=amppctzr=amprrun

proc append data=d base=d1run

endmend

bangkit_data

13

Appendix 6 Syntax program EB with NBR

macro sae_nbdo x=1 to 900

data workaset workeif ^(u=ampx) then deleterun

this genmod procedure estimates the response without zero-inflation proc genmod data=amodel ypoi = x1 dist=nb link=logods output ParameterEstimates=workbetha_nb (keep=Parameter Estimate)run

proc transpose data=workbetha_nb out=workbetha_nb_trun

data _null_set workbetha_nb_tcall symput (Intercept col1)call symput (x1 col2)call symput (Dispersion col3)run

EB with negbin-regdata workduga_nbset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + ampDispersion)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(ampDispersion+ypoi)((mu_hat_b+ampDispersion)2)bias_b=abs(teta_hat_bayes-parmlambda)run

proc append data=workduga_nb base=workduga_nb1run

jacknifedo h=1 to 30

data workdset workduga_nb1if ^(u=ampx) then deleterundata workjacknbamphset workdif u=ampxif kk=amph then deleterun

proc genmod data=workjacknbamph output p out=sasyi_estmodel ypoi = x1 dist = nb link=logods output parameterestimates=workbetha_est_nbamph (keep=parameter Estimate)

14

Appendix 6 Syntax program EB with NBR (continued)

runproc transpose data=workbetha_est_nbamph out=workbetha_est_nbtamphrundata _null_set workbetha_est_nbtamphcall symput (Intercept_ col1)call symput (x1_ col2)call symput (Dispersion_ col3)run

data workduganbamphset workdmu_hat_b_amph=exp(ampIntercept_ + ampx1_x1)w_b_amph=mu_hat_b_amph (mu_hat_b_amph + ampDispersion_)teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2g1_amph=(ampDispersion_+ypoi)((mu_hat_b_amph+ampDispersion_)2)beda_g_amph=g1_amph-g1run

data workmse_nb_jmerge workduganb1 workduganb2 workduganb3 workduganb4 workduganb5 workduganb6 workduganb7 workduganb8 workduganb9 workduganb10 workduganb11 workduganb12workduganb13 workduganb14 workduganb15 workduganb16 workduganb17workduganb18 workduganb19 workduganb20 workduganb21 workduganb22workduganb23 workduganb24 workduganb25 workduganb26 workduganb27workduganb28 workduganb29 workduganb30by kkrun

data workmse_nb_jset workmse_nb_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampjendm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesul = ampxrun

proc append data=workmse_nb_j base=workmse_nb_j1run

data workhasilnbmerge workd workmse_nb_j keep kk x1 tetha mu parmlambda ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_b

15

Appendix 6 Syntax program EB with NBR (continued)

run

ods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilnb BASE=workhasilnb1 appendver=v6run

ENDmend

sae_nb

16

Appendix 7 Syntax program EB with ZINB

macro sae_zinb

do x=1 to 900

data workaset work eif ^(u=ampx) then deleterun

proc countreg data=amodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workpe(keep=Parameter Estimate)run

proc transpose data=workpe out=workpe_trun

data _null_set workpe_tcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + amp_Alpha)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(amp_Alpha+ypoi)((mu_hat_b+amp_Alpha)2)bias_b=abs(teta_hat_bayes-parmlamdha)

run

proc append data=workduga base=workduga1run

do h=1 to 30

data workdset workduga1if ^(u=ampx) then deleterundata workjackzinbamphset workdif u=ampxif kk=amph then deleterun

proc countreg data=jackzinbamphmodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workbetha_est_ZINBamph

17

Appendix 7 Syntax program EB with ZINB (continued)

(keep=Parameter Estimate)run

proc transpose data=workbetha_est_ZINBamph out=workbetha_est_ZINBtamphrun

data _null_set workbetha_est_ZINBtamphcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaZINBamphset workdmu_hat_b_amph=exp(ampIntercept + ampx1x1)mu_hat_b_amph= ampb_o- + ampb_1- x1w_b_amph=mu_hat_b_amph (mu_hat_b_amph + (amp_Alpha))teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2

g1_amph =((mu_hat_b_amph2ampalpha_)2)(ampalpha_+y_i)((mu_hat_b_amph2ampalpha_)+mu_hat_b_amph)2

g1_amph=(amp_Alpha+ypoi)((mu_hat_b_amph+amp_Alpha)2)

g1_amph =(A2)(ampk- + y_i)( a +mu_hat_b)2

beda_g_amph=g1_amph-g1run

data workmse_ZINB_jmerge workdugaZINB1 workdugaZINB2 workdugaZINB3 workdugaZINB4 workdugaZINB5 workdugaZINB6 workdugaZINB7 workdugaZINB8 workdugaZINB9 workdugaZINB10 workdugaZINB11 workdugaZINB12workdugaZINB13 workdugaZINB14 workdugaZINB15 workdugaZINB16 workdugaZINB17workdugaZINB18 workdugaZINB19 workdugaZINB20 workdugaZINB21 workdugaZINB22workdugaZINB23 workdugaZINB24 workdugaZINB25 workdugaZINB26 workdugaZINB27workdugaZINB28 workdugaZINB29 workdugaZINB30by kkrun

data workmse_ZINB_jset workmse_ZINB_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampj

18

Appendix 7 Syntax program EB with ZINB (continued)

endm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesrun

data workhasilZINBmerge workd workmse_ZINB_j keep kk x1 tetha mu lamdha ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_brunods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilZINB BASE=workhasilZINB1run

ENDmend

sae_zinb

  • KOPER AMPE PRAKATA_2pdf
  • isiirenepdf
Page 6: Zero-Inflated Negative Binomial in Small Area Estimation · Irianto Oetomo and Fine Analisa Maharani. She has two siblings. In 1998, she graduated from SD Dukuh 09 East Jakarta and

vi

ACKNOWLEDGEMENTS

First of all the author modestly admitted that completion of this paper would not be possible without invaluable help from many generous and extraordinary people The author was deeply in debt for their helps ideas critics and improvement advices during writing process However they should not be hold responsible for all mistakes and deficiencies in this paper which were purely authors So hereby I would like to express my graceful to

1 All praise and gratitude for Allah SWT Alhamdulillah hirabbil alamin With his bless I able to finish this paper Thanks Allah for giving me a wonderful life with extraordinary people around me

2 Prof Dr Khairil Anwar Notodiputro and Ir Indahwati MSi for the early motivation discussion advices support and their great enthusiasm

3 Mr Bambang Sumantri MSi as examiner thanks for the spirit advices and critics4 My beloved family for the unlimited love ever after 5 Mr Alfian Futuhul Hadi MSi for enlightening discussion when I was in trouble6 Mr Bagus Sartono MSi thank you very much to run my data at your lab with your

wonderful computer Sorry if it might disturb you7 Mr Anang MSi and Mr Rahman MSi for sharing their knowledge and technical support8 Mr Dr Ir Hari Wijayanto MS all lecturer and staff at Statistics Department Thanks for

knowledge of statistics and knowledge of life that you shared It means a lot for me9 Rahmatullah Sigit Dodiet Sasongko SSi for the spirit love care time and patience Keep

it real Still love me forever and ever10 Mr Dionisius Laksmana Bisara Putra SSi for edited my paper critics and provided useful

discussion for author 11 Maulana Chistanto SSi and Yhanuar Ismail SSi thank for being my best brother12 Nikhen Sevrien and (alm Dini) thanks for lighting my day13 Rere Yusri Agus Ika Cinong Toki Cheri Fisca Wiwik Neng Mala Lilis Dika

Rangga Lele Dodi Kus Inal Bebek Koler and all of Statisticsrsquo41 14 Everyone that helps me in this study which can not be named personally

This thesis is not perfect so I am expecting the critics advices and recommendation to people who read my thesis Thank You God bless you all

Bogor January 2009

Irene Muflikh Nadhiroh

1

TABLE OF CONTENTS

PageINTRODUCTION 1

Background 1Objectives1

LITERATURE REVIEW 1Direct Estimation1Small Area Estimation 1Small Area Models1Empirical Bayes Methods 2Poisson-Gamma Models 2Negative Binomial Regression 2Over-disperse at Count Data 3Zero-Inflated Models3Zero-Inflated Negative Binomial 3Jackknife Method of Estimating MSE( EB

i )3

METHODOLOGY 4Data 4Methods4

RESULT AND DISCUSSION 4Estimation of Prior Parameter is Based of EB Method with Negative Binomial Regression 4Estimation of Prior Parameter is Based of EB Method with Zero-Inflated Negative Binomial Regression 5Comparison of EB estimator with Negative Binomial Regression and EB estimator with ZINB 5

CONCLUSION 5RECOMMENDATION 6REFERENCES 6

LIST OF TABLES

PageTable 1 MSE and RRMSE of EB Estimator with NBR 4Table 2 MSE (II) and RRMSE (II) of EB Estimator with NBR 4Table 3 MSE and RRMSE of EB Estimator with ZINB 5Table 4 MSE (II) and RRMSE (II) of EB Estimator with ZINB 5

LIST OF APPENDICES

PageAppendix 1 Result of EB estimation with NBR 7Appendix 2 Result of EB estimation with ZINB 8Appendix 3 Result of EB estimation (II) with NBR 9Appendix 4 Result of EB estimation (II) with ZINB 10Appendix 5 Syntax program for generate data 11Appendix 6 Syntax program EB with NBR 13Appendix 7 Syntax program EB with ZINB 16

1

INTRODUCTION

BackgroundDirect estimation is usually applied in big

scale survey but it is sometime difficult to utilize such estimator in a smaller region especially the sample size is too small In this case indirect estimation which adds covariates to estimate the parameter is usually used This type of estimation is broadly known as Small Area Estimation

Kismiantini (2007) conducted a research in Small Area Estimation based on Poisson-Gamma models Maximum Likelihood Estimation was used with Negative Binomial Regression techniques to estimate the respective prior parameter Moreover Negative Binomial Regression was used to resolve over-dispersion problem in the data

In reality count data is not onlycharacterized by over-dispersion but sometimes by excess-zero Excess-zero is a condition when the data contains too many zero or exceeds the distributionrsquos expectation 100 observations from Poisson model with response mean of 4 we could expect that there will be 2 zeros If the data have 30 zeros it should be obvious that the distributional assumptions have been violated Therefore the estimated parameter and standard error will be biased (Hardin amp Hilbe 2007) In this paper Zero-Inflated models were adapted to solve this type of problem

ObjectivesThe research objectives are

1 To investigate the performance of Negative Binomial Regression on Small Area Estimation in case of excess-zero

2 To apply Zero-Inflated Count Models on Small Area Estimation in case of excess-zero

3 To evaluate the performance of Zero-Inflated Count Models in estimating prior parameter for Small Area Estimation

LITERATURE REVIEW

Direct EstimationDirect estimates are generally ldquodesign

basedrdquo in the sense that they make use of ldquosurvey weightrdquo and associated inferences are based on the probability distribution by the sample design with the population values held fixed (Rao 2003) In particular direct estimates of a domain parameter are based only on the domain-specific sample data

Data from sample survey have been used to be a reliable estimate of parameter Ramsini et al (2001) mentioned that direct estimates of small area are unbiased although it would have big variance cause itrsquos small sample size

Small Area EstimationThe term of small area can be everything

depending on our object of interest It can be a city age group sex group region and rural district In general small area is used to denote any domain which the direct estimation with adequate precision can not be produced (Rao 2003) It happens because the sample size in small area is too small As a result direct estimation based on sampling design is not capable to produce direct estimation with adequate precision Furthermore small area estimation is developed as a statistic technique for estimating the parameter of small area This technique is used in effort to make estimation with adequate level of precision It works as indirect estimation that lend the strength of variable interest values from related areas through the use of supplementary information related to variable interest such as recent census count and current administrative records (Rao 2003) Indirect estimation is a process of estimating a domainrsquos parameter by connecting the information in that domain with another domain using an appropriate model So the estimator works by including other domainrsquos data (Kurnia amp Notodiputro 2006)

Small Area ModelsThere are two link models in indirect

estimation First traditional method based on implicit models that provide a link to relate small area through supplementary data Second explicit small area models that make specific allowance between area variations (Rao 2003) This research used the second model and it could be classified into two broad types of basic model1 Basic area level (type A) model

Basic area level model or aggregate model includes all models that relate small area with area-specific auxiliary variables These models are essential if unit (element) level data are not available Assuming parameter estimators

i is

related to area specific auxiliary data or covariate variables T

pii xxx )( 11 by

a linear model

2

iiT

ii vbx with i=1hellipm

iv ~N(0 2v ) are area-specific random

effect and Tp )( 1 is 1p vector of

regression coefficients Therefore ib are

known as positive constants For making inferences about

i direct estimators iy

are assumed available Accordingly assuming

iii ey where i=1hellipm with

sampling error ie ~N(0 ei2 ) and ei

2are known At the end both models are combined and as a result is new model

iiiT

ii evbxy where i=1hellipm

(Rao 2003)2 Basic unit level (type B) model

Unit level model includes all models that relate unit values of the study variable to unit-specific auxiliary variables Assuming unit-specific auxiliary variables T

ijpijij xxx )( 1 and

correspondingly a nested regression model

ijiT

ijij evxy where

i=1hellipm and j=1hellip in with

iv ~N(0 2v ) and also ie ~N(0 ei

2 )

Empirical Bayes MethodsThe Bayesian approach is based on Bayes

Law which was found by Thomas Bayes This law was introduced by Richard Proce in 1763 two years after Thomas Bayes passed away In 1774 and 1781 Laplace gave the details and relevancies for modern Bayesian statistics (Gill 2002 in Kismiantini 2007)

Novick in Good (1980) mentioned that Bayes method is difficult to adopt and sometimes is very sensitive due to the requirement of prior probability informationwhich is usually difficult to obtain Robbin (1955) introduced Empirical Bayes methods by assuming a particular prior distribution estimating based on the sample Rao (2003) said that EB (Empirical Bayes) and HB (Hierarchical Bayes) are compatible for binary and count data in Small Area Estimation Therefore EB method was used in this research

Rao (2003) summarized EB methods in Small Area Estimation as follows 1 Obtain the posterior probability density

function of the small area parameter2 Estimate the parameters from the

marginal density function

3 Use the estimated posterior density forinferences regarding the parameters ofinterest

Poisson-Gamma ModelsPoisson model is a standard model in

dealing with count data Generally count data can be suffered by over-dispersion problem Therefore a Poisson formula had been developed to accommodate extra variance from sample data Two-stage models have been introduced for count data known as mixed model Poisson-Gamma Wakefield (2006) introduced Poisson-Gamma model which was easier to use with SMR (Standard Mortality Ratio) as a direct estimator This study used Wakefield model with alteration in direct estimator

Let iy be a number of specific individual

at small area-i which has specific characteristic of interest and written as follow

j

iji yy

ijy are the-jth object at the-ith small area where

j=1hellipn and i=1hellipm

First stage )(~ ii

ind

i Poissony is assumed

where )( ii x describes a regression

model in area level ix is a vector of

covariates and Tpii)( is a vector of

regression coefficientsSecond stage distribution

)1(~ gammaiid

iis assumed as a prior

distribution with mean 1 and variance 1

Then the marginal distribution |iy is

negative binomialMoreover Wakefield (2006) used Bayes

Theorem and acquired posterior distributionas

)1(~|i

iii ygammay

and EB estimator as

iiiiB

iEB

i )ˆ1(ˆˆ)ˆˆ(ˆˆ

with )ˆˆ(ˆˆ iii ii y are direct

estimation from i and iy are the number of

observation

Negative Binomial Regression The negative binomial regression model

seems have been first discussed by Anscombe (1972) Others have pointed out its success indealing with over-dispersed count data

3

Lawless (1987) elaborated the mixture model parameterization of the negative binomial providing formulas for its log likelihood mean variance and moments Later Breslow (1990) cited Lawlessrsquo work and since its inception to the late 1980rsquos the negative binomial regression model have been construed as a mixture model that is useful for accommodating otherwise over-dispersed Poisson data (Hardin amp Hilbe 2007) The negative binomial distribution function is written as

yk

kk

k

y

kyxyg

)1()(

)()|(

where y=012hellip k and are negative

binomial parameter with )(yE and

ky 2)var( k mention as disperse

parameter which is shown that the data consist of over-dispersed

Over-disperse at Count DataCount data for Poisson regression

including by over-disperse if variance bigger than mean or if the expected value of variance is smaller than expected This phenomenon is written as

)()( ii yEyVar (McCullagh amp Nelder 1989)

Zero-Inflated ModelsZero-Inflated models consider two distinct

sources of zero outcomes One source is generated from individuals who do not enter into the counting process the other from those who do enter the count process but result in a zero outcome (Hardin amp Hilbe 2007)

Lambert (1992) first described this type of mixture model in the context of process control in manufacturing It has since been used in many applications and is now founddiscussed in nearly every book or article dealing with count response models

For the zero-inflated model the probability of observing a zero outcome equals the probability that an individual is in the always-zero group plus the probability that individual is not that group times the probability that the counting process produces a zero If )0(B as

the probability that the binary process result in a zero outcomes and )0Pr( as the probability

that the counting of a zero outcomes the probability of a zero outcome for the system is then given by (Hardin amp Hilbe 2007)

)0Pr()1()0()0Pr( ZBy The probability of a nonzero count is

)Pr()]0(1[)0Pr( kBkky This model would produce two groups of

parameter one is zero-inflation parameter which shown that the covariate significantly contribute to having a zero outcomes And the other parameter is negative binomial parameter which modeling the response with the covariate

Zero-Inflated Negative BinomialThere are many kinds of zero-inflated

model each model has plus and minus and is used in different type of data Zero-Inflated negative binomial is one kind of them This model is used in over-disperse and excess-zero data As a result among parameter estimators there would be k parameters which indicate that over-disperse occur in data just as disperse parameter in negative binomial regression

The probability distribution of this model is as follow

)|( iii xyYP )|0()(1)( iii xgxx )|()(1 iii xygx

Where is a function of iz ix are vector

of zero-inflated covariate and is a vector of

zero-inflated coefficient which will be estimated Meanwhile )|( ii xyg is probability

distribution of negative binomial written asiy

i

i

iii

iii y

yxyg

)1()(

)()|(

Mean and variance of ZINB are

))(1)(1()|(

)1()|(

iiiiii

iiii

xyV

xyE

Jackknife Method of Estimating MSE( EBi )

Jackknife methods is one of general methods used in survey because itrsquos unpretentious concept (Jiang Lahiri and Wan 2002) This methods have been known by Tukey (1958) and developed to be a method that capable to be bias corrected of estimator by remove observation-i for i=1hellipm and performs parameter estimation

Rao (2003) the Jackknife step to estimate MSE( EB

i ) are

1 Assume that )ˆˆ(ˆ iiEBi yk

)ˆˆ(ˆ111 ii

EBi yk then calculate

m

l

EBi

EBii m

mM 2

12 )ˆˆ(1ˆ

2 Calculate the delete-i estimator 1

ˆ

and

1 then calculate

4

)]ˆˆ()ˆˆ([1

)ˆˆ(ˆ111111 ii

m

miiiiii ygyg

m

mygM

And )ˆ( 21 vig is the variance estimator of

posterior distribution which is used to measure the variability associated with i

The use of )ˆ( 21 vig is leads to severe of

underestimation of )ˆ( EBiJMSE related

with estimation in prior parameter Therefore the estimator

iM1ˆ correct the

bias of )ˆ( 21 vig

3 Calculate the jackknife estimator of MSE( EB

i ) as

iiEBiJ MMMSE 21

ˆˆ)ˆ(

METHODOLOGY

DataThis research assumed that the available

auxiliary data is on area level so this research used basic area level model The data were simulated with 30 small areas and one covariate Every batch generated different conditions of excess-zeros data start from 01 until 09 probability of zero in small area This research assumed structure of relation between respond and covariate was linear

MethodsThe following steps in generating data

using SAS 91 were used1 Fix the value of

iX for the- i th area

2 Define the expected probability of zero in each small area ))0(( iYP then

calculate ))0(log( ii YPLambda

3 Generate )11(~ Gammai4 Calculate )log(

iiLambda 5 Fit linear regression between and

iX to

obtain0 and

16 Calculate )`exp(X= ii 7 Calculate

iiparmlambda 8 Generate )(~ parmlambdaPoissonyi

Moreover in analyzing data the following steps were applied 1 Generate the negative binomial regression

with genmod procedure in SAS 91 and Zero-Inflated Negative binomial Regression with countreg procedure in SAS 92

2 Estimate the prior parameter which are and

3 Estimate using EB method4 Calculate MSE for indirect estimation5 Calculate RRMSE (Root Relative Mean

Square Error)

i

ii

MSERRMSE

ˆ)ˆ(

)ˆ(

RESULT AND DISCUSSION

Estimation of Prior Parameter is Based on EB Method with Negative Binomial

RegressionIn case of non-excess-zero data the

estimator produced small and consistent MSE Meanwhile if the number of excess-zero isapproximately 30 or more with expected probability of zero 06 the performance of estimates tends to be unreliable As a result EB estimation produced negative values

RRMSE of the estimator increasessimultaneously along with the increase of number of zero in the data Furthermore if thedata contain excess zero at least 30 theestimator is unreliable

Table 1 MSE and RRMSE of EB Estimator with NBR

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 033 016 018 01302 035 020 026 02003 040 023 036 03004 042 027 050 04205 045 031 072 05906 -12875 033 -038 08107 253671 040 -1216 13508 -584495 030 30946 21109 39135606 016 116E+10 664

Table 2 MSE (II) and RRMSE (II) of EB Estimator with NBR

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 033 016 018 01302 035 020 026 02003 040 023 036 03004 042 027 050 04205 046 031 071 05806 26197 033 -035 07507 950007 040 -1002 09908 1444250 030 22054 11009 41595285 016 677E+09 056

5

Table 1 show that the iterative process produced unexpected negative values of MSEThe simplest way to solve this problem is tochange the negative value to zero MSE (II) and RRMSE (II) in table 2 are the result of MSE and RRMSE after the negative value of MSE has been changed to zero

When data have expected probability of zero by 06 to 09 mean of MSE (II) increases drastically Similarly mean of RRMSE (II) increases sharply when data have 08 to 09 expected probability of zero However when data have 06 to 07 expected probability of zero the mean of RRMSE (II) is negative due to the negative value of EB estimates

Estimation of Prior Parameter is Based on EB Method with Zero-Inflated

Negative Binomial RegressionThe EB estimates are similar to the

estimates produced by NBR method although they are slightly outperformed NBR method when the data only contain small number of zeros In particular as shown by table 3 if data have expected probability of zero by 01 to 05 ZINB produces bigger MSE for EB estimator than which NBR produces

Whereas if data have expected probability of zero by 06 to 07 ZINB gives better estimates The estimates were also unbiased as it covers parameter values adequately However ZINB begins to produce inconsistent estimates if data have expected probability of zero by 08 or more due to enormous MSE

Besides when data have expected is because ZINB generates small estimates which is close to the parameter values

Mean of MSE (II) with ZINB is biggerthan the mean of MSE with ZINB That is because when negative value of MSE changed to zero it doesnrsquot have reduction factor in the mean calculation

Comparison of EB estimator withNegative Binomial Regression and EB

estimator with ZINBEB estimates given by both NBR and

ZINB methods are similar for data with small numbers of zero However ZINB method produces bigger MSE than NBR do as long as expected probability of zero in data does not exceed 06 thresholds

But ZINB method performs better if data have expected probability of zero by 06 to 07 In this case EB estimates given by NBR method are unstable and inconsistent due to estimatesrsquo negative value and huge MSE that

can be thousand times larger than theiracceptable value On the other hand EB estimator with ZINB works well it givesunbiased estimates and its MSE values are more stable than EB estimates with NBR

Both methods would have performed poorly if data had expected probability of zero by 08 or more EB estimators with both methods were inconsistent as a result of very huge MSE values they produced

Table 3 MSE and RRMSE of EB Estimator with ZINB

Probability of zero

Mean of MSE

Median ofMSE

Mean ofRRMSE

Median of RRMSE

01 045 017 024 01402 043 020 033 02103 071 028 052 03204 054 028 0632 04205 086 033 7322807 06606 061 038 29817 10307 058 025 218119 19408 -128 -14E-07 162697 37509 2954790 -1E-06 35E+278 609508

Table 4 MSE (II) and RRMSE (II) of EB Estimator with ZINB

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 045 017 024 01402 0436 020 0324 02103 072 028 051 031104 055 028 061 04105 095 033 6561235 05806 075 038 23406 07007 150 025 134655 06908 175 0 733506 009 2954908 0 12E+278 0

CONCLUSION

Excess-zero in data highly influenced the result of EB estimation Conventional method such as negative binomial regression in prior estimation has produced unbiased and unreliable EB estimator for data with expected probability of zero by 06 This is shown bybig number of MSE and negative value of estimator

Meanwhile EB estimation by ZINB method produced more reliable estimator even when the data have expected probability of zero by 06 to 07

The ZINB has also provided a reliable estimator for data with less than 5333 of zeros This means that performance of ZINB

6

declines when the data have expected probability of zero by 08 or more As shown by the big MSE and inconsistent estimator

RECOMMENDATION

This research is based on many assumptions and suffered by several limitations If the assumptions and boundaries can be relaxed can be expected better result There are some recommendations for the next research1 The generating process in this research

does not reflect the real sampling processIf the generating process similar to the real sampling process it might give better result because it will be closer with the real application

2 It will be more interesting to runexperiment which takes account of larger number of areas since the number of areas will influence data modeling

3 The Restricted Maximum Likelihood maybe applied when estimating prior parameter with ZINB and NBR in other to solve the negative value of MSE

4 Theoretical research of ZINB and Empirical Bayes estimator is important to understand the behavior of parameter estimates of ZINB in Empirical Bayes setting

REFERENCES

Erdman D L Jackson A Sinko 2008 Zero-Inflated Poisson and Zero-Inflated Negative Binomial Models Using the COUNTREG Procedure SAS Global Forum 2008322-2008httpwww2sascomproceedingsforum2008322-2008pdf [25 Agustus 2008]

Famoye F KP Singh 2006 Zero-Inflated Generalized Poisson Regression Model with an Application to Domestic Violence Data Journal of Data Science 4117-130

Hardin JW JM Hilbe 2007 Generalized Linear Models and Extensions Texas A Stata Press Publication

Kurnia A KA Notodiputro 2006 Penerapan Metode Jackknife dalam pendugaan Area Kecil Forum Statistika dan Komputasi April 2006 p12-15

Kismiantini 2007 Pendugaan Statistik Area Kecil Berbasis Model Poisson-Gamma [Tesis] Bogor Institut Pertanian Bogor Fakultas Matematika dan Pengetahuan Alam

McCullagh P J A Nelder 1983 Generalized Linear Models London Chapmann and Hall

Ramsini B et all 2001 Uninsured Estimates by County A Review of Options and IssueshttpwwwodhohiogovDataOFHSurvofhsrfq7pdf [24 April 2008]

Rao JNK 2003 Small Area Estimation New York John Wiley amp Sons

Wakefield J 2006 Disease mapping and spatial regression with count data httpwwwbepresscomuwbiostatpaper286pdf [24 April 2008]

7

Appendix 1 Result of EB estimation with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE -100569 0194196 0271669 041875 045917 3239598RRMSE 0123605 0300339 0422426 0503566 0642418 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE -237643 0235733 0306641 0452955 05091 3652167RRMSE 0038956 0412708 0588924 0717336 0844735 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE -749011 0254097 0330078 -12875 0539873 2354887RRMSE -663045 051763 0813734 -038057 1287528 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE -7513075 0235378 0402092 2536714 0876569 6051162RRMSE -10741 0704796 1355566 -121606 3040291 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601

bias 0000395 0116669 0254473 1091172 0497898 6297454MSE -6E+09 -016583 0301527 -584495 5718409 185E+09RRMSE -212936 0927338 2115163 3094627 1359703 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655MSE -38E+09 -130817 0159682 39135606 3074073 12E+11

RRMSE -909131 1647188 6639631 116E+10 1585472 706E+11

8

Appendix 2 Result of EB estimation with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE -053954 010933 0168797 0449506 0369775 360843RRMSE 0022947 0096443 0136424 0238099 0241955 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE -07309 0126202 0201463 0425844 0414597 1734815RRMSE 0021807 0144983 0210692 0326097 0401786 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE -229891 0156942 0277017 0707983 0590466 7469014RRMSE 0023998 0210095 0317195 0519524 0618802 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE -125713 0181557 0284338 0540615 0498521 423089RRMSE 0054916 028362 0420396 0630776 0778033 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442

MSE -181856 0194818 0334706 0859252 0711939 7997074RRMSE 0026206 0387294 0662251 7322807 1312302 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE -34589 0078006 0376514 060793 0804116 3426488RRMSE 000461 0502807 1033578 2981671 2012552 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE -142213 -001433 0255331 0584152 1132152 264456RRMSE 0064209 0847956 1942286 2181192 4589042 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE -10651 -56E-05 -14E-07 -127819 1452962 1132741RRMSE 0063244 1475413 3754705 162697 9221163 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE -175652 -33E-05 -1E-06 2954790 152E-06 613E+08

RRMSE 0040681 4059441 6095076 35E+278 5569021 16E+281

9

Appendix 3 Result of EB estimation (II) with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE 0 0194196 0271669 0419181 045917 3239598RRMSE 0 0300116 0422209 0502895 0641904 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE 0 0235733 0306641 0456258 05091 3652167RRMSE 0 0410357 0585765 0712314 0841838 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE 0 0254097 0330078 2619677 0539873 2354887RRMSE -663045 0448118 0750369 -034911 1209918 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE 0 0235378 0402092 9500073 0876569 6051162RRMSE -10741 0288999 0995659 -100163 2527784 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601bias 0000395 0116669 0254473 1091172 0497898 6297454MSE 0 0 0301527 1444250 5718409 185E+09RRMSE -212936 0 1104113 2205437 5656681 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655

MSE 0 0 0159682 41595285 3074073 12E+11

RRMSE -909131 0 0557622 677E+09 9311925 706E+11

10

Appendix 4 Result of EB estimation (II) with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE 0 010933 0168797 0450626 0369775 360843RRMSE 0 0095932 0135647 023675 0239669 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE 0 0126202 0201463 0428006 0414597 1734815RRMSE 0 0142648 020709 0320663 0395479 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE 0 0156942 0277017 0716543 0590466 7469014RRMSE 0 0203913 0311937 0506882 0615401 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE 0 0181557 0284338 0549835 0498521 423089RRMSE 0 0270309 0405926 0606317 0766631 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442MSE 0 0194818 0334706 094973 0711939 7997074RRMSE 0 0316402 0576343 6561235 1240175 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE 0 0078006 0376514 0749436 0804116 3426488RRMSE 0 0258286 0698814 2340612 1714808 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE 0 0 0255331 1501268 1132152 264456RRMSE 0 0 0688797 1346552 2500825 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE 0 0 0 1755486 1452962 1132741RRMSE 0 0 0 7335062 3311711 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE 0 0 0 2954908 152E-06 613E+08

RRMSE 0 0 0 12E+278 416189 16E+281

11

Appendix 5 Syntax program for generate data

data b generate x1(covariate) and ei input x1cards0222831971100013131702314625252218171412202210run

macro bangkit_datado r=1 to 100

data egenerate poisson-gamma with excess zerodo kk=1 to 30set btetha = rangam(11)lambda = -log(01) peluang munculnya nilai nol yang diinginkan (01-09)starlambda = log(lambdatetha)output endrun

proc regmodel starlambda = x1 ods output ParameterEstimates=workbetha_lr (keep=Parameter Estimate)run

proc transpose data=workbetha_lr out=workbetha_lr_t

12

Appendix 5 Syntax program for generate data (continued)

rundata _null_set workbetha_lr_tcall symput (Intercept col1)call symput (x1 col2)run

data ddo kk=1 to 30set emu = exp(ampIntercept + ampx1x1)parmlambda = mutethaypoi = rand(poissonparmlambda)output endrun

ods trace onto take percent zero on dataproc freq data=dtables ypoi ods output OneWayFreqs=workzerorundata zeroset zerokeep percentrunproc transpose data=zero out=zero1 rundata _null_set workzero1call symput (pctz col1)rundata dset dpzero=amppctzr=amprrun

proc append data=d base=d1run

endmend

bangkit_data

13

Appendix 6 Syntax program EB with NBR

macro sae_nbdo x=1 to 900

data workaset workeif ^(u=ampx) then deleterun

this genmod procedure estimates the response without zero-inflation proc genmod data=amodel ypoi = x1 dist=nb link=logods output ParameterEstimates=workbetha_nb (keep=Parameter Estimate)run

proc transpose data=workbetha_nb out=workbetha_nb_trun

data _null_set workbetha_nb_tcall symput (Intercept col1)call symput (x1 col2)call symput (Dispersion col3)run

EB with negbin-regdata workduga_nbset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + ampDispersion)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(ampDispersion+ypoi)((mu_hat_b+ampDispersion)2)bias_b=abs(teta_hat_bayes-parmlambda)run

proc append data=workduga_nb base=workduga_nb1run

jacknifedo h=1 to 30

data workdset workduga_nb1if ^(u=ampx) then deleterundata workjacknbamphset workdif u=ampxif kk=amph then deleterun

proc genmod data=workjacknbamph output p out=sasyi_estmodel ypoi = x1 dist = nb link=logods output parameterestimates=workbetha_est_nbamph (keep=parameter Estimate)

14

Appendix 6 Syntax program EB with NBR (continued)

runproc transpose data=workbetha_est_nbamph out=workbetha_est_nbtamphrundata _null_set workbetha_est_nbtamphcall symput (Intercept_ col1)call symput (x1_ col2)call symput (Dispersion_ col3)run

data workduganbamphset workdmu_hat_b_amph=exp(ampIntercept_ + ampx1_x1)w_b_amph=mu_hat_b_amph (mu_hat_b_amph + ampDispersion_)teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2g1_amph=(ampDispersion_+ypoi)((mu_hat_b_amph+ampDispersion_)2)beda_g_amph=g1_amph-g1run

data workmse_nb_jmerge workduganb1 workduganb2 workduganb3 workduganb4 workduganb5 workduganb6 workduganb7 workduganb8 workduganb9 workduganb10 workduganb11 workduganb12workduganb13 workduganb14 workduganb15 workduganb16 workduganb17workduganb18 workduganb19 workduganb20 workduganb21 workduganb22workduganb23 workduganb24 workduganb25 workduganb26 workduganb27workduganb28 workduganb29 workduganb30by kkrun

data workmse_nb_jset workmse_nb_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampjendm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesul = ampxrun

proc append data=workmse_nb_j base=workmse_nb_j1run

data workhasilnbmerge workd workmse_nb_j keep kk x1 tetha mu parmlambda ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_b

15

Appendix 6 Syntax program EB with NBR (continued)

run

ods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilnb BASE=workhasilnb1 appendver=v6run

ENDmend

sae_nb

16

Appendix 7 Syntax program EB with ZINB

macro sae_zinb

do x=1 to 900

data workaset work eif ^(u=ampx) then deleterun

proc countreg data=amodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workpe(keep=Parameter Estimate)run

proc transpose data=workpe out=workpe_trun

data _null_set workpe_tcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + amp_Alpha)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(amp_Alpha+ypoi)((mu_hat_b+amp_Alpha)2)bias_b=abs(teta_hat_bayes-parmlamdha)

run

proc append data=workduga base=workduga1run

do h=1 to 30

data workdset workduga1if ^(u=ampx) then deleterundata workjackzinbamphset workdif u=ampxif kk=amph then deleterun

proc countreg data=jackzinbamphmodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workbetha_est_ZINBamph

17

Appendix 7 Syntax program EB with ZINB (continued)

(keep=Parameter Estimate)run

proc transpose data=workbetha_est_ZINBamph out=workbetha_est_ZINBtamphrun

data _null_set workbetha_est_ZINBtamphcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaZINBamphset workdmu_hat_b_amph=exp(ampIntercept + ampx1x1)mu_hat_b_amph= ampb_o- + ampb_1- x1w_b_amph=mu_hat_b_amph (mu_hat_b_amph + (amp_Alpha))teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2

g1_amph =((mu_hat_b_amph2ampalpha_)2)(ampalpha_+y_i)((mu_hat_b_amph2ampalpha_)+mu_hat_b_amph)2

g1_amph=(amp_Alpha+ypoi)((mu_hat_b_amph+amp_Alpha)2)

g1_amph =(A2)(ampk- + y_i)( a +mu_hat_b)2

beda_g_amph=g1_amph-g1run

data workmse_ZINB_jmerge workdugaZINB1 workdugaZINB2 workdugaZINB3 workdugaZINB4 workdugaZINB5 workdugaZINB6 workdugaZINB7 workdugaZINB8 workdugaZINB9 workdugaZINB10 workdugaZINB11 workdugaZINB12workdugaZINB13 workdugaZINB14 workdugaZINB15 workdugaZINB16 workdugaZINB17workdugaZINB18 workdugaZINB19 workdugaZINB20 workdugaZINB21 workdugaZINB22workdugaZINB23 workdugaZINB24 workdugaZINB25 workdugaZINB26 workdugaZINB27workdugaZINB28 workdugaZINB29 workdugaZINB30by kkrun

data workmse_ZINB_jset workmse_ZINB_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampj

18

Appendix 7 Syntax program EB with ZINB (continued)

endm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesrun

data workhasilZINBmerge workd workmse_ZINB_j keep kk x1 tetha mu lamdha ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_brunods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilZINB BASE=workhasilZINB1run

ENDmend

sae_zinb

  • KOPER AMPE PRAKATA_2pdf
  • isiirenepdf
Page 7: Zero-Inflated Negative Binomial in Small Area Estimation · Irianto Oetomo and Fine Analisa Maharani. She has two siblings. In 1998, she graduated from SD Dukuh 09 East Jakarta and

1

TABLE OF CONTENTS

PageINTRODUCTION 1

Background 1Objectives1

LITERATURE REVIEW 1Direct Estimation1Small Area Estimation 1Small Area Models1Empirical Bayes Methods 2Poisson-Gamma Models 2Negative Binomial Regression 2Over-disperse at Count Data 3Zero-Inflated Models3Zero-Inflated Negative Binomial 3Jackknife Method of Estimating MSE( EB

i )3

METHODOLOGY 4Data 4Methods4

RESULT AND DISCUSSION 4Estimation of Prior Parameter is Based of EB Method with Negative Binomial Regression 4Estimation of Prior Parameter is Based of EB Method with Zero-Inflated Negative Binomial Regression 5Comparison of EB estimator with Negative Binomial Regression and EB estimator with ZINB 5

CONCLUSION 5RECOMMENDATION 6REFERENCES 6

LIST OF TABLES

PageTable 1 MSE and RRMSE of EB Estimator with NBR 4Table 2 MSE (II) and RRMSE (II) of EB Estimator with NBR 4Table 3 MSE and RRMSE of EB Estimator with ZINB 5Table 4 MSE (II) and RRMSE (II) of EB Estimator with ZINB 5

LIST OF APPENDICES

PageAppendix 1 Result of EB estimation with NBR 7Appendix 2 Result of EB estimation with ZINB 8Appendix 3 Result of EB estimation (II) with NBR 9Appendix 4 Result of EB estimation (II) with ZINB 10Appendix 5 Syntax program for generate data 11Appendix 6 Syntax program EB with NBR 13Appendix 7 Syntax program EB with ZINB 16

1

INTRODUCTION

BackgroundDirect estimation is usually applied in big

scale survey but it is sometime difficult to utilize such estimator in a smaller region especially the sample size is too small In this case indirect estimation which adds covariates to estimate the parameter is usually used This type of estimation is broadly known as Small Area Estimation

Kismiantini (2007) conducted a research in Small Area Estimation based on Poisson-Gamma models Maximum Likelihood Estimation was used with Negative Binomial Regression techniques to estimate the respective prior parameter Moreover Negative Binomial Regression was used to resolve over-dispersion problem in the data

In reality count data is not onlycharacterized by over-dispersion but sometimes by excess-zero Excess-zero is a condition when the data contains too many zero or exceeds the distributionrsquos expectation 100 observations from Poisson model with response mean of 4 we could expect that there will be 2 zeros If the data have 30 zeros it should be obvious that the distributional assumptions have been violated Therefore the estimated parameter and standard error will be biased (Hardin amp Hilbe 2007) In this paper Zero-Inflated models were adapted to solve this type of problem

ObjectivesThe research objectives are

1 To investigate the performance of Negative Binomial Regression on Small Area Estimation in case of excess-zero

2 To apply Zero-Inflated Count Models on Small Area Estimation in case of excess-zero

3 To evaluate the performance of Zero-Inflated Count Models in estimating prior parameter for Small Area Estimation

LITERATURE REVIEW

Direct EstimationDirect estimates are generally ldquodesign

basedrdquo in the sense that they make use of ldquosurvey weightrdquo and associated inferences are based on the probability distribution by the sample design with the population values held fixed (Rao 2003) In particular direct estimates of a domain parameter are based only on the domain-specific sample data

Data from sample survey have been used to be a reliable estimate of parameter Ramsini et al (2001) mentioned that direct estimates of small area are unbiased although it would have big variance cause itrsquos small sample size

Small Area EstimationThe term of small area can be everything

depending on our object of interest It can be a city age group sex group region and rural district In general small area is used to denote any domain which the direct estimation with adequate precision can not be produced (Rao 2003) It happens because the sample size in small area is too small As a result direct estimation based on sampling design is not capable to produce direct estimation with adequate precision Furthermore small area estimation is developed as a statistic technique for estimating the parameter of small area This technique is used in effort to make estimation with adequate level of precision It works as indirect estimation that lend the strength of variable interest values from related areas through the use of supplementary information related to variable interest such as recent census count and current administrative records (Rao 2003) Indirect estimation is a process of estimating a domainrsquos parameter by connecting the information in that domain with another domain using an appropriate model So the estimator works by including other domainrsquos data (Kurnia amp Notodiputro 2006)

Small Area ModelsThere are two link models in indirect

estimation First traditional method based on implicit models that provide a link to relate small area through supplementary data Second explicit small area models that make specific allowance between area variations (Rao 2003) This research used the second model and it could be classified into two broad types of basic model1 Basic area level (type A) model

Basic area level model or aggregate model includes all models that relate small area with area-specific auxiliary variables These models are essential if unit (element) level data are not available Assuming parameter estimators

i is

related to area specific auxiliary data or covariate variables T

pii xxx )( 11 by

a linear model

2

iiT

ii vbx with i=1hellipm

iv ~N(0 2v ) are area-specific random

effect and Tp )( 1 is 1p vector of

regression coefficients Therefore ib are

known as positive constants For making inferences about

i direct estimators iy

are assumed available Accordingly assuming

iii ey where i=1hellipm with

sampling error ie ~N(0 ei2 ) and ei

2are known At the end both models are combined and as a result is new model

iiiT

ii evbxy where i=1hellipm

(Rao 2003)2 Basic unit level (type B) model

Unit level model includes all models that relate unit values of the study variable to unit-specific auxiliary variables Assuming unit-specific auxiliary variables T

ijpijij xxx )( 1 and

correspondingly a nested regression model

ijiT

ijij evxy where

i=1hellipm and j=1hellip in with

iv ~N(0 2v ) and also ie ~N(0 ei

2 )

Empirical Bayes MethodsThe Bayesian approach is based on Bayes

Law which was found by Thomas Bayes This law was introduced by Richard Proce in 1763 two years after Thomas Bayes passed away In 1774 and 1781 Laplace gave the details and relevancies for modern Bayesian statistics (Gill 2002 in Kismiantini 2007)

Novick in Good (1980) mentioned that Bayes method is difficult to adopt and sometimes is very sensitive due to the requirement of prior probability informationwhich is usually difficult to obtain Robbin (1955) introduced Empirical Bayes methods by assuming a particular prior distribution estimating based on the sample Rao (2003) said that EB (Empirical Bayes) and HB (Hierarchical Bayes) are compatible for binary and count data in Small Area Estimation Therefore EB method was used in this research

Rao (2003) summarized EB methods in Small Area Estimation as follows 1 Obtain the posterior probability density

function of the small area parameter2 Estimate the parameters from the

marginal density function

3 Use the estimated posterior density forinferences regarding the parameters ofinterest

Poisson-Gamma ModelsPoisson model is a standard model in

dealing with count data Generally count data can be suffered by over-dispersion problem Therefore a Poisson formula had been developed to accommodate extra variance from sample data Two-stage models have been introduced for count data known as mixed model Poisson-Gamma Wakefield (2006) introduced Poisson-Gamma model which was easier to use with SMR (Standard Mortality Ratio) as a direct estimator This study used Wakefield model with alteration in direct estimator

Let iy be a number of specific individual

at small area-i which has specific characteristic of interest and written as follow

j

iji yy

ijy are the-jth object at the-ith small area where

j=1hellipn and i=1hellipm

First stage )(~ ii

ind

i Poissony is assumed

where )( ii x describes a regression

model in area level ix is a vector of

covariates and Tpii)( is a vector of

regression coefficientsSecond stage distribution

)1(~ gammaiid

iis assumed as a prior

distribution with mean 1 and variance 1

Then the marginal distribution |iy is

negative binomialMoreover Wakefield (2006) used Bayes

Theorem and acquired posterior distributionas

)1(~|i

iii ygammay

and EB estimator as

iiiiB

iEB

i )ˆ1(ˆˆ)ˆˆ(ˆˆ

with )ˆˆ(ˆˆ iii ii y are direct

estimation from i and iy are the number of

observation

Negative Binomial Regression The negative binomial regression model

seems have been first discussed by Anscombe (1972) Others have pointed out its success indealing with over-dispersed count data

3

Lawless (1987) elaborated the mixture model parameterization of the negative binomial providing formulas for its log likelihood mean variance and moments Later Breslow (1990) cited Lawlessrsquo work and since its inception to the late 1980rsquos the negative binomial regression model have been construed as a mixture model that is useful for accommodating otherwise over-dispersed Poisson data (Hardin amp Hilbe 2007) The negative binomial distribution function is written as

yk

kk

k

y

kyxyg

)1()(

)()|(

where y=012hellip k and are negative

binomial parameter with )(yE and

ky 2)var( k mention as disperse

parameter which is shown that the data consist of over-dispersed

Over-disperse at Count DataCount data for Poisson regression

including by over-disperse if variance bigger than mean or if the expected value of variance is smaller than expected This phenomenon is written as

)()( ii yEyVar (McCullagh amp Nelder 1989)

Zero-Inflated ModelsZero-Inflated models consider two distinct

sources of zero outcomes One source is generated from individuals who do not enter into the counting process the other from those who do enter the count process but result in a zero outcome (Hardin amp Hilbe 2007)

Lambert (1992) first described this type of mixture model in the context of process control in manufacturing It has since been used in many applications and is now founddiscussed in nearly every book or article dealing with count response models

For the zero-inflated model the probability of observing a zero outcome equals the probability that an individual is in the always-zero group plus the probability that individual is not that group times the probability that the counting process produces a zero If )0(B as

the probability that the binary process result in a zero outcomes and )0Pr( as the probability

that the counting of a zero outcomes the probability of a zero outcome for the system is then given by (Hardin amp Hilbe 2007)

)0Pr()1()0()0Pr( ZBy The probability of a nonzero count is

)Pr()]0(1[)0Pr( kBkky This model would produce two groups of

parameter one is zero-inflation parameter which shown that the covariate significantly contribute to having a zero outcomes And the other parameter is negative binomial parameter which modeling the response with the covariate

Zero-Inflated Negative BinomialThere are many kinds of zero-inflated

model each model has plus and minus and is used in different type of data Zero-Inflated negative binomial is one kind of them This model is used in over-disperse and excess-zero data As a result among parameter estimators there would be k parameters which indicate that over-disperse occur in data just as disperse parameter in negative binomial regression

The probability distribution of this model is as follow

)|( iii xyYP )|0()(1)( iii xgxx )|()(1 iii xygx

Where is a function of iz ix are vector

of zero-inflated covariate and is a vector of

zero-inflated coefficient which will be estimated Meanwhile )|( ii xyg is probability

distribution of negative binomial written asiy

i

i

iii

iii y

yxyg

)1()(

)()|(

Mean and variance of ZINB are

))(1)(1()|(

)1()|(

iiiiii

iiii

xyV

xyE

Jackknife Method of Estimating MSE( EBi )

Jackknife methods is one of general methods used in survey because itrsquos unpretentious concept (Jiang Lahiri and Wan 2002) This methods have been known by Tukey (1958) and developed to be a method that capable to be bias corrected of estimator by remove observation-i for i=1hellipm and performs parameter estimation

Rao (2003) the Jackknife step to estimate MSE( EB

i ) are

1 Assume that )ˆˆ(ˆ iiEBi yk

)ˆˆ(ˆ111 ii

EBi yk then calculate

m

l

EBi

EBii m

mM 2

12 )ˆˆ(1ˆ

2 Calculate the delete-i estimator 1

ˆ

and

1 then calculate

4

)]ˆˆ()ˆˆ([1

)ˆˆ(ˆ111111 ii

m

miiiiii ygyg

m

mygM

And )ˆ( 21 vig is the variance estimator of

posterior distribution which is used to measure the variability associated with i

The use of )ˆ( 21 vig is leads to severe of

underestimation of )ˆ( EBiJMSE related

with estimation in prior parameter Therefore the estimator

iM1ˆ correct the

bias of )ˆ( 21 vig

3 Calculate the jackknife estimator of MSE( EB

i ) as

iiEBiJ MMMSE 21

ˆˆ)ˆ(

METHODOLOGY

DataThis research assumed that the available

auxiliary data is on area level so this research used basic area level model The data were simulated with 30 small areas and one covariate Every batch generated different conditions of excess-zeros data start from 01 until 09 probability of zero in small area This research assumed structure of relation between respond and covariate was linear

MethodsThe following steps in generating data

using SAS 91 were used1 Fix the value of

iX for the- i th area

2 Define the expected probability of zero in each small area ))0(( iYP then

calculate ))0(log( ii YPLambda

3 Generate )11(~ Gammai4 Calculate )log(

iiLambda 5 Fit linear regression between and

iX to

obtain0 and

16 Calculate )`exp(X= ii 7 Calculate

iiparmlambda 8 Generate )(~ parmlambdaPoissonyi

Moreover in analyzing data the following steps were applied 1 Generate the negative binomial regression

with genmod procedure in SAS 91 and Zero-Inflated Negative binomial Regression with countreg procedure in SAS 92

2 Estimate the prior parameter which are and

3 Estimate using EB method4 Calculate MSE for indirect estimation5 Calculate RRMSE (Root Relative Mean

Square Error)

i

ii

MSERRMSE

ˆ)ˆ(

)ˆ(

RESULT AND DISCUSSION

Estimation of Prior Parameter is Based on EB Method with Negative Binomial

RegressionIn case of non-excess-zero data the

estimator produced small and consistent MSE Meanwhile if the number of excess-zero isapproximately 30 or more with expected probability of zero 06 the performance of estimates tends to be unreliable As a result EB estimation produced negative values

RRMSE of the estimator increasessimultaneously along with the increase of number of zero in the data Furthermore if thedata contain excess zero at least 30 theestimator is unreliable

Table 1 MSE and RRMSE of EB Estimator with NBR

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 033 016 018 01302 035 020 026 02003 040 023 036 03004 042 027 050 04205 045 031 072 05906 -12875 033 -038 08107 253671 040 -1216 13508 -584495 030 30946 21109 39135606 016 116E+10 664

Table 2 MSE (II) and RRMSE (II) of EB Estimator with NBR

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 033 016 018 01302 035 020 026 02003 040 023 036 03004 042 027 050 04205 046 031 071 05806 26197 033 -035 07507 950007 040 -1002 09908 1444250 030 22054 11009 41595285 016 677E+09 056

5

Table 1 show that the iterative process produced unexpected negative values of MSEThe simplest way to solve this problem is tochange the negative value to zero MSE (II) and RRMSE (II) in table 2 are the result of MSE and RRMSE after the negative value of MSE has been changed to zero

When data have expected probability of zero by 06 to 09 mean of MSE (II) increases drastically Similarly mean of RRMSE (II) increases sharply when data have 08 to 09 expected probability of zero However when data have 06 to 07 expected probability of zero the mean of RRMSE (II) is negative due to the negative value of EB estimates

Estimation of Prior Parameter is Based on EB Method with Zero-Inflated

Negative Binomial RegressionThe EB estimates are similar to the

estimates produced by NBR method although they are slightly outperformed NBR method when the data only contain small number of zeros In particular as shown by table 3 if data have expected probability of zero by 01 to 05 ZINB produces bigger MSE for EB estimator than which NBR produces

Whereas if data have expected probability of zero by 06 to 07 ZINB gives better estimates The estimates were also unbiased as it covers parameter values adequately However ZINB begins to produce inconsistent estimates if data have expected probability of zero by 08 or more due to enormous MSE

Besides when data have expected is because ZINB generates small estimates which is close to the parameter values

Mean of MSE (II) with ZINB is biggerthan the mean of MSE with ZINB That is because when negative value of MSE changed to zero it doesnrsquot have reduction factor in the mean calculation

Comparison of EB estimator withNegative Binomial Regression and EB

estimator with ZINBEB estimates given by both NBR and

ZINB methods are similar for data with small numbers of zero However ZINB method produces bigger MSE than NBR do as long as expected probability of zero in data does not exceed 06 thresholds

But ZINB method performs better if data have expected probability of zero by 06 to 07 In this case EB estimates given by NBR method are unstable and inconsistent due to estimatesrsquo negative value and huge MSE that

can be thousand times larger than theiracceptable value On the other hand EB estimator with ZINB works well it givesunbiased estimates and its MSE values are more stable than EB estimates with NBR

Both methods would have performed poorly if data had expected probability of zero by 08 or more EB estimators with both methods were inconsistent as a result of very huge MSE values they produced

Table 3 MSE and RRMSE of EB Estimator with ZINB

Probability of zero

Mean of MSE

Median ofMSE

Mean ofRRMSE

Median of RRMSE

01 045 017 024 01402 043 020 033 02103 071 028 052 03204 054 028 0632 04205 086 033 7322807 06606 061 038 29817 10307 058 025 218119 19408 -128 -14E-07 162697 37509 2954790 -1E-06 35E+278 609508

Table 4 MSE (II) and RRMSE (II) of EB Estimator with ZINB

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 045 017 024 01402 0436 020 0324 02103 072 028 051 031104 055 028 061 04105 095 033 6561235 05806 075 038 23406 07007 150 025 134655 06908 175 0 733506 009 2954908 0 12E+278 0

CONCLUSION

Excess-zero in data highly influenced the result of EB estimation Conventional method such as negative binomial regression in prior estimation has produced unbiased and unreliable EB estimator for data with expected probability of zero by 06 This is shown bybig number of MSE and negative value of estimator

Meanwhile EB estimation by ZINB method produced more reliable estimator even when the data have expected probability of zero by 06 to 07

The ZINB has also provided a reliable estimator for data with less than 5333 of zeros This means that performance of ZINB

6

declines when the data have expected probability of zero by 08 or more As shown by the big MSE and inconsistent estimator

RECOMMENDATION

This research is based on many assumptions and suffered by several limitations If the assumptions and boundaries can be relaxed can be expected better result There are some recommendations for the next research1 The generating process in this research

does not reflect the real sampling processIf the generating process similar to the real sampling process it might give better result because it will be closer with the real application

2 It will be more interesting to runexperiment which takes account of larger number of areas since the number of areas will influence data modeling

3 The Restricted Maximum Likelihood maybe applied when estimating prior parameter with ZINB and NBR in other to solve the negative value of MSE

4 Theoretical research of ZINB and Empirical Bayes estimator is important to understand the behavior of parameter estimates of ZINB in Empirical Bayes setting

REFERENCES

Erdman D L Jackson A Sinko 2008 Zero-Inflated Poisson and Zero-Inflated Negative Binomial Models Using the COUNTREG Procedure SAS Global Forum 2008322-2008httpwww2sascomproceedingsforum2008322-2008pdf [25 Agustus 2008]

Famoye F KP Singh 2006 Zero-Inflated Generalized Poisson Regression Model with an Application to Domestic Violence Data Journal of Data Science 4117-130

Hardin JW JM Hilbe 2007 Generalized Linear Models and Extensions Texas A Stata Press Publication

Kurnia A KA Notodiputro 2006 Penerapan Metode Jackknife dalam pendugaan Area Kecil Forum Statistika dan Komputasi April 2006 p12-15

Kismiantini 2007 Pendugaan Statistik Area Kecil Berbasis Model Poisson-Gamma [Tesis] Bogor Institut Pertanian Bogor Fakultas Matematika dan Pengetahuan Alam

McCullagh P J A Nelder 1983 Generalized Linear Models London Chapmann and Hall

Ramsini B et all 2001 Uninsured Estimates by County A Review of Options and IssueshttpwwwodhohiogovDataOFHSurvofhsrfq7pdf [24 April 2008]

Rao JNK 2003 Small Area Estimation New York John Wiley amp Sons

Wakefield J 2006 Disease mapping and spatial regression with count data httpwwwbepresscomuwbiostatpaper286pdf [24 April 2008]

7

Appendix 1 Result of EB estimation with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE -100569 0194196 0271669 041875 045917 3239598RRMSE 0123605 0300339 0422426 0503566 0642418 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE -237643 0235733 0306641 0452955 05091 3652167RRMSE 0038956 0412708 0588924 0717336 0844735 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE -749011 0254097 0330078 -12875 0539873 2354887RRMSE -663045 051763 0813734 -038057 1287528 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE -7513075 0235378 0402092 2536714 0876569 6051162RRMSE -10741 0704796 1355566 -121606 3040291 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601

bias 0000395 0116669 0254473 1091172 0497898 6297454MSE -6E+09 -016583 0301527 -584495 5718409 185E+09RRMSE -212936 0927338 2115163 3094627 1359703 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655MSE -38E+09 -130817 0159682 39135606 3074073 12E+11

RRMSE -909131 1647188 6639631 116E+10 1585472 706E+11

8

Appendix 2 Result of EB estimation with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE -053954 010933 0168797 0449506 0369775 360843RRMSE 0022947 0096443 0136424 0238099 0241955 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE -07309 0126202 0201463 0425844 0414597 1734815RRMSE 0021807 0144983 0210692 0326097 0401786 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE -229891 0156942 0277017 0707983 0590466 7469014RRMSE 0023998 0210095 0317195 0519524 0618802 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE -125713 0181557 0284338 0540615 0498521 423089RRMSE 0054916 028362 0420396 0630776 0778033 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442

MSE -181856 0194818 0334706 0859252 0711939 7997074RRMSE 0026206 0387294 0662251 7322807 1312302 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE -34589 0078006 0376514 060793 0804116 3426488RRMSE 000461 0502807 1033578 2981671 2012552 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE -142213 -001433 0255331 0584152 1132152 264456RRMSE 0064209 0847956 1942286 2181192 4589042 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE -10651 -56E-05 -14E-07 -127819 1452962 1132741RRMSE 0063244 1475413 3754705 162697 9221163 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE -175652 -33E-05 -1E-06 2954790 152E-06 613E+08

RRMSE 0040681 4059441 6095076 35E+278 5569021 16E+281

9

Appendix 3 Result of EB estimation (II) with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE 0 0194196 0271669 0419181 045917 3239598RRMSE 0 0300116 0422209 0502895 0641904 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE 0 0235733 0306641 0456258 05091 3652167RRMSE 0 0410357 0585765 0712314 0841838 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE 0 0254097 0330078 2619677 0539873 2354887RRMSE -663045 0448118 0750369 -034911 1209918 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE 0 0235378 0402092 9500073 0876569 6051162RRMSE -10741 0288999 0995659 -100163 2527784 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601bias 0000395 0116669 0254473 1091172 0497898 6297454MSE 0 0 0301527 1444250 5718409 185E+09RRMSE -212936 0 1104113 2205437 5656681 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655

MSE 0 0 0159682 41595285 3074073 12E+11

RRMSE -909131 0 0557622 677E+09 9311925 706E+11

10

Appendix 4 Result of EB estimation (II) with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE 0 010933 0168797 0450626 0369775 360843RRMSE 0 0095932 0135647 023675 0239669 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE 0 0126202 0201463 0428006 0414597 1734815RRMSE 0 0142648 020709 0320663 0395479 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE 0 0156942 0277017 0716543 0590466 7469014RRMSE 0 0203913 0311937 0506882 0615401 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE 0 0181557 0284338 0549835 0498521 423089RRMSE 0 0270309 0405926 0606317 0766631 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442MSE 0 0194818 0334706 094973 0711939 7997074RRMSE 0 0316402 0576343 6561235 1240175 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE 0 0078006 0376514 0749436 0804116 3426488RRMSE 0 0258286 0698814 2340612 1714808 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE 0 0 0255331 1501268 1132152 264456RRMSE 0 0 0688797 1346552 2500825 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE 0 0 0 1755486 1452962 1132741RRMSE 0 0 0 7335062 3311711 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE 0 0 0 2954908 152E-06 613E+08

RRMSE 0 0 0 12E+278 416189 16E+281

11

Appendix 5 Syntax program for generate data

data b generate x1(covariate) and ei input x1cards0222831971100013131702314625252218171412202210run

macro bangkit_datado r=1 to 100

data egenerate poisson-gamma with excess zerodo kk=1 to 30set btetha = rangam(11)lambda = -log(01) peluang munculnya nilai nol yang diinginkan (01-09)starlambda = log(lambdatetha)output endrun

proc regmodel starlambda = x1 ods output ParameterEstimates=workbetha_lr (keep=Parameter Estimate)run

proc transpose data=workbetha_lr out=workbetha_lr_t

12

Appendix 5 Syntax program for generate data (continued)

rundata _null_set workbetha_lr_tcall symput (Intercept col1)call symput (x1 col2)run

data ddo kk=1 to 30set emu = exp(ampIntercept + ampx1x1)parmlambda = mutethaypoi = rand(poissonparmlambda)output endrun

ods trace onto take percent zero on dataproc freq data=dtables ypoi ods output OneWayFreqs=workzerorundata zeroset zerokeep percentrunproc transpose data=zero out=zero1 rundata _null_set workzero1call symput (pctz col1)rundata dset dpzero=amppctzr=amprrun

proc append data=d base=d1run

endmend

bangkit_data

13

Appendix 6 Syntax program EB with NBR

macro sae_nbdo x=1 to 900

data workaset workeif ^(u=ampx) then deleterun

this genmod procedure estimates the response without zero-inflation proc genmod data=amodel ypoi = x1 dist=nb link=logods output ParameterEstimates=workbetha_nb (keep=Parameter Estimate)run

proc transpose data=workbetha_nb out=workbetha_nb_trun

data _null_set workbetha_nb_tcall symput (Intercept col1)call symput (x1 col2)call symput (Dispersion col3)run

EB with negbin-regdata workduga_nbset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + ampDispersion)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(ampDispersion+ypoi)((mu_hat_b+ampDispersion)2)bias_b=abs(teta_hat_bayes-parmlambda)run

proc append data=workduga_nb base=workduga_nb1run

jacknifedo h=1 to 30

data workdset workduga_nb1if ^(u=ampx) then deleterundata workjacknbamphset workdif u=ampxif kk=amph then deleterun

proc genmod data=workjacknbamph output p out=sasyi_estmodel ypoi = x1 dist = nb link=logods output parameterestimates=workbetha_est_nbamph (keep=parameter Estimate)

14

Appendix 6 Syntax program EB with NBR (continued)

runproc transpose data=workbetha_est_nbamph out=workbetha_est_nbtamphrundata _null_set workbetha_est_nbtamphcall symput (Intercept_ col1)call symput (x1_ col2)call symput (Dispersion_ col3)run

data workduganbamphset workdmu_hat_b_amph=exp(ampIntercept_ + ampx1_x1)w_b_amph=mu_hat_b_amph (mu_hat_b_amph + ampDispersion_)teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2g1_amph=(ampDispersion_+ypoi)((mu_hat_b_amph+ampDispersion_)2)beda_g_amph=g1_amph-g1run

data workmse_nb_jmerge workduganb1 workduganb2 workduganb3 workduganb4 workduganb5 workduganb6 workduganb7 workduganb8 workduganb9 workduganb10 workduganb11 workduganb12workduganb13 workduganb14 workduganb15 workduganb16 workduganb17workduganb18 workduganb19 workduganb20 workduganb21 workduganb22workduganb23 workduganb24 workduganb25 workduganb26 workduganb27workduganb28 workduganb29 workduganb30by kkrun

data workmse_nb_jset workmse_nb_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampjendm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesul = ampxrun

proc append data=workmse_nb_j base=workmse_nb_j1run

data workhasilnbmerge workd workmse_nb_j keep kk x1 tetha mu parmlambda ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_b

15

Appendix 6 Syntax program EB with NBR (continued)

run

ods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilnb BASE=workhasilnb1 appendver=v6run

ENDmend

sae_nb

16

Appendix 7 Syntax program EB with ZINB

macro sae_zinb

do x=1 to 900

data workaset work eif ^(u=ampx) then deleterun

proc countreg data=amodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workpe(keep=Parameter Estimate)run

proc transpose data=workpe out=workpe_trun

data _null_set workpe_tcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + amp_Alpha)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(amp_Alpha+ypoi)((mu_hat_b+amp_Alpha)2)bias_b=abs(teta_hat_bayes-parmlamdha)

run

proc append data=workduga base=workduga1run

do h=1 to 30

data workdset workduga1if ^(u=ampx) then deleterundata workjackzinbamphset workdif u=ampxif kk=amph then deleterun

proc countreg data=jackzinbamphmodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workbetha_est_ZINBamph

17

Appendix 7 Syntax program EB with ZINB (continued)

(keep=Parameter Estimate)run

proc transpose data=workbetha_est_ZINBamph out=workbetha_est_ZINBtamphrun

data _null_set workbetha_est_ZINBtamphcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaZINBamphset workdmu_hat_b_amph=exp(ampIntercept + ampx1x1)mu_hat_b_amph= ampb_o- + ampb_1- x1w_b_amph=mu_hat_b_amph (mu_hat_b_amph + (amp_Alpha))teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2

g1_amph =((mu_hat_b_amph2ampalpha_)2)(ampalpha_+y_i)((mu_hat_b_amph2ampalpha_)+mu_hat_b_amph)2

g1_amph=(amp_Alpha+ypoi)((mu_hat_b_amph+amp_Alpha)2)

g1_amph =(A2)(ampk- + y_i)( a +mu_hat_b)2

beda_g_amph=g1_amph-g1run

data workmse_ZINB_jmerge workdugaZINB1 workdugaZINB2 workdugaZINB3 workdugaZINB4 workdugaZINB5 workdugaZINB6 workdugaZINB7 workdugaZINB8 workdugaZINB9 workdugaZINB10 workdugaZINB11 workdugaZINB12workdugaZINB13 workdugaZINB14 workdugaZINB15 workdugaZINB16 workdugaZINB17workdugaZINB18 workdugaZINB19 workdugaZINB20 workdugaZINB21 workdugaZINB22workdugaZINB23 workdugaZINB24 workdugaZINB25 workdugaZINB26 workdugaZINB27workdugaZINB28 workdugaZINB29 workdugaZINB30by kkrun

data workmse_ZINB_jset workmse_ZINB_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampj

18

Appendix 7 Syntax program EB with ZINB (continued)

endm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesrun

data workhasilZINBmerge workd workmse_ZINB_j keep kk x1 tetha mu lamdha ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_brunods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilZINB BASE=workhasilZINB1run

ENDmend

sae_zinb

  • KOPER AMPE PRAKATA_2pdf
  • isiirenepdf
Page 8: Zero-Inflated Negative Binomial in Small Area Estimation · Irianto Oetomo and Fine Analisa Maharani. She has two siblings. In 1998, she graduated from SD Dukuh 09 East Jakarta and

1

INTRODUCTION

BackgroundDirect estimation is usually applied in big

scale survey but it is sometime difficult to utilize such estimator in a smaller region especially the sample size is too small In this case indirect estimation which adds covariates to estimate the parameter is usually used This type of estimation is broadly known as Small Area Estimation

Kismiantini (2007) conducted a research in Small Area Estimation based on Poisson-Gamma models Maximum Likelihood Estimation was used with Negative Binomial Regression techniques to estimate the respective prior parameter Moreover Negative Binomial Regression was used to resolve over-dispersion problem in the data

In reality count data is not onlycharacterized by over-dispersion but sometimes by excess-zero Excess-zero is a condition when the data contains too many zero or exceeds the distributionrsquos expectation 100 observations from Poisson model with response mean of 4 we could expect that there will be 2 zeros If the data have 30 zeros it should be obvious that the distributional assumptions have been violated Therefore the estimated parameter and standard error will be biased (Hardin amp Hilbe 2007) In this paper Zero-Inflated models were adapted to solve this type of problem

ObjectivesThe research objectives are

1 To investigate the performance of Negative Binomial Regression on Small Area Estimation in case of excess-zero

2 To apply Zero-Inflated Count Models on Small Area Estimation in case of excess-zero

3 To evaluate the performance of Zero-Inflated Count Models in estimating prior parameter for Small Area Estimation

LITERATURE REVIEW

Direct EstimationDirect estimates are generally ldquodesign

basedrdquo in the sense that they make use of ldquosurvey weightrdquo and associated inferences are based on the probability distribution by the sample design with the population values held fixed (Rao 2003) In particular direct estimates of a domain parameter are based only on the domain-specific sample data

Data from sample survey have been used to be a reliable estimate of parameter Ramsini et al (2001) mentioned that direct estimates of small area are unbiased although it would have big variance cause itrsquos small sample size

Small Area EstimationThe term of small area can be everything

depending on our object of interest It can be a city age group sex group region and rural district In general small area is used to denote any domain which the direct estimation with adequate precision can not be produced (Rao 2003) It happens because the sample size in small area is too small As a result direct estimation based on sampling design is not capable to produce direct estimation with adequate precision Furthermore small area estimation is developed as a statistic technique for estimating the parameter of small area This technique is used in effort to make estimation with adequate level of precision It works as indirect estimation that lend the strength of variable interest values from related areas through the use of supplementary information related to variable interest such as recent census count and current administrative records (Rao 2003) Indirect estimation is a process of estimating a domainrsquos parameter by connecting the information in that domain with another domain using an appropriate model So the estimator works by including other domainrsquos data (Kurnia amp Notodiputro 2006)

Small Area ModelsThere are two link models in indirect

estimation First traditional method based on implicit models that provide a link to relate small area through supplementary data Second explicit small area models that make specific allowance between area variations (Rao 2003) This research used the second model and it could be classified into two broad types of basic model1 Basic area level (type A) model

Basic area level model or aggregate model includes all models that relate small area with area-specific auxiliary variables These models are essential if unit (element) level data are not available Assuming parameter estimators

i is

related to area specific auxiliary data or covariate variables T

pii xxx )( 11 by

a linear model

2

iiT

ii vbx with i=1hellipm

iv ~N(0 2v ) are area-specific random

effect and Tp )( 1 is 1p vector of

regression coefficients Therefore ib are

known as positive constants For making inferences about

i direct estimators iy

are assumed available Accordingly assuming

iii ey where i=1hellipm with

sampling error ie ~N(0 ei2 ) and ei

2are known At the end both models are combined and as a result is new model

iiiT

ii evbxy where i=1hellipm

(Rao 2003)2 Basic unit level (type B) model

Unit level model includes all models that relate unit values of the study variable to unit-specific auxiliary variables Assuming unit-specific auxiliary variables T

ijpijij xxx )( 1 and

correspondingly a nested regression model

ijiT

ijij evxy where

i=1hellipm and j=1hellip in with

iv ~N(0 2v ) and also ie ~N(0 ei

2 )

Empirical Bayes MethodsThe Bayesian approach is based on Bayes

Law which was found by Thomas Bayes This law was introduced by Richard Proce in 1763 two years after Thomas Bayes passed away In 1774 and 1781 Laplace gave the details and relevancies for modern Bayesian statistics (Gill 2002 in Kismiantini 2007)

Novick in Good (1980) mentioned that Bayes method is difficult to adopt and sometimes is very sensitive due to the requirement of prior probability informationwhich is usually difficult to obtain Robbin (1955) introduced Empirical Bayes methods by assuming a particular prior distribution estimating based on the sample Rao (2003) said that EB (Empirical Bayes) and HB (Hierarchical Bayes) are compatible for binary and count data in Small Area Estimation Therefore EB method was used in this research

Rao (2003) summarized EB methods in Small Area Estimation as follows 1 Obtain the posterior probability density

function of the small area parameter2 Estimate the parameters from the

marginal density function

3 Use the estimated posterior density forinferences regarding the parameters ofinterest

Poisson-Gamma ModelsPoisson model is a standard model in

dealing with count data Generally count data can be suffered by over-dispersion problem Therefore a Poisson formula had been developed to accommodate extra variance from sample data Two-stage models have been introduced for count data known as mixed model Poisson-Gamma Wakefield (2006) introduced Poisson-Gamma model which was easier to use with SMR (Standard Mortality Ratio) as a direct estimator This study used Wakefield model with alteration in direct estimator

Let iy be a number of specific individual

at small area-i which has specific characteristic of interest and written as follow

j

iji yy

ijy are the-jth object at the-ith small area where

j=1hellipn and i=1hellipm

First stage )(~ ii

ind

i Poissony is assumed

where )( ii x describes a regression

model in area level ix is a vector of

covariates and Tpii)( is a vector of

regression coefficientsSecond stage distribution

)1(~ gammaiid

iis assumed as a prior

distribution with mean 1 and variance 1

Then the marginal distribution |iy is

negative binomialMoreover Wakefield (2006) used Bayes

Theorem and acquired posterior distributionas

)1(~|i

iii ygammay

and EB estimator as

iiiiB

iEB

i )ˆ1(ˆˆ)ˆˆ(ˆˆ

with )ˆˆ(ˆˆ iii ii y are direct

estimation from i and iy are the number of

observation

Negative Binomial Regression The negative binomial regression model

seems have been first discussed by Anscombe (1972) Others have pointed out its success indealing with over-dispersed count data

3

Lawless (1987) elaborated the mixture model parameterization of the negative binomial providing formulas for its log likelihood mean variance and moments Later Breslow (1990) cited Lawlessrsquo work and since its inception to the late 1980rsquos the negative binomial regression model have been construed as a mixture model that is useful for accommodating otherwise over-dispersed Poisson data (Hardin amp Hilbe 2007) The negative binomial distribution function is written as

yk

kk

k

y

kyxyg

)1()(

)()|(

where y=012hellip k and are negative

binomial parameter with )(yE and

ky 2)var( k mention as disperse

parameter which is shown that the data consist of over-dispersed

Over-disperse at Count DataCount data for Poisson regression

including by over-disperse if variance bigger than mean or if the expected value of variance is smaller than expected This phenomenon is written as

)()( ii yEyVar (McCullagh amp Nelder 1989)

Zero-Inflated ModelsZero-Inflated models consider two distinct

sources of zero outcomes One source is generated from individuals who do not enter into the counting process the other from those who do enter the count process but result in a zero outcome (Hardin amp Hilbe 2007)

Lambert (1992) first described this type of mixture model in the context of process control in manufacturing It has since been used in many applications and is now founddiscussed in nearly every book or article dealing with count response models

For the zero-inflated model the probability of observing a zero outcome equals the probability that an individual is in the always-zero group plus the probability that individual is not that group times the probability that the counting process produces a zero If )0(B as

the probability that the binary process result in a zero outcomes and )0Pr( as the probability

that the counting of a zero outcomes the probability of a zero outcome for the system is then given by (Hardin amp Hilbe 2007)

)0Pr()1()0()0Pr( ZBy The probability of a nonzero count is

)Pr()]0(1[)0Pr( kBkky This model would produce two groups of

parameter one is zero-inflation parameter which shown that the covariate significantly contribute to having a zero outcomes And the other parameter is negative binomial parameter which modeling the response with the covariate

Zero-Inflated Negative BinomialThere are many kinds of zero-inflated

model each model has plus and minus and is used in different type of data Zero-Inflated negative binomial is one kind of them This model is used in over-disperse and excess-zero data As a result among parameter estimators there would be k parameters which indicate that over-disperse occur in data just as disperse parameter in negative binomial regression

The probability distribution of this model is as follow

)|( iii xyYP )|0()(1)( iii xgxx )|()(1 iii xygx

Where is a function of iz ix are vector

of zero-inflated covariate and is a vector of

zero-inflated coefficient which will be estimated Meanwhile )|( ii xyg is probability

distribution of negative binomial written asiy

i

i

iii

iii y

yxyg

)1()(

)()|(

Mean and variance of ZINB are

))(1)(1()|(

)1()|(

iiiiii

iiii

xyV

xyE

Jackknife Method of Estimating MSE( EBi )

Jackknife methods is one of general methods used in survey because itrsquos unpretentious concept (Jiang Lahiri and Wan 2002) This methods have been known by Tukey (1958) and developed to be a method that capable to be bias corrected of estimator by remove observation-i for i=1hellipm and performs parameter estimation

Rao (2003) the Jackknife step to estimate MSE( EB

i ) are

1 Assume that )ˆˆ(ˆ iiEBi yk

)ˆˆ(ˆ111 ii

EBi yk then calculate

m

l

EBi

EBii m

mM 2

12 )ˆˆ(1ˆ

2 Calculate the delete-i estimator 1

ˆ

and

1 then calculate

4

)]ˆˆ()ˆˆ([1

)ˆˆ(ˆ111111 ii

m

miiiiii ygyg

m

mygM

And )ˆ( 21 vig is the variance estimator of

posterior distribution which is used to measure the variability associated with i

The use of )ˆ( 21 vig is leads to severe of

underestimation of )ˆ( EBiJMSE related

with estimation in prior parameter Therefore the estimator

iM1ˆ correct the

bias of )ˆ( 21 vig

3 Calculate the jackknife estimator of MSE( EB

i ) as

iiEBiJ MMMSE 21

ˆˆ)ˆ(

METHODOLOGY

DataThis research assumed that the available

auxiliary data is on area level so this research used basic area level model The data were simulated with 30 small areas and one covariate Every batch generated different conditions of excess-zeros data start from 01 until 09 probability of zero in small area This research assumed structure of relation between respond and covariate was linear

MethodsThe following steps in generating data

using SAS 91 were used1 Fix the value of

iX for the- i th area

2 Define the expected probability of zero in each small area ))0(( iYP then

calculate ))0(log( ii YPLambda

3 Generate )11(~ Gammai4 Calculate )log(

iiLambda 5 Fit linear regression between and

iX to

obtain0 and

16 Calculate )`exp(X= ii 7 Calculate

iiparmlambda 8 Generate )(~ parmlambdaPoissonyi

Moreover in analyzing data the following steps were applied 1 Generate the negative binomial regression

with genmod procedure in SAS 91 and Zero-Inflated Negative binomial Regression with countreg procedure in SAS 92

2 Estimate the prior parameter which are and

3 Estimate using EB method4 Calculate MSE for indirect estimation5 Calculate RRMSE (Root Relative Mean

Square Error)

i

ii

MSERRMSE

ˆ)ˆ(

)ˆ(

RESULT AND DISCUSSION

Estimation of Prior Parameter is Based on EB Method with Negative Binomial

RegressionIn case of non-excess-zero data the

estimator produced small and consistent MSE Meanwhile if the number of excess-zero isapproximately 30 or more with expected probability of zero 06 the performance of estimates tends to be unreliable As a result EB estimation produced negative values

RRMSE of the estimator increasessimultaneously along with the increase of number of zero in the data Furthermore if thedata contain excess zero at least 30 theestimator is unreliable

Table 1 MSE and RRMSE of EB Estimator with NBR

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 033 016 018 01302 035 020 026 02003 040 023 036 03004 042 027 050 04205 045 031 072 05906 -12875 033 -038 08107 253671 040 -1216 13508 -584495 030 30946 21109 39135606 016 116E+10 664

Table 2 MSE (II) and RRMSE (II) of EB Estimator with NBR

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 033 016 018 01302 035 020 026 02003 040 023 036 03004 042 027 050 04205 046 031 071 05806 26197 033 -035 07507 950007 040 -1002 09908 1444250 030 22054 11009 41595285 016 677E+09 056

5

Table 1 show that the iterative process produced unexpected negative values of MSEThe simplest way to solve this problem is tochange the negative value to zero MSE (II) and RRMSE (II) in table 2 are the result of MSE and RRMSE after the negative value of MSE has been changed to zero

When data have expected probability of zero by 06 to 09 mean of MSE (II) increases drastically Similarly mean of RRMSE (II) increases sharply when data have 08 to 09 expected probability of zero However when data have 06 to 07 expected probability of zero the mean of RRMSE (II) is negative due to the negative value of EB estimates

Estimation of Prior Parameter is Based on EB Method with Zero-Inflated

Negative Binomial RegressionThe EB estimates are similar to the

estimates produced by NBR method although they are slightly outperformed NBR method when the data only contain small number of zeros In particular as shown by table 3 if data have expected probability of zero by 01 to 05 ZINB produces bigger MSE for EB estimator than which NBR produces

Whereas if data have expected probability of zero by 06 to 07 ZINB gives better estimates The estimates were also unbiased as it covers parameter values adequately However ZINB begins to produce inconsistent estimates if data have expected probability of zero by 08 or more due to enormous MSE

Besides when data have expected is because ZINB generates small estimates which is close to the parameter values

Mean of MSE (II) with ZINB is biggerthan the mean of MSE with ZINB That is because when negative value of MSE changed to zero it doesnrsquot have reduction factor in the mean calculation

Comparison of EB estimator withNegative Binomial Regression and EB

estimator with ZINBEB estimates given by both NBR and

ZINB methods are similar for data with small numbers of zero However ZINB method produces bigger MSE than NBR do as long as expected probability of zero in data does not exceed 06 thresholds

But ZINB method performs better if data have expected probability of zero by 06 to 07 In this case EB estimates given by NBR method are unstable and inconsistent due to estimatesrsquo negative value and huge MSE that

can be thousand times larger than theiracceptable value On the other hand EB estimator with ZINB works well it givesunbiased estimates and its MSE values are more stable than EB estimates with NBR

Both methods would have performed poorly if data had expected probability of zero by 08 or more EB estimators with both methods were inconsistent as a result of very huge MSE values they produced

Table 3 MSE and RRMSE of EB Estimator with ZINB

Probability of zero

Mean of MSE

Median ofMSE

Mean ofRRMSE

Median of RRMSE

01 045 017 024 01402 043 020 033 02103 071 028 052 03204 054 028 0632 04205 086 033 7322807 06606 061 038 29817 10307 058 025 218119 19408 -128 -14E-07 162697 37509 2954790 -1E-06 35E+278 609508

Table 4 MSE (II) and RRMSE (II) of EB Estimator with ZINB

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 045 017 024 01402 0436 020 0324 02103 072 028 051 031104 055 028 061 04105 095 033 6561235 05806 075 038 23406 07007 150 025 134655 06908 175 0 733506 009 2954908 0 12E+278 0

CONCLUSION

Excess-zero in data highly influenced the result of EB estimation Conventional method such as negative binomial regression in prior estimation has produced unbiased and unreliable EB estimator for data with expected probability of zero by 06 This is shown bybig number of MSE and negative value of estimator

Meanwhile EB estimation by ZINB method produced more reliable estimator even when the data have expected probability of zero by 06 to 07

The ZINB has also provided a reliable estimator for data with less than 5333 of zeros This means that performance of ZINB

6

declines when the data have expected probability of zero by 08 or more As shown by the big MSE and inconsistent estimator

RECOMMENDATION

This research is based on many assumptions and suffered by several limitations If the assumptions and boundaries can be relaxed can be expected better result There are some recommendations for the next research1 The generating process in this research

does not reflect the real sampling processIf the generating process similar to the real sampling process it might give better result because it will be closer with the real application

2 It will be more interesting to runexperiment which takes account of larger number of areas since the number of areas will influence data modeling

3 The Restricted Maximum Likelihood maybe applied when estimating prior parameter with ZINB and NBR in other to solve the negative value of MSE

4 Theoretical research of ZINB and Empirical Bayes estimator is important to understand the behavior of parameter estimates of ZINB in Empirical Bayes setting

REFERENCES

Erdman D L Jackson A Sinko 2008 Zero-Inflated Poisson and Zero-Inflated Negative Binomial Models Using the COUNTREG Procedure SAS Global Forum 2008322-2008httpwww2sascomproceedingsforum2008322-2008pdf [25 Agustus 2008]

Famoye F KP Singh 2006 Zero-Inflated Generalized Poisson Regression Model with an Application to Domestic Violence Data Journal of Data Science 4117-130

Hardin JW JM Hilbe 2007 Generalized Linear Models and Extensions Texas A Stata Press Publication

Kurnia A KA Notodiputro 2006 Penerapan Metode Jackknife dalam pendugaan Area Kecil Forum Statistika dan Komputasi April 2006 p12-15

Kismiantini 2007 Pendugaan Statistik Area Kecil Berbasis Model Poisson-Gamma [Tesis] Bogor Institut Pertanian Bogor Fakultas Matematika dan Pengetahuan Alam

McCullagh P J A Nelder 1983 Generalized Linear Models London Chapmann and Hall

Ramsini B et all 2001 Uninsured Estimates by County A Review of Options and IssueshttpwwwodhohiogovDataOFHSurvofhsrfq7pdf [24 April 2008]

Rao JNK 2003 Small Area Estimation New York John Wiley amp Sons

Wakefield J 2006 Disease mapping and spatial regression with count data httpwwwbepresscomuwbiostatpaper286pdf [24 April 2008]

7

Appendix 1 Result of EB estimation with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE -100569 0194196 0271669 041875 045917 3239598RRMSE 0123605 0300339 0422426 0503566 0642418 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE -237643 0235733 0306641 0452955 05091 3652167RRMSE 0038956 0412708 0588924 0717336 0844735 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE -749011 0254097 0330078 -12875 0539873 2354887RRMSE -663045 051763 0813734 -038057 1287528 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE -7513075 0235378 0402092 2536714 0876569 6051162RRMSE -10741 0704796 1355566 -121606 3040291 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601

bias 0000395 0116669 0254473 1091172 0497898 6297454MSE -6E+09 -016583 0301527 -584495 5718409 185E+09RRMSE -212936 0927338 2115163 3094627 1359703 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655MSE -38E+09 -130817 0159682 39135606 3074073 12E+11

RRMSE -909131 1647188 6639631 116E+10 1585472 706E+11

8

Appendix 2 Result of EB estimation with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE -053954 010933 0168797 0449506 0369775 360843RRMSE 0022947 0096443 0136424 0238099 0241955 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE -07309 0126202 0201463 0425844 0414597 1734815RRMSE 0021807 0144983 0210692 0326097 0401786 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE -229891 0156942 0277017 0707983 0590466 7469014RRMSE 0023998 0210095 0317195 0519524 0618802 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE -125713 0181557 0284338 0540615 0498521 423089RRMSE 0054916 028362 0420396 0630776 0778033 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442

MSE -181856 0194818 0334706 0859252 0711939 7997074RRMSE 0026206 0387294 0662251 7322807 1312302 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE -34589 0078006 0376514 060793 0804116 3426488RRMSE 000461 0502807 1033578 2981671 2012552 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE -142213 -001433 0255331 0584152 1132152 264456RRMSE 0064209 0847956 1942286 2181192 4589042 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE -10651 -56E-05 -14E-07 -127819 1452962 1132741RRMSE 0063244 1475413 3754705 162697 9221163 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE -175652 -33E-05 -1E-06 2954790 152E-06 613E+08

RRMSE 0040681 4059441 6095076 35E+278 5569021 16E+281

9

Appendix 3 Result of EB estimation (II) with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE 0 0194196 0271669 0419181 045917 3239598RRMSE 0 0300116 0422209 0502895 0641904 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE 0 0235733 0306641 0456258 05091 3652167RRMSE 0 0410357 0585765 0712314 0841838 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE 0 0254097 0330078 2619677 0539873 2354887RRMSE -663045 0448118 0750369 -034911 1209918 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE 0 0235378 0402092 9500073 0876569 6051162RRMSE -10741 0288999 0995659 -100163 2527784 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601bias 0000395 0116669 0254473 1091172 0497898 6297454MSE 0 0 0301527 1444250 5718409 185E+09RRMSE -212936 0 1104113 2205437 5656681 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655

MSE 0 0 0159682 41595285 3074073 12E+11

RRMSE -909131 0 0557622 677E+09 9311925 706E+11

10

Appendix 4 Result of EB estimation (II) with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE 0 010933 0168797 0450626 0369775 360843RRMSE 0 0095932 0135647 023675 0239669 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE 0 0126202 0201463 0428006 0414597 1734815RRMSE 0 0142648 020709 0320663 0395479 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE 0 0156942 0277017 0716543 0590466 7469014RRMSE 0 0203913 0311937 0506882 0615401 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE 0 0181557 0284338 0549835 0498521 423089RRMSE 0 0270309 0405926 0606317 0766631 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442MSE 0 0194818 0334706 094973 0711939 7997074RRMSE 0 0316402 0576343 6561235 1240175 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE 0 0078006 0376514 0749436 0804116 3426488RRMSE 0 0258286 0698814 2340612 1714808 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE 0 0 0255331 1501268 1132152 264456RRMSE 0 0 0688797 1346552 2500825 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE 0 0 0 1755486 1452962 1132741RRMSE 0 0 0 7335062 3311711 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE 0 0 0 2954908 152E-06 613E+08

RRMSE 0 0 0 12E+278 416189 16E+281

11

Appendix 5 Syntax program for generate data

data b generate x1(covariate) and ei input x1cards0222831971100013131702314625252218171412202210run

macro bangkit_datado r=1 to 100

data egenerate poisson-gamma with excess zerodo kk=1 to 30set btetha = rangam(11)lambda = -log(01) peluang munculnya nilai nol yang diinginkan (01-09)starlambda = log(lambdatetha)output endrun

proc regmodel starlambda = x1 ods output ParameterEstimates=workbetha_lr (keep=Parameter Estimate)run

proc transpose data=workbetha_lr out=workbetha_lr_t

12

Appendix 5 Syntax program for generate data (continued)

rundata _null_set workbetha_lr_tcall symput (Intercept col1)call symput (x1 col2)run

data ddo kk=1 to 30set emu = exp(ampIntercept + ampx1x1)parmlambda = mutethaypoi = rand(poissonparmlambda)output endrun

ods trace onto take percent zero on dataproc freq data=dtables ypoi ods output OneWayFreqs=workzerorundata zeroset zerokeep percentrunproc transpose data=zero out=zero1 rundata _null_set workzero1call symput (pctz col1)rundata dset dpzero=amppctzr=amprrun

proc append data=d base=d1run

endmend

bangkit_data

13

Appendix 6 Syntax program EB with NBR

macro sae_nbdo x=1 to 900

data workaset workeif ^(u=ampx) then deleterun

this genmod procedure estimates the response without zero-inflation proc genmod data=amodel ypoi = x1 dist=nb link=logods output ParameterEstimates=workbetha_nb (keep=Parameter Estimate)run

proc transpose data=workbetha_nb out=workbetha_nb_trun

data _null_set workbetha_nb_tcall symput (Intercept col1)call symput (x1 col2)call symput (Dispersion col3)run

EB with negbin-regdata workduga_nbset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + ampDispersion)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(ampDispersion+ypoi)((mu_hat_b+ampDispersion)2)bias_b=abs(teta_hat_bayes-parmlambda)run

proc append data=workduga_nb base=workduga_nb1run

jacknifedo h=1 to 30

data workdset workduga_nb1if ^(u=ampx) then deleterundata workjacknbamphset workdif u=ampxif kk=amph then deleterun

proc genmod data=workjacknbamph output p out=sasyi_estmodel ypoi = x1 dist = nb link=logods output parameterestimates=workbetha_est_nbamph (keep=parameter Estimate)

14

Appendix 6 Syntax program EB with NBR (continued)

runproc transpose data=workbetha_est_nbamph out=workbetha_est_nbtamphrundata _null_set workbetha_est_nbtamphcall symput (Intercept_ col1)call symput (x1_ col2)call symput (Dispersion_ col3)run

data workduganbamphset workdmu_hat_b_amph=exp(ampIntercept_ + ampx1_x1)w_b_amph=mu_hat_b_amph (mu_hat_b_amph + ampDispersion_)teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2g1_amph=(ampDispersion_+ypoi)((mu_hat_b_amph+ampDispersion_)2)beda_g_amph=g1_amph-g1run

data workmse_nb_jmerge workduganb1 workduganb2 workduganb3 workduganb4 workduganb5 workduganb6 workduganb7 workduganb8 workduganb9 workduganb10 workduganb11 workduganb12workduganb13 workduganb14 workduganb15 workduganb16 workduganb17workduganb18 workduganb19 workduganb20 workduganb21 workduganb22workduganb23 workduganb24 workduganb25 workduganb26 workduganb27workduganb28 workduganb29 workduganb30by kkrun

data workmse_nb_jset workmse_nb_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampjendm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesul = ampxrun

proc append data=workmse_nb_j base=workmse_nb_j1run

data workhasilnbmerge workd workmse_nb_j keep kk x1 tetha mu parmlambda ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_b

15

Appendix 6 Syntax program EB with NBR (continued)

run

ods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilnb BASE=workhasilnb1 appendver=v6run

ENDmend

sae_nb

16

Appendix 7 Syntax program EB with ZINB

macro sae_zinb

do x=1 to 900

data workaset work eif ^(u=ampx) then deleterun

proc countreg data=amodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workpe(keep=Parameter Estimate)run

proc transpose data=workpe out=workpe_trun

data _null_set workpe_tcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + amp_Alpha)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(amp_Alpha+ypoi)((mu_hat_b+amp_Alpha)2)bias_b=abs(teta_hat_bayes-parmlamdha)

run

proc append data=workduga base=workduga1run

do h=1 to 30

data workdset workduga1if ^(u=ampx) then deleterundata workjackzinbamphset workdif u=ampxif kk=amph then deleterun

proc countreg data=jackzinbamphmodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workbetha_est_ZINBamph

17

Appendix 7 Syntax program EB with ZINB (continued)

(keep=Parameter Estimate)run

proc transpose data=workbetha_est_ZINBamph out=workbetha_est_ZINBtamphrun

data _null_set workbetha_est_ZINBtamphcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaZINBamphset workdmu_hat_b_amph=exp(ampIntercept + ampx1x1)mu_hat_b_amph= ampb_o- + ampb_1- x1w_b_amph=mu_hat_b_amph (mu_hat_b_amph + (amp_Alpha))teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2

g1_amph =((mu_hat_b_amph2ampalpha_)2)(ampalpha_+y_i)((mu_hat_b_amph2ampalpha_)+mu_hat_b_amph)2

g1_amph=(amp_Alpha+ypoi)((mu_hat_b_amph+amp_Alpha)2)

g1_amph =(A2)(ampk- + y_i)( a +mu_hat_b)2

beda_g_amph=g1_amph-g1run

data workmse_ZINB_jmerge workdugaZINB1 workdugaZINB2 workdugaZINB3 workdugaZINB4 workdugaZINB5 workdugaZINB6 workdugaZINB7 workdugaZINB8 workdugaZINB9 workdugaZINB10 workdugaZINB11 workdugaZINB12workdugaZINB13 workdugaZINB14 workdugaZINB15 workdugaZINB16 workdugaZINB17workdugaZINB18 workdugaZINB19 workdugaZINB20 workdugaZINB21 workdugaZINB22workdugaZINB23 workdugaZINB24 workdugaZINB25 workdugaZINB26 workdugaZINB27workdugaZINB28 workdugaZINB29 workdugaZINB30by kkrun

data workmse_ZINB_jset workmse_ZINB_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampj

18

Appendix 7 Syntax program EB with ZINB (continued)

endm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesrun

data workhasilZINBmerge workd workmse_ZINB_j keep kk x1 tetha mu lamdha ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_brunods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilZINB BASE=workhasilZINB1run

ENDmend

sae_zinb

  • KOPER AMPE PRAKATA_2pdf
  • isiirenepdf
Page 9: Zero-Inflated Negative Binomial in Small Area Estimation · Irianto Oetomo and Fine Analisa Maharani. She has two siblings. In 1998, she graduated from SD Dukuh 09 East Jakarta and

2

iiT

ii vbx with i=1hellipm

iv ~N(0 2v ) are area-specific random

effect and Tp )( 1 is 1p vector of

regression coefficients Therefore ib are

known as positive constants For making inferences about

i direct estimators iy

are assumed available Accordingly assuming

iii ey where i=1hellipm with

sampling error ie ~N(0 ei2 ) and ei

2are known At the end both models are combined and as a result is new model

iiiT

ii evbxy where i=1hellipm

(Rao 2003)2 Basic unit level (type B) model

Unit level model includes all models that relate unit values of the study variable to unit-specific auxiliary variables Assuming unit-specific auxiliary variables T

ijpijij xxx )( 1 and

correspondingly a nested regression model

ijiT

ijij evxy where

i=1hellipm and j=1hellip in with

iv ~N(0 2v ) and also ie ~N(0 ei

2 )

Empirical Bayes MethodsThe Bayesian approach is based on Bayes

Law which was found by Thomas Bayes This law was introduced by Richard Proce in 1763 two years after Thomas Bayes passed away In 1774 and 1781 Laplace gave the details and relevancies for modern Bayesian statistics (Gill 2002 in Kismiantini 2007)

Novick in Good (1980) mentioned that Bayes method is difficult to adopt and sometimes is very sensitive due to the requirement of prior probability informationwhich is usually difficult to obtain Robbin (1955) introduced Empirical Bayes methods by assuming a particular prior distribution estimating based on the sample Rao (2003) said that EB (Empirical Bayes) and HB (Hierarchical Bayes) are compatible for binary and count data in Small Area Estimation Therefore EB method was used in this research

Rao (2003) summarized EB methods in Small Area Estimation as follows 1 Obtain the posterior probability density

function of the small area parameter2 Estimate the parameters from the

marginal density function

3 Use the estimated posterior density forinferences regarding the parameters ofinterest

Poisson-Gamma ModelsPoisson model is a standard model in

dealing with count data Generally count data can be suffered by over-dispersion problem Therefore a Poisson formula had been developed to accommodate extra variance from sample data Two-stage models have been introduced for count data known as mixed model Poisson-Gamma Wakefield (2006) introduced Poisson-Gamma model which was easier to use with SMR (Standard Mortality Ratio) as a direct estimator This study used Wakefield model with alteration in direct estimator

Let iy be a number of specific individual

at small area-i which has specific characteristic of interest and written as follow

j

iji yy

ijy are the-jth object at the-ith small area where

j=1hellipn and i=1hellipm

First stage )(~ ii

ind

i Poissony is assumed

where )( ii x describes a regression

model in area level ix is a vector of

covariates and Tpii)( is a vector of

regression coefficientsSecond stage distribution

)1(~ gammaiid

iis assumed as a prior

distribution with mean 1 and variance 1

Then the marginal distribution |iy is

negative binomialMoreover Wakefield (2006) used Bayes

Theorem and acquired posterior distributionas

)1(~|i

iii ygammay

and EB estimator as

iiiiB

iEB

i )ˆ1(ˆˆ)ˆˆ(ˆˆ

with )ˆˆ(ˆˆ iii ii y are direct

estimation from i and iy are the number of

observation

Negative Binomial Regression The negative binomial regression model

seems have been first discussed by Anscombe (1972) Others have pointed out its success indealing with over-dispersed count data

3

Lawless (1987) elaborated the mixture model parameterization of the negative binomial providing formulas for its log likelihood mean variance and moments Later Breslow (1990) cited Lawlessrsquo work and since its inception to the late 1980rsquos the negative binomial regression model have been construed as a mixture model that is useful for accommodating otherwise over-dispersed Poisson data (Hardin amp Hilbe 2007) The negative binomial distribution function is written as

yk

kk

k

y

kyxyg

)1()(

)()|(

where y=012hellip k and are negative

binomial parameter with )(yE and

ky 2)var( k mention as disperse

parameter which is shown that the data consist of over-dispersed

Over-disperse at Count DataCount data for Poisson regression

including by over-disperse if variance bigger than mean or if the expected value of variance is smaller than expected This phenomenon is written as

)()( ii yEyVar (McCullagh amp Nelder 1989)

Zero-Inflated ModelsZero-Inflated models consider two distinct

sources of zero outcomes One source is generated from individuals who do not enter into the counting process the other from those who do enter the count process but result in a zero outcome (Hardin amp Hilbe 2007)

Lambert (1992) first described this type of mixture model in the context of process control in manufacturing It has since been used in many applications and is now founddiscussed in nearly every book or article dealing with count response models

For the zero-inflated model the probability of observing a zero outcome equals the probability that an individual is in the always-zero group plus the probability that individual is not that group times the probability that the counting process produces a zero If )0(B as

the probability that the binary process result in a zero outcomes and )0Pr( as the probability

that the counting of a zero outcomes the probability of a zero outcome for the system is then given by (Hardin amp Hilbe 2007)

)0Pr()1()0()0Pr( ZBy The probability of a nonzero count is

)Pr()]0(1[)0Pr( kBkky This model would produce two groups of

parameter one is zero-inflation parameter which shown that the covariate significantly contribute to having a zero outcomes And the other parameter is negative binomial parameter which modeling the response with the covariate

Zero-Inflated Negative BinomialThere are many kinds of zero-inflated

model each model has plus and minus and is used in different type of data Zero-Inflated negative binomial is one kind of them This model is used in over-disperse and excess-zero data As a result among parameter estimators there would be k parameters which indicate that over-disperse occur in data just as disperse parameter in negative binomial regression

The probability distribution of this model is as follow

)|( iii xyYP )|0()(1)( iii xgxx )|()(1 iii xygx

Where is a function of iz ix are vector

of zero-inflated covariate and is a vector of

zero-inflated coefficient which will be estimated Meanwhile )|( ii xyg is probability

distribution of negative binomial written asiy

i

i

iii

iii y

yxyg

)1()(

)()|(

Mean and variance of ZINB are

))(1)(1()|(

)1()|(

iiiiii

iiii

xyV

xyE

Jackknife Method of Estimating MSE( EBi )

Jackknife methods is one of general methods used in survey because itrsquos unpretentious concept (Jiang Lahiri and Wan 2002) This methods have been known by Tukey (1958) and developed to be a method that capable to be bias corrected of estimator by remove observation-i for i=1hellipm and performs parameter estimation

Rao (2003) the Jackknife step to estimate MSE( EB

i ) are

1 Assume that )ˆˆ(ˆ iiEBi yk

)ˆˆ(ˆ111 ii

EBi yk then calculate

m

l

EBi

EBii m

mM 2

12 )ˆˆ(1ˆ

2 Calculate the delete-i estimator 1

ˆ

and

1 then calculate

4

)]ˆˆ()ˆˆ([1

)ˆˆ(ˆ111111 ii

m

miiiiii ygyg

m

mygM

And )ˆ( 21 vig is the variance estimator of

posterior distribution which is used to measure the variability associated with i

The use of )ˆ( 21 vig is leads to severe of

underestimation of )ˆ( EBiJMSE related

with estimation in prior parameter Therefore the estimator

iM1ˆ correct the

bias of )ˆ( 21 vig

3 Calculate the jackknife estimator of MSE( EB

i ) as

iiEBiJ MMMSE 21

ˆˆ)ˆ(

METHODOLOGY

DataThis research assumed that the available

auxiliary data is on area level so this research used basic area level model The data were simulated with 30 small areas and one covariate Every batch generated different conditions of excess-zeros data start from 01 until 09 probability of zero in small area This research assumed structure of relation between respond and covariate was linear

MethodsThe following steps in generating data

using SAS 91 were used1 Fix the value of

iX for the- i th area

2 Define the expected probability of zero in each small area ))0(( iYP then

calculate ))0(log( ii YPLambda

3 Generate )11(~ Gammai4 Calculate )log(

iiLambda 5 Fit linear regression between and

iX to

obtain0 and

16 Calculate )`exp(X= ii 7 Calculate

iiparmlambda 8 Generate )(~ parmlambdaPoissonyi

Moreover in analyzing data the following steps were applied 1 Generate the negative binomial regression

with genmod procedure in SAS 91 and Zero-Inflated Negative binomial Regression with countreg procedure in SAS 92

2 Estimate the prior parameter which are and

3 Estimate using EB method4 Calculate MSE for indirect estimation5 Calculate RRMSE (Root Relative Mean

Square Error)

i

ii

MSERRMSE

ˆ)ˆ(

)ˆ(

RESULT AND DISCUSSION

Estimation of Prior Parameter is Based on EB Method with Negative Binomial

RegressionIn case of non-excess-zero data the

estimator produced small and consistent MSE Meanwhile if the number of excess-zero isapproximately 30 or more with expected probability of zero 06 the performance of estimates tends to be unreliable As a result EB estimation produced negative values

RRMSE of the estimator increasessimultaneously along with the increase of number of zero in the data Furthermore if thedata contain excess zero at least 30 theestimator is unreliable

Table 1 MSE and RRMSE of EB Estimator with NBR

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 033 016 018 01302 035 020 026 02003 040 023 036 03004 042 027 050 04205 045 031 072 05906 -12875 033 -038 08107 253671 040 -1216 13508 -584495 030 30946 21109 39135606 016 116E+10 664

Table 2 MSE (II) and RRMSE (II) of EB Estimator with NBR

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 033 016 018 01302 035 020 026 02003 040 023 036 03004 042 027 050 04205 046 031 071 05806 26197 033 -035 07507 950007 040 -1002 09908 1444250 030 22054 11009 41595285 016 677E+09 056

5

Table 1 show that the iterative process produced unexpected negative values of MSEThe simplest way to solve this problem is tochange the negative value to zero MSE (II) and RRMSE (II) in table 2 are the result of MSE and RRMSE after the negative value of MSE has been changed to zero

When data have expected probability of zero by 06 to 09 mean of MSE (II) increases drastically Similarly mean of RRMSE (II) increases sharply when data have 08 to 09 expected probability of zero However when data have 06 to 07 expected probability of zero the mean of RRMSE (II) is negative due to the negative value of EB estimates

Estimation of Prior Parameter is Based on EB Method with Zero-Inflated

Negative Binomial RegressionThe EB estimates are similar to the

estimates produced by NBR method although they are slightly outperformed NBR method when the data only contain small number of zeros In particular as shown by table 3 if data have expected probability of zero by 01 to 05 ZINB produces bigger MSE for EB estimator than which NBR produces

Whereas if data have expected probability of zero by 06 to 07 ZINB gives better estimates The estimates were also unbiased as it covers parameter values adequately However ZINB begins to produce inconsistent estimates if data have expected probability of zero by 08 or more due to enormous MSE

Besides when data have expected is because ZINB generates small estimates which is close to the parameter values

Mean of MSE (II) with ZINB is biggerthan the mean of MSE with ZINB That is because when negative value of MSE changed to zero it doesnrsquot have reduction factor in the mean calculation

Comparison of EB estimator withNegative Binomial Regression and EB

estimator with ZINBEB estimates given by both NBR and

ZINB methods are similar for data with small numbers of zero However ZINB method produces bigger MSE than NBR do as long as expected probability of zero in data does not exceed 06 thresholds

But ZINB method performs better if data have expected probability of zero by 06 to 07 In this case EB estimates given by NBR method are unstable and inconsistent due to estimatesrsquo negative value and huge MSE that

can be thousand times larger than theiracceptable value On the other hand EB estimator with ZINB works well it givesunbiased estimates and its MSE values are more stable than EB estimates with NBR

Both methods would have performed poorly if data had expected probability of zero by 08 or more EB estimators with both methods were inconsistent as a result of very huge MSE values they produced

Table 3 MSE and RRMSE of EB Estimator with ZINB

Probability of zero

Mean of MSE

Median ofMSE

Mean ofRRMSE

Median of RRMSE

01 045 017 024 01402 043 020 033 02103 071 028 052 03204 054 028 0632 04205 086 033 7322807 06606 061 038 29817 10307 058 025 218119 19408 -128 -14E-07 162697 37509 2954790 -1E-06 35E+278 609508

Table 4 MSE (II) and RRMSE (II) of EB Estimator with ZINB

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 045 017 024 01402 0436 020 0324 02103 072 028 051 031104 055 028 061 04105 095 033 6561235 05806 075 038 23406 07007 150 025 134655 06908 175 0 733506 009 2954908 0 12E+278 0

CONCLUSION

Excess-zero in data highly influenced the result of EB estimation Conventional method such as negative binomial regression in prior estimation has produced unbiased and unreliable EB estimator for data with expected probability of zero by 06 This is shown bybig number of MSE and negative value of estimator

Meanwhile EB estimation by ZINB method produced more reliable estimator even when the data have expected probability of zero by 06 to 07

The ZINB has also provided a reliable estimator for data with less than 5333 of zeros This means that performance of ZINB

6

declines when the data have expected probability of zero by 08 or more As shown by the big MSE and inconsistent estimator

RECOMMENDATION

This research is based on many assumptions and suffered by several limitations If the assumptions and boundaries can be relaxed can be expected better result There are some recommendations for the next research1 The generating process in this research

does not reflect the real sampling processIf the generating process similar to the real sampling process it might give better result because it will be closer with the real application

2 It will be more interesting to runexperiment which takes account of larger number of areas since the number of areas will influence data modeling

3 The Restricted Maximum Likelihood maybe applied when estimating prior parameter with ZINB and NBR in other to solve the negative value of MSE

4 Theoretical research of ZINB and Empirical Bayes estimator is important to understand the behavior of parameter estimates of ZINB in Empirical Bayes setting

REFERENCES

Erdman D L Jackson A Sinko 2008 Zero-Inflated Poisson and Zero-Inflated Negative Binomial Models Using the COUNTREG Procedure SAS Global Forum 2008322-2008httpwww2sascomproceedingsforum2008322-2008pdf [25 Agustus 2008]

Famoye F KP Singh 2006 Zero-Inflated Generalized Poisson Regression Model with an Application to Domestic Violence Data Journal of Data Science 4117-130

Hardin JW JM Hilbe 2007 Generalized Linear Models and Extensions Texas A Stata Press Publication

Kurnia A KA Notodiputro 2006 Penerapan Metode Jackknife dalam pendugaan Area Kecil Forum Statistika dan Komputasi April 2006 p12-15

Kismiantini 2007 Pendugaan Statistik Area Kecil Berbasis Model Poisson-Gamma [Tesis] Bogor Institut Pertanian Bogor Fakultas Matematika dan Pengetahuan Alam

McCullagh P J A Nelder 1983 Generalized Linear Models London Chapmann and Hall

Ramsini B et all 2001 Uninsured Estimates by County A Review of Options and IssueshttpwwwodhohiogovDataOFHSurvofhsrfq7pdf [24 April 2008]

Rao JNK 2003 Small Area Estimation New York John Wiley amp Sons

Wakefield J 2006 Disease mapping and spatial regression with count data httpwwwbepresscomuwbiostatpaper286pdf [24 April 2008]

7

Appendix 1 Result of EB estimation with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE -100569 0194196 0271669 041875 045917 3239598RRMSE 0123605 0300339 0422426 0503566 0642418 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE -237643 0235733 0306641 0452955 05091 3652167RRMSE 0038956 0412708 0588924 0717336 0844735 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE -749011 0254097 0330078 -12875 0539873 2354887RRMSE -663045 051763 0813734 -038057 1287528 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE -7513075 0235378 0402092 2536714 0876569 6051162RRMSE -10741 0704796 1355566 -121606 3040291 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601

bias 0000395 0116669 0254473 1091172 0497898 6297454MSE -6E+09 -016583 0301527 -584495 5718409 185E+09RRMSE -212936 0927338 2115163 3094627 1359703 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655MSE -38E+09 -130817 0159682 39135606 3074073 12E+11

RRMSE -909131 1647188 6639631 116E+10 1585472 706E+11

8

Appendix 2 Result of EB estimation with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE -053954 010933 0168797 0449506 0369775 360843RRMSE 0022947 0096443 0136424 0238099 0241955 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE -07309 0126202 0201463 0425844 0414597 1734815RRMSE 0021807 0144983 0210692 0326097 0401786 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE -229891 0156942 0277017 0707983 0590466 7469014RRMSE 0023998 0210095 0317195 0519524 0618802 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE -125713 0181557 0284338 0540615 0498521 423089RRMSE 0054916 028362 0420396 0630776 0778033 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442

MSE -181856 0194818 0334706 0859252 0711939 7997074RRMSE 0026206 0387294 0662251 7322807 1312302 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE -34589 0078006 0376514 060793 0804116 3426488RRMSE 000461 0502807 1033578 2981671 2012552 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE -142213 -001433 0255331 0584152 1132152 264456RRMSE 0064209 0847956 1942286 2181192 4589042 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE -10651 -56E-05 -14E-07 -127819 1452962 1132741RRMSE 0063244 1475413 3754705 162697 9221163 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE -175652 -33E-05 -1E-06 2954790 152E-06 613E+08

RRMSE 0040681 4059441 6095076 35E+278 5569021 16E+281

9

Appendix 3 Result of EB estimation (II) with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE 0 0194196 0271669 0419181 045917 3239598RRMSE 0 0300116 0422209 0502895 0641904 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE 0 0235733 0306641 0456258 05091 3652167RRMSE 0 0410357 0585765 0712314 0841838 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE 0 0254097 0330078 2619677 0539873 2354887RRMSE -663045 0448118 0750369 -034911 1209918 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE 0 0235378 0402092 9500073 0876569 6051162RRMSE -10741 0288999 0995659 -100163 2527784 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601bias 0000395 0116669 0254473 1091172 0497898 6297454MSE 0 0 0301527 1444250 5718409 185E+09RRMSE -212936 0 1104113 2205437 5656681 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655

MSE 0 0 0159682 41595285 3074073 12E+11

RRMSE -909131 0 0557622 677E+09 9311925 706E+11

10

Appendix 4 Result of EB estimation (II) with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE 0 010933 0168797 0450626 0369775 360843RRMSE 0 0095932 0135647 023675 0239669 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE 0 0126202 0201463 0428006 0414597 1734815RRMSE 0 0142648 020709 0320663 0395479 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE 0 0156942 0277017 0716543 0590466 7469014RRMSE 0 0203913 0311937 0506882 0615401 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE 0 0181557 0284338 0549835 0498521 423089RRMSE 0 0270309 0405926 0606317 0766631 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442MSE 0 0194818 0334706 094973 0711939 7997074RRMSE 0 0316402 0576343 6561235 1240175 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE 0 0078006 0376514 0749436 0804116 3426488RRMSE 0 0258286 0698814 2340612 1714808 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE 0 0 0255331 1501268 1132152 264456RRMSE 0 0 0688797 1346552 2500825 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE 0 0 0 1755486 1452962 1132741RRMSE 0 0 0 7335062 3311711 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE 0 0 0 2954908 152E-06 613E+08

RRMSE 0 0 0 12E+278 416189 16E+281

11

Appendix 5 Syntax program for generate data

data b generate x1(covariate) and ei input x1cards0222831971100013131702314625252218171412202210run

macro bangkit_datado r=1 to 100

data egenerate poisson-gamma with excess zerodo kk=1 to 30set btetha = rangam(11)lambda = -log(01) peluang munculnya nilai nol yang diinginkan (01-09)starlambda = log(lambdatetha)output endrun

proc regmodel starlambda = x1 ods output ParameterEstimates=workbetha_lr (keep=Parameter Estimate)run

proc transpose data=workbetha_lr out=workbetha_lr_t

12

Appendix 5 Syntax program for generate data (continued)

rundata _null_set workbetha_lr_tcall symput (Intercept col1)call symput (x1 col2)run

data ddo kk=1 to 30set emu = exp(ampIntercept + ampx1x1)parmlambda = mutethaypoi = rand(poissonparmlambda)output endrun

ods trace onto take percent zero on dataproc freq data=dtables ypoi ods output OneWayFreqs=workzerorundata zeroset zerokeep percentrunproc transpose data=zero out=zero1 rundata _null_set workzero1call symput (pctz col1)rundata dset dpzero=amppctzr=amprrun

proc append data=d base=d1run

endmend

bangkit_data

13

Appendix 6 Syntax program EB with NBR

macro sae_nbdo x=1 to 900

data workaset workeif ^(u=ampx) then deleterun

this genmod procedure estimates the response without zero-inflation proc genmod data=amodel ypoi = x1 dist=nb link=logods output ParameterEstimates=workbetha_nb (keep=Parameter Estimate)run

proc transpose data=workbetha_nb out=workbetha_nb_trun

data _null_set workbetha_nb_tcall symput (Intercept col1)call symput (x1 col2)call symput (Dispersion col3)run

EB with negbin-regdata workduga_nbset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + ampDispersion)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(ampDispersion+ypoi)((mu_hat_b+ampDispersion)2)bias_b=abs(teta_hat_bayes-parmlambda)run

proc append data=workduga_nb base=workduga_nb1run

jacknifedo h=1 to 30

data workdset workduga_nb1if ^(u=ampx) then deleterundata workjacknbamphset workdif u=ampxif kk=amph then deleterun

proc genmod data=workjacknbamph output p out=sasyi_estmodel ypoi = x1 dist = nb link=logods output parameterestimates=workbetha_est_nbamph (keep=parameter Estimate)

14

Appendix 6 Syntax program EB with NBR (continued)

runproc transpose data=workbetha_est_nbamph out=workbetha_est_nbtamphrundata _null_set workbetha_est_nbtamphcall symput (Intercept_ col1)call symput (x1_ col2)call symput (Dispersion_ col3)run

data workduganbamphset workdmu_hat_b_amph=exp(ampIntercept_ + ampx1_x1)w_b_amph=mu_hat_b_amph (mu_hat_b_amph + ampDispersion_)teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2g1_amph=(ampDispersion_+ypoi)((mu_hat_b_amph+ampDispersion_)2)beda_g_amph=g1_amph-g1run

data workmse_nb_jmerge workduganb1 workduganb2 workduganb3 workduganb4 workduganb5 workduganb6 workduganb7 workduganb8 workduganb9 workduganb10 workduganb11 workduganb12workduganb13 workduganb14 workduganb15 workduganb16 workduganb17workduganb18 workduganb19 workduganb20 workduganb21 workduganb22workduganb23 workduganb24 workduganb25 workduganb26 workduganb27workduganb28 workduganb29 workduganb30by kkrun

data workmse_nb_jset workmse_nb_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampjendm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesul = ampxrun

proc append data=workmse_nb_j base=workmse_nb_j1run

data workhasilnbmerge workd workmse_nb_j keep kk x1 tetha mu parmlambda ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_b

15

Appendix 6 Syntax program EB with NBR (continued)

run

ods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilnb BASE=workhasilnb1 appendver=v6run

ENDmend

sae_nb

16

Appendix 7 Syntax program EB with ZINB

macro sae_zinb

do x=1 to 900

data workaset work eif ^(u=ampx) then deleterun

proc countreg data=amodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workpe(keep=Parameter Estimate)run

proc transpose data=workpe out=workpe_trun

data _null_set workpe_tcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + amp_Alpha)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(amp_Alpha+ypoi)((mu_hat_b+amp_Alpha)2)bias_b=abs(teta_hat_bayes-parmlamdha)

run

proc append data=workduga base=workduga1run

do h=1 to 30

data workdset workduga1if ^(u=ampx) then deleterundata workjackzinbamphset workdif u=ampxif kk=amph then deleterun

proc countreg data=jackzinbamphmodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workbetha_est_ZINBamph

17

Appendix 7 Syntax program EB with ZINB (continued)

(keep=Parameter Estimate)run

proc transpose data=workbetha_est_ZINBamph out=workbetha_est_ZINBtamphrun

data _null_set workbetha_est_ZINBtamphcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaZINBamphset workdmu_hat_b_amph=exp(ampIntercept + ampx1x1)mu_hat_b_amph= ampb_o- + ampb_1- x1w_b_amph=mu_hat_b_amph (mu_hat_b_amph + (amp_Alpha))teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2

g1_amph =((mu_hat_b_amph2ampalpha_)2)(ampalpha_+y_i)((mu_hat_b_amph2ampalpha_)+mu_hat_b_amph)2

g1_amph=(amp_Alpha+ypoi)((mu_hat_b_amph+amp_Alpha)2)

g1_amph =(A2)(ampk- + y_i)( a +mu_hat_b)2

beda_g_amph=g1_amph-g1run

data workmse_ZINB_jmerge workdugaZINB1 workdugaZINB2 workdugaZINB3 workdugaZINB4 workdugaZINB5 workdugaZINB6 workdugaZINB7 workdugaZINB8 workdugaZINB9 workdugaZINB10 workdugaZINB11 workdugaZINB12workdugaZINB13 workdugaZINB14 workdugaZINB15 workdugaZINB16 workdugaZINB17workdugaZINB18 workdugaZINB19 workdugaZINB20 workdugaZINB21 workdugaZINB22workdugaZINB23 workdugaZINB24 workdugaZINB25 workdugaZINB26 workdugaZINB27workdugaZINB28 workdugaZINB29 workdugaZINB30by kkrun

data workmse_ZINB_jset workmse_ZINB_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampj

18

Appendix 7 Syntax program EB with ZINB (continued)

endm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesrun

data workhasilZINBmerge workd workmse_ZINB_j keep kk x1 tetha mu lamdha ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_brunods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilZINB BASE=workhasilZINB1run

ENDmend

sae_zinb

  • KOPER AMPE PRAKATA_2pdf
  • isiirenepdf
Page 10: Zero-Inflated Negative Binomial in Small Area Estimation · Irianto Oetomo and Fine Analisa Maharani. She has two siblings. In 1998, she graduated from SD Dukuh 09 East Jakarta and

3

Lawless (1987) elaborated the mixture model parameterization of the negative binomial providing formulas for its log likelihood mean variance and moments Later Breslow (1990) cited Lawlessrsquo work and since its inception to the late 1980rsquos the negative binomial regression model have been construed as a mixture model that is useful for accommodating otherwise over-dispersed Poisson data (Hardin amp Hilbe 2007) The negative binomial distribution function is written as

yk

kk

k

y

kyxyg

)1()(

)()|(

where y=012hellip k and are negative

binomial parameter with )(yE and

ky 2)var( k mention as disperse

parameter which is shown that the data consist of over-dispersed

Over-disperse at Count DataCount data for Poisson regression

including by over-disperse if variance bigger than mean or if the expected value of variance is smaller than expected This phenomenon is written as

)()( ii yEyVar (McCullagh amp Nelder 1989)

Zero-Inflated ModelsZero-Inflated models consider two distinct

sources of zero outcomes One source is generated from individuals who do not enter into the counting process the other from those who do enter the count process but result in a zero outcome (Hardin amp Hilbe 2007)

Lambert (1992) first described this type of mixture model in the context of process control in manufacturing It has since been used in many applications and is now founddiscussed in nearly every book or article dealing with count response models

For the zero-inflated model the probability of observing a zero outcome equals the probability that an individual is in the always-zero group plus the probability that individual is not that group times the probability that the counting process produces a zero If )0(B as

the probability that the binary process result in a zero outcomes and )0Pr( as the probability

that the counting of a zero outcomes the probability of a zero outcome for the system is then given by (Hardin amp Hilbe 2007)

)0Pr()1()0()0Pr( ZBy The probability of a nonzero count is

)Pr()]0(1[)0Pr( kBkky This model would produce two groups of

parameter one is zero-inflation parameter which shown that the covariate significantly contribute to having a zero outcomes And the other parameter is negative binomial parameter which modeling the response with the covariate

Zero-Inflated Negative BinomialThere are many kinds of zero-inflated

model each model has plus and minus and is used in different type of data Zero-Inflated negative binomial is one kind of them This model is used in over-disperse and excess-zero data As a result among parameter estimators there would be k parameters which indicate that over-disperse occur in data just as disperse parameter in negative binomial regression

The probability distribution of this model is as follow

)|( iii xyYP )|0()(1)( iii xgxx )|()(1 iii xygx

Where is a function of iz ix are vector

of zero-inflated covariate and is a vector of

zero-inflated coefficient which will be estimated Meanwhile )|( ii xyg is probability

distribution of negative binomial written asiy

i

i

iii

iii y

yxyg

)1()(

)()|(

Mean and variance of ZINB are

))(1)(1()|(

)1()|(

iiiiii

iiii

xyV

xyE

Jackknife Method of Estimating MSE( EBi )

Jackknife methods is one of general methods used in survey because itrsquos unpretentious concept (Jiang Lahiri and Wan 2002) This methods have been known by Tukey (1958) and developed to be a method that capable to be bias corrected of estimator by remove observation-i for i=1hellipm and performs parameter estimation

Rao (2003) the Jackknife step to estimate MSE( EB

i ) are

1 Assume that )ˆˆ(ˆ iiEBi yk

)ˆˆ(ˆ111 ii

EBi yk then calculate

m

l

EBi

EBii m

mM 2

12 )ˆˆ(1ˆ

2 Calculate the delete-i estimator 1

ˆ

and

1 then calculate

4

)]ˆˆ()ˆˆ([1

)ˆˆ(ˆ111111 ii

m

miiiiii ygyg

m

mygM

And )ˆ( 21 vig is the variance estimator of

posterior distribution which is used to measure the variability associated with i

The use of )ˆ( 21 vig is leads to severe of

underestimation of )ˆ( EBiJMSE related

with estimation in prior parameter Therefore the estimator

iM1ˆ correct the

bias of )ˆ( 21 vig

3 Calculate the jackknife estimator of MSE( EB

i ) as

iiEBiJ MMMSE 21

ˆˆ)ˆ(

METHODOLOGY

DataThis research assumed that the available

auxiliary data is on area level so this research used basic area level model The data were simulated with 30 small areas and one covariate Every batch generated different conditions of excess-zeros data start from 01 until 09 probability of zero in small area This research assumed structure of relation between respond and covariate was linear

MethodsThe following steps in generating data

using SAS 91 were used1 Fix the value of

iX for the- i th area

2 Define the expected probability of zero in each small area ))0(( iYP then

calculate ))0(log( ii YPLambda

3 Generate )11(~ Gammai4 Calculate )log(

iiLambda 5 Fit linear regression between and

iX to

obtain0 and

16 Calculate )`exp(X= ii 7 Calculate

iiparmlambda 8 Generate )(~ parmlambdaPoissonyi

Moreover in analyzing data the following steps were applied 1 Generate the negative binomial regression

with genmod procedure in SAS 91 and Zero-Inflated Negative binomial Regression with countreg procedure in SAS 92

2 Estimate the prior parameter which are and

3 Estimate using EB method4 Calculate MSE for indirect estimation5 Calculate RRMSE (Root Relative Mean

Square Error)

i

ii

MSERRMSE

ˆ)ˆ(

)ˆ(

RESULT AND DISCUSSION

Estimation of Prior Parameter is Based on EB Method with Negative Binomial

RegressionIn case of non-excess-zero data the

estimator produced small and consistent MSE Meanwhile if the number of excess-zero isapproximately 30 or more with expected probability of zero 06 the performance of estimates tends to be unreliable As a result EB estimation produced negative values

RRMSE of the estimator increasessimultaneously along with the increase of number of zero in the data Furthermore if thedata contain excess zero at least 30 theestimator is unreliable

Table 1 MSE and RRMSE of EB Estimator with NBR

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 033 016 018 01302 035 020 026 02003 040 023 036 03004 042 027 050 04205 045 031 072 05906 -12875 033 -038 08107 253671 040 -1216 13508 -584495 030 30946 21109 39135606 016 116E+10 664

Table 2 MSE (II) and RRMSE (II) of EB Estimator with NBR

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 033 016 018 01302 035 020 026 02003 040 023 036 03004 042 027 050 04205 046 031 071 05806 26197 033 -035 07507 950007 040 -1002 09908 1444250 030 22054 11009 41595285 016 677E+09 056

5

Table 1 show that the iterative process produced unexpected negative values of MSEThe simplest way to solve this problem is tochange the negative value to zero MSE (II) and RRMSE (II) in table 2 are the result of MSE and RRMSE after the negative value of MSE has been changed to zero

When data have expected probability of zero by 06 to 09 mean of MSE (II) increases drastically Similarly mean of RRMSE (II) increases sharply when data have 08 to 09 expected probability of zero However when data have 06 to 07 expected probability of zero the mean of RRMSE (II) is negative due to the negative value of EB estimates

Estimation of Prior Parameter is Based on EB Method with Zero-Inflated

Negative Binomial RegressionThe EB estimates are similar to the

estimates produced by NBR method although they are slightly outperformed NBR method when the data only contain small number of zeros In particular as shown by table 3 if data have expected probability of zero by 01 to 05 ZINB produces bigger MSE for EB estimator than which NBR produces

Whereas if data have expected probability of zero by 06 to 07 ZINB gives better estimates The estimates were also unbiased as it covers parameter values adequately However ZINB begins to produce inconsistent estimates if data have expected probability of zero by 08 or more due to enormous MSE

Besides when data have expected is because ZINB generates small estimates which is close to the parameter values

Mean of MSE (II) with ZINB is biggerthan the mean of MSE with ZINB That is because when negative value of MSE changed to zero it doesnrsquot have reduction factor in the mean calculation

Comparison of EB estimator withNegative Binomial Regression and EB

estimator with ZINBEB estimates given by both NBR and

ZINB methods are similar for data with small numbers of zero However ZINB method produces bigger MSE than NBR do as long as expected probability of zero in data does not exceed 06 thresholds

But ZINB method performs better if data have expected probability of zero by 06 to 07 In this case EB estimates given by NBR method are unstable and inconsistent due to estimatesrsquo negative value and huge MSE that

can be thousand times larger than theiracceptable value On the other hand EB estimator with ZINB works well it givesunbiased estimates and its MSE values are more stable than EB estimates with NBR

Both methods would have performed poorly if data had expected probability of zero by 08 or more EB estimators with both methods were inconsistent as a result of very huge MSE values they produced

Table 3 MSE and RRMSE of EB Estimator with ZINB

Probability of zero

Mean of MSE

Median ofMSE

Mean ofRRMSE

Median of RRMSE

01 045 017 024 01402 043 020 033 02103 071 028 052 03204 054 028 0632 04205 086 033 7322807 06606 061 038 29817 10307 058 025 218119 19408 -128 -14E-07 162697 37509 2954790 -1E-06 35E+278 609508

Table 4 MSE (II) and RRMSE (II) of EB Estimator with ZINB

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 045 017 024 01402 0436 020 0324 02103 072 028 051 031104 055 028 061 04105 095 033 6561235 05806 075 038 23406 07007 150 025 134655 06908 175 0 733506 009 2954908 0 12E+278 0

CONCLUSION

Excess-zero in data highly influenced the result of EB estimation Conventional method such as negative binomial regression in prior estimation has produced unbiased and unreliable EB estimator for data with expected probability of zero by 06 This is shown bybig number of MSE and negative value of estimator

Meanwhile EB estimation by ZINB method produced more reliable estimator even when the data have expected probability of zero by 06 to 07

The ZINB has also provided a reliable estimator for data with less than 5333 of zeros This means that performance of ZINB

6

declines when the data have expected probability of zero by 08 or more As shown by the big MSE and inconsistent estimator

RECOMMENDATION

This research is based on many assumptions and suffered by several limitations If the assumptions and boundaries can be relaxed can be expected better result There are some recommendations for the next research1 The generating process in this research

does not reflect the real sampling processIf the generating process similar to the real sampling process it might give better result because it will be closer with the real application

2 It will be more interesting to runexperiment which takes account of larger number of areas since the number of areas will influence data modeling

3 The Restricted Maximum Likelihood maybe applied when estimating prior parameter with ZINB and NBR in other to solve the negative value of MSE

4 Theoretical research of ZINB and Empirical Bayes estimator is important to understand the behavior of parameter estimates of ZINB in Empirical Bayes setting

REFERENCES

Erdman D L Jackson A Sinko 2008 Zero-Inflated Poisson and Zero-Inflated Negative Binomial Models Using the COUNTREG Procedure SAS Global Forum 2008322-2008httpwww2sascomproceedingsforum2008322-2008pdf [25 Agustus 2008]

Famoye F KP Singh 2006 Zero-Inflated Generalized Poisson Regression Model with an Application to Domestic Violence Data Journal of Data Science 4117-130

Hardin JW JM Hilbe 2007 Generalized Linear Models and Extensions Texas A Stata Press Publication

Kurnia A KA Notodiputro 2006 Penerapan Metode Jackknife dalam pendugaan Area Kecil Forum Statistika dan Komputasi April 2006 p12-15

Kismiantini 2007 Pendugaan Statistik Area Kecil Berbasis Model Poisson-Gamma [Tesis] Bogor Institut Pertanian Bogor Fakultas Matematika dan Pengetahuan Alam

McCullagh P J A Nelder 1983 Generalized Linear Models London Chapmann and Hall

Ramsini B et all 2001 Uninsured Estimates by County A Review of Options and IssueshttpwwwodhohiogovDataOFHSurvofhsrfq7pdf [24 April 2008]

Rao JNK 2003 Small Area Estimation New York John Wiley amp Sons

Wakefield J 2006 Disease mapping and spatial regression with count data httpwwwbepresscomuwbiostatpaper286pdf [24 April 2008]

7

Appendix 1 Result of EB estimation with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE -100569 0194196 0271669 041875 045917 3239598RRMSE 0123605 0300339 0422426 0503566 0642418 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE -237643 0235733 0306641 0452955 05091 3652167RRMSE 0038956 0412708 0588924 0717336 0844735 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE -749011 0254097 0330078 -12875 0539873 2354887RRMSE -663045 051763 0813734 -038057 1287528 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE -7513075 0235378 0402092 2536714 0876569 6051162RRMSE -10741 0704796 1355566 -121606 3040291 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601

bias 0000395 0116669 0254473 1091172 0497898 6297454MSE -6E+09 -016583 0301527 -584495 5718409 185E+09RRMSE -212936 0927338 2115163 3094627 1359703 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655MSE -38E+09 -130817 0159682 39135606 3074073 12E+11

RRMSE -909131 1647188 6639631 116E+10 1585472 706E+11

8

Appendix 2 Result of EB estimation with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE -053954 010933 0168797 0449506 0369775 360843RRMSE 0022947 0096443 0136424 0238099 0241955 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE -07309 0126202 0201463 0425844 0414597 1734815RRMSE 0021807 0144983 0210692 0326097 0401786 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE -229891 0156942 0277017 0707983 0590466 7469014RRMSE 0023998 0210095 0317195 0519524 0618802 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE -125713 0181557 0284338 0540615 0498521 423089RRMSE 0054916 028362 0420396 0630776 0778033 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442

MSE -181856 0194818 0334706 0859252 0711939 7997074RRMSE 0026206 0387294 0662251 7322807 1312302 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE -34589 0078006 0376514 060793 0804116 3426488RRMSE 000461 0502807 1033578 2981671 2012552 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE -142213 -001433 0255331 0584152 1132152 264456RRMSE 0064209 0847956 1942286 2181192 4589042 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE -10651 -56E-05 -14E-07 -127819 1452962 1132741RRMSE 0063244 1475413 3754705 162697 9221163 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE -175652 -33E-05 -1E-06 2954790 152E-06 613E+08

RRMSE 0040681 4059441 6095076 35E+278 5569021 16E+281

9

Appendix 3 Result of EB estimation (II) with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE 0 0194196 0271669 0419181 045917 3239598RRMSE 0 0300116 0422209 0502895 0641904 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE 0 0235733 0306641 0456258 05091 3652167RRMSE 0 0410357 0585765 0712314 0841838 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE 0 0254097 0330078 2619677 0539873 2354887RRMSE -663045 0448118 0750369 -034911 1209918 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE 0 0235378 0402092 9500073 0876569 6051162RRMSE -10741 0288999 0995659 -100163 2527784 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601bias 0000395 0116669 0254473 1091172 0497898 6297454MSE 0 0 0301527 1444250 5718409 185E+09RRMSE -212936 0 1104113 2205437 5656681 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655

MSE 0 0 0159682 41595285 3074073 12E+11

RRMSE -909131 0 0557622 677E+09 9311925 706E+11

10

Appendix 4 Result of EB estimation (II) with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE 0 010933 0168797 0450626 0369775 360843RRMSE 0 0095932 0135647 023675 0239669 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE 0 0126202 0201463 0428006 0414597 1734815RRMSE 0 0142648 020709 0320663 0395479 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE 0 0156942 0277017 0716543 0590466 7469014RRMSE 0 0203913 0311937 0506882 0615401 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE 0 0181557 0284338 0549835 0498521 423089RRMSE 0 0270309 0405926 0606317 0766631 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442MSE 0 0194818 0334706 094973 0711939 7997074RRMSE 0 0316402 0576343 6561235 1240175 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE 0 0078006 0376514 0749436 0804116 3426488RRMSE 0 0258286 0698814 2340612 1714808 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE 0 0 0255331 1501268 1132152 264456RRMSE 0 0 0688797 1346552 2500825 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE 0 0 0 1755486 1452962 1132741RRMSE 0 0 0 7335062 3311711 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE 0 0 0 2954908 152E-06 613E+08

RRMSE 0 0 0 12E+278 416189 16E+281

11

Appendix 5 Syntax program for generate data

data b generate x1(covariate) and ei input x1cards0222831971100013131702314625252218171412202210run

macro bangkit_datado r=1 to 100

data egenerate poisson-gamma with excess zerodo kk=1 to 30set btetha = rangam(11)lambda = -log(01) peluang munculnya nilai nol yang diinginkan (01-09)starlambda = log(lambdatetha)output endrun

proc regmodel starlambda = x1 ods output ParameterEstimates=workbetha_lr (keep=Parameter Estimate)run

proc transpose data=workbetha_lr out=workbetha_lr_t

12

Appendix 5 Syntax program for generate data (continued)

rundata _null_set workbetha_lr_tcall symput (Intercept col1)call symput (x1 col2)run

data ddo kk=1 to 30set emu = exp(ampIntercept + ampx1x1)parmlambda = mutethaypoi = rand(poissonparmlambda)output endrun

ods trace onto take percent zero on dataproc freq data=dtables ypoi ods output OneWayFreqs=workzerorundata zeroset zerokeep percentrunproc transpose data=zero out=zero1 rundata _null_set workzero1call symput (pctz col1)rundata dset dpzero=amppctzr=amprrun

proc append data=d base=d1run

endmend

bangkit_data

13

Appendix 6 Syntax program EB with NBR

macro sae_nbdo x=1 to 900

data workaset workeif ^(u=ampx) then deleterun

this genmod procedure estimates the response without zero-inflation proc genmod data=amodel ypoi = x1 dist=nb link=logods output ParameterEstimates=workbetha_nb (keep=Parameter Estimate)run

proc transpose data=workbetha_nb out=workbetha_nb_trun

data _null_set workbetha_nb_tcall symput (Intercept col1)call symput (x1 col2)call symput (Dispersion col3)run

EB with negbin-regdata workduga_nbset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + ampDispersion)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(ampDispersion+ypoi)((mu_hat_b+ampDispersion)2)bias_b=abs(teta_hat_bayes-parmlambda)run

proc append data=workduga_nb base=workduga_nb1run

jacknifedo h=1 to 30

data workdset workduga_nb1if ^(u=ampx) then deleterundata workjacknbamphset workdif u=ampxif kk=amph then deleterun

proc genmod data=workjacknbamph output p out=sasyi_estmodel ypoi = x1 dist = nb link=logods output parameterestimates=workbetha_est_nbamph (keep=parameter Estimate)

14

Appendix 6 Syntax program EB with NBR (continued)

runproc transpose data=workbetha_est_nbamph out=workbetha_est_nbtamphrundata _null_set workbetha_est_nbtamphcall symput (Intercept_ col1)call symput (x1_ col2)call symput (Dispersion_ col3)run

data workduganbamphset workdmu_hat_b_amph=exp(ampIntercept_ + ampx1_x1)w_b_amph=mu_hat_b_amph (mu_hat_b_amph + ampDispersion_)teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2g1_amph=(ampDispersion_+ypoi)((mu_hat_b_amph+ampDispersion_)2)beda_g_amph=g1_amph-g1run

data workmse_nb_jmerge workduganb1 workduganb2 workduganb3 workduganb4 workduganb5 workduganb6 workduganb7 workduganb8 workduganb9 workduganb10 workduganb11 workduganb12workduganb13 workduganb14 workduganb15 workduganb16 workduganb17workduganb18 workduganb19 workduganb20 workduganb21 workduganb22workduganb23 workduganb24 workduganb25 workduganb26 workduganb27workduganb28 workduganb29 workduganb30by kkrun

data workmse_nb_jset workmse_nb_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampjendm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesul = ampxrun

proc append data=workmse_nb_j base=workmse_nb_j1run

data workhasilnbmerge workd workmse_nb_j keep kk x1 tetha mu parmlambda ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_b

15

Appendix 6 Syntax program EB with NBR (continued)

run

ods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilnb BASE=workhasilnb1 appendver=v6run

ENDmend

sae_nb

16

Appendix 7 Syntax program EB with ZINB

macro sae_zinb

do x=1 to 900

data workaset work eif ^(u=ampx) then deleterun

proc countreg data=amodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workpe(keep=Parameter Estimate)run

proc transpose data=workpe out=workpe_trun

data _null_set workpe_tcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + amp_Alpha)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(amp_Alpha+ypoi)((mu_hat_b+amp_Alpha)2)bias_b=abs(teta_hat_bayes-parmlamdha)

run

proc append data=workduga base=workduga1run

do h=1 to 30

data workdset workduga1if ^(u=ampx) then deleterundata workjackzinbamphset workdif u=ampxif kk=amph then deleterun

proc countreg data=jackzinbamphmodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workbetha_est_ZINBamph

17

Appendix 7 Syntax program EB with ZINB (continued)

(keep=Parameter Estimate)run

proc transpose data=workbetha_est_ZINBamph out=workbetha_est_ZINBtamphrun

data _null_set workbetha_est_ZINBtamphcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaZINBamphset workdmu_hat_b_amph=exp(ampIntercept + ampx1x1)mu_hat_b_amph= ampb_o- + ampb_1- x1w_b_amph=mu_hat_b_amph (mu_hat_b_amph + (amp_Alpha))teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2

g1_amph =((mu_hat_b_amph2ampalpha_)2)(ampalpha_+y_i)((mu_hat_b_amph2ampalpha_)+mu_hat_b_amph)2

g1_amph=(amp_Alpha+ypoi)((mu_hat_b_amph+amp_Alpha)2)

g1_amph =(A2)(ampk- + y_i)( a +mu_hat_b)2

beda_g_amph=g1_amph-g1run

data workmse_ZINB_jmerge workdugaZINB1 workdugaZINB2 workdugaZINB3 workdugaZINB4 workdugaZINB5 workdugaZINB6 workdugaZINB7 workdugaZINB8 workdugaZINB9 workdugaZINB10 workdugaZINB11 workdugaZINB12workdugaZINB13 workdugaZINB14 workdugaZINB15 workdugaZINB16 workdugaZINB17workdugaZINB18 workdugaZINB19 workdugaZINB20 workdugaZINB21 workdugaZINB22workdugaZINB23 workdugaZINB24 workdugaZINB25 workdugaZINB26 workdugaZINB27workdugaZINB28 workdugaZINB29 workdugaZINB30by kkrun

data workmse_ZINB_jset workmse_ZINB_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampj

18

Appendix 7 Syntax program EB with ZINB (continued)

endm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesrun

data workhasilZINBmerge workd workmse_ZINB_j keep kk x1 tetha mu lamdha ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_brunods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilZINB BASE=workhasilZINB1run

ENDmend

sae_zinb

  • KOPER AMPE PRAKATA_2pdf
  • isiirenepdf
Page 11: Zero-Inflated Negative Binomial in Small Area Estimation · Irianto Oetomo and Fine Analisa Maharani. She has two siblings. In 1998, she graduated from SD Dukuh 09 East Jakarta and

4

)]ˆˆ()ˆˆ([1

)ˆˆ(ˆ111111 ii

m

miiiiii ygyg

m

mygM

And )ˆ( 21 vig is the variance estimator of

posterior distribution which is used to measure the variability associated with i

The use of )ˆ( 21 vig is leads to severe of

underestimation of )ˆ( EBiJMSE related

with estimation in prior parameter Therefore the estimator

iM1ˆ correct the

bias of )ˆ( 21 vig

3 Calculate the jackknife estimator of MSE( EB

i ) as

iiEBiJ MMMSE 21

ˆˆ)ˆ(

METHODOLOGY

DataThis research assumed that the available

auxiliary data is on area level so this research used basic area level model The data were simulated with 30 small areas and one covariate Every batch generated different conditions of excess-zeros data start from 01 until 09 probability of zero in small area This research assumed structure of relation between respond and covariate was linear

MethodsThe following steps in generating data

using SAS 91 were used1 Fix the value of

iX for the- i th area

2 Define the expected probability of zero in each small area ))0(( iYP then

calculate ))0(log( ii YPLambda

3 Generate )11(~ Gammai4 Calculate )log(

iiLambda 5 Fit linear regression between and

iX to

obtain0 and

16 Calculate )`exp(X= ii 7 Calculate

iiparmlambda 8 Generate )(~ parmlambdaPoissonyi

Moreover in analyzing data the following steps were applied 1 Generate the negative binomial regression

with genmod procedure in SAS 91 and Zero-Inflated Negative binomial Regression with countreg procedure in SAS 92

2 Estimate the prior parameter which are and

3 Estimate using EB method4 Calculate MSE for indirect estimation5 Calculate RRMSE (Root Relative Mean

Square Error)

i

ii

MSERRMSE

ˆ)ˆ(

)ˆ(

RESULT AND DISCUSSION

Estimation of Prior Parameter is Based on EB Method with Negative Binomial

RegressionIn case of non-excess-zero data the

estimator produced small and consistent MSE Meanwhile if the number of excess-zero isapproximately 30 or more with expected probability of zero 06 the performance of estimates tends to be unreliable As a result EB estimation produced negative values

RRMSE of the estimator increasessimultaneously along with the increase of number of zero in the data Furthermore if thedata contain excess zero at least 30 theestimator is unreliable

Table 1 MSE and RRMSE of EB Estimator with NBR

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 033 016 018 01302 035 020 026 02003 040 023 036 03004 042 027 050 04205 045 031 072 05906 -12875 033 -038 08107 253671 040 -1216 13508 -584495 030 30946 21109 39135606 016 116E+10 664

Table 2 MSE (II) and RRMSE (II) of EB Estimator with NBR

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 033 016 018 01302 035 020 026 02003 040 023 036 03004 042 027 050 04205 046 031 071 05806 26197 033 -035 07507 950007 040 -1002 09908 1444250 030 22054 11009 41595285 016 677E+09 056

5

Table 1 show that the iterative process produced unexpected negative values of MSEThe simplest way to solve this problem is tochange the negative value to zero MSE (II) and RRMSE (II) in table 2 are the result of MSE and RRMSE after the negative value of MSE has been changed to zero

When data have expected probability of zero by 06 to 09 mean of MSE (II) increases drastically Similarly mean of RRMSE (II) increases sharply when data have 08 to 09 expected probability of zero However when data have 06 to 07 expected probability of zero the mean of RRMSE (II) is negative due to the negative value of EB estimates

Estimation of Prior Parameter is Based on EB Method with Zero-Inflated

Negative Binomial RegressionThe EB estimates are similar to the

estimates produced by NBR method although they are slightly outperformed NBR method when the data only contain small number of zeros In particular as shown by table 3 if data have expected probability of zero by 01 to 05 ZINB produces bigger MSE for EB estimator than which NBR produces

Whereas if data have expected probability of zero by 06 to 07 ZINB gives better estimates The estimates were also unbiased as it covers parameter values adequately However ZINB begins to produce inconsistent estimates if data have expected probability of zero by 08 or more due to enormous MSE

Besides when data have expected is because ZINB generates small estimates which is close to the parameter values

Mean of MSE (II) with ZINB is biggerthan the mean of MSE with ZINB That is because when negative value of MSE changed to zero it doesnrsquot have reduction factor in the mean calculation

Comparison of EB estimator withNegative Binomial Regression and EB

estimator with ZINBEB estimates given by both NBR and

ZINB methods are similar for data with small numbers of zero However ZINB method produces bigger MSE than NBR do as long as expected probability of zero in data does not exceed 06 thresholds

But ZINB method performs better if data have expected probability of zero by 06 to 07 In this case EB estimates given by NBR method are unstable and inconsistent due to estimatesrsquo negative value and huge MSE that

can be thousand times larger than theiracceptable value On the other hand EB estimator with ZINB works well it givesunbiased estimates and its MSE values are more stable than EB estimates with NBR

Both methods would have performed poorly if data had expected probability of zero by 08 or more EB estimators with both methods were inconsistent as a result of very huge MSE values they produced

Table 3 MSE and RRMSE of EB Estimator with ZINB

Probability of zero

Mean of MSE

Median ofMSE

Mean ofRRMSE

Median of RRMSE

01 045 017 024 01402 043 020 033 02103 071 028 052 03204 054 028 0632 04205 086 033 7322807 06606 061 038 29817 10307 058 025 218119 19408 -128 -14E-07 162697 37509 2954790 -1E-06 35E+278 609508

Table 4 MSE (II) and RRMSE (II) of EB Estimator with ZINB

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 045 017 024 01402 0436 020 0324 02103 072 028 051 031104 055 028 061 04105 095 033 6561235 05806 075 038 23406 07007 150 025 134655 06908 175 0 733506 009 2954908 0 12E+278 0

CONCLUSION

Excess-zero in data highly influenced the result of EB estimation Conventional method such as negative binomial regression in prior estimation has produced unbiased and unreliable EB estimator for data with expected probability of zero by 06 This is shown bybig number of MSE and negative value of estimator

Meanwhile EB estimation by ZINB method produced more reliable estimator even when the data have expected probability of zero by 06 to 07

The ZINB has also provided a reliable estimator for data with less than 5333 of zeros This means that performance of ZINB

6

declines when the data have expected probability of zero by 08 or more As shown by the big MSE and inconsistent estimator

RECOMMENDATION

This research is based on many assumptions and suffered by several limitations If the assumptions and boundaries can be relaxed can be expected better result There are some recommendations for the next research1 The generating process in this research

does not reflect the real sampling processIf the generating process similar to the real sampling process it might give better result because it will be closer with the real application

2 It will be more interesting to runexperiment which takes account of larger number of areas since the number of areas will influence data modeling

3 The Restricted Maximum Likelihood maybe applied when estimating prior parameter with ZINB and NBR in other to solve the negative value of MSE

4 Theoretical research of ZINB and Empirical Bayes estimator is important to understand the behavior of parameter estimates of ZINB in Empirical Bayes setting

REFERENCES

Erdman D L Jackson A Sinko 2008 Zero-Inflated Poisson and Zero-Inflated Negative Binomial Models Using the COUNTREG Procedure SAS Global Forum 2008322-2008httpwww2sascomproceedingsforum2008322-2008pdf [25 Agustus 2008]

Famoye F KP Singh 2006 Zero-Inflated Generalized Poisson Regression Model with an Application to Domestic Violence Data Journal of Data Science 4117-130

Hardin JW JM Hilbe 2007 Generalized Linear Models and Extensions Texas A Stata Press Publication

Kurnia A KA Notodiputro 2006 Penerapan Metode Jackknife dalam pendugaan Area Kecil Forum Statistika dan Komputasi April 2006 p12-15

Kismiantini 2007 Pendugaan Statistik Area Kecil Berbasis Model Poisson-Gamma [Tesis] Bogor Institut Pertanian Bogor Fakultas Matematika dan Pengetahuan Alam

McCullagh P J A Nelder 1983 Generalized Linear Models London Chapmann and Hall

Ramsini B et all 2001 Uninsured Estimates by County A Review of Options and IssueshttpwwwodhohiogovDataOFHSurvofhsrfq7pdf [24 April 2008]

Rao JNK 2003 Small Area Estimation New York John Wiley amp Sons

Wakefield J 2006 Disease mapping and spatial regression with count data httpwwwbepresscomuwbiostatpaper286pdf [24 April 2008]

7

Appendix 1 Result of EB estimation with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE -100569 0194196 0271669 041875 045917 3239598RRMSE 0123605 0300339 0422426 0503566 0642418 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE -237643 0235733 0306641 0452955 05091 3652167RRMSE 0038956 0412708 0588924 0717336 0844735 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE -749011 0254097 0330078 -12875 0539873 2354887RRMSE -663045 051763 0813734 -038057 1287528 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE -7513075 0235378 0402092 2536714 0876569 6051162RRMSE -10741 0704796 1355566 -121606 3040291 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601

bias 0000395 0116669 0254473 1091172 0497898 6297454MSE -6E+09 -016583 0301527 -584495 5718409 185E+09RRMSE -212936 0927338 2115163 3094627 1359703 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655MSE -38E+09 -130817 0159682 39135606 3074073 12E+11

RRMSE -909131 1647188 6639631 116E+10 1585472 706E+11

8

Appendix 2 Result of EB estimation with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE -053954 010933 0168797 0449506 0369775 360843RRMSE 0022947 0096443 0136424 0238099 0241955 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE -07309 0126202 0201463 0425844 0414597 1734815RRMSE 0021807 0144983 0210692 0326097 0401786 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE -229891 0156942 0277017 0707983 0590466 7469014RRMSE 0023998 0210095 0317195 0519524 0618802 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE -125713 0181557 0284338 0540615 0498521 423089RRMSE 0054916 028362 0420396 0630776 0778033 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442

MSE -181856 0194818 0334706 0859252 0711939 7997074RRMSE 0026206 0387294 0662251 7322807 1312302 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE -34589 0078006 0376514 060793 0804116 3426488RRMSE 000461 0502807 1033578 2981671 2012552 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE -142213 -001433 0255331 0584152 1132152 264456RRMSE 0064209 0847956 1942286 2181192 4589042 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE -10651 -56E-05 -14E-07 -127819 1452962 1132741RRMSE 0063244 1475413 3754705 162697 9221163 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE -175652 -33E-05 -1E-06 2954790 152E-06 613E+08

RRMSE 0040681 4059441 6095076 35E+278 5569021 16E+281

9

Appendix 3 Result of EB estimation (II) with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE 0 0194196 0271669 0419181 045917 3239598RRMSE 0 0300116 0422209 0502895 0641904 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE 0 0235733 0306641 0456258 05091 3652167RRMSE 0 0410357 0585765 0712314 0841838 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE 0 0254097 0330078 2619677 0539873 2354887RRMSE -663045 0448118 0750369 -034911 1209918 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE 0 0235378 0402092 9500073 0876569 6051162RRMSE -10741 0288999 0995659 -100163 2527784 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601bias 0000395 0116669 0254473 1091172 0497898 6297454MSE 0 0 0301527 1444250 5718409 185E+09RRMSE -212936 0 1104113 2205437 5656681 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655

MSE 0 0 0159682 41595285 3074073 12E+11

RRMSE -909131 0 0557622 677E+09 9311925 706E+11

10

Appendix 4 Result of EB estimation (II) with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE 0 010933 0168797 0450626 0369775 360843RRMSE 0 0095932 0135647 023675 0239669 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE 0 0126202 0201463 0428006 0414597 1734815RRMSE 0 0142648 020709 0320663 0395479 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE 0 0156942 0277017 0716543 0590466 7469014RRMSE 0 0203913 0311937 0506882 0615401 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE 0 0181557 0284338 0549835 0498521 423089RRMSE 0 0270309 0405926 0606317 0766631 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442MSE 0 0194818 0334706 094973 0711939 7997074RRMSE 0 0316402 0576343 6561235 1240175 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE 0 0078006 0376514 0749436 0804116 3426488RRMSE 0 0258286 0698814 2340612 1714808 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE 0 0 0255331 1501268 1132152 264456RRMSE 0 0 0688797 1346552 2500825 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE 0 0 0 1755486 1452962 1132741RRMSE 0 0 0 7335062 3311711 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE 0 0 0 2954908 152E-06 613E+08

RRMSE 0 0 0 12E+278 416189 16E+281

11

Appendix 5 Syntax program for generate data

data b generate x1(covariate) and ei input x1cards0222831971100013131702314625252218171412202210run

macro bangkit_datado r=1 to 100

data egenerate poisson-gamma with excess zerodo kk=1 to 30set btetha = rangam(11)lambda = -log(01) peluang munculnya nilai nol yang diinginkan (01-09)starlambda = log(lambdatetha)output endrun

proc regmodel starlambda = x1 ods output ParameterEstimates=workbetha_lr (keep=Parameter Estimate)run

proc transpose data=workbetha_lr out=workbetha_lr_t

12

Appendix 5 Syntax program for generate data (continued)

rundata _null_set workbetha_lr_tcall symput (Intercept col1)call symput (x1 col2)run

data ddo kk=1 to 30set emu = exp(ampIntercept + ampx1x1)parmlambda = mutethaypoi = rand(poissonparmlambda)output endrun

ods trace onto take percent zero on dataproc freq data=dtables ypoi ods output OneWayFreqs=workzerorundata zeroset zerokeep percentrunproc transpose data=zero out=zero1 rundata _null_set workzero1call symput (pctz col1)rundata dset dpzero=amppctzr=amprrun

proc append data=d base=d1run

endmend

bangkit_data

13

Appendix 6 Syntax program EB with NBR

macro sae_nbdo x=1 to 900

data workaset workeif ^(u=ampx) then deleterun

this genmod procedure estimates the response without zero-inflation proc genmod data=amodel ypoi = x1 dist=nb link=logods output ParameterEstimates=workbetha_nb (keep=Parameter Estimate)run

proc transpose data=workbetha_nb out=workbetha_nb_trun

data _null_set workbetha_nb_tcall symput (Intercept col1)call symput (x1 col2)call symput (Dispersion col3)run

EB with negbin-regdata workduga_nbset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + ampDispersion)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(ampDispersion+ypoi)((mu_hat_b+ampDispersion)2)bias_b=abs(teta_hat_bayes-parmlambda)run

proc append data=workduga_nb base=workduga_nb1run

jacknifedo h=1 to 30

data workdset workduga_nb1if ^(u=ampx) then deleterundata workjacknbamphset workdif u=ampxif kk=amph then deleterun

proc genmod data=workjacknbamph output p out=sasyi_estmodel ypoi = x1 dist = nb link=logods output parameterestimates=workbetha_est_nbamph (keep=parameter Estimate)

14

Appendix 6 Syntax program EB with NBR (continued)

runproc transpose data=workbetha_est_nbamph out=workbetha_est_nbtamphrundata _null_set workbetha_est_nbtamphcall symput (Intercept_ col1)call symput (x1_ col2)call symput (Dispersion_ col3)run

data workduganbamphset workdmu_hat_b_amph=exp(ampIntercept_ + ampx1_x1)w_b_amph=mu_hat_b_amph (mu_hat_b_amph + ampDispersion_)teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2g1_amph=(ampDispersion_+ypoi)((mu_hat_b_amph+ampDispersion_)2)beda_g_amph=g1_amph-g1run

data workmse_nb_jmerge workduganb1 workduganb2 workduganb3 workduganb4 workduganb5 workduganb6 workduganb7 workduganb8 workduganb9 workduganb10 workduganb11 workduganb12workduganb13 workduganb14 workduganb15 workduganb16 workduganb17workduganb18 workduganb19 workduganb20 workduganb21 workduganb22workduganb23 workduganb24 workduganb25 workduganb26 workduganb27workduganb28 workduganb29 workduganb30by kkrun

data workmse_nb_jset workmse_nb_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampjendm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesul = ampxrun

proc append data=workmse_nb_j base=workmse_nb_j1run

data workhasilnbmerge workd workmse_nb_j keep kk x1 tetha mu parmlambda ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_b

15

Appendix 6 Syntax program EB with NBR (continued)

run

ods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilnb BASE=workhasilnb1 appendver=v6run

ENDmend

sae_nb

16

Appendix 7 Syntax program EB with ZINB

macro sae_zinb

do x=1 to 900

data workaset work eif ^(u=ampx) then deleterun

proc countreg data=amodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workpe(keep=Parameter Estimate)run

proc transpose data=workpe out=workpe_trun

data _null_set workpe_tcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + amp_Alpha)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(amp_Alpha+ypoi)((mu_hat_b+amp_Alpha)2)bias_b=abs(teta_hat_bayes-parmlamdha)

run

proc append data=workduga base=workduga1run

do h=1 to 30

data workdset workduga1if ^(u=ampx) then deleterundata workjackzinbamphset workdif u=ampxif kk=amph then deleterun

proc countreg data=jackzinbamphmodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workbetha_est_ZINBamph

17

Appendix 7 Syntax program EB with ZINB (continued)

(keep=Parameter Estimate)run

proc transpose data=workbetha_est_ZINBamph out=workbetha_est_ZINBtamphrun

data _null_set workbetha_est_ZINBtamphcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaZINBamphset workdmu_hat_b_amph=exp(ampIntercept + ampx1x1)mu_hat_b_amph= ampb_o- + ampb_1- x1w_b_amph=mu_hat_b_amph (mu_hat_b_amph + (amp_Alpha))teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2

g1_amph =((mu_hat_b_amph2ampalpha_)2)(ampalpha_+y_i)((mu_hat_b_amph2ampalpha_)+mu_hat_b_amph)2

g1_amph=(amp_Alpha+ypoi)((mu_hat_b_amph+amp_Alpha)2)

g1_amph =(A2)(ampk- + y_i)( a +mu_hat_b)2

beda_g_amph=g1_amph-g1run

data workmse_ZINB_jmerge workdugaZINB1 workdugaZINB2 workdugaZINB3 workdugaZINB4 workdugaZINB5 workdugaZINB6 workdugaZINB7 workdugaZINB8 workdugaZINB9 workdugaZINB10 workdugaZINB11 workdugaZINB12workdugaZINB13 workdugaZINB14 workdugaZINB15 workdugaZINB16 workdugaZINB17workdugaZINB18 workdugaZINB19 workdugaZINB20 workdugaZINB21 workdugaZINB22workdugaZINB23 workdugaZINB24 workdugaZINB25 workdugaZINB26 workdugaZINB27workdugaZINB28 workdugaZINB29 workdugaZINB30by kkrun

data workmse_ZINB_jset workmse_ZINB_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampj

18

Appendix 7 Syntax program EB with ZINB (continued)

endm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesrun

data workhasilZINBmerge workd workmse_ZINB_j keep kk x1 tetha mu lamdha ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_brunods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilZINB BASE=workhasilZINB1run

ENDmend

sae_zinb

  • KOPER AMPE PRAKATA_2pdf
  • isiirenepdf
Page 12: Zero-Inflated Negative Binomial in Small Area Estimation · Irianto Oetomo and Fine Analisa Maharani. She has two siblings. In 1998, she graduated from SD Dukuh 09 East Jakarta and

5

Table 1 show that the iterative process produced unexpected negative values of MSEThe simplest way to solve this problem is tochange the negative value to zero MSE (II) and RRMSE (II) in table 2 are the result of MSE and RRMSE after the negative value of MSE has been changed to zero

When data have expected probability of zero by 06 to 09 mean of MSE (II) increases drastically Similarly mean of RRMSE (II) increases sharply when data have 08 to 09 expected probability of zero However when data have 06 to 07 expected probability of zero the mean of RRMSE (II) is negative due to the negative value of EB estimates

Estimation of Prior Parameter is Based on EB Method with Zero-Inflated

Negative Binomial RegressionThe EB estimates are similar to the

estimates produced by NBR method although they are slightly outperformed NBR method when the data only contain small number of zeros In particular as shown by table 3 if data have expected probability of zero by 01 to 05 ZINB produces bigger MSE for EB estimator than which NBR produces

Whereas if data have expected probability of zero by 06 to 07 ZINB gives better estimates The estimates were also unbiased as it covers parameter values adequately However ZINB begins to produce inconsistent estimates if data have expected probability of zero by 08 or more due to enormous MSE

Besides when data have expected is because ZINB generates small estimates which is close to the parameter values

Mean of MSE (II) with ZINB is biggerthan the mean of MSE with ZINB That is because when negative value of MSE changed to zero it doesnrsquot have reduction factor in the mean calculation

Comparison of EB estimator withNegative Binomial Regression and EB

estimator with ZINBEB estimates given by both NBR and

ZINB methods are similar for data with small numbers of zero However ZINB method produces bigger MSE than NBR do as long as expected probability of zero in data does not exceed 06 thresholds

But ZINB method performs better if data have expected probability of zero by 06 to 07 In this case EB estimates given by NBR method are unstable and inconsistent due to estimatesrsquo negative value and huge MSE that

can be thousand times larger than theiracceptable value On the other hand EB estimator with ZINB works well it givesunbiased estimates and its MSE values are more stable than EB estimates with NBR

Both methods would have performed poorly if data had expected probability of zero by 08 or more EB estimators with both methods were inconsistent as a result of very huge MSE values they produced

Table 3 MSE and RRMSE of EB Estimator with ZINB

Probability of zero

Mean of MSE

Median ofMSE

Mean ofRRMSE

Median of RRMSE

01 045 017 024 01402 043 020 033 02103 071 028 052 03204 054 028 0632 04205 086 033 7322807 06606 061 038 29817 10307 058 025 218119 19408 -128 -14E-07 162697 37509 2954790 -1E-06 35E+278 609508

Table 4 MSE (II) and RRMSE (II) of EB Estimator with ZINB

Probability of zero

Mean of MSE

Median of MSE

Mean of RRMSE

Median of RRMSE

01 045 017 024 01402 0436 020 0324 02103 072 028 051 031104 055 028 061 04105 095 033 6561235 05806 075 038 23406 07007 150 025 134655 06908 175 0 733506 009 2954908 0 12E+278 0

CONCLUSION

Excess-zero in data highly influenced the result of EB estimation Conventional method such as negative binomial regression in prior estimation has produced unbiased and unreliable EB estimator for data with expected probability of zero by 06 This is shown bybig number of MSE and negative value of estimator

Meanwhile EB estimation by ZINB method produced more reliable estimator even when the data have expected probability of zero by 06 to 07

The ZINB has also provided a reliable estimator for data with less than 5333 of zeros This means that performance of ZINB

6

declines when the data have expected probability of zero by 08 or more As shown by the big MSE and inconsistent estimator

RECOMMENDATION

This research is based on many assumptions and suffered by several limitations If the assumptions and boundaries can be relaxed can be expected better result There are some recommendations for the next research1 The generating process in this research

does not reflect the real sampling processIf the generating process similar to the real sampling process it might give better result because it will be closer with the real application

2 It will be more interesting to runexperiment which takes account of larger number of areas since the number of areas will influence data modeling

3 The Restricted Maximum Likelihood maybe applied when estimating prior parameter with ZINB and NBR in other to solve the negative value of MSE

4 Theoretical research of ZINB and Empirical Bayes estimator is important to understand the behavior of parameter estimates of ZINB in Empirical Bayes setting

REFERENCES

Erdman D L Jackson A Sinko 2008 Zero-Inflated Poisson and Zero-Inflated Negative Binomial Models Using the COUNTREG Procedure SAS Global Forum 2008322-2008httpwww2sascomproceedingsforum2008322-2008pdf [25 Agustus 2008]

Famoye F KP Singh 2006 Zero-Inflated Generalized Poisson Regression Model with an Application to Domestic Violence Data Journal of Data Science 4117-130

Hardin JW JM Hilbe 2007 Generalized Linear Models and Extensions Texas A Stata Press Publication

Kurnia A KA Notodiputro 2006 Penerapan Metode Jackknife dalam pendugaan Area Kecil Forum Statistika dan Komputasi April 2006 p12-15

Kismiantini 2007 Pendugaan Statistik Area Kecil Berbasis Model Poisson-Gamma [Tesis] Bogor Institut Pertanian Bogor Fakultas Matematika dan Pengetahuan Alam

McCullagh P J A Nelder 1983 Generalized Linear Models London Chapmann and Hall

Ramsini B et all 2001 Uninsured Estimates by County A Review of Options and IssueshttpwwwodhohiogovDataOFHSurvofhsrfq7pdf [24 April 2008]

Rao JNK 2003 Small Area Estimation New York John Wiley amp Sons

Wakefield J 2006 Disease mapping and spatial regression with count data httpwwwbepresscomuwbiostatpaper286pdf [24 April 2008]

7

Appendix 1 Result of EB estimation with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE -100569 0194196 0271669 041875 045917 3239598RRMSE 0123605 0300339 0422426 0503566 0642418 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE -237643 0235733 0306641 0452955 05091 3652167RRMSE 0038956 0412708 0588924 0717336 0844735 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE -749011 0254097 0330078 -12875 0539873 2354887RRMSE -663045 051763 0813734 -038057 1287528 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE -7513075 0235378 0402092 2536714 0876569 6051162RRMSE -10741 0704796 1355566 -121606 3040291 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601

bias 0000395 0116669 0254473 1091172 0497898 6297454MSE -6E+09 -016583 0301527 -584495 5718409 185E+09RRMSE -212936 0927338 2115163 3094627 1359703 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655MSE -38E+09 -130817 0159682 39135606 3074073 12E+11

RRMSE -909131 1647188 6639631 116E+10 1585472 706E+11

8

Appendix 2 Result of EB estimation with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE -053954 010933 0168797 0449506 0369775 360843RRMSE 0022947 0096443 0136424 0238099 0241955 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE -07309 0126202 0201463 0425844 0414597 1734815RRMSE 0021807 0144983 0210692 0326097 0401786 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE -229891 0156942 0277017 0707983 0590466 7469014RRMSE 0023998 0210095 0317195 0519524 0618802 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE -125713 0181557 0284338 0540615 0498521 423089RRMSE 0054916 028362 0420396 0630776 0778033 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442

MSE -181856 0194818 0334706 0859252 0711939 7997074RRMSE 0026206 0387294 0662251 7322807 1312302 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE -34589 0078006 0376514 060793 0804116 3426488RRMSE 000461 0502807 1033578 2981671 2012552 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE -142213 -001433 0255331 0584152 1132152 264456RRMSE 0064209 0847956 1942286 2181192 4589042 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE -10651 -56E-05 -14E-07 -127819 1452962 1132741RRMSE 0063244 1475413 3754705 162697 9221163 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE -175652 -33E-05 -1E-06 2954790 152E-06 613E+08

RRMSE 0040681 4059441 6095076 35E+278 5569021 16E+281

9

Appendix 3 Result of EB estimation (II) with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE 0 0194196 0271669 0419181 045917 3239598RRMSE 0 0300116 0422209 0502895 0641904 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE 0 0235733 0306641 0456258 05091 3652167RRMSE 0 0410357 0585765 0712314 0841838 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE 0 0254097 0330078 2619677 0539873 2354887RRMSE -663045 0448118 0750369 -034911 1209918 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE 0 0235378 0402092 9500073 0876569 6051162RRMSE -10741 0288999 0995659 -100163 2527784 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601bias 0000395 0116669 0254473 1091172 0497898 6297454MSE 0 0 0301527 1444250 5718409 185E+09RRMSE -212936 0 1104113 2205437 5656681 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655

MSE 0 0 0159682 41595285 3074073 12E+11

RRMSE -909131 0 0557622 677E+09 9311925 706E+11

10

Appendix 4 Result of EB estimation (II) with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE 0 010933 0168797 0450626 0369775 360843RRMSE 0 0095932 0135647 023675 0239669 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE 0 0126202 0201463 0428006 0414597 1734815RRMSE 0 0142648 020709 0320663 0395479 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE 0 0156942 0277017 0716543 0590466 7469014RRMSE 0 0203913 0311937 0506882 0615401 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE 0 0181557 0284338 0549835 0498521 423089RRMSE 0 0270309 0405926 0606317 0766631 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442MSE 0 0194818 0334706 094973 0711939 7997074RRMSE 0 0316402 0576343 6561235 1240175 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE 0 0078006 0376514 0749436 0804116 3426488RRMSE 0 0258286 0698814 2340612 1714808 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE 0 0 0255331 1501268 1132152 264456RRMSE 0 0 0688797 1346552 2500825 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE 0 0 0 1755486 1452962 1132741RRMSE 0 0 0 7335062 3311711 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE 0 0 0 2954908 152E-06 613E+08

RRMSE 0 0 0 12E+278 416189 16E+281

11

Appendix 5 Syntax program for generate data

data b generate x1(covariate) and ei input x1cards0222831971100013131702314625252218171412202210run

macro bangkit_datado r=1 to 100

data egenerate poisson-gamma with excess zerodo kk=1 to 30set btetha = rangam(11)lambda = -log(01) peluang munculnya nilai nol yang diinginkan (01-09)starlambda = log(lambdatetha)output endrun

proc regmodel starlambda = x1 ods output ParameterEstimates=workbetha_lr (keep=Parameter Estimate)run

proc transpose data=workbetha_lr out=workbetha_lr_t

12

Appendix 5 Syntax program for generate data (continued)

rundata _null_set workbetha_lr_tcall symput (Intercept col1)call symput (x1 col2)run

data ddo kk=1 to 30set emu = exp(ampIntercept + ampx1x1)parmlambda = mutethaypoi = rand(poissonparmlambda)output endrun

ods trace onto take percent zero on dataproc freq data=dtables ypoi ods output OneWayFreqs=workzerorundata zeroset zerokeep percentrunproc transpose data=zero out=zero1 rundata _null_set workzero1call symput (pctz col1)rundata dset dpzero=amppctzr=amprrun

proc append data=d base=d1run

endmend

bangkit_data

13

Appendix 6 Syntax program EB with NBR

macro sae_nbdo x=1 to 900

data workaset workeif ^(u=ampx) then deleterun

this genmod procedure estimates the response without zero-inflation proc genmod data=amodel ypoi = x1 dist=nb link=logods output ParameterEstimates=workbetha_nb (keep=Parameter Estimate)run

proc transpose data=workbetha_nb out=workbetha_nb_trun

data _null_set workbetha_nb_tcall symput (Intercept col1)call symput (x1 col2)call symput (Dispersion col3)run

EB with negbin-regdata workduga_nbset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + ampDispersion)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(ampDispersion+ypoi)((mu_hat_b+ampDispersion)2)bias_b=abs(teta_hat_bayes-parmlambda)run

proc append data=workduga_nb base=workduga_nb1run

jacknifedo h=1 to 30

data workdset workduga_nb1if ^(u=ampx) then deleterundata workjacknbamphset workdif u=ampxif kk=amph then deleterun

proc genmod data=workjacknbamph output p out=sasyi_estmodel ypoi = x1 dist = nb link=logods output parameterestimates=workbetha_est_nbamph (keep=parameter Estimate)

14

Appendix 6 Syntax program EB with NBR (continued)

runproc transpose data=workbetha_est_nbamph out=workbetha_est_nbtamphrundata _null_set workbetha_est_nbtamphcall symput (Intercept_ col1)call symput (x1_ col2)call symput (Dispersion_ col3)run

data workduganbamphset workdmu_hat_b_amph=exp(ampIntercept_ + ampx1_x1)w_b_amph=mu_hat_b_amph (mu_hat_b_amph + ampDispersion_)teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2g1_amph=(ampDispersion_+ypoi)((mu_hat_b_amph+ampDispersion_)2)beda_g_amph=g1_amph-g1run

data workmse_nb_jmerge workduganb1 workduganb2 workduganb3 workduganb4 workduganb5 workduganb6 workduganb7 workduganb8 workduganb9 workduganb10 workduganb11 workduganb12workduganb13 workduganb14 workduganb15 workduganb16 workduganb17workduganb18 workduganb19 workduganb20 workduganb21 workduganb22workduganb23 workduganb24 workduganb25 workduganb26 workduganb27workduganb28 workduganb29 workduganb30by kkrun

data workmse_nb_jset workmse_nb_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampjendm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesul = ampxrun

proc append data=workmse_nb_j base=workmse_nb_j1run

data workhasilnbmerge workd workmse_nb_j keep kk x1 tetha mu parmlambda ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_b

15

Appendix 6 Syntax program EB with NBR (continued)

run

ods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilnb BASE=workhasilnb1 appendver=v6run

ENDmend

sae_nb

16

Appendix 7 Syntax program EB with ZINB

macro sae_zinb

do x=1 to 900

data workaset work eif ^(u=ampx) then deleterun

proc countreg data=amodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workpe(keep=Parameter Estimate)run

proc transpose data=workpe out=workpe_trun

data _null_set workpe_tcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + amp_Alpha)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(amp_Alpha+ypoi)((mu_hat_b+amp_Alpha)2)bias_b=abs(teta_hat_bayes-parmlamdha)

run

proc append data=workduga base=workduga1run

do h=1 to 30

data workdset workduga1if ^(u=ampx) then deleterundata workjackzinbamphset workdif u=ampxif kk=amph then deleterun

proc countreg data=jackzinbamphmodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workbetha_est_ZINBamph

17

Appendix 7 Syntax program EB with ZINB (continued)

(keep=Parameter Estimate)run

proc transpose data=workbetha_est_ZINBamph out=workbetha_est_ZINBtamphrun

data _null_set workbetha_est_ZINBtamphcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaZINBamphset workdmu_hat_b_amph=exp(ampIntercept + ampx1x1)mu_hat_b_amph= ampb_o- + ampb_1- x1w_b_amph=mu_hat_b_amph (mu_hat_b_amph + (amp_Alpha))teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2

g1_amph =((mu_hat_b_amph2ampalpha_)2)(ampalpha_+y_i)((mu_hat_b_amph2ampalpha_)+mu_hat_b_amph)2

g1_amph=(amp_Alpha+ypoi)((mu_hat_b_amph+amp_Alpha)2)

g1_amph =(A2)(ampk- + y_i)( a +mu_hat_b)2

beda_g_amph=g1_amph-g1run

data workmse_ZINB_jmerge workdugaZINB1 workdugaZINB2 workdugaZINB3 workdugaZINB4 workdugaZINB5 workdugaZINB6 workdugaZINB7 workdugaZINB8 workdugaZINB9 workdugaZINB10 workdugaZINB11 workdugaZINB12workdugaZINB13 workdugaZINB14 workdugaZINB15 workdugaZINB16 workdugaZINB17workdugaZINB18 workdugaZINB19 workdugaZINB20 workdugaZINB21 workdugaZINB22workdugaZINB23 workdugaZINB24 workdugaZINB25 workdugaZINB26 workdugaZINB27workdugaZINB28 workdugaZINB29 workdugaZINB30by kkrun

data workmse_ZINB_jset workmse_ZINB_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampj

18

Appendix 7 Syntax program EB with ZINB (continued)

endm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesrun

data workhasilZINBmerge workd workmse_ZINB_j keep kk x1 tetha mu lamdha ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_brunods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilZINB BASE=workhasilZINB1run

ENDmend

sae_zinb

  • KOPER AMPE PRAKATA_2pdf
  • isiirenepdf
Page 13: Zero-Inflated Negative Binomial in Small Area Estimation · Irianto Oetomo and Fine Analisa Maharani. She has two siblings. In 1998, she graduated from SD Dukuh 09 East Jakarta and

6

declines when the data have expected probability of zero by 08 or more As shown by the big MSE and inconsistent estimator

RECOMMENDATION

This research is based on many assumptions and suffered by several limitations If the assumptions and boundaries can be relaxed can be expected better result There are some recommendations for the next research1 The generating process in this research

does not reflect the real sampling processIf the generating process similar to the real sampling process it might give better result because it will be closer with the real application

2 It will be more interesting to runexperiment which takes account of larger number of areas since the number of areas will influence data modeling

3 The Restricted Maximum Likelihood maybe applied when estimating prior parameter with ZINB and NBR in other to solve the negative value of MSE

4 Theoretical research of ZINB and Empirical Bayes estimator is important to understand the behavior of parameter estimates of ZINB in Empirical Bayes setting

REFERENCES

Erdman D L Jackson A Sinko 2008 Zero-Inflated Poisson and Zero-Inflated Negative Binomial Models Using the COUNTREG Procedure SAS Global Forum 2008322-2008httpwww2sascomproceedingsforum2008322-2008pdf [25 Agustus 2008]

Famoye F KP Singh 2006 Zero-Inflated Generalized Poisson Regression Model with an Application to Domestic Violence Data Journal of Data Science 4117-130

Hardin JW JM Hilbe 2007 Generalized Linear Models and Extensions Texas A Stata Press Publication

Kurnia A KA Notodiputro 2006 Penerapan Metode Jackknife dalam pendugaan Area Kecil Forum Statistika dan Komputasi April 2006 p12-15

Kismiantini 2007 Pendugaan Statistik Area Kecil Berbasis Model Poisson-Gamma [Tesis] Bogor Institut Pertanian Bogor Fakultas Matematika dan Pengetahuan Alam

McCullagh P J A Nelder 1983 Generalized Linear Models London Chapmann and Hall

Ramsini B et all 2001 Uninsured Estimates by County A Review of Options and IssueshttpwwwodhohiogovDataOFHSurvofhsrfq7pdf [24 April 2008]

Rao JNK 2003 Small Area Estimation New York John Wiley amp Sons

Wakefield J 2006 Disease mapping and spatial regression with count data httpwwwbepresscomuwbiostatpaper286pdf [24 April 2008]

7

Appendix 1 Result of EB estimation with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE -100569 0194196 0271669 041875 045917 3239598RRMSE 0123605 0300339 0422426 0503566 0642418 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE -237643 0235733 0306641 0452955 05091 3652167RRMSE 0038956 0412708 0588924 0717336 0844735 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE -749011 0254097 0330078 -12875 0539873 2354887RRMSE -663045 051763 0813734 -038057 1287528 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE -7513075 0235378 0402092 2536714 0876569 6051162RRMSE -10741 0704796 1355566 -121606 3040291 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601

bias 0000395 0116669 0254473 1091172 0497898 6297454MSE -6E+09 -016583 0301527 -584495 5718409 185E+09RRMSE -212936 0927338 2115163 3094627 1359703 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655MSE -38E+09 -130817 0159682 39135606 3074073 12E+11

RRMSE -909131 1647188 6639631 116E+10 1585472 706E+11

8

Appendix 2 Result of EB estimation with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE -053954 010933 0168797 0449506 0369775 360843RRMSE 0022947 0096443 0136424 0238099 0241955 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE -07309 0126202 0201463 0425844 0414597 1734815RRMSE 0021807 0144983 0210692 0326097 0401786 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE -229891 0156942 0277017 0707983 0590466 7469014RRMSE 0023998 0210095 0317195 0519524 0618802 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE -125713 0181557 0284338 0540615 0498521 423089RRMSE 0054916 028362 0420396 0630776 0778033 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442

MSE -181856 0194818 0334706 0859252 0711939 7997074RRMSE 0026206 0387294 0662251 7322807 1312302 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE -34589 0078006 0376514 060793 0804116 3426488RRMSE 000461 0502807 1033578 2981671 2012552 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE -142213 -001433 0255331 0584152 1132152 264456RRMSE 0064209 0847956 1942286 2181192 4589042 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE -10651 -56E-05 -14E-07 -127819 1452962 1132741RRMSE 0063244 1475413 3754705 162697 9221163 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE -175652 -33E-05 -1E-06 2954790 152E-06 613E+08

RRMSE 0040681 4059441 6095076 35E+278 5569021 16E+281

9

Appendix 3 Result of EB estimation (II) with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE 0 0194196 0271669 0419181 045917 3239598RRMSE 0 0300116 0422209 0502895 0641904 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE 0 0235733 0306641 0456258 05091 3652167RRMSE 0 0410357 0585765 0712314 0841838 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE 0 0254097 0330078 2619677 0539873 2354887RRMSE -663045 0448118 0750369 -034911 1209918 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE 0 0235378 0402092 9500073 0876569 6051162RRMSE -10741 0288999 0995659 -100163 2527784 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601bias 0000395 0116669 0254473 1091172 0497898 6297454MSE 0 0 0301527 1444250 5718409 185E+09RRMSE -212936 0 1104113 2205437 5656681 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655

MSE 0 0 0159682 41595285 3074073 12E+11

RRMSE -909131 0 0557622 677E+09 9311925 706E+11

10

Appendix 4 Result of EB estimation (II) with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE 0 010933 0168797 0450626 0369775 360843RRMSE 0 0095932 0135647 023675 0239669 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE 0 0126202 0201463 0428006 0414597 1734815RRMSE 0 0142648 020709 0320663 0395479 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE 0 0156942 0277017 0716543 0590466 7469014RRMSE 0 0203913 0311937 0506882 0615401 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE 0 0181557 0284338 0549835 0498521 423089RRMSE 0 0270309 0405926 0606317 0766631 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442MSE 0 0194818 0334706 094973 0711939 7997074RRMSE 0 0316402 0576343 6561235 1240175 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE 0 0078006 0376514 0749436 0804116 3426488RRMSE 0 0258286 0698814 2340612 1714808 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE 0 0 0255331 1501268 1132152 264456RRMSE 0 0 0688797 1346552 2500825 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE 0 0 0 1755486 1452962 1132741RRMSE 0 0 0 7335062 3311711 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE 0 0 0 2954908 152E-06 613E+08

RRMSE 0 0 0 12E+278 416189 16E+281

11

Appendix 5 Syntax program for generate data

data b generate x1(covariate) and ei input x1cards0222831971100013131702314625252218171412202210run

macro bangkit_datado r=1 to 100

data egenerate poisson-gamma with excess zerodo kk=1 to 30set btetha = rangam(11)lambda = -log(01) peluang munculnya nilai nol yang diinginkan (01-09)starlambda = log(lambdatetha)output endrun

proc regmodel starlambda = x1 ods output ParameterEstimates=workbetha_lr (keep=Parameter Estimate)run

proc transpose data=workbetha_lr out=workbetha_lr_t

12

Appendix 5 Syntax program for generate data (continued)

rundata _null_set workbetha_lr_tcall symput (Intercept col1)call symput (x1 col2)run

data ddo kk=1 to 30set emu = exp(ampIntercept + ampx1x1)parmlambda = mutethaypoi = rand(poissonparmlambda)output endrun

ods trace onto take percent zero on dataproc freq data=dtables ypoi ods output OneWayFreqs=workzerorundata zeroset zerokeep percentrunproc transpose data=zero out=zero1 rundata _null_set workzero1call symput (pctz col1)rundata dset dpzero=amppctzr=amprrun

proc append data=d base=d1run

endmend

bangkit_data

13

Appendix 6 Syntax program EB with NBR

macro sae_nbdo x=1 to 900

data workaset workeif ^(u=ampx) then deleterun

this genmod procedure estimates the response without zero-inflation proc genmod data=amodel ypoi = x1 dist=nb link=logods output ParameterEstimates=workbetha_nb (keep=Parameter Estimate)run

proc transpose data=workbetha_nb out=workbetha_nb_trun

data _null_set workbetha_nb_tcall symput (Intercept col1)call symput (x1 col2)call symput (Dispersion col3)run

EB with negbin-regdata workduga_nbset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + ampDispersion)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(ampDispersion+ypoi)((mu_hat_b+ampDispersion)2)bias_b=abs(teta_hat_bayes-parmlambda)run

proc append data=workduga_nb base=workduga_nb1run

jacknifedo h=1 to 30

data workdset workduga_nb1if ^(u=ampx) then deleterundata workjacknbamphset workdif u=ampxif kk=amph then deleterun

proc genmod data=workjacknbamph output p out=sasyi_estmodel ypoi = x1 dist = nb link=logods output parameterestimates=workbetha_est_nbamph (keep=parameter Estimate)

14

Appendix 6 Syntax program EB with NBR (continued)

runproc transpose data=workbetha_est_nbamph out=workbetha_est_nbtamphrundata _null_set workbetha_est_nbtamphcall symput (Intercept_ col1)call symput (x1_ col2)call symput (Dispersion_ col3)run

data workduganbamphset workdmu_hat_b_amph=exp(ampIntercept_ + ampx1_x1)w_b_amph=mu_hat_b_amph (mu_hat_b_amph + ampDispersion_)teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2g1_amph=(ampDispersion_+ypoi)((mu_hat_b_amph+ampDispersion_)2)beda_g_amph=g1_amph-g1run

data workmse_nb_jmerge workduganb1 workduganb2 workduganb3 workduganb4 workduganb5 workduganb6 workduganb7 workduganb8 workduganb9 workduganb10 workduganb11 workduganb12workduganb13 workduganb14 workduganb15 workduganb16 workduganb17workduganb18 workduganb19 workduganb20 workduganb21 workduganb22workduganb23 workduganb24 workduganb25 workduganb26 workduganb27workduganb28 workduganb29 workduganb30by kkrun

data workmse_nb_jset workmse_nb_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampjendm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesul = ampxrun

proc append data=workmse_nb_j base=workmse_nb_j1run

data workhasilnbmerge workd workmse_nb_j keep kk x1 tetha mu parmlambda ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_b

15

Appendix 6 Syntax program EB with NBR (continued)

run

ods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilnb BASE=workhasilnb1 appendver=v6run

ENDmend

sae_nb

16

Appendix 7 Syntax program EB with ZINB

macro sae_zinb

do x=1 to 900

data workaset work eif ^(u=ampx) then deleterun

proc countreg data=amodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workpe(keep=Parameter Estimate)run

proc transpose data=workpe out=workpe_trun

data _null_set workpe_tcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + amp_Alpha)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(amp_Alpha+ypoi)((mu_hat_b+amp_Alpha)2)bias_b=abs(teta_hat_bayes-parmlamdha)

run

proc append data=workduga base=workduga1run

do h=1 to 30

data workdset workduga1if ^(u=ampx) then deleterundata workjackzinbamphset workdif u=ampxif kk=amph then deleterun

proc countreg data=jackzinbamphmodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workbetha_est_ZINBamph

17

Appendix 7 Syntax program EB with ZINB (continued)

(keep=Parameter Estimate)run

proc transpose data=workbetha_est_ZINBamph out=workbetha_est_ZINBtamphrun

data _null_set workbetha_est_ZINBtamphcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaZINBamphset workdmu_hat_b_amph=exp(ampIntercept + ampx1x1)mu_hat_b_amph= ampb_o- + ampb_1- x1w_b_amph=mu_hat_b_amph (mu_hat_b_amph + (amp_Alpha))teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2

g1_amph =((mu_hat_b_amph2ampalpha_)2)(ampalpha_+y_i)((mu_hat_b_amph2ampalpha_)+mu_hat_b_amph)2

g1_amph=(amp_Alpha+ypoi)((mu_hat_b_amph+amp_Alpha)2)

g1_amph =(A2)(ampk- + y_i)( a +mu_hat_b)2

beda_g_amph=g1_amph-g1run

data workmse_ZINB_jmerge workdugaZINB1 workdugaZINB2 workdugaZINB3 workdugaZINB4 workdugaZINB5 workdugaZINB6 workdugaZINB7 workdugaZINB8 workdugaZINB9 workdugaZINB10 workdugaZINB11 workdugaZINB12workdugaZINB13 workdugaZINB14 workdugaZINB15 workdugaZINB16 workdugaZINB17workdugaZINB18 workdugaZINB19 workdugaZINB20 workdugaZINB21 workdugaZINB22workdugaZINB23 workdugaZINB24 workdugaZINB25 workdugaZINB26 workdugaZINB27workdugaZINB28 workdugaZINB29 workdugaZINB30by kkrun

data workmse_ZINB_jset workmse_ZINB_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampj

18

Appendix 7 Syntax program EB with ZINB (continued)

endm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesrun

data workhasilZINBmerge workd workmse_ZINB_j keep kk x1 tetha mu lamdha ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_brunods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilZINB BASE=workhasilZINB1run

ENDmend

sae_zinb

  • KOPER AMPE PRAKATA_2pdf
  • isiirenepdf
Page 14: Zero-Inflated Negative Binomial in Small Area Estimation · Irianto Oetomo and Fine Analisa Maharani. She has two siblings. In 1998, she graduated from SD Dukuh 09 East Jakarta and

7

Appendix 1 Result of EB estimation with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE -100569 0194196 0271669 041875 045917 3239598RRMSE 0123605 0300339 0422426 0503566 0642418 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE -237643 0235733 0306641 0452955 05091 3652167RRMSE 0038956 0412708 0588924 0717336 0844735 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE -749011 0254097 0330078 -12875 0539873 2354887RRMSE -663045 051763 0813734 -038057 1287528 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE -7513075 0235378 0402092 2536714 0876569 6051162RRMSE -10741 0704796 1355566 -121606 3040291 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601

bias 0000395 0116669 0254473 1091172 0497898 6297454MSE -6E+09 -016583 0301527 -584495 5718409 185E+09RRMSE -212936 0927338 2115163 3094627 1359703 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655MSE -38E+09 -130817 0159682 39135606 3074073 12E+11

RRMSE -909131 1647188 6639631 116E+10 1585472 706E+11

8

Appendix 2 Result of EB estimation with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE -053954 010933 0168797 0449506 0369775 360843RRMSE 0022947 0096443 0136424 0238099 0241955 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE -07309 0126202 0201463 0425844 0414597 1734815RRMSE 0021807 0144983 0210692 0326097 0401786 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE -229891 0156942 0277017 0707983 0590466 7469014RRMSE 0023998 0210095 0317195 0519524 0618802 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE -125713 0181557 0284338 0540615 0498521 423089RRMSE 0054916 028362 0420396 0630776 0778033 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442

MSE -181856 0194818 0334706 0859252 0711939 7997074RRMSE 0026206 0387294 0662251 7322807 1312302 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE -34589 0078006 0376514 060793 0804116 3426488RRMSE 000461 0502807 1033578 2981671 2012552 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE -142213 -001433 0255331 0584152 1132152 264456RRMSE 0064209 0847956 1942286 2181192 4589042 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE -10651 -56E-05 -14E-07 -127819 1452962 1132741RRMSE 0063244 1475413 3754705 162697 9221163 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE -175652 -33E-05 -1E-06 2954790 152E-06 613E+08

RRMSE 0040681 4059441 6095076 35E+278 5569021 16E+281

9

Appendix 3 Result of EB estimation (II) with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE 0 0194196 0271669 0419181 045917 3239598RRMSE 0 0300116 0422209 0502895 0641904 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE 0 0235733 0306641 0456258 05091 3652167RRMSE 0 0410357 0585765 0712314 0841838 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE 0 0254097 0330078 2619677 0539873 2354887RRMSE -663045 0448118 0750369 -034911 1209918 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE 0 0235378 0402092 9500073 0876569 6051162RRMSE -10741 0288999 0995659 -100163 2527784 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601bias 0000395 0116669 0254473 1091172 0497898 6297454MSE 0 0 0301527 1444250 5718409 185E+09RRMSE -212936 0 1104113 2205437 5656681 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655

MSE 0 0 0159682 41595285 3074073 12E+11

RRMSE -909131 0 0557622 677E+09 9311925 706E+11

10

Appendix 4 Result of EB estimation (II) with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE 0 010933 0168797 0450626 0369775 360843RRMSE 0 0095932 0135647 023675 0239669 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE 0 0126202 0201463 0428006 0414597 1734815RRMSE 0 0142648 020709 0320663 0395479 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE 0 0156942 0277017 0716543 0590466 7469014RRMSE 0 0203913 0311937 0506882 0615401 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE 0 0181557 0284338 0549835 0498521 423089RRMSE 0 0270309 0405926 0606317 0766631 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442MSE 0 0194818 0334706 094973 0711939 7997074RRMSE 0 0316402 0576343 6561235 1240175 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE 0 0078006 0376514 0749436 0804116 3426488RRMSE 0 0258286 0698814 2340612 1714808 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE 0 0 0255331 1501268 1132152 264456RRMSE 0 0 0688797 1346552 2500825 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE 0 0 0 1755486 1452962 1132741RRMSE 0 0 0 7335062 3311711 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE 0 0 0 2954908 152E-06 613E+08

RRMSE 0 0 0 12E+278 416189 16E+281

11

Appendix 5 Syntax program for generate data

data b generate x1(covariate) and ei input x1cards0222831971100013131702314625252218171412202210run

macro bangkit_datado r=1 to 100

data egenerate poisson-gamma with excess zerodo kk=1 to 30set btetha = rangam(11)lambda = -log(01) peluang munculnya nilai nol yang diinginkan (01-09)starlambda = log(lambdatetha)output endrun

proc regmodel starlambda = x1 ods output ParameterEstimates=workbetha_lr (keep=Parameter Estimate)run

proc transpose data=workbetha_lr out=workbetha_lr_t

12

Appendix 5 Syntax program for generate data (continued)

rundata _null_set workbetha_lr_tcall symput (Intercept col1)call symput (x1 col2)run

data ddo kk=1 to 30set emu = exp(ampIntercept + ampx1x1)parmlambda = mutethaypoi = rand(poissonparmlambda)output endrun

ods trace onto take percent zero on dataproc freq data=dtables ypoi ods output OneWayFreqs=workzerorundata zeroset zerokeep percentrunproc transpose data=zero out=zero1 rundata _null_set workzero1call symput (pctz col1)rundata dset dpzero=amppctzr=amprrun

proc append data=d base=d1run

endmend

bangkit_data

13

Appendix 6 Syntax program EB with NBR

macro sae_nbdo x=1 to 900

data workaset workeif ^(u=ampx) then deleterun

this genmod procedure estimates the response without zero-inflation proc genmod data=amodel ypoi = x1 dist=nb link=logods output ParameterEstimates=workbetha_nb (keep=Parameter Estimate)run

proc transpose data=workbetha_nb out=workbetha_nb_trun

data _null_set workbetha_nb_tcall symput (Intercept col1)call symput (x1 col2)call symput (Dispersion col3)run

EB with negbin-regdata workduga_nbset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + ampDispersion)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(ampDispersion+ypoi)((mu_hat_b+ampDispersion)2)bias_b=abs(teta_hat_bayes-parmlambda)run

proc append data=workduga_nb base=workduga_nb1run

jacknifedo h=1 to 30

data workdset workduga_nb1if ^(u=ampx) then deleterundata workjacknbamphset workdif u=ampxif kk=amph then deleterun

proc genmod data=workjacknbamph output p out=sasyi_estmodel ypoi = x1 dist = nb link=logods output parameterestimates=workbetha_est_nbamph (keep=parameter Estimate)

14

Appendix 6 Syntax program EB with NBR (continued)

runproc transpose data=workbetha_est_nbamph out=workbetha_est_nbtamphrundata _null_set workbetha_est_nbtamphcall symput (Intercept_ col1)call symput (x1_ col2)call symput (Dispersion_ col3)run

data workduganbamphset workdmu_hat_b_amph=exp(ampIntercept_ + ampx1_x1)w_b_amph=mu_hat_b_amph (mu_hat_b_amph + ampDispersion_)teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2g1_amph=(ampDispersion_+ypoi)((mu_hat_b_amph+ampDispersion_)2)beda_g_amph=g1_amph-g1run

data workmse_nb_jmerge workduganb1 workduganb2 workduganb3 workduganb4 workduganb5 workduganb6 workduganb7 workduganb8 workduganb9 workduganb10 workduganb11 workduganb12workduganb13 workduganb14 workduganb15 workduganb16 workduganb17workduganb18 workduganb19 workduganb20 workduganb21 workduganb22workduganb23 workduganb24 workduganb25 workduganb26 workduganb27workduganb28 workduganb29 workduganb30by kkrun

data workmse_nb_jset workmse_nb_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampjendm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesul = ampxrun

proc append data=workmse_nb_j base=workmse_nb_j1run

data workhasilnbmerge workd workmse_nb_j keep kk x1 tetha mu parmlambda ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_b

15

Appendix 6 Syntax program EB with NBR (continued)

run

ods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilnb BASE=workhasilnb1 appendver=v6run

ENDmend

sae_nb

16

Appendix 7 Syntax program EB with ZINB

macro sae_zinb

do x=1 to 900

data workaset work eif ^(u=ampx) then deleterun

proc countreg data=amodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workpe(keep=Parameter Estimate)run

proc transpose data=workpe out=workpe_trun

data _null_set workpe_tcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + amp_Alpha)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(amp_Alpha+ypoi)((mu_hat_b+amp_Alpha)2)bias_b=abs(teta_hat_bayes-parmlamdha)

run

proc append data=workduga base=workduga1run

do h=1 to 30

data workdset workduga1if ^(u=ampx) then deleterundata workjackzinbamphset workdif u=ampxif kk=amph then deleterun

proc countreg data=jackzinbamphmodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workbetha_est_ZINBamph

17

Appendix 7 Syntax program EB with ZINB (continued)

(keep=Parameter Estimate)run

proc transpose data=workbetha_est_ZINBamph out=workbetha_est_ZINBtamphrun

data _null_set workbetha_est_ZINBtamphcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaZINBamphset workdmu_hat_b_amph=exp(ampIntercept + ampx1x1)mu_hat_b_amph= ampb_o- + ampb_1- x1w_b_amph=mu_hat_b_amph (mu_hat_b_amph + (amp_Alpha))teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2

g1_amph =((mu_hat_b_amph2ampalpha_)2)(ampalpha_+y_i)((mu_hat_b_amph2ampalpha_)+mu_hat_b_amph)2

g1_amph=(amp_Alpha+ypoi)((mu_hat_b_amph+amp_Alpha)2)

g1_amph =(A2)(ampk- + y_i)( a +mu_hat_b)2

beda_g_amph=g1_amph-g1run

data workmse_ZINB_jmerge workdugaZINB1 workdugaZINB2 workdugaZINB3 workdugaZINB4 workdugaZINB5 workdugaZINB6 workdugaZINB7 workdugaZINB8 workdugaZINB9 workdugaZINB10 workdugaZINB11 workdugaZINB12workdugaZINB13 workdugaZINB14 workdugaZINB15 workdugaZINB16 workdugaZINB17workdugaZINB18 workdugaZINB19 workdugaZINB20 workdugaZINB21 workdugaZINB22workdugaZINB23 workdugaZINB24 workdugaZINB25 workdugaZINB26 workdugaZINB27workdugaZINB28 workdugaZINB29 workdugaZINB30by kkrun

data workmse_ZINB_jset workmse_ZINB_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampj

18

Appendix 7 Syntax program EB with ZINB (continued)

endm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesrun

data workhasilZINBmerge workd workmse_ZINB_j keep kk x1 tetha mu lamdha ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_brunods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilZINB BASE=workhasilZINB1run

ENDmend

sae_zinb

  • KOPER AMPE PRAKATA_2pdf
  • isiirenepdf
Page 15: Zero-Inflated Negative Binomial in Small Area Estimation · Irianto Oetomo and Fine Analisa Maharani. She has two siblings. In 1998, she graduated from SD Dukuh 09 East Jakarta and

8

Appendix 2 Result of EB estimation with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE -053954 010933 0168797 0449506 0369775 360843RRMSE 0022947 0096443 0136424 0238099 0241955 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE -07309 0126202 0201463 0425844 0414597 1734815RRMSE 0021807 0144983 0210692 0326097 0401786 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE -229891 0156942 0277017 0707983 0590466 7469014RRMSE 0023998 0210095 0317195 0519524 0618802 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE -125713 0181557 0284338 0540615 0498521 423089RRMSE 0054916 028362 0420396 0630776 0778033 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442

MSE -181856 0194818 0334706 0859252 0711939 7997074RRMSE 0026206 0387294 0662251 7322807 1312302 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE -34589 0078006 0376514 060793 0804116 3426488RRMSE 000461 0502807 1033578 2981671 2012552 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE -142213 -001433 0255331 0584152 1132152 264456RRMSE 0064209 0847956 1942286 2181192 4589042 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE -10651 -56E-05 -14E-07 -127819 1452962 1132741RRMSE 0063244 1475413 3754705 162697 9221163 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE -175652 -33E-05 -1E-06 2954790 152E-06 613E+08

RRMSE 0040681 4059441 6095076 35E+278 5569021 16E+281

9

Appendix 3 Result of EB estimation (II) with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE 0 0194196 0271669 0419181 045917 3239598RRMSE 0 0300116 0422209 0502895 0641904 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE 0 0235733 0306641 0456258 05091 3652167RRMSE 0 0410357 0585765 0712314 0841838 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE 0 0254097 0330078 2619677 0539873 2354887RRMSE -663045 0448118 0750369 -034911 1209918 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE 0 0235378 0402092 9500073 0876569 6051162RRMSE -10741 0288999 0995659 -100163 2527784 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601bias 0000395 0116669 0254473 1091172 0497898 6297454MSE 0 0 0301527 1444250 5718409 185E+09RRMSE -212936 0 1104113 2205437 5656681 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655

MSE 0 0 0159682 41595285 3074073 12E+11

RRMSE -909131 0 0557622 677E+09 9311925 706E+11

10

Appendix 4 Result of EB estimation (II) with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE 0 010933 0168797 0450626 0369775 360843RRMSE 0 0095932 0135647 023675 0239669 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE 0 0126202 0201463 0428006 0414597 1734815RRMSE 0 0142648 020709 0320663 0395479 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE 0 0156942 0277017 0716543 0590466 7469014RRMSE 0 0203913 0311937 0506882 0615401 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE 0 0181557 0284338 0549835 0498521 423089RRMSE 0 0270309 0405926 0606317 0766631 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442MSE 0 0194818 0334706 094973 0711939 7997074RRMSE 0 0316402 0576343 6561235 1240175 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE 0 0078006 0376514 0749436 0804116 3426488RRMSE 0 0258286 0698814 2340612 1714808 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE 0 0 0255331 1501268 1132152 264456RRMSE 0 0 0688797 1346552 2500825 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE 0 0 0 1755486 1452962 1132741RRMSE 0 0 0 7335062 3311711 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE 0 0 0 2954908 152E-06 613E+08

RRMSE 0 0 0 12E+278 416189 16E+281

11

Appendix 5 Syntax program for generate data

data b generate x1(covariate) and ei input x1cards0222831971100013131702314625252218171412202210run

macro bangkit_datado r=1 to 100

data egenerate poisson-gamma with excess zerodo kk=1 to 30set btetha = rangam(11)lambda = -log(01) peluang munculnya nilai nol yang diinginkan (01-09)starlambda = log(lambdatetha)output endrun

proc regmodel starlambda = x1 ods output ParameterEstimates=workbetha_lr (keep=Parameter Estimate)run

proc transpose data=workbetha_lr out=workbetha_lr_t

12

Appendix 5 Syntax program for generate data (continued)

rundata _null_set workbetha_lr_tcall symput (Intercept col1)call symput (x1 col2)run

data ddo kk=1 to 30set emu = exp(ampIntercept + ampx1x1)parmlambda = mutethaypoi = rand(poissonparmlambda)output endrun

ods trace onto take percent zero on dataproc freq data=dtables ypoi ods output OneWayFreqs=workzerorundata zeroset zerokeep percentrunproc transpose data=zero out=zero1 rundata _null_set workzero1call symput (pctz col1)rundata dset dpzero=amppctzr=amprrun

proc append data=d base=d1run

endmend

bangkit_data

13

Appendix 6 Syntax program EB with NBR

macro sae_nbdo x=1 to 900

data workaset workeif ^(u=ampx) then deleterun

this genmod procedure estimates the response without zero-inflation proc genmod data=amodel ypoi = x1 dist=nb link=logods output ParameterEstimates=workbetha_nb (keep=Parameter Estimate)run

proc transpose data=workbetha_nb out=workbetha_nb_trun

data _null_set workbetha_nb_tcall symput (Intercept col1)call symput (x1 col2)call symput (Dispersion col3)run

EB with negbin-regdata workduga_nbset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + ampDispersion)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(ampDispersion+ypoi)((mu_hat_b+ampDispersion)2)bias_b=abs(teta_hat_bayes-parmlambda)run

proc append data=workduga_nb base=workduga_nb1run

jacknifedo h=1 to 30

data workdset workduga_nb1if ^(u=ampx) then deleterundata workjacknbamphset workdif u=ampxif kk=amph then deleterun

proc genmod data=workjacknbamph output p out=sasyi_estmodel ypoi = x1 dist = nb link=logods output parameterestimates=workbetha_est_nbamph (keep=parameter Estimate)

14

Appendix 6 Syntax program EB with NBR (continued)

runproc transpose data=workbetha_est_nbamph out=workbetha_est_nbtamphrundata _null_set workbetha_est_nbtamphcall symput (Intercept_ col1)call symput (x1_ col2)call symput (Dispersion_ col3)run

data workduganbamphset workdmu_hat_b_amph=exp(ampIntercept_ + ampx1_x1)w_b_amph=mu_hat_b_amph (mu_hat_b_amph + ampDispersion_)teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2g1_amph=(ampDispersion_+ypoi)((mu_hat_b_amph+ampDispersion_)2)beda_g_amph=g1_amph-g1run

data workmse_nb_jmerge workduganb1 workduganb2 workduganb3 workduganb4 workduganb5 workduganb6 workduganb7 workduganb8 workduganb9 workduganb10 workduganb11 workduganb12workduganb13 workduganb14 workduganb15 workduganb16 workduganb17workduganb18 workduganb19 workduganb20 workduganb21 workduganb22workduganb23 workduganb24 workduganb25 workduganb26 workduganb27workduganb28 workduganb29 workduganb30by kkrun

data workmse_nb_jset workmse_nb_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampjendm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesul = ampxrun

proc append data=workmse_nb_j base=workmse_nb_j1run

data workhasilnbmerge workd workmse_nb_j keep kk x1 tetha mu parmlambda ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_b

15

Appendix 6 Syntax program EB with NBR (continued)

run

ods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilnb BASE=workhasilnb1 appendver=v6run

ENDmend

sae_nb

16

Appendix 7 Syntax program EB with ZINB

macro sae_zinb

do x=1 to 900

data workaset work eif ^(u=ampx) then deleterun

proc countreg data=amodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workpe(keep=Parameter Estimate)run

proc transpose data=workpe out=workpe_trun

data _null_set workpe_tcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + amp_Alpha)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(amp_Alpha+ypoi)((mu_hat_b+amp_Alpha)2)bias_b=abs(teta_hat_bayes-parmlamdha)

run

proc append data=workduga base=workduga1run

do h=1 to 30

data workdset workduga1if ^(u=ampx) then deleterundata workjackzinbamphset workdif u=ampxif kk=amph then deleterun

proc countreg data=jackzinbamphmodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workbetha_est_ZINBamph

17

Appendix 7 Syntax program EB with ZINB (continued)

(keep=Parameter Estimate)run

proc transpose data=workbetha_est_ZINBamph out=workbetha_est_ZINBtamphrun

data _null_set workbetha_est_ZINBtamphcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaZINBamphset workdmu_hat_b_amph=exp(ampIntercept + ampx1x1)mu_hat_b_amph= ampb_o- + ampb_1- x1w_b_amph=mu_hat_b_amph (mu_hat_b_amph + (amp_Alpha))teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2

g1_amph =((mu_hat_b_amph2ampalpha_)2)(ampalpha_+y_i)((mu_hat_b_amph2ampalpha_)+mu_hat_b_amph)2

g1_amph=(amp_Alpha+ypoi)((mu_hat_b_amph+amp_Alpha)2)

g1_amph =(A2)(ampk- + y_i)( a +mu_hat_b)2

beda_g_amph=g1_amph-g1run

data workmse_ZINB_jmerge workdugaZINB1 workdugaZINB2 workdugaZINB3 workdugaZINB4 workdugaZINB5 workdugaZINB6 workdugaZINB7 workdugaZINB8 workdugaZINB9 workdugaZINB10 workdugaZINB11 workdugaZINB12workdugaZINB13 workdugaZINB14 workdugaZINB15 workdugaZINB16 workdugaZINB17workdugaZINB18 workdugaZINB19 workdugaZINB20 workdugaZINB21 workdugaZINB22workdugaZINB23 workdugaZINB24 workdugaZINB25 workdugaZINB26 workdugaZINB27workdugaZINB28 workdugaZINB29 workdugaZINB30by kkrun

data workmse_ZINB_jset workmse_ZINB_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampj

18

Appendix 7 Syntax program EB with ZINB (continued)

endm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesrun

data workhasilZINBmerge workd workmse_ZINB_j keep kk x1 tetha mu lamdha ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_brunods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilZINB BASE=workhasilZINB1run

ENDmend

sae_zinb

  • KOPER AMPE PRAKATA_2pdf
  • isiirenepdf
Page 16: Zero-Inflated Negative Binomial in Small Area Estimation · Irianto Oetomo and Fine Analisa Maharani. She has two siblings. In 1998, she graduated from SD Dukuh 09 East Jakarta and

9

Appendix 3 Result of EB estimation (II) with NBR

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0426011 1525665 3188832 4252666 5752756 205939

bias 0000446 05164 0878579 1315093 1721091 8704671MSE 0040547 0109118 0159448 0333613 0335256 4167064RRMSE 0041258 0100045 01356 018188 0220426 0576793

20 1333-3667 100 EB estimator 0342831 1013993 2218265 2984668 3953417 1815693bias 0000587 0413611 079407 1100373 1454889 7906915MSE 0055631 0131969 0196963 0353033 0386291 3778251RRMSE 0070449 015421 0205182 0262006 0352726 0788718

30 20-5333 100 EB estimator 0323311 0836545 1562163 2263684 2918741 1214482bias 0000151 0372382 067041 0916482 122012 5950225MSE 0074364 0163462 0231014 0400207 0432371 5250254RRMSE 0102324 0214697 0299247 0361013 0474077 1192032

40 2333-5667 100 EB estimator 024882 064963 1219656 17107 2248716 930007bias 0000564 0293602 0549809 0757937 1007851 486688MSE 0 0194196 0271669 0419181 045917 3239598RRMSE 0 0300116 0422209 0502895 0641904 2202294

50 2333-6333 100 EB estimator 0122548 0570083 1028619 1291758 1728067 6750472bias 000029 0250747 0453265 0622838 0803185 4009352MSE 0 0235733 0306641 0456258 05091 3652167RRMSE 0 0410357 0585765 0712314 0841838 3240156

60 30-70 100 EB estimator -077338 044443 0699758 0944038 1131071 6323352bias 0000452 020433 0398131 0534095 0679938 3848209MSE 0 0254097 0330078 2619677 0539873 2354887RRMSE -663045 0448118 0750369 -034911 1209918 1767434

70 4333-7333 100 EB estimator -33274 0249515 0442513 0659375 0922519 9258959bias 0000375 0155154 0316124 0476883 0588926 8475103MSE 0 0235378 0402092 9500073 0876569 6051162RRMSE -10741 0288999 0995659 -100163 2527784 3332419

80 5333-90 100 EB estimator -232889 017621 0305365 0569959 0576346 6303601bias 0000395 0116669 0254473 1091172 0497898 6297454MSE 0 0 0301527 1444250 5718409 185E+09RRMSE -212936 0 1104113 2205437 5656681 4151289

90 70-100 100 EB estimator -108767 0111208 0230315 0212247 0353129 3625557bias 000016 0086 0177169 0425532 0314714 1092655

MSE 0 0 0159682 41595285 3074073 12E+11

RRMSE -909131 0 0557622 677E+09 9311925 706E+11

10

Appendix 4 Result of EB estimation (II) with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE 0 010933 0168797 0450626 0369775 360843RRMSE 0 0095932 0135647 023675 0239669 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE 0 0126202 0201463 0428006 0414597 1734815RRMSE 0 0142648 020709 0320663 0395479 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE 0 0156942 0277017 0716543 0590466 7469014RRMSE 0 0203913 0311937 0506882 0615401 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE 0 0181557 0284338 0549835 0498521 423089RRMSE 0 0270309 0405926 0606317 0766631 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442MSE 0 0194818 0334706 094973 0711939 7997074RRMSE 0 0316402 0576343 6561235 1240175 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE 0 0078006 0376514 0749436 0804116 3426488RRMSE 0 0258286 0698814 2340612 1714808 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE 0 0 0255331 1501268 1132152 264456RRMSE 0 0 0688797 1346552 2500825 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE 0 0 0 1755486 1452962 1132741RRMSE 0 0 0 7335062 3311711 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE 0 0 0 2954908 152E-06 613E+08

RRMSE 0 0 0 12E+278 416189 16E+281

11

Appendix 5 Syntax program for generate data

data b generate x1(covariate) and ei input x1cards0222831971100013131702314625252218171412202210run

macro bangkit_datado r=1 to 100

data egenerate poisson-gamma with excess zerodo kk=1 to 30set btetha = rangam(11)lambda = -log(01) peluang munculnya nilai nol yang diinginkan (01-09)starlambda = log(lambdatetha)output endrun

proc regmodel starlambda = x1 ods output ParameterEstimates=workbetha_lr (keep=Parameter Estimate)run

proc transpose data=workbetha_lr out=workbetha_lr_t

12

Appendix 5 Syntax program for generate data (continued)

rundata _null_set workbetha_lr_tcall symput (Intercept col1)call symput (x1 col2)run

data ddo kk=1 to 30set emu = exp(ampIntercept + ampx1x1)parmlambda = mutethaypoi = rand(poissonparmlambda)output endrun

ods trace onto take percent zero on dataproc freq data=dtables ypoi ods output OneWayFreqs=workzerorundata zeroset zerokeep percentrunproc transpose data=zero out=zero1 rundata _null_set workzero1call symput (pctz col1)rundata dset dpzero=amppctzr=amprrun

proc append data=d base=d1run

endmend

bangkit_data

13

Appendix 6 Syntax program EB with NBR

macro sae_nbdo x=1 to 900

data workaset workeif ^(u=ampx) then deleterun

this genmod procedure estimates the response without zero-inflation proc genmod data=amodel ypoi = x1 dist=nb link=logods output ParameterEstimates=workbetha_nb (keep=Parameter Estimate)run

proc transpose data=workbetha_nb out=workbetha_nb_trun

data _null_set workbetha_nb_tcall symput (Intercept col1)call symput (x1 col2)call symput (Dispersion col3)run

EB with negbin-regdata workduga_nbset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + ampDispersion)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(ampDispersion+ypoi)((mu_hat_b+ampDispersion)2)bias_b=abs(teta_hat_bayes-parmlambda)run

proc append data=workduga_nb base=workduga_nb1run

jacknifedo h=1 to 30

data workdset workduga_nb1if ^(u=ampx) then deleterundata workjacknbamphset workdif u=ampxif kk=amph then deleterun

proc genmod data=workjacknbamph output p out=sasyi_estmodel ypoi = x1 dist = nb link=logods output parameterestimates=workbetha_est_nbamph (keep=parameter Estimate)

14

Appendix 6 Syntax program EB with NBR (continued)

runproc transpose data=workbetha_est_nbamph out=workbetha_est_nbtamphrundata _null_set workbetha_est_nbtamphcall symput (Intercept_ col1)call symput (x1_ col2)call symput (Dispersion_ col3)run

data workduganbamphset workdmu_hat_b_amph=exp(ampIntercept_ + ampx1_x1)w_b_amph=mu_hat_b_amph (mu_hat_b_amph + ampDispersion_)teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2g1_amph=(ampDispersion_+ypoi)((mu_hat_b_amph+ampDispersion_)2)beda_g_amph=g1_amph-g1run

data workmse_nb_jmerge workduganb1 workduganb2 workduganb3 workduganb4 workduganb5 workduganb6 workduganb7 workduganb8 workduganb9 workduganb10 workduganb11 workduganb12workduganb13 workduganb14 workduganb15 workduganb16 workduganb17workduganb18 workduganb19 workduganb20 workduganb21 workduganb22workduganb23 workduganb24 workduganb25 workduganb26 workduganb27workduganb28 workduganb29 workduganb30by kkrun

data workmse_nb_jset workmse_nb_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampjendm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesul = ampxrun

proc append data=workmse_nb_j base=workmse_nb_j1run

data workhasilnbmerge workd workmse_nb_j keep kk x1 tetha mu parmlambda ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_b

15

Appendix 6 Syntax program EB with NBR (continued)

run

ods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilnb BASE=workhasilnb1 appendver=v6run

ENDmend

sae_nb

16

Appendix 7 Syntax program EB with ZINB

macro sae_zinb

do x=1 to 900

data workaset work eif ^(u=ampx) then deleterun

proc countreg data=amodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workpe(keep=Parameter Estimate)run

proc transpose data=workpe out=workpe_trun

data _null_set workpe_tcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + amp_Alpha)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(amp_Alpha+ypoi)((mu_hat_b+amp_Alpha)2)bias_b=abs(teta_hat_bayes-parmlamdha)

run

proc append data=workduga base=workduga1run

do h=1 to 30

data workdset workduga1if ^(u=ampx) then deleterundata workjackzinbamphset workdif u=ampxif kk=amph then deleterun

proc countreg data=jackzinbamphmodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workbetha_est_ZINBamph

17

Appendix 7 Syntax program EB with ZINB (continued)

(keep=Parameter Estimate)run

proc transpose data=workbetha_est_ZINBamph out=workbetha_est_ZINBtamphrun

data _null_set workbetha_est_ZINBtamphcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaZINBamphset workdmu_hat_b_amph=exp(ampIntercept + ampx1x1)mu_hat_b_amph= ampb_o- + ampb_1- x1w_b_amph=mu_hat_b_amph (mu_hat_b_amph + (amp_Alpha))teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2

g1_amph =((mu_hat_b_amph2ampalpha_)2)(ampalpha_+y_i)((mu_hat_b_amph2ampalpha_)+mu_hat_b_amph)2

g1_amph=(amp_Alpha+ypoi)((mu_hat_b_amph+amp_Alpha)2)

g1_amph =(A2)(ampk- + y_i)( a +mu_hat_b)2

beda_g_amph=g1_amph-g1run

data workmse_ZINB_jmerge workdugaZINB1 workdugaZINB2 workdugaZINB3 workdugaZINB4 workdugaZINB5 workdugaZINB6 workdugaZINB7 workdugaZINB8 workdugaZINB9 workdugaZINB10 workdugaZINB11 workdugaZINB12workdugaZINB13 workdugaZINB14 workdugaZINB15 workdugaZINB16 workdugaZINB17workdugaZINB18 workdugaZINB19 workdugaZINB20 workdugaZINB21 workdugaZINB22workdugaZINB23 workdugaZINB24 workdugaZINB25 workdugaZINB26 workdugaZINB27workdugaZINB28 workdugaZINB29 workdugaZINB30by kkrun

data workmse_ZINB_jset workmse_ZINB_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampj

18

Appendix 7 Syntax program EB with ZINB (continued)

endm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesrun

data workhasilZINBmerge workd workmse_ZINB_j keep kk x1 tetha mu lamdha ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_brunods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilZINB BASE=workhasilZINB1run

ENDmend

sae_zinb

  • KOPER AMPE PRAKATA_2pdf
  • isiirenepdf
Page 17: Zero-Inflated Negative Binomial in Small Area Estimation · Irianto Oetomo and Fine Analisa Maharani. She has two siblings. In 1998, she graduated from SD Dukuh 09 East Jakarta and

10

Appendix 4 Result of EB estimation (II) with ZINB

P(Y=0) Aktual r min Q1 median mean Q3 max10 10-3333 100 EB estimator 0267752 1500256 3195861 4280907 5833922 2220705

bias 0000603 0485515 0882468 1315721 1750173 8704672MSE 0 010933 0168797 0450626 0369775 360843RRMSE 0 0095932 0135647 023675 0239669 5518468

20 1333-3667 100 EB estimator 105E-08 0914898 221594 3017228 401361 1815694bias 0000368 0383426 0780984 1105029 1496623 7906918MSE 0 0126202 0201463 0428006 0414597 1734815RRMSE 0 0142648 020709 0320663 0395479 3177943

30 20-5333 100 EB estimator 0132041 0719086 1523909 2308745 3012309 1228058bias 0000508 0332427 0680187 0928947 1254604 6314973MSE 0 0156942 0277017 0716543 0590466 7469014RRMSE 0 0203913 0311937 0506882 0615401 3500387

40 2333-5667 100 EB estimator 105E-08 0574265 1209034 1742928 2368713 104953bias 0000564 0268248 0544049 0771741 1067061 4889872MSE 0 0181557 0284338 0549835 0498521 423089RRMSE 0 0270309 0405926 0606317 0766631 5394515

50 2333-6333 100 EB estimator 105E-08 0426701 1033816 133848 1906961 8018962bias 0000453 0224726 0454522 0661709 0900005 4414442MSE 0 0194818 0334706 094973 0711939 7997074RRMSE 0 0316402 0576343 6561235 1240175 13388294

60 30-70 100 EB estimator 105E-08 030085 0645848 0985327 1154975 728326bias 62E-05 0190886 0406245 0567657 074167 3923952MSE 0 0078006 0376514 0749436 0804116 3426488RRMSE 0 0258286 0698814 2340612 1714808 3308816

70 4333-7333 100 EB estimator 105E-08 105E-08 0341315 0677841 1 5005491bias 979E-05 0128017 0358257 0487174 0654423 3733981MSE 0 0 0255331 1501268 1132152 264456RRMSE 0 0 0688797 1346552 2500825 7899681

80 5333-90 100 EB estimator 105E-08 105E-08 0142906 0445315 0859305 5bias 0000161 0083397 0272773 0392826 0557213 3532556MSE 0 0 0 1755486 1452962 1132741RRMSE 0 0 0 7335062 3311711 3786684

90 70-100 100 EB estimator 1E-277 105E-08 105E-08 0225165 0135512 3bias 0000495 0054221 0153374 027819 0350213 2736904MSE 0 0 0 2954908 152E-06 613E+08

RRMSE 0 0 0 12E+278 416189 16E+281

11

Appendix 5 Syntax program for generate data

data b generate x1(covariate) and ei input x1cards0222831971100013131702314625252218171412202210run

macro bangkit_datado r=1 to 100

data egenerate poisson-gamma with excess zerodo kk=1 to 30set btetha = rangam(11)lambda = -log(01) peluang munculnya nilai nol yang diinginkan (01-09)starlambda = log(lambdatetha)output endrun

proc regmodel starlambda = x1 ods output ParameterEstimates=workbetha_lr (keep=Parameter Estimate)run

proc transpose data=workbetha_lr out=workbetha_lr_t

12

Appendix 5 Syntax program for generate data (continued)

rundata _null_set workbetha_lr_tcall symput (Intercept col1)call symput (x1 col2)run

data ddo kk=1 to 30set emu = exp(ampIntercept + ampx1x1)parmlambda = mutethaypoi = rand(poissonparmlambda)output endrun

ods trace onto take percent zero on dataproc freq data=dtables ypoi ods output OneWayFreqs=workzerorundata zeroset zerokeep percentrunproc transpose data=zero out=zero1 rundata _null_set workzero1call symput (pctz col1)rundata dset dpzero=amppctzr=amprrun

proc append data=d base=d1run

endmend

bangkit_data

13

Appendix 6 Syntax program EB with NBR

macro sae_nbdo x=1 to 900

data workaset workeif ^(u=ampx) then deleterun

this genmod procedure estimates the response without zero-inflation proc genmod data=amodel ypoi = x1 dist=nb link=logods output ParameterEstimates=workbetha_nb (keep=Parameter Estimate)run

proc transpose data=workbetha_nb out=workbetha_nb_trun

data _null_set workbetha_nb_tcall symput (Intercept col1)call symput (x1 col2)call symput (Dispersion col3)run

EB with negbin-regdata workduga_nbset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + ampDispersion)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(ampDispersion+ypoi)((mu_hat_b+ampDispersion)2)bias_b=abs(teta_hat_bayes-parmlambda)run

proc append data=workduga_nb base=workduga_nb1run

jacknifedo h=1 to 30

data workdset workduga_nb1if ^(u=ampx) then deleterundata workjacknbamphset workdif u=ampxif kk=amph then deleterun

proc genmod data=workjacknbamph output p out=sasyi_estmodel ypoi = x1 dist = nb link=logods output parameterestimates=workbetha_est_nbamph (keep=parameter Estimate)

14

Appendix 6 Syntax program EB with NBR (continued)

runproc transpose data=workbetha_est_nbamph out=workbetha_est_nbtamphrundata _null_set workbetha_est_nbtamphcall symput (Intercept_ col1)call symput (x1_ col2)call symput (Dispersion_ col3)run

data workduganbamphset workdmu_hat_b_amph=exp(ampIntercept_ + ampx1_x1)w_b_amph=mu_hat_b_amph (mu_hat_b_amph + ampDispersion_)teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2g1_amph=(ampDispersion_+ypoi)((mu_hat_b_amph+ampDispersion_)2)beda_g_amph=g1_amph-g1run

data workmse_nb_jmerge workduganb1 workduganb2 workduganb3 workduganb4 workduganb5 workduganb6 workduganb7 workduganb8 workduganb9 workduganb10 workduganb11 workduganb12workduganb13 workduganb14 workduganb15 workduganb16 workduganb17workduganb18 workduganb19 workduganb20 workduganb21 workduganb22workduganb23 workduganb24 workduganb25 workduganb26 workduganb27workduganb28 workduganb29 workduganb30by kkrun

data workmse_nb_jset workmse_nb_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampjendm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesul = ampxrun

proc append data=workmse_nb_j base=workmse_nb_j1run

data workhasilnbmerge workd workmse_nb_j keep kk x1 tetha mu parmlambda ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_b

15

Appendix 6 Syntax program EB with NBR (continued)

run

ods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilnb BASE=workhasilnb1 appendver=v6run

ENDmend

sae_nb

16

Appendix 7 Syntax program EB with ZINB

macro sae_zinb

do x=1 to 900

data workaset work eif ^(u=ampx) then deleterun

proc countreg data=amodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workpe(keep=Parameter Estimate)run

proc transpose data=workpe out=workpe_trun

data _null_set workpe_tcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + amp_Alpha)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(amp_Alpha+ypoi)((mu_hat_b+amp_Alpha)2)bias_b=abs(teta_hat_bayes-parmlamdha)

run

proc append data=workduga base=workduga1run

do h=1 to 30

data workdset workduga1if ^(u=ampx) then deleterundata workjackzinbamphset workdif u=ampxif kk=amph then deleterun

proc countreg data=jackzinbamphmodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workbetha_est_ZINBamph

17

Appendix 7 Syntax program EB with ZINB (continued)

(keep=Parameter Estimate)run

proc transpose data=workbetha_est_ZINBamph out=workbetha_est_ZINBtamphrun

data _null_set workbetha_est_ZINBtamphcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaZINBamphset workdmu_hat_b_amph=exp(ampIntercept + ampx1x1)mu_hat_b_amph= ampb_o- + ampb_1- x1w_b_amph=mu_hat_b_amph (mu_hat_b_amph + (amp_Alpha))teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2

g1_amph =((mu_hat_b_amph2ampalpha_)2)(ampalpha_+y_i)((mu_hat_b_amph2ampalpha_)+mu_hat_b_amph)2

g1_amph=(amp_Alpha+ypoi)((mu_hat_b_amph+amp_Alpha)2)

g1_amph =(A2)(ampk- + y_i)( a +mu_hat_b)2

beda_g_amph=g1_amph-g1run

data workmse_ZINB_jmerge workdugaZINB1 workdugaZINB2 workdugaZINB3 workdugaZINB4 workdugaZINB5 workdugaZINB6 workdugaZINB7 workdugaZINB8 workdugaZINB9 workdugaZINB10 workdugaZINB11 workdugaZINB12workdugaZINB13 workdugaZINB14 workdugaZINB15 workdugaZINB16 workdugaZINB17workdugaZINB18 workdugaZINB19 workdugaZINB20 workdugaZINB21 workdugaZINB22workdugaZINB23 workdugaZINB24 workdugaZINB25 workdugaZINB26 workdugaZINB27workdugaZINB28 workdugaZINB29 workdugaZINB30by kkrun

data workmse_ZINB_jset workmse_ZINB_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampj

18

Appendix 7 Syntax program EB with ZINB (continued)

endm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesrun

data workhasilZINBmerge workd workmse_ZINB_j keep kk x1 tetha mu lamdha ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_brunods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilZINB BASE=workhasilZINB1run

ENDmend

sae_zinb

  • KOPER AMPE PRAKATA_2pdf
  • isiirenepdf
Page 18: Zero-Inflated Negative Binomial in Small Area Estimation · Irianto Oetomo and Fine Analisa Maharani. She has two siblings. In 1998, she graduated from SD Dukuh 09 East Jakarta and

11

Appendix 5 Syntax program for generate data

data b generate x1(covariate) and ei input x1cards0222831971100013131702314625252218171412202210run

macro bangkit_datado r=1 to 100

data egenerate poisson-gamma with excess zerodo kk=1 to 30set btetha = rangam(11)lambda = -log(01) peluang munculnya nilai nol yang diinginkan (01-09)starlambda = log(lambdatetha)output endrun

proc regmodel starlambda = x1 ods output ParameterEstimates=workbetha_lr (keep=Parameter Estimate)run

proc transpose data=workbetha_lr out=workbetha_lr_t

12

Appendix 5 Syntax program for generate data (continued)

rundata _null_set workbetha_lr_tcall symput (Intercept col1)call symput (x1 col2)run

data ddo kk=1 to 30set emu = exp(ampIntercept + ampx1x1)parmlambda = mutethaypoi = rand(poissonparmlambda)output endrun

ods trace onto take percent zero on dataproc freq data=dtables ypoi ods output OneWayFreqs=workzerorundata zeroset zerokeep percentrunproc transpose data=zero out=zero1 rundata _null_set workzero1call symput (pctz col1)rundata dset dpzero=amppctzr=amprrun

proc append data=d base=d1run

endmend

bangkit_data

13

Appendix 6 Syntax program EB with NBR

macro sae_nbdo x=1 to 900

data workaset workeif ^(u=ampx) then deleterun

this genmod procedure estimates the response without zero-inflation proc genmod data=amodel ypoi = x1 dist=nb link=logods output ParameterEstimates=workbetha_nb (keep=Parameter Estimate)run

proc transpose data=workbetha_nb out=workbetha_nb_trun

data _null_set workbetha_nb_tcall symput (Intercept col1)call symput (x1 col2)call symput (Dispersion col3)run

EB with negbin-regdata workduga_nbset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + ampDispersion)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(ampDispersion+ypoi)((mu_hat_b+ampDispersion)2)bias_b=abs(teta_hat_bayes-parmlambda)run

proc append data=workduga_nb base=workduga_nb1run

jacknifedo h=1 to 30

data workdset workduga_nb1if ^(u=ampx) then deleterundata workjacknbamphset workdif u=ampxif kk=amph then deleterun

proc genmod data=workjacknbamph output p out=sasyi_estmodel ypoi = x1 dist = nb link=logods output parameterestimates=workbetha_est_nbamph (keep=parameter Estimate)

14

Appendix 6 Syntax program EB with NBR (continued)

runproc transpose data=workbetha_est_nbamph out=workbetha_est_nbtamphrundata _null_set workbetha_est_nbtamphcall symput (Intercept_ col1)call symput (x1_ col2)call symput (Dispersion_ col3)run

data workduganbamphset workdmu_hat_b_amph=exp(ampIntercept_ + ampx1_x1)w_b_amph=mu_hat_b_amph (mu_hat_b_amph + ampDispersion_)teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2g1_amph=(ampDispersion_+ypoi)((mu_hat_b_amph+ampDispersion_)2)beda_g_amph=g1_amph-g1run

data workmse_nb_jmerge workduganb1 workduganb2 workduganb3 workduganb4 workduganb5 workduganb6 workduganb7 workduganb8 workduganb9 workduganb10 workduganb11 workduganb12workduganb13 workduganb14 workduganb15 workduganb16 workduganb17workduganb18 workduganb19 workduganb20 workduganb21 workduganb22workduganb23 workduganb24 workduganb25 workduganb26 workduganb27workduganb28 workduganb29 workduganb30by kkrun

data workmse_nb_jset workmse_nb_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampjendm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesul = ampxrun

proc append data=workmse_nb_j base=workmse_nb_j1run

data workhasilnbmerge workd workmse_nb_j keep kk x1 tetha mu parmlambda ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_b

15

Appendix 6 Syntax program EB with NBR (continued)

run

ods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilnb BASE=workhasilnb1 appendver=v6run

ENDmend

sae_nb

16

Appendix 7 Syntax program EB with ZINB

macro sae_zinb

do x=1 to 900

data workaset work eif ^(u=ampx) then deleterun

proc countreg data=amodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workpe(keep=Parameter Estimate)run

proc transpose data=workpe out=workpe_trun

data _null_set workpe_tcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + amp_Alpha)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(amp_Alpha+ypoi)((mu_hat_b+amp_Alpha)2)bias_b=abs(teta_hat_bayes-parmlamdha)

run

proc append data=workduga base=workduga1run

do h=1 to 30

data workdset workduga1if ^(u=ampx) then deleterundata workjackzinbamphset workdif u=ampxif kk=amph then deleterun

proc countreg data=jackzinbamphmodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workbetha_est_ZINBamph

17

Appendix 7 Syntax program EB with ZINB (continued)

(keep=Parameter Estimate)run

proc transpose data=workbetha_est_ZINBamph out=workbetha_est_ZINBtamphrun

data _null_set workbetha_est_ZINBtamphcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaZINBamphset workdmu_hat_b_amph=exp(ampIntercept + ampx1x1)mu_hat_b_amph= ampb_o- + ampb_1- x1w_b_amph=mu_hat_b_amph (mu_hat_b_amph + (amp_Alpha))teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2

g1_amph =((mu_hat_b_amph2ampalpha_)2)(ampalpha_+y_i)((mu_hat_b_amph2ampalpha_)+mu_hat_b_amph)2

g1_amph=(amp_Alpha+ypoi)((mu_hat_b_amph+amp_Alpha)2)

g1_amph =(A2)(ampk- + y_i)( a +mu_hat_b)2

beda_g_amph=g1_amph-g1run

data workmse_ZINB_jmerge workdugaZINB1 workdugaZINB2 workdugaZINB3 workdugaZINB4 workdugaZINB5 workdugaZINB6 workdugaZINB7 workdugaZINB8 workdugaZINB9 workdugaZINB10 workdugaZINB11 workdugaZINB12workdugaZINB13 workdugaZINB14 workdugaZINB15 workdugaZINB16 workdugaZINB17workdugaZINB18 workdugaZINB19 workdugaZINB20 workdugaZINB21 workdugaZINB22workdugaZINB23 workdugaZINB24 workdugaZINB25 workdugaZINB26 workdugaZINB27workdugaZINB28 workdugaZINB29 workdugaZINB30by kkrun

data workmse_ZINB_jset workmse_ZINB_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampj

18

Appendix 7 Syntax program EB with ZINB (continued)

endm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesrun

data workhasilZINBmerge workd workmse_ZINB_j keep kk x1 tetha mu lamdha ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_brunods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilZINB BASE=workhasilZINB1run

ENDmend

sae_zinb

  • KOPER AMPE PRAKATA_2pdf
  • isiirenepdf
Page 19: Zero-Inflated Negative Binomial in Small Area Estimation · Irianto Oetomo and Fine Analisa Maharani. She has two siblings. In 1998, she graduated from SD Dukuh 09 East Jakarta and

12

Appendix 5 Syntax program for generate data (continued)

rundata _null_set workbetha_lr_tcall symput (Intercept col1)call symput (x1 col2)run

data ddo kk=1 to 30set emu = exp(ampIntercept + ampx1x1)parmlambda = mutethaypoi = rand(poissonparmlambda)output endrun

ods trace onto take percent zero on dataproc freq data=dtables ypoi ods output OneWayFreqs=workzerorundata zeroset zerokeep percentrunproc transpose data=zero out=zero1 rundata _null_set workzero1call symput (pctz col1)rundata dset dpzero=amppctzr=amprrun

proc append data=d base=d1run

endmend

bangkit_data

13

Appendix 6 Syntax program EB with NBR

macro sae_nbdo x=1 to 900

data workaset workeif ^(u=ampx) then deleterun

this genmod procedure estimates the response without zero-inflation proc genmod data=amodel ypoi = x1 dist=nb link=logods output ParameterEstimates=workbetha_nb (keep=Parameter Estimate)run

proc transpose data=workbetha_nb out=workbetha_nb_trun

data _null_set workbetha_nb_tcall symput (Intercept col1)call symput (x1 col2)call symput (Dispersion col3)run

EB with negbin-regdata workduga_nbset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + ampDispersion)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(ampDispersion+ypoi)((mu_hat_b+ampDispersion)2)bias_b=abs(teta_hat_bayes-parmlambda)run

proc append data=workduga_nb base=workduga_nb1run

jacknifedo h=1 to 30

data workdset workduga_nb1if ^(u=ampx) then deleterundata workjacknbamphset workdif u=ampxif kk=amph then deleterun

proc genmod data=workjacknbamph output p out=sasyi_estmodel ypoi = x1 dist = nb link=logods output parameterestimates=workbetha_est_nbamph (keep=parameter Estimate)

14

Appendix 6 Syntax program EB with NBR (continued)

runproc transpose data=workbetha_est_nbamph out=workbetha_est_nbtamphrundata _null_set workbetha_est_nbtamphcall symput (Intercept_ col1)call symput (x1_ col2)call symput (Dispersion_ col3)run

data workduganbamphset workdmu_hat_b_amph=exp(ampIntercept_ + ampx1_x1)w_b_amph=mu_hat_b_amph (mu_hat_b_amph + ampDispersion_)teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2g1_amph=(ampDispersion_+ypoi)((mu_hat_b_amph+ampDispersion_)2)beda_g_amph=g1_amph-g1run

data workmse_nb_jmerge workduganb1 workduganb2 workduganb3 workduganb4 workduganb5 workduganb6 workduganb7 workduganb8 workduganb9 workduganb10 workduganb11 workduganb12workduganb13 workduganb14 workduganb15 workduganb16 workduganb17workduganb18 workduganb19 workduganb20 workduganb21 workduganb22workduganb23 workduganb24 workduganb25 workduganb26 workduganb27workduganb28 workduganb29 workduganb30by kkrun

data workmse_nb_jset workmse_nb_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampjendm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesul = ampxrun

proc append data=workmse_nb_j base=workmse_nb_j1run

data workhasilnbmerge workd workmse_nb_j keep kk x1 tetha mu parmlambda ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_b

15

Appendix 6 Syntax program EB with NBR (continued)

run

ods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilnb BASE=workhasilnb1 appendver=v6run

ENDmend

sae_nb

16

Appendix 7 Syntax program EB with ZINB

macro sae_zinb

do x=1 to 900

data workaset work eif ^(u=ampx) then deleterun

proc countreg data=amodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workpe(keep=Parameter Estimate)run

proc transpose data=workpe out=workpe_trun

data _null_set workpe_tcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + amp_Alpha)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(amp_Alpha+ypoi)((mu_hat_b+amp_Alpha)2)bias_b=abs(teta_hat_bayes-parmlamdha)

run

proc append data=workduga base=workduga1run

do h=1 to 30

data workdset workduga1if ^(u=ampx) then deleterundata workjackzinbamphset workdif u=ampxif kk=amph then deleterun

proc countreg data=jackzinbamphmodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workbetha_est_ZINBamph

17

Appendix 7 Syntax program EB with ZINB (continued)

(keep=Parameter Estimate)run

proc transpose data=workbetha_est_ZINBamph out=workbetha_est_ZINBtamphrun

data _null_set workbetha_est_ZINBtamphcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaZINBamphset workdmu_hat_b_amph=exp(ampIntercept + ampx1x1)mu_hat_b_amph= ampb_o- + ampb_1- x1w_b_amph=mu_hat_b_amph (mu_hat_b_amph + (amp_Alpha))teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2

g1_amph =((mu_hat_b_amph2ampalpha_)2)(ampalpha_+y_i)((mu_hat_b_amph2ampalpha_)+mu_hat_b_amph)2

g1_amph=(amp_Alpha+ypoi)((mu_hat_b_amph+amp_Alpha)2)

g1_amph =(A2)(ampk- + y_i)( a +mu_hat_b)2

beda_g_amph=g1_amph-g1run

data workmse_ZINB_jmerge workdugaZINB1 workdugaZINB2 workdugaZINB3 workdugaZINB4 workdugaZINB5 workdugaZINB6 workdugaZINB7 workdugaZINB8 workdugaZINB9 workdugaZINB10 workdugaZINB11 workdugaZINB12workdugaZINB13 workdugaZINB14 workdugaZINB15 workdugaZINB16 workdugaZINB17workdugaZINB18 workdugaZINB19 workdugaZINB20 workdugaZINB21 workdugaZINB22workdugaZINB23 workdugaZINB24 workdugaZINB25 workdugaZINB26 workdugaZINB27workdugaZINB28 workdugaZINB29 workdugaZINB30by kkrun

data workmse_ZINB_jset workmse_ZINB_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampj

18

Appendix 7 Syntax program EB with ZINB (continued)

endm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesrun

data workhasilZINBmerge workd workmse_ZINB_j keep kk x1 tetha mu lamdha ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_brunods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilZINB BASE=workhasilZINB1run

ENDmend

sae_zinb

  • KOPER AMPE PRAKATA_2pdf
  • isiirenepdf
Page 20: Zero-Inflated Negative Binomial in Small Area Estimation · Irianto Oetomo and Fine Analisa Maharani. She has two siblings. In 1998, she graduated from SD Dukuh 09 East Jakarta and

13

Appendix 6 Syntax program EB with NBR

macro sae_nbdo x=1 to 900

data workaset workeif ^(u=ampx) then deleterun

this genmod procedure estimates the response without zero-inflation proc genmod data=amodel ypoi = x1 dist=nb link=logods output ParameterEstimates=workbetha_nb (keep=Parameter Estimate)run

proc transpose data=workbetha_nb out=workbetha_nb_trun

data _null_set workbetha_nb_tcall symput (Intercept col1)call symput (x1 col2)call symput (Dispersion col3)run

EB with negbin-regdata workduga_nbset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + ampDispersion)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(ampDispersion+ypoi)((mu_hat_b+ampDispersion)2)bias_b=abs(teta_hat_bayes-parmlambda)run

proc append data=workduga_nb base=workduga_nb1run

jacknifedo h=1 to 30

data workdset workduga_nb1if ^(u=ampx) then deleterundata workjacknbamphset workdif u=ampxif kk=amph then deleterun

proc genmod data=workjacknbamph output p out=sasyi_estmodel ypoi = x1 dist = nb link=logods output parameterestimates=workbetha_est_nbamph (keep=parameter Estimate)

14

Appendix 6 Syntax program EB with NBR (continued)

runproc transpose data=workbetha_est_nbamph out=workbetha_est_nbtamphrundata _null_set workbetha_est_nbtamphcall symput (Intercept_ col1)call symput (x1_ col2)call symput (Dispersion_ col3)run

data workduganbamphset workdmu_hat_b_amph=exp(ampIntercept_ + ampx1_x1)w_b_amph=mu_hat_b_amph (mu_hat_b_amph + ampDispersion_)teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2g1_amph=(ampDispersion_+ypoi)((mu_hat_b_amph+ampDispersion_)2)beda_g_amph=g1_amph-g1run

data workmse_nb_jmerge workduganb1 workduganb2 workduganb3 workduganb4 workduganb5 workduganb6 workduganb7 workduganb8 workduganb9 workduganb10 workduganb11 workduganb12workduganb13 workduganb14 workduganb15 workduganb16 workduganb17workduganb18 workduganb19 workduganb20 workduganb21 workduganb22workduganb23 workduganb24 workduganb25 workduganb26 workduganb27workduganb28 workduganb29 workduganb30by kkrun

data workmse_nb_jset workmse_nb_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampjendm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesul = ampxrun

proc append data=workmse_nb_j base=workmse_nb_j1run

data workhasilnbmerge workd workmse_nb_j keep kk x1 tetha mu parmlambda ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_b

15

Appendix 6 Syntax program EB with NBR (continued)

run

ods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilnb BASE=workhasilnb1 appendver=v6run

ENDmend

sae_nb

16

Appendix 7 Syntax program EB with ZINB

macro sae_zinb

do x=1 to 900

data workaset work eif ^(u=ampx) then deleterun

proc countreg data=amodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workpe(keep=Parameter Estimate)run

proc transpose data=workpe out=workpe_trun

data _null_set workpe_tcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + amp_Alpha)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(amp_Alpha+ypoi)((mu_hat_b+amp_Alpha)2)bias_b=abs(teta_hat_bayes-parmlamdha)

run

proc append data=workduga base=workduga1run

do h=1 to 30

data workdset workduga1if ^(u=ampx) then deleterundata workjackzinbamphset workdif u=ampxif kk=amph then deleterun

proc countreg data=jackzinbamphmodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workbetha_est_ZINBamph

17

Appendix 7 Syntax program EB with ZINB (continued)

(keep=Parameter Estimate)run

proc transpose data=workbetha_est_ZINBamph out=workbetha_est_ZINBtamphrun

data _null_set workbetha_est_ZINBtamphcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaZINBamphset workdmu_hat_b_amph=exp(ampIntercept + ampx1x1)mu_hat_b_amph= ampb_o- + ampb_1- x1w_b_amph=mu_hat_b_amph (mu_hat_b_amph + (amp_Alpha))teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2

g1_amph =((mu_hat_b_amph2ampalpha_)2)(ampalpha_+y_i)((mu_hat_b_amph2ampalpha_)+mu_hat_b_amph)2

g1_amph=(amp_Alpha+ypoi)((mu_hat_b_amph+amp_Alpha)2)

g1_amph =(A2)(ampk- + y_i)( a +mu_hat_b)2

beda_g_amph=g1_amph-g1run

data workmse_ZINB_jmerge workdugaZINB1 workdugaZINB2 workdugaZINB3 workdugaZINB4 workdugaZINB5 workdugaZINB6 workdugaZINB7 workdugaZINB8 workdugaZINB9 workdugaZINB10 workdugaZINB11 workdugaZINB12workdugaZINB13 workdugaZINB14 workdugaZINB15 workdugaZINB16 workdugaZINB17workdugaZINB18 workdugaZINB19 workdugaZINB20 workdugaZINB21 workdugaZINB22workdugaZINB23 workdugaZINB24 workdugaZINB25 workdugaZINB26 workdugaZINB27workdugaZINB28 workdugaZINB29 workdugaZINB30by kkrun

data workmse_ZINB_jset workmse_ZINB_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampj

18

Appendix 7 Syntax program EB with ZINB (continued)

endm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesrun

data workhasilZINBmerge workd workmse_ZINB_j keep kk x1 tetha mu lamdha ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_brunods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilZINB BASE=workhasilZINB1run

ENDmend

sae_zinb

  • KOPER AMPE PRAKATA_2pdf
  • isiirenepdf
Page 21: Zero-Inflated Negative Binomial in Small Area Estimation · Irianto Oetomo and Fine Analisa Maharani. She has two siblings. In 1998, she graduated from SD Dukuh 09 East Jakarta and

14

Appendix 6 Syntax program EB with NBR (continued)

runproc transpose data=workbetha_est_nbamph out=workbetha_est_nbtamphrundata _null_set workbetha_est_nbtamphcall symput (Intercept_ col1)call symput (x1_ col2)call symput (Dispersion_ col3)run

data workduganbamphset workdmu_hat_b_amph=exp(ampIntercept_ + ampx1_x1)w_b_amph=mu_hat_b_amph (mu_hat_b_amph + ampDispersion_)teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2g1_amph=(ampDispersion_+ypoi)((mu_hat_b_amph+ampDispersion_)2)beda_g_amph=g1_amph-g1run

data workmse_nb_jmerge workduganb1 workduganb2 workduganb3 workduganb4 workduganb5 workduganb6 workduganb7 workduganb8 workduganb9 workduganb10 workduganb11 workduganb12workduganb13 workduganb14 workduganb15 workduganb16 workduganb17workduganb18 workduganb19 workduganb20 workduganb21 workduganb22workduganb23 workduganb24 workduganb25 workduganb26 workduganb27workduganb28 workduganb29 workduganb30by kkrun

data workmse_nb_jset workmse_nb_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampjendm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesul = ampxrun

proc append data=workmse_nb_j base=workmse_nb_j1run

data workhasilnbmerge workd workmse_nb_j keep kk x1 tetha mu parmlambda ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_b

15

Appendix 6 Syntax program EB with NBR (continued)

run

ods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilnb BASE=workhasilnb1 appendver=v6run

ENDmend

sae_nb

16

Appendix 7 Syntax program EB with ZINB

macro sae_zinb

do x=1 to 900

data workaset work eif ^(u=ampx) then deleterun

proc countreg data=amodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workpe(keep=Parameter Estimate)run

proc transpose data=workpe out=workpe_trun

data _null_set workpe_tcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + amp_Alpha)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(amp_Alpha+ypoi)((mu_hat_b+amp_Alpha)2)bias_b=abs(teta_hat_bayes-parmlamdha)

run

proc append data=workduga base=workduga1run

do h=1 to 30

data workdset workduga1if ^(u=ampx) then deleterundata workjackzinbamphset workdif u=ampxif kk=amph then deleterun

proc countreg data=jackzinbamphmodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workbetha_est_ZINBamph

17

Appendix 7 Syntax program EB with ZINB (continued)

(keep=Parameter Estimate)run

proc transpose data=workbetha_est_ZINBamph out=workbetha_est_ZINBtamphrun

data _null_set workbetha_est_ZINBtamphcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaZINBamphset workdmu_hat_b_amph=exp(ampIntercept + ampx1x1)mu_hat_b_amph= ampb_o- + ampb_1- x1w_b_amph=mu_hat_b_amph (mu_hat_b_amph + (amp_Alpha))teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2

g1_amph =((mu_hat_b_amph2ampalpha_)2)(ampalpha_+y_i)((mu_hat_b_amph2ampalpha_)+mu_hat_b_amph)2

g1_amph=(amp_Alpha+ypoi)((mu_hat_b_amph+amp_Alpha)2)

g1_amph =(A2)(ampk- + y_i)( a +mu_hat_b)2

beda_g_amph=g1_amph-g1run

data workmse_ZINB_jmerge workdugaZINB1 workdugaZINB2 workdugaZINB3 workdugaZINB4 workdugaZINB5 workdugaZINB6 workdugaZINB7 workdugaZINB8 workdugaZINB9 workdugaZINB10 workdugaZINB11 workdugaZINB12workdugaZINB13 workdugaZINB14 workdugaZINB15 workdugaZINB16 workdugaZINB17workdugaZINB18 workdugaZINB19 workdugaZINB20 workdugaZINB21 workdugaZINB22workdugaZINB23 workdugaZINB24 workdugaZINB25 workdugaZINB26 workdugaZINB27workdugaZINB28 workdugaZINB29 workdugaZINB30by kkrun

data workmse_ZINB_jset workmse_ZINB_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampj

18

Appendix 7 Syntax program EB with ZINB (continued)

endm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesrun

data workhasilZINBmerge workd workmse_ZINB_j keep kk x1 tetha mu lamdha ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_brunods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilZINB BASE=workhasilZINB1run

ENDmend

sae_zinb

  • KOPER AMPE PRAKATA_2pdf
  • isiirenepdf
Page 22: Zero-Inflated Negative Binomial in Small Area Estimation · Irianto Oetomo and Fine Analisa Maharani. She has two siblings. In 1998, she graduated from SD Dukuh 09 East Jakarta and

15

Appendix 6 Syntax program EB with NBR (continued)

run

ods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilnb BASE=workhasilnb1 appendver=v6run

ENDmend

sae_nb

16

Appendix 7 Syntax program EB with ZINB

macro sae_zinb

do x=1 to 900

data workaset work eif ^(u=ampx) then deleterun

proc countreg data=amodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workpe(keep=Parameter Estimate)run

proc transpose data=workpe out=workpe_trun

data _null_set workpe_tcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + amp_Alpha)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(amp_Alpha+ypoi)((mu_hat_b+amp_Alpha)2)bias_b=abs(teta_hat_bayes-parmlamdha)

run

proc append data=workduga base=workduga1run

do h=1 to 30

data workdset workduga1if ^(u=ampx) then deleterundata workjackzinbamphset workdif u=ampxif kk=amph then deleterun

proc countreg data=jackzinbamphmodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workbetha_est_ZINBamph

17

Appendix 7 Syntax program EB with ZINB (continued)

(keep=Parameter Estimate)run

proc transpose data=workbetha_est_ZINBamph out=workbetha_est_ZINBtamphrun

data _null_set workbetha_est_ZINBtamphcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaZINBamphset workdmu_hat_b_amph=exp(ampIntercept + ampx1x1)mu_hat_b_amph= ampb_o- + ampb_1- x1w_b_amph=mu_hat_b_amph (mu_hat_b_amph + (amp_Alpha))teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2

g1_amph =((mu_hat_b_amph2ampalpha_)2)(ampalpha_+y_i)((mu_hat_b_amph2ampalpha_)+mu_hat_b_amph)2

g1_amph=(amp_Alpha+ypoi)((mu_hat_b_amph+amp_Alpha)2)

g1_amph =(A2)(ampk- + y_i)( a +mu_hat_b)2

beda_g_amph=g1_amph-g1run

data workmse_ZINB_jmerge workdugaZINB1 workdugaZINB2 workdugaZINB3 workdugaZINB4 workdugaZINB5 workdugaZINB6 workdugaZINB7 workdugaZINB8 workdugaZINB9 workdugaZINB10 workdugaZINB11 workdugaZINB12workdugaZINB13 workdugaZINB14 workdugaZINB15 workdugaZINB16 workdugaZINB17workdugaZINB18 workdugaZINB19 workdugaZINB20 workdugaZINB21 workdugaZINB22workdugaZINB23 workdugaZINB24 workdugaZINB25 workdugaZINB26 workdugaZINB27workdugaZINB28 workdugaZINB29 workdugaZINB30by kkrun

data workmse_ZINB_jset workmse_ZINB_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampj

18

Appendix 7 Syntax program EB with ZINB (continued)

endm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesrun

data workhasilZINBmerge workd workmse_ZINB_j keep kk x1 tetha mu lamdha ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_brunods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilZINB BASE=workhasilZINB1run

ENDmend

sae_zinb

  • KOPER AMPE PRAKATA_2pdf
  • isiirenepdf
Page 23: Zero-Inflated Negative Binomial in Small Area Estimation · Irianto Oetomo and Fine Analisa Maharani. She has two siblings. In 1998, she graduated from SD Dukuh 09 East Jakarta and

16

Appendix 7 Syntax program EB with ZINB

macro sae_zinb

do x=1 to 900

data workaset work eif ^(u=ampx) then deleterun

proc countreg data=amodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workpe(keep=Parameter Estimate)run

proc transpose data=workpe out=workpe_trun

data _null_set workpe_tcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaset amu_hat_b=exp(ampIntercept + ampx1x1) w_bayes=mu_hat_b(mu_hat_b + amp_Alpha)teta_hat_bayes=w_bayesypoi+(1-w_bayes)mu_hat_bg1=(amp_Alpha+ypoi)((mu_hat_b+amp_Alpha)2)bias_b=abs(teta_hat_bayes-parmlamdha)

run

proc append data=workduga base=workduga1run

do h=1 to 30

data workdset workduga1if ^(u=ampx) then deleterundata workjackzinbamphset workdif u=ampxif kk=amph then deleterun

proc countreg data=jackzinbamphmodel ypoi=x1dist=zinb method=qnzeromodel ypoi ~ x1ods output ParameterEstimates=workbetha_est_ZINBamph

17

Appendix 7 Syntax program EB with ZINB (continued)

(keep=Parameter Estimate)run

proc transpose data=workbetha_est_ZINBamph out=workbetha_est_ZINBtamphrun

data _null_set workbetha_est_ZINBtamphcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaZINBamphset workdmu_hat_b_amph=exp(ampIntercept + ampx1x1)mu_hat_b_amph= ampb_o- + ampb_1- x1w_b_amph=mu_hat_b_amph (mu_hat_b_amph + (amp_Alpha))teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2

g1_amph =((mu_hat_b_amph2ampalpha_)2)(ampalpha_+y_i)((mu_hat_b_amph2ampalpha_)+mu_hat_b_amph)2

g1_amph=(amp_Alpha+ypoi)((mu_hat_b_amph+amp_Alpha)2)

g1_amph =(A2)(ampk- + y_i)( a +mu_hat_b)2

beda_g_amph=g1_amph-g1run

data workmse_ZINB_jmerge workdugaZINB1 workdugaZINB2 workdugaZINB3 workdugaZINB4 workdugaZINB5 workdugaZINB6 workdugaZINB7 workdugaZINB8 workdugaZINB9 workdugaZINB10 workdugaZINB11 workdugaZINB12workdugaZINB13 workdugaZINB14 workdugaZINB15 workdugaZINB16 workdugaZINB17workdugaZINB18 workdugaZINB19 workdugaZINB20 workdugaZINB21 workdugaZINB22workdugaZINB23 workdugaZINB24 workdugaZINB25 workdugaZINB26 workdugaZINB27workdugaZINB28 workdugaZINB29 workdugaZINB30by kkrun

data workmse_ZINB_jset workmse_ZINB_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampj

18

Appendix 7 Syntax program EB with ZINB (continued)

endm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesrun

data workhasilZINBmerge workd workmse_ZINB_j keep kk x1 tetha mu lamdha ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_brunods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilZINB BASE=workhasilZINB1run

ENDmend

sae_zinb

  • KOPER AMPE PRAKATA_2pdf
  • isiirenepdf
Page 24: Zero-Inflated Negative Binomial in Small Area Estimation · Irianto Oetomo and Fine Analisa Maharani. She has two siblings. In 1998, she graduated from SD Dukuh 09 East Jakarta and

17

Appendix 7 Syntax program EB with ZINB (continued)

(keep=Parameter Estimate)run

proc transpose data=workbetha_est_ZINBamph out=workbetha_est_ZINBtamphrun

data _null_set workbetha_est_ZINBtamphcall symput (Intercept col1)call symput (x1 col2)call symput (Inf_Intercept col3)call symput (Inf_x1 col4)call symput (_Alpha col5)run

data workdugaZINBamphset workdmu_hat_b_amph=exp(ampIntercept + ampx1x1)mu_hat_b_amph= ampb_o- + ampb_1- x1w_b_amph=mu_hat_b_amph (mu_hat_b_amph + (amp_Alpha))teta_hat_amph=w_b_amph ypoi+(1-w_b_amph)mu_hat_b_amphdelta_amph=(teta_hat_amph - teta_hat_bayes)2

g1_amph =((mu_hat_b_amph2ampalpha_)2)(ampalpha_+y_i)((mu_hat_b_amph2ampalpha_)+mu_hat_b_amph)2

g1_amph=(amp_Alpha+ypoi)((mu_hat_b_amph+amp_Alpha)2)

g1_amph =(A2)(ampk- + y_i)( a +mu_hat_b)2

beda_g_amph=g1_amph-g1run

data workmse_ZINB_jmerge workdugaZINB1 workdugaZINB2 workdugaZINB3 workdugaZINB4 workdugaZINB5 workdugaZINB6 workdugaZINB7 workdugaZINB8 workdugaZINB9 workdugaZINB10 workdugaZINB11 workdugaZINB12workdugaZINB13 workdugaZINB14 workdugaZINB15 workdugaZINB16 workdugaZINB17workdugaZINB18 workdugaZINB19 workdugaZINB20 workdugaZINB21 workdugaZINB22workdugaZINB23 workdugaZINB24 workdugaZINB25 workdugaZINB26 workdugaZINB27workdugaZINB28 workdugaZINB29 workdugaZINB30by kkrun

data workmse_ZINB_jset workmse_ZINB_jt_sum=0g_sum=0do j=1 to 30g_sum=g_sum+beda_g_ampjt_sum=t_sum+delta_ampj

18

Appendix 7 Syntax program EB with ZINB (continued)

endm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesrun

data workhasilZINBmerge workd workmse_ZINB_j keep kk x1 tetha mu lamdha ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_brunods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilZINB BASE=workhasilZINB1run

ENDmend

sae_zinb

  • KOPER AMPE PRAKATA_2pdf
  • isiirenepdf
Page 25: Zero-Inflated Negative Binomial in Small Area Estimation · Irianto Oetomo and Fine Analisa Maharani. She has two siblings. In 1998, she graduated from SD Dukuh 09 East Jakarta and

18

Appendix 7 Syntax program EB with ZINB (continued)

endm2i=((30-1)30)t_summ1i=g1-((30-1)30)g_summse_j_b=m1i+m2irrmse_j_b=sqrt(mse_j_b)teta_hat_bayesrun

data workhasilZINBmerge workd workmse_ZINB_j keep kk x1 tetha mu lamdha ypoi pzero r peluangnol u mu_hat_b w_bayes teta_hat_bayes bias_b mse_j_b rrmse_j_brunods listingoption nodate ls=130 ps=130ods html

end

proc append data=workhasilZINB BASE=workhasilZINB1run

ENDmend

sae_zinb

  • KOPER AMPE PRAKATA_2pdf
  • isiirenepdf