8
Research Article Group Identification and Variable Selection in Quantile Regression Ali Alkenani 1 and Basim Shlaibah Msallam 2 1 Department of Statistics, College of Administration and Economics, University of Al-Qadisiyah, Iraq 2 Supervision & Scientific Evaluation Office, Ministry of Higher Education and Scientific Research, Iraq Correspondence should be addressed to Ali Alkenani; [email protected] Received 24 November 2018; Revised 21 February 2019; Accepted 28 March 2019; Published 10 April 2019 Guest Editor: Himel Mallick Copyright © 2019 Ali Alkenani and Basim Shlaibah Msallam. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Using the Pairwise Absolute Clustering and Sparsity (PACS) penalty, we proposed the regularized quantile regression QR method (QR-PACS). e PACS penalty achieves the elimination of insignificant predictors and the combination of predictors with indistinguishable coefficients (IC), which are the two issues raised in the searching for the true model. QR-PACS extends PACS from mean regression settings to QR settings. e paper shows that QR-PACS can yield promising predictive precision as well as identifying related groups in both simulation and real data. 1. Introduction e regression model is one of the most important statis- tical models. e ordinary least squares regression (OLS) estimates the conditional mean function of the response. e least absolute deviation regression (LADR) estimates the conditional median function and it is resistant to outliers. e QR was introduced by Koenker and Bassett [1] as a generalization of LADR to estimate the conditional quantile function of the response. Consequently, QR gives us much more information about the conditional distribution of the response. QR has attracted a vast amount of interest in literature. It is applied in many different areas such as economics, finance, survival analysis, and growth chart. Variable selection (VS) is very important in the process of model building. In many applications, the number of variables is huge. However, keeping irrelevant variables in the model is undesirable because it makes the model difficult to interpret and may affect negatively its ability of prediction. Many different penalties were suggested to achieve VS. For example, Lasso [2], SCAD [3], fused Lasso [4], elastic-net [5], group Lasso [6], adaptive Lasso [7], adaptive elastic-net [8], and MCP [9]. Under QR framework, Koenker [10] combined the Lasso with the mixed-effect QR model to encourage shrinkage in estimating the random effects. Wang, Li, and Jiang [11] combined LADR with the adaptive Lasso penalty. Li and Zhu [12] proposed L1-norm penalized QR (PQR) by combining QR with Lasso penalty. Wu and Liu [13] introduced PQR with the SCAD and the adaptive Lasso penalties. Slawski [14] proposed the structured elastic-net regularizer for QR. In the setting p > n, where represents the number of predictors and represents the sample size, Belloni and Chernozhukov [15] studied the theory of PQR for the Lasso penalty. ey considered QR in high-dimensional sparse models. Wang, Wu, and Li [16] investigated the methodology of PQR in ultrahigh dimension for the nonconvex penalties such as SCAD or MCP. Peng and Wang [17] proposed and studied a new iterative coordinate descent algorithm for solving nonconvex PQR in high dimension. e search for the true model focuses on two issues: deleting irrelevant predictors and merging predictors with IC [18]. Although the above penalties can achieve the first issue, they fail in achieving the second one. e two issues can be achieved through Pairwise Absolute Clustering and Hindawi Journal of Probability and Statistics Volume 2019, Article ID 8504174, 7 pages https://doi.org/10.1155/2019/8504174

Group Identification and Variable Selection in Quantile ...downloads.hindawi.com/journals/jps/2019/8504174.pdf · ResearchArticle Group Identification and Variable Selection in Quantile

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Group Identification and Variable Selection in Quantile ...downloads.hindawi.com/journals/jps/2019/8504174.pdf · ResearchArticle Group Identification and Variable Selection in Quantile

Research ArticleGroup Identification and Variable Selection inQuantile Regression

Ali Alkenani 1 and Basim ShlaibahMsallam2

1Department of Statistics College of Administration and Economics University of Al-Qadisiyah Iraq2Supervision amp Scientific Evaluation Office Ministry of Higher Education and Scientific Research Iraq

Correspondence should be addressed to Ali Alkenani alialkenaniqueduiq

Received 24 November 2018 Revised 21 February 2019 Accepted 28 March 2019 Published 10 April 2019

Guest Editor Himel Mallick

Copyright copy 2019 Ali Alkenani and Basim Shlaibah Msallam This is an open access article distributed under the CreativeCommons Attribution License which permits unrestricted use distribution and reproduction in any medium provided theoriginal work is properly cited

Using the Pairwise Absolute Clustering and Sparsity (PACS) penalty we proposed the regularized quantile regression QR method(QR-PACS) The PACS penalty achieves the elimination of insignificant predictors and the combination of predictors withindistinguishable coefficients (IC) which are the two issues raised in the searching for the true model QR-PACS extends PACSfrom mean regression settings to QR settings The paper shows that QR-PACS can yield promising predictive precision as well asidentifying related groups in both simulation and real data

1 Introduction

The regression model is one of the most important statis-tical models The ordinary least squares regression (OLS)estimates the conditional mean function of the responseThe least absolute deviation regression (LADR) estimates theconditional median function and it is resistant to outliersThe QR was introduced by Koenker and Bassett [1] as ageneralization of LADR to estimate the conditional quantilefunction of the response Consequently QR gives us muchmore information about the conditional distribution of theresponse QR has attracted a vast amount of interest inliterature It is applied in many different areas such aseconomics finance survival analysis and growth chart

Variable selection (VS) is very important in the processof model building In many applications the number ofvariables is huge However keeping irrelevant variables in themodel is undesirable because it makes the model difficult tointerpret and may affect negatively its ability of predictionMany different penalties were suggested to achieve VS Forexample Lasso [2] SCAD [3] fused Lasso [4] elastic-net [5]group Lasso [6] adaptive Lasso [7] adaptive elastic-net [8]and MCP [9]

Under QR framework Koenker [10] combined the Lassowith the mixed-effect QR model to encourage shrinkagein estimating the random effects Wang Li and Jiang [11]combined LADR with the adaptive Lasso penalty Li and Zhu[12] proposed L1-norm penalized QR (PQR) by combiningQR with Lasso penalty Wu and Liu [13] introduced PQRwith the SCAD and the adaptive Lasso penalties Slawski[14] proposed the structured elastic-net regularizer forQR

In the setting p gt n where 119901 represents the numberof predictors and 119899 represents the sample size Belloni andChernozhukov [15] studied the theory of PQR for the Lassopenalty They considered QR in high-dimensional sparsemodels WangWu and Li [16] investigated the methodologyof PQR in ultrahigh dimension for the nonconvex penaltiessuch as SCAD or MCP Peng and Wang [17] proposed andstudied a new iterative coordinate descent algorithm forsolving nonconvex PQR in high dimension

The search for the true model focuses on two issuesdeleting irrelevant predictors and merging predictors withIC [18] Although the above penalties can achieve the firstissue they fail in achieving the second one The two issuescan be achieved through Pairwise Absolute Clustering and

HindawiJournal of Probability and StatisticsVolume 2019 Article ID 8504174 7 pageshttpsdoiorg10115520198504174

2 Journal of Probability and Statistics

Sparsity (PACS) [18]Moreover PACS is an oraclemethod forsimultaneous group identification and VS

The limitations of existing variable selection methodsmotivate the authors to write this paper The aim of thecurrent research is to find an effective procedure for simul-taneous group identification and VS under QR framework

In this paper we suggested theQR-PACS to get the advan-tages over the existing PQRmethods QR-PACS benefits fromthe ability of PACS on achieving the mentioned issues of thediscovery of the true model which is unavailable in LassoAdaptive Lasso SCAD MCP Elastic-net and structuredelastic-net

The rest of the paper is organized as follows In Section 2penalized linear QR is reviewed briefly QR-PACS is intro-duced in Section 3 The numerical results of simulations andreal data are presented in Sections 4 and 5 respectively Theconclusions are reported in Section 6

2 Penalized Linear QR

QR is a widespread technique used to describe the distribu-tion of an outcome variable (119910119894) given a set of predictors (x119894)Let x119894 be a 119901⨉1 vector of predictors for the 119894119905ℎ observationand 119902120591(x119894) be the inverse cumulative distribution function of119910119894 given x119894 Then 119902120591(x119894) = x119879119894 120573120591 = sum119901119895=1 119909119894119895120573119895120591 where 120573120591 is avector of119901 unknown parameters and 120591 is the level of quantile

Koenker and Bassett [1] suggested estimating 120573120591 asfollows

min120573120591

119899sum119894=1

120588120591(119910119894 minus 119901sum119895=1

119909119894119895120573119895120591) (1)

where 120588120591() is the check loss function defined as

120588120591 (119906) = 120591119906119868[0infin) (119906) minus (1 minus 120591) 119906119868(minusinfin0) (119906) (2)

Under regularization framework Li andZhu [12]Wu andLiu[13] Slawski [14] and Wang Wu and Li [16] among othersproposed the penalized versions of (1) by adding differentpenalties as follows

min120573120591

119899sum119894=1

120588120591(119910119894 minus 119901sum119895=1

119909119894119895120573119895120591) + 120582 119901sum119895=1

119875 (120573119895120591) (3)

where 120582 gt 0 is the penalization parameter and 119875(120573119896) is thepenalty function

For the rest of this paper the subscript 120591 is omitted fornotational convenience

3 Penalized Linear QR throughPACS (QR-PACS)

In this section we incorporate PACS into the optimization of(1) to propose QR-PACS Under the QR setup the predictors119909119894119895 are standardized and the response 119910119894 is centered 119894 =1 2 119899 and 119895 = 1 2 119901 The QR-PACS is proposed forsimultaneous group identification and VS in QR The QR-PACS encourages correlated variable to have equal coefficient

values The equality of coefficients is attained by addinggroup identification penalty to the pairwise differences andsums of coefficientsThe QR-PACS estimates are proposed asminimizers of

119899sum119894=1

120588(119910119894 minus 119901sum119895=1

119909119894119895120573119895) + 120582119901sum119895=1

120596119895 1003816100381610038161003816100381612057311989510038161003816100381610038161003816+ sum1le119895lt119896le119901

120596119895119896(minus) 10038161003816100381610038161003816120573119896 minus 12057311989510038161003816100381610038161003816 + sum1le119895lt119896le119901

120596119895119896(+) 10038161003816100381610038161003816120573119896 + 12057311989510038161003816100381610038161003816

(4)

where 120596 are the nonnegative weightsThe PACS penalty in (4) consists of 120582sum119901119895=1 120596119895|120573119895|

that encourages sparseness 120582sum1le119895lt119896le119901 120596119895119896(minus)|120573119896 minus 120573119895| andsum1le119895lt119896le119901 120596119895119896(+)|120573119896 + 120573119895| which are employed for the groupidentification and encourage equality of coefficientsThe sec-ond term of the penalty encourages the same sign coefficientsto be set as equal while the third term encourages oppositesign coefficients to be set as equal in magnitude

Choosing appropriate adaptive weights is very importantfor PACS In QR-PACS we employed the adaptive weightsthat incorporate correlations into the weights as suggested bySharma et al [18] with a small modification as follows

120596119895 = 1003816100381610038161003816100381612057311989510038161003816100381610038161003816minus1 120596119895119896(minus) = (1 minus 119903119887119895119896)minus1 10038161003816100381610038161003816120573119896 minus 12057311989510038161003816100381610038161003816minus1

and 120596119895119896(+) = (1 + 119903119887119895119896)minus1 10038161003816100381610038161003816120573119896 + 12057311989510038161003816100381610038161003816minus1for 1 le 119895 lt 119896 le 119901

(5)

where 120573 is a radic119899 consistent estimator of 120573 such as thePACS [18] estimates or other shrinkage QR estimates and119903119887119895119896 is the biweight midcorrelation (119895 119896)119905ℎ pair of predictorsWe propose to employ the biweight midcorrelation [19 20]instead of Pearson correlation which is used in the adaptiveweights in [18] to obtain robust correlation and robustweights

In this paper ridge quantile estimates were employed asinitial estimates for 120573rsquos to obtain weights performing well instudies with collinear predictors

4 Simulation Study

In this section five examples were carried out to assessQR-PACS method by comparing it with existing selectionapproaches under QR setting in both prediction precisionand model discovery A regression model was generated asfollows

119910 = 119883120573 + 120590120598 (6)

In all examples predictors 119883 and the error term 120598 werestandard normal

Journal of Probability and Statistics 3

Table 1 ME SA GA and SGA results of Example 1 for n=100

120591 Criterion Ridge QR- QR- QR- QR- QR-PACSQR LASSO SCAD adaptive LASSO elastic-net

120590=1

010

ME(SE) 01144(00176) 00834(00182) 00684(00165) 00673(00153) 00623(00112) 00432(00093)SA 0 62 79 80 87 79GA 0 0 0 0 0 88SGA 0 0 0 0 0 72

050

ME(SE) 00832(00056) 00524(00065) 00443(00056) 00431(00050) 00404(00041) 00143(00020)SA 0 64 81 82 86 80GA 0 0 0 0 0 89SGA 0 0 0 0 0 73

075

ME(SE) 01140(00173) 00829(00177) 00672(00150) 00668(00146) 00619(00105) 00429(00082)SA 0 63 81 81 88 80GA 0 0 0 0 0 89SGA 0 0 0 0 0 72

120590=3

010

ME(SE) 01471(00183) 01168(00188) 01062(00162) 01056(00162) 00953(00120) 00755(00106)SA 0 59 77 77 84 77GA 0 0 0 0 0 86SGA 0 0 0 0 0 70

050

ME(SE) 01153(00066) 00855(00071) 00763(00060) 00756(00060) 00730(00049) 00473(00034)SA 0 61 80 80 83 78GA 0 0 0 0 0 87SGA 0 0 0 0 0 71

075

ME(SE) 01466(00179) 01161(00185) 01056(00161) 01044(00158) 00945(00117) 00744(00099)SA 0 61 79 79 85 78GA 0 0 0 0 0 87SGA 0 0 0 0 0 70

We compared QR-PACS with ridge QR QR-Lasso QR-SCAD QR-adaptive Lasso and QR-elastic-net The perfor-mance of the methods was compared using model error(ME) criterion for prediction accuracy which was definedby (120573 minus 120573)1015840119881(120573 minus 120573) where 119881 represents the populationcovariance matrix of X and the resulting model complexityfor model discovery The median and standard error (SE)of ME were reported Also selection accuracy (SA oftrue models identified) grouping accuracy (GA of truegroups identified) and of both selection and groupingaccuracy (SGA) were computed and reported Note that noneof ridge QR QR-Lasso QR-SCAD QR-adaptive Lasso andQR- elastic-net perform grouping The sample size was 100and the simulated model was replicated 100 times Sometypical examples are reported as follows

Example 1 In this example we assumed the true parametersfor the model of study as 120573 = (2 2 2 0 0 0 0 0)119879 119883 isinR8 The first three predictors were highly correlated withcorrelation equal to 07 and their coefficients were equal inmagnitude while the rest were uncorrelated

Example 2 In this example the true coefficients wereassumed as120573 = (05 1 2 0 0 0 0 0)119879119883 isin R8The first threepredictors were highly correlatedwith correlation equal to 07and their coefficients were different in magnitude while therest were uncorrelated

Example 3 In this example the true parameters were 120573 =(1 1 1 05 1 2 0 0 0 0)119879119883 isin R10The first three predictorswere highly correlated with correlation equal to 07 and theircoefficients were equal in magnitude while the correlationfor the second three predictors was equal to 03 and theircoefficients were different in magnitudes The remainingpredictors were uncorrelated

Example 4 In this example the true parameters were 120573 =(1 1 1 05 1 2 0 0 0 0)119879119883 isin R10The first three predictorswere correlated with correlation equal to 03 and theircoefficients were equal in magnitude while the correlationfor the second three predictors was 07 and their coefficientswere different in magnitudes The remaining predictors wereuncorrelated

Example 5 In this example the true parameters wereassumed as 120573 = (2 2 2 1 1 0 0 0 0 0)119879 119883 isin R10 The firstthree and the second two predictors were highly correlatedwith correlation equal to 07 and their coefficients weredifferent in magnitude while the rest were uncorrelated

For all the values of 120591 and 120590 Table 1 shows that the QR-PACS method has the lowest ME Although the QR-elastic-net andQR-adaptive Lasso have the highest SA it is clear thatall the considered methods do not perform grouping exceptQR-PACS The QR-PACS method successfully identifies thegroups of predictors as seen in the GA and SGA rows

4 Journal of Probability and Statistics

Table 2 ME SA GA and SGA results of Example 2 for n=100

120591 Criterion Ridge QR- QR- QR- QR- QR-PACSQR LASSO SCAD adaptive LASSO elastic-net

120590=1

010

ME(SE) 01170(00201) 00848(00195) 00696(00177) 00665(00165) 00639(00106) 00445(00101)SA 0 60 78 78 84 78NG 100 100 100 100 100 100SNG 0 69 70 70 78 69

050

ME(SE) 00857(00075) 00538(00068) 00456(00062) 00439(00057) 00421(00055) 00156(00035)SA 0 63 81 81 84 79NG 100 100 100 100 100 100SNG 0 70 71 71 80 70

075

ME(SE) 01155(00185) 00841(00179) 00681(00161) 00679(00155) 00630(00111) 00437(00090)SA 0 61 80 81 84 79NG 100 100 100 100 100 100SNG 0 70 70 71 79 70

120590=3

010

ME(SE) 01275(00213) 00968(00199) 00788(00185) 00745(00183) 00728(00106) 00419(00117)SA 0 57 77 77 83 77NG 100 100 100 100 100 100SNG 0 67 69 69 76 67

050

ME(SE) 00876(00078) 00549(00072) 00464(00064) 00447(00062) 00430(00060) 00156(00040)SA 0 63 80 80 83 79NG 100 100 100 100 100 100SNG 0 69 70 70 79 69

075

ME(SE) 01164(00190) 00850(00187) 00692(00170) 00688(00162) 00639(00120) 00446(00098)SA 0 60 79 80 84 79NG 100 100 100 100 100 100SNG 0 69 70 70 78 69

In Table 2 the percentage of no-grouping (NG nogroups found) and percentage of selection and no-grouping(SNG) were reported instead of GA and SGA respectivelyIn terms of prediction and selection the QR-PACS methoddoes not performwell while the QR-elastic-net QR-adaptiveLasso and QR-SCAD perform the best respectively Allthe methods under consideration perform well in termsof not identifying the group Thus the QR-PACS is not arecommendedmethodwhen there is high correlation and thesignificant variables do not form a group

Table 3 demonstrates that the QR-elastic-net and QR-adaptive Lasso have the best SA respectively however theQR-PACS performs better in terms of ME It is obvious thatQR-PACS identifies the important group with high GA andSGA

Table 4 shows that theQR-elastic-net QR-adaptive Lassoand QR-SCAD have the best SA It is clear that the QR-PACShas the best results among the other methods in terms ofMEIn terms ofGAand SGA it can be observed that theQR-PACSperforms well

FromTable 5 it can be noticed that theQR-elastic-net hasthe best SA The QR-PACS has excellent GA Also it is clearthat QR-PACS successfully identifies the groups of predictorsas seen in the GA and SGA

5 NCAA Sports Data

In this section the behavior of the QR-PACS with ridge QRQR-Lasso QR-SCADQR-adaptive Lasso andQR-elastic-net

was illustrated in the analysis of NCAA sports data [21] Westandardized the predictors and centered the response beforethe data analysis

In each repetition the authors randomly split the datainto a training and a testing dataset the percentage of thetesting data was 20 and models were fit onto the trainingset The NCAA sports data were randomly split 100 timeseach to allow for more stable comparisons We reported theaverage and SE of the ratio of test error (RTE) over QR of allmethods and the effective model size (MZ) after accountingfor equality of absolute coefficient estimates

The NCAA data was taken from a study of the effectsof sociodemographic indicators and the sports programs ongraduation rates

The data size is n=94 and p=19 predictorsThe dataset andits description are available from the website (httpwww4statncsuedusimboosvarselectncaahtml) The predictorsare students in top 10 HS (x1) ACT COMPOSITE 25TH(x2) On living campus (x3) first-time undergraduates (x4)Total Enrolment1000 (x5) courses taught by TAs (x6)composite of basketball ranking (x7) in-state tuition1000(x8) room and board1000 (x9) avg BB home attendance(x10) Professor Salary (x11) Ratio of Studentfaculty (x12)white (x13) Assistant professor salary (x14) population ofcity (x15) faculty with PHD (x16) Acceptance rate (x17)receiving loans (x18) and Out of state (x19)

From Table 6 the results indicate that QR-PACS doessignificantly better than ridge QR QR-Lasso QR-SCAD

Journal of Probability and Statistics 5

Table 3 ME SA GA and SGA results of Example 3 for n=100

120591 Criterion Ridge QR- QR- QR- QR- QR-PACSQR LASSO SCAD adaptive LASSO elastic-net

120590=1

010

ME(SE) 01383(00195) 01176(00196) 00933(00164) 00908(00151) 00902(00119) 00558(00098)SA 0 58 79 79 82 78GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

050

ME(SE) 01139(00071) 00843(00062) 00662(00059) 00643(00054) 00638(00051) 00367(00029)SA 0 62 80 81 84 79GA 0 0 0 0 0 86SGA 0 0 0 0 0 71

075

ME(SE) 01289(00180) 01083(00171) 00861(00161) 00843(00147) 00834(00108) 00479(00081)SA 0 61 80 80 84 79GA 0 0 0 0 0 86SGA 0 0 0 0 0 71

120590=3

010

ME(SE) 01490(00199) 01279(00198) 01030(00169) 01016(00155) 00998(00124) 00651(00104)SA 0 57 78 78 80 77GA 0 0 0 0 0 83SGA 0 0 0 0 0 69

050

ME(SE) 01244(00076) 00945(00065) 00867(00062) 00742(00056) 00735(00054) 00464(00032)SA 0 61 79 80 83 78GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

075

ME(SE) 01399(00185) 01189(00171) 00958(00166) 00946(00160) 00930(00111) 00567(00086)SA 0 60 79 80 83 78GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

Table 4 ME SA GA and SGA results of Example 4 for n=100

120591 Criterion Ridge QR- QR- QR- QR- QR-PACSQR LASSO SCAD adaptive LASSO elastic-net

120590=1

010

ME(SE) 01380(00193) 01149(00195) 00939(00165) 00937(00153) 00832(00130) 00560(00105)SA 0 57 79 79 82 78GA 0 0 0 0 0 86SGA 0 0 0 0 0 71

050

ME(SE) 01132(00069) 00816(00062) 00672(00060) 00672(00056) 00568(00062) 00369(00036)SA 0 62 81 81 85 80GA 0 0 0 0 0 86SGA 0 0 0 0 0 72

075

ME(SE) 01280(00178) 01056(00171) 00873(00164) 00872(00149) 00764(00119) 00481(00088)SA 0 61 80 80 84 79GA 0 0 0 0 0 86SGA 0 0 0 0 0 71

120590=3

010

ME(SE) 01495(00200) 01270(00197) 01016(00169) 01008(00154) 00970(00120) 00639(00100)SA 0 57 78 78 81 77GA 0 0 0 0 0 84SGA 0 0 0 0 0 70

050

ME(SE) 01239(00072) 00928(00064) 00859(00060) 00738(00054) 00730(00051) 00458(00029)SA 0 62 80 80 83 79GA 0 0 0 0 0 85SGA 0 0 0 07 0 70

075

ME(SE) 01391(00181) 01182(00167) 00953(00163) 00941(00158) 00921(00106) 00559(00081)SA 0 60 79 80 83 78GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

6 Journal of Probability and Statistics

Table 5 ME SA GA and SGA results of Example 5 for n=100

120591 Criterion Ridge QR- QR- QR- QR- QR-PACSQR LASSO SCAD adaptive LASSO elastic-net

120590=1

010

ME(SE) 01330(00161) 01106(00168) 00914(00154) 00910(00142) 00818(00122) 00548(00981)SA 0 58 79 79 84 79GA 0 0 0 0 0 87SGA 0 0 0 0 0 72

050

ME(SE) 01119(00061) 00807(00053) 00666(00054) 00663(00051) 00561(00054) 00358(00031)SA 0 63 82 82 86 81GA 0 0 0 0 0 87SGA 0 0 0 0 0 73

075

ME(SE) 01274(00169) 01048(00162) 00867(00157) 00865(00143) 00756(00109) 00476(00079)SA 0 62 81 81 84 79GA 0 0 0 0 0 87SGA 0 0 0 0 0 72

120590=3

010

ME(SE) 01486(00188) 01260(00189) 01001(00156) 01000(00148) 00960(00110) 00625(00088)SA 0 57 77 77 81 78GA 0 0 0 0 0 84SGA 0 0 0 0 0 70

050

ME(SE) 01227(00069) 00922(00059) 00850(00053) 00731(00047) 00724(00045) 00445(00020)SA 0 63 80 80 83 80GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

075

ME(SE) 01385(00177) 01172(00154) 00940(00154) 00935(00151) 00926(00100) 00552(00075)SA 0 60 80 80 83 79GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

Table 6 The test error and the effective model size values for the methods under consideration based on the NCAA sport data

Method 120591 = 025 120591 = 050 120591 = 075RTE MZ RTE MZ RTE MZ

Ridge QR 1135 (0035) 19 1131 (0037) 19 1134 (0034) 19QR-LASSO 1126(0033) 6 1123(0032) 6 1124(0033) 6QR-SCAD 1115 (0036) 6 1111 (0035) 6 1113 (0035) 6QR-adaptive LASSO 1110 (0029) 5 1101 (0030) 5 1108 (0029) 5QR-elastic-net 1029(0029) 5 1022 (0026) 5 1025(0030) 5QR-PACS 0913(0014) 5 0896(0012) 5 0912(0015) 5

QR-adaptive Lasso and QR-elastic-net in test error In factridge QR QR-Lasso QR-SCAD QR-adaptive Lasso andQR-elastic-net perform worse than the QR in test error Theeffective model size is 5 for QR-PACS although it includesall variables in the models

6 Conclusions

In this paper QR-PACS for group identification andVS underQR settings is developed which combines the strength of QRand the ability of PACS for consistent group identificationand VS QR-PACS can achieve the two goals simultaneouslyQR-PACS extends PACS frommean regression settings toQRsettings It is proved computationally that it can be simply

carried out with an effective computational algorithm Thepaper shows that QR-PACS can yield promising predictiveprecision as well as identifying related groups in both sim-ulation and the real data Future direction or extension of thecurrent paper is QR-PACS under Bayesian framework Alsorobust QR-PACS is another possible extension of the currentpaper

Data Availability

The data which is studied in our paper is the NCAA sportsdata from Mangold et al [21] It is public and available fromthewebsite (httpwww4statncsuedusimboosvarselectncaahtml) [21]

Journal of Probability and Statistics 7

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paper

References

[1] R Koenker andG Bassett Jr ldquoRegression quantilesrdquo Economet-rica vol 46 no 1 pp 33ndash50 1978

[2] R Tibshirani ldquoRegression shrinkage and selection via thelassordquo Journal of the Royal Statistical Society Series B (StatisticalMethodology) vol 58 no 1 pp 267ndash288 1996

[3] J Fan and R Li ldquoVariable selection via nonconcave penalizedlikelihood and its oracle propertiesrdquo Journal of the AmericanStatistical Association vol 96 no 456 pp 1348ndash1360 2001

[4] R Tibshirani M Saunders S Rosset J Zhu and K KnightldquoSparsity and smoothness via the fused lassordquo Journal of theRoyal Statistical Society B Statistical Methodology vol 67 no1 pp 91ndash108 2005

[5] H Zou and T Hastie ldquoRegularization and variable selectionvia the elastic netrdquo Journal of the Royal Statistical Society BStatistical Methodology vol 67 no 2 pp 301ndash320 2005

[6] M Yuan and Y Lin ldquoModel selection and estimation in regres-sion with grouped variablesrdquo Journal of the Royal StatisticalSociety Series B (Statistical Methodology) vol 68 no 1 pp 49ndash67 2006

[7] H Zou ldquoThe adaptive lasso and its oracle propertiesrdquo Journal ofthe American Statistical Association vol 101 no 476 pp 1418ndash1429 2006

[8] H Zou and H H Zhang ldquoOn the adaptive elastic-net with adiverging number of parametersrdquo The Annals of Statistics vol37 no 4 pp 1733ndash1751 2009

[9] C H Zhang ldquoNearly unbiased variable selection under mini-max concave penaltyrdquoThe Annals of Statistics vol 38 no 2 pp894ndash942 2010

[10] R Koenker ldquoQuantile regression for longitudinal datardquo Journalof Multivariate Analysis vol 91 no 1 pp 74ndash89 2004

[11] H Wang G Li and G Jiang ldquoRobust regression shrinkage andconsistent variable selection through the LAD-Lassordquo Journal ofBusiness amp Economic Statistics vol 25 no 3 pp 347ndash355 2007

[12] Y Li and J Zhu ldquo1198711-norm quantile regressionrdquo Journal ofComputational and Graphical Statistics vol 17 no 1 pp 163ndash185 2008

[13] Y Wu and Y Liu ldquoVariable selection in quantile regressionrdquoStatistica Sinica vol 19 no 2 pp 801ndash817 2009

[14] M Slawski ldquoThe structured elastic net for quantile regressionand support vector classificationrdquo Statistics and Computing vol22 no 1 pp 153ndash168 2012

[15] A Belloni and V Chernozhukov l1ndash Penalized Quantile Regres-sion in High-Dimensional Sparse Models 2011

[16] L Wang Y Wu and R Li ldquoQuantile regression for analyzingheterogeneity in ultra-high dimensionrdquo Journal of the AmericanStatistical Association vol 107 no 497 pp 214ndash222 2012

[17] B Peng and LWang ldquoAn iterative coordinate descent algorithmfor high-dimensional nonconvex penalized quantile regres-sionrdquo Journal of Computational and Graphical Statistics vol 24no 3 pp 676ndash694 2015

[18] D B Sharma H D Bondell and H H Zhang ldquoConsistentgroup identification and variable selection in regression withcorrelated predictorsrdquo Journal of Computational and GraphicalStatistics vol 22 no 2 pp 319ndash340 2013

[19] R Wilcox Introduction to Robust Estimation and HypothesisTesting Academic Press San Diego Calif USA 2nd edition1997

[20] A Alkenani and K Yu ldquoA comparative study for robust canoni-cal correlation methodsrdquo Journal of Statistical Computation andSimulation vol 83 no 4 pp 690ndash720 2013

[21] W D Mangold L Bean and D Adams ldquoThe impact of inter-collegiate athletics on graduation rates among major NCAAdivision I universities implications for college persistencetheory and practicerdquo Journal of Higher Education vol 74 no5 pp 540ndash563 2003

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

Page 2: Group Identification and Variable Selection in Quantile ...downloads.hindawi.com/journals/jps/2019/8504174.pdf · ResearchArticle Group Identification and Variable Selection in Quantile

2 Journal of Probability and Statistics

Sparsity (PACS) [18]Moreover PACS is an oraclemethod forsimultaneous group identification and VS

The limitations of existing variable selection methodsmotivate the authors to write this paper The aim of thecurrent research is to find an effective procedure for simul-taneous group identification and VS under QR framework

In this paper we suggested theQR-PACS to get the advan-tages over the existing PQRmethods QR-PACS benefits fromthe ability of PACS on achieving the mentioned issues of thediscovery of the true model which is unavailable in LassoAdaptive Lasso SCAD MCP Elastic-net and structuredelastic-net

The rest of the paper is organized as follows In Section 2penalized linear QR is reviewed briefly QR-PACS is intro-duced in Section 3 The numerical results of simulations andreal data are presented in Sections 4 and 5 respectively Theconclusions are reported in Section 6

2 Penalized Linear QR

QR is a widespread technique used to describe the distribu-tion of an outcome variable (119910119894) given a set of predictors (x119894)Let x119894 be a 119901⨉1 vector of predictors for the 119894119905ℎ observationand 119902120591(x119894) be the inverse cumulative distribution function of119910119894 given x119894 Then 119902120591(x119894) = x119879119894 120573120591 = sum119901119895=1 119909119894119895120573119895120591 where 120573120591 is avector of119901 unknown parameters and 120591 is the level of quantile

Koenker and Bassett [1] suggested estimating 120573120591 asfollows

min120573120591

119899sum119894=1

120588120591(119910119894 minus 119901sum119895=1

119909119894119895120573119895120591) (1)

where 120588120591() is the check loss function defined as

120588120591 (119906) = 120591119906119868[0infin) (119906) minus (1 minus 120591) 119906119868(minusinfin0) (119906) (2)

Under regularization framework Li andZhu [12]Wu andLiu[13] Slawski [14] and Wang Wu and Li [16] among othersproposed the penalized versions of (1) by adding differentpenalties as follows

min120573120591

119899sum119894=1

120588120591(119910119894 minus 119901sum119895=1

119909119894119895120573119895120591) + 120582 119901sum119895=1

119875 (120573119895120591) (3)

where 120582 gt 0 is the penalization parameter and 119875(120573119896) is thepenalty function

For the rest of this paper the subscript 120591 is omitted fornotational convenience

3 Penalized Linear QR throughPACS (QR-PACS)

In this section we incorporate PACS into the optimization of(1) to propose QR-PACS Under the QR setup the predictors119909119894119895 are standardized and the response 119910119894 is centered 119894 =1 2 119899 and 119895 = 1 2 119901 The QR-PACS is proposed forsimultaneous group identification and VS in QR The QR-PACS encourages correlated variable to have equal coefficient

values The equality of coefficients is attained by addinggroup identification penalty to the pairwise differences andsums of coefficientsThe QR-PACS estimates are proposed asminimizers of

119899sum119894=1

120588(119910119894 minus 119901sum119895=1

119909119894119895120573119895) + 120582119901sum119895=1

120596119895 1003816100381610038161003816100381612057311989510038161003816100381610038161003816+ sum1le119895lt119896le119901

120596119895119896(minus) 10038161003816100381610038161003816120573119896 minus 12057311989510038161003816100381610038161003816 + sum1le119895lt119896le119901

120596119895119896(+) 10038161003816100381610038161003816120573119896 + 12057311989510038161003816100381610038161003816

(4)

where 120596 are the nonnegative weightsThe PACS penalty in (4) consists of 120582sum119901119895=1 120596119895|120573119895|

that encourages sparseness 120582sum1le119895lt119896le119901 120596119895119896(minus)|120573119896 minus 120573119895| andsum1le119895lt119896le119901 120596119895119896(+)|120573119896 + 120573119895| which are employed for the groupidentification and encourage equality of coefficientsThe sec-ond term of the penalty encourages the same sign coefficientsto be set as equal while the third term encourages oppositesign coefficients to be set as equal in magnitude

Choosing appropriate adaptive weights is very importantfor PACS In QR-PACS we employed the adaptive weightsthat incorporate correlations into the weights as suggested bySharma et al [18] with a small modification as follows

120596119895 = 1003816100381610038161003816100381612057311989510038161003816100381610038161003816minus1 120596119895119896(minus) = (1 minus 119903119887119895119896)minus1 10038161003816100381610038161003816120573119896 minus 12057311989510038161003816100381610038161003816minus1

and 120596119895119896(+) = (1 + 119903119887119895119896)minus1 10038161003816100381610038161003816120573119896 + 12057311989510038161003816100381610038161003816minus1for 1 le 119895 lt 119896 le 119901

(5)

where 120573 is a radic119899 consistent estimator of 120573 such as thePACS [18] estimates or other shrinkage QR estimates and119903119887119895119896 is the biweight midcorrelation (119895 119896)119905ℎ pair of predictorsWe propose to employ the biweight midcorrelation [19 20]instead of Pearson correlation which is used in the adaptiveweights in [18] to obtain robust correlation and robustweights

In this paper ridge quantile estimates were employed asinitial estimates for 120573rsquos to obtain weights performing well instudies with collinear predictors

4 Simulation Study

In this section five examples were carried out to assessQR-PACS method by comparing it with existing selectionapproaches under QR setting in both prediction precisionand model discovery A regression model was generated asfollows

119910 = 119883120573 + 120590120598 (6)

In all examples predictors 119883 and the error term 120598 werestandard normal

Journal of Probability and Statistics 3

Table 1 ME SA GA and SGA results of Example 1 for n=100

120591 Criterion Ridge QR- QR- QR- QR- QR-PACSQR LASSO SCAD adaptive LASSO elastic-net

120590=1

010

ME(SE) 01144(00176) 00834(00182) 00684(00165) 00673(00153) 00623(00112) 00432(00093)SA 0 62 79 80 87 79GA 0 0 0 0 0 88SGA 0 0 0 0 0 72

050

ME(SE) 00832(00056) 00524(00065) 00443(00056) 00431(00050) 00404(00041) 00143(00020)SA 0 64 81 82 86 80GA 0 0 0 0 0 89SGA 0 0 0 0 0 73

075

ME(SE) 01140(00173) 00829(00177) 00672(00150) 00668(00146) 00619(00105) 00429(00082)SA 0 63 81 81 88 80GA 0 0 0 0 0 89SGA 0 0 0 0 0 72

120590=3

010

ME(SE) 01471(00183) 01168(00188) 01062(00162) 01056(00162) 00953(00120) 00755(00106)SA 0 59 77 77 84 77GA 0 0 0 0 0 86SGA 0 0 0 0 0 70

050

ME(SE) 01153(00066) 00855(00071) 00763(00060) 00756(00060) 00730(00049) 00473(00034)SA 0 61 80 80 83 78GA 0 0 0 0 0 87SGA 0 0 0 0 0 71

075

ME(SE) 01466(00179) 01161(00185) 01056(00161) 01044(00158) 00945(00117) 00744(00099)SA 0 61 79 79 85 78GA 0 0 0 0 0 87SGA 0 0 0 0 0 70

We compared QR-PACS with ridge QR QR-Lasso QR-SCAD QR-adaptive Lasso and QR-elastic-net The perfor-mance of the methods was compared using model error(ME) criterion for prediction accuracy which was definedby (120573 minus 120573)1015840119881(120573 minus 120573) where 119881 represents the populationcovariance matrix of X and the resulting model complexityfor model discovery The median and standard error (SE)of ME were reported Also selection accuracy (SA oftrue models identified) grouping accuracy (GA of truegroups identified) and of both selection and groupingaccuracy (SGA) were computed and reported Note that noneof ridge QR QR-Lasso QR-SCAD QR-adaptive Lasso andQR- elastic-net perform grouping The sample size was 100and the simulated model was replicated 100 times Sometypical examples are reported as follows

Example 1 In this example we assumed the true parametersfor the model of study as 120573 = (2 2 2 0 0 0 0 0)119879 119883 isinR8 The first three predictors were highly correlated withcorrelation equal to 07 and their coefficients were equal inmagnitude while the rest were uncorrelated

Example 2 In this example the true coefficients wereassumed as120573 = (05 1 2 0 0 0 0 0)119879119883 isin R8The first threepredictors were highly correlatedwith correlation equal to 07and their coefficients were different in magnitude while therest were uncorrelated

Example 3 In this example the true parameters were 120573 =(1 1 1 05 1 2 0 0 0 0)119879119883 isin R10The first three predictorswere highly correlated with correlation equal to 07 and theircoefficients were equal in magnitude while the correlationfor the second three predictors was equal to 03 and theircoefficients were different in magnitudes The remainingpredictors were uncorrelated

Example 4 In this example the true parameters were 120573 =(1 1 1 05 1 2 0 0 0 0)119879119883 isin R10The first three predictorswere correlated with correlation equal to 03 and theircoefficients were equal in magnitude while the correlationfor the second three predictors was 07 and their coefficientswere different in magnitudes The remaining predictors wereuncorrelated

Example 5 In this example the true parameters wereassumed as 120573 = (2 2 2 1 1 0 0 0 0 0)119879 119883 isin R10 The firstthree and the second two predictors were highly correlatedwith correlation equal to 07 and their coefficients weredifferent in magnitude while the rest were uncorrelated

For all the values of 120591 and 120590 Table 1 shows that the QR-PACS method has the lowest ME Although the QR-elastic-net andQR-adaptive Lasso have the highest SA it is clear thatall the considered methods do not perform grouping exceptQR-PACS The QR-PACS method successfully identifies thegroups of predictors as seen in the GA and SGA rows

4 Journal of Probability and Statistics

Table 2 ME SA GA and SGA results of Example 2 for n=100

120591 Criterion Ridge QR- QR- QR- QR- QR-PACSQR LASSO SCAD adaptive LASSO elastic-net

120590=1

010

ME(SE) 01170(00201) 00848(00195) 00696(00177) 00665(00165) 00639(00106) 00445(00101)SA 0 60 78 78 84 78NG 100 100 100 100 100 100SNG 0 69 70 70 78 69

050

ME(SE) 00857(00075) 00538(00068) 00456(00062) 00439(00057) 00421(00055) 00156(00035)SA 0 63 81 81 84 79NG 100 100 100 100 100 100SNG 0 70 71 71 80 70

075

ME(SE) 01155(00185) 00841(00179) 00681(00161) 00679(00155) 00630(00111) 00437(00090)SA 0 61 80 81 84 79NG 100 100 100 100 100 100SNG 0 70 70 71 79 70

120590=3

010

ME(SE) 01275(00213) 00968(00199) 00788(00185) 00745(00183) 00728(00106) 00419(00117)SA 0 57 77 77 83 77NG 100 100 100 100 100 100SNG 0 67 69 69 76 67

050

ME(SE) 00876(00078) 00549(00072) 00464(00064) 00447(00062) 00430(00060) 00156(00040)SA 0 63 80 80 83 79NG 100 100 100 100 100 100SNG 0 69 70 70 79 69

075

ME(SE) 01164(00190) 00850(00187) 00692(00170) 00688(00162) 00639(00120) 00446(00098)SA 0 60 79 80 84 79NG 100 100 100 100 100 100SNG 0 69 70 70 78 69

In Table 2 the percentage of no-grouping (NG nogroups found) and percentage of selection and no-grouping(SNG) were reported instead of GA and SGA respectivelyIn terms of prediction and selection the QR-PACS methoddoes not performwell while the QR-elastic-net QR-adaptiveLasso and QR-SCAD perform the best respectively Allthe methods under consideration perform well in termsof not identifying the group Thus the QR-PACS is not arecommendedmethodwhen there is high correlation and thesignificant variables do not form a group

Table 3 demonstrates that the QR-elastic-net and QR-adaptive Lasso have the best SA respectively however theQR-PACS performs better in terms of ME It is obvious thatQR-PACS identifies the important group with high GA andSGA

Table 4 shows that theQR-elastic-net QR-adaptive Lassoand QR-SCAD have the best SA It is clear that the QR-PACShas the best results among the other methods in terms ofMEIn terms ofGAand SGA it can be observed that theQR-PACSperforms well

FromTable 5 it can be noticed that theQR-elastic-net hasthe best SA The QR-PACS has excellent GA Also it is clearthat QR-PACS successfully identifies the groups of predictorsas seen in the GA and SGA

5 NCAA Sports Data

In this section the behavior of the QR-PACS with ridge QRQR-Lasso QR-SCADQR-adaptive Lasso andQR-elastic-net

was illustrated in the analysis of NCAA sports data [21] Westandardized the predictors and centered the response beforethe data analysis

In each repetition the authors randomly split the datainto a training and a testing dataset the percentage of thetesting data was 20 and models were fit onto the trainingset The NCAA sports data were randomly split 100 timeseach to allow for more stable comparisons We reported theaverage and SE of the ratio of test error (RTE) over QR of allmethods and the effective model size (MZ) after accountingfor equality of absolute coefficient estimates

The NCAA data was taken from a study of the effectsof sociodemographic indicators and the sports programs ongraduation rates

The data size is n=94 and p=19 predictorsThe dataset andits description are available from the website (httpwww4statncsuedusimboosvarselectncaahtml) The predictorsare students in top 10 HS (x1) ACT COMPOSITE 25TH(x2) On living campus (x3) first-time undergraduates (x4)Total Enrolment1000 (x5) courses taught by TAs (x6)composite of basketball ranking (x7) in-state tuition1000(x8) room and board1000 (x9) avg BB home attendance(x10) Professor Salary (x11) Ratio of Studentfaculty (x12)white (x13) Assistant professor salary (x14) population ofcity (x15) faculty with PHD (x16) Acceptance rate (x17)receiving loans (x18) and Out of state (x19)

From Table 6 the results indicate that QR-PACS doessignificantly better than ridge QR QR-Lasso QR-SCAD

Journal of Probability and Statistics 5

Table 3 ME SA GA and SGA results of Example 3 for n=100

120591 Criterion Ridge QR- QR- QR- QR- QR-PACSQR LASSO SCAD adaptive LASSO elastic-net

120590=1

010

ME(SE) 01383(00195) 01176(00196) 00933(00164) 00908(00151) 00902(00119) 00558(00098)SA 0 58 79 79 82 78GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

050

ME(SE) 01139(00071) 00843(00062) 00662(00059) 00643(00054) 00638(00051) 00367(00029)SA 0 62 80 81 84 79GA 0 0 0 0 0 86SGA 0 0 0 0 0 71

075

ME(SE) 01289(00180) 01083(00171) 00861(00161) 00843(00147) 00834(00108) 00479(00081)SA 0 61 80 80 84 79GA 0 0 0 0 0 86SGA 0 0 0 0 0 71

120590=3

010

ME(SE) 01490(00199) 01279(00198) 01030(00169) 01016(00155) 00998(00124) 00651(00104)SA 0 57 78 78 80 77GA 0 0 0 0 0 83SGA 0 0 0 0 0 69

050

ME(SE) 01244(00076) 00945(00065) 00867(00062) 00742(00056) 00735(00054) 00464(00032)SA 0 61 79 80 83 78GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

075

ME(SE) 01399(00185) 01189(00171) 00958(00166) 00946(00160) 00930(00111) 00567(00086)SA 0 60 79 80 83 78GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

Table 4 ME SA GA and SGA results of Example 4 for n=100

120591 Criterion Ridge QR- QR- QR- QR- QR-PACSQR LASSO SCAD adaptive LASSO elastic-net

120590=1

010

ME(SE) 01380(00193) 01149(00195) 00939(00165) 00937(00153) 00832(00130) 00560(00105)SA 0 57 79 79 82 78GA 0 0 0 0 0 86SGA 0 0 0 0 0 71

050

ME(SE) 01132(00069) 00816(00062) 00672(00060) 00672(00056) 00568(00062) 00369(00036)SA 0 62 81 81 85 80GA 0 0 0 0 0 86SGA 0 0 0 0 0 72

075

ME(SE) 01280(00178) 01056(00171) 00873(00164) 00872(00149) 00764(00119) 00481(00088)SA 0 61 80 80 84 79GA 0 0 0 0 0 86SGA 0 0 0 0 0 71

120590=3

010

ME(SE) 01495(00200) 01270(00197) 01016(00169) 01008(00154) 00970(00120) 00639(00100)SA 0 57 78 78 81 77GA 0 0 0 0 0 84SGA 0 0 0 0 0 70

050

ME(SE) 01239(00072) 00928(00064) 00859(00060) 00738(00054) 00730(00051) 00458(00029)SA 0 62 80 80 83 79GA 0 0 0 0 0 85SGA 0 0 0 07 0 70

075

ME(SE) 01391(00181) 01182(00167) 00953(00163) 00941(00158) 00921(00106) 00559(00081)SA 0 60 79 80 83 78GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

6 Journal of Probability and Statistics

Table 5 ME SA GA and SGA results of Example 5 for n=100

120591 Criterion Ridge QR- QR- QR- QR- QR-PACSQR LASSO SCAD adaptive LASSO elastic-net

120590=1

010

ME(SE) 01330(00161) 01106(00168) 00914(00154) 00910(00142) 00818(00122) 00548(00981)SA 0 58 79 79 84 79GA 0 0 0 0 0 87SGA 0 0 0 0 0 72

050

ME(SE) 01119(00061) 00807(00053) 00666(00054) 00663(00051) 00561(00054) 00358(00031)SA 0 63 82 82 86 81GA 0 0 0 0 0 87SGA 0 0 0 0 0 73

075

ME(SE) 01274(00169) 01048(00162) 00867(00157) 00865(00143) 00756(00109) 00476(00079)SA 0 62 81 81 84 79GA 0 0 0 0 0 87SGA 0 0 0 0 0 72

120590=3

010

ME(SE) 01486(00188) 01260(00189) 01001(00156) 01000(00148) 00960(00110) 00625(00088)SA 0 57 77 77 81 78GA 0 0 0 0 0 84SGA 0 0 0 0 0 70

050

ME(SE) 01227(00069) 00922(00059) 00850(00053) 00731(00047) 00724(00045) 00445(00020)SA 0 63 80 80 83 80GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

075

ME(SE) 01385(00177) 01172(00154) 00940(00154) 00935(00151) 00926(00100) 00552(00075)SA 0 60 80 80 83 79GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

Table 6 The test error and the effective model size values for the methods under consideration based on the NCAA sport data

Method 120591 = 025 120591 = 050 120591 = 075RTE MZ RTE MZ RTE MZ

Ridge QR 1135 (0035) 19 1131 (0037) 19 1134 (0034) 19QR-LASSO 1126(0033) 6 1123(0032) 6 1124(0033) 6QR-SCAD 1115 (0036) 6 1111 (0035) 6 1113 (0035) 6QR-adaptive LASSO 1110 (0029) 5 1101 (0030) 5 1108 (0029) 5QR-elastic-net 1029(0029) 5 1022 (0026) 5 1025(0030) 5QR-PACS 0913(0014) 5 0896(0012) 5 0912(0015) 5

QR-adaptive Lasso and QR-elastic-net in test error In factridge QR QR-Lasso QR-SCAD QR-adaptive Lasso andQR-elastic-net perform worse than the QR in test error Theeffective model size is 5 for QR-PACS although it includesall variables in the models

6 Conclusions

In this paper QR-PACS for group identification andVS underQR settings is developed which combines the strength of QRand the ability of PACS for consistent group identificationand VS QR-PACS can achieve the two goals simultaneouslyQR-PACS extends PACS frommean regression settings toQRsettings It is proved computationally that it can be simply

carried out with an effective computational algorithm Thepaper shows that QR-PACS can yield promising predictiveprecision as well as identifying related groups in both sim-ulation and the real data Future direction or extension of thecurrent paper is QR-PACS under Bayesian framework Alsorobust QR-PACS is another possible extension of the currentpaper

Data Availability

The data which is studied in our paper is the NCAA sportsdata from Mangold et al [21] It is public and available fromthewebsite (httpwww4statncsuedusimboosvarselectncaahtml) [21]

Journal of Probability and Statistics 7

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paper

References

[1] R Koenker andG Bassett Jr ldquoRegression quantilesrdquo Economet-rica vol 46 no 1 pp 33ndash50 1978

[2] R Tibshirani ldquoRegression shrinkage and selection via thelassordquo Journal of the Royal Statistical Society Series B (StatisticalMethodology) vol 58 no 1 pp 267ndash288 1996

[3] J Fan and R Li ldquoVariable selection via nonconcave penalizedlikelihood and its oracle propertiesrdquo Journal of the AmericanStatistical Association vol 96 no 456 pp 1348ndash1360 2001

[4] R Tibshirani M Saunders S Rosset J Zhu and K KnightldquoSparsity and smoothness via the fused lassordquo Journal of theRoyal Statistical Society B Statistical Methodology vol 67 no1 pp 91ndash108 2005

[5] H Zou and T Hastie ldquoRegularization and variable selectionvia the elastic netrdquo Journal of the Royal Statistical Society BStatistical Methodology vol 67 no 2 pp 301ndash320 2005

[6] M Yuan and Y Lin ldquoModel selection and estimation in regres-sion with grouped variablesrdquo Journal of the Royal StatisticalSociety Series B (Statistical Methodology) vol 68 no 1 pp 49ndash67 2006

[7] H Zou ldquoThe adaptive lasso and its oracle propertiesrdquo Journal ofthe American Statistical Association vol 101 no 476 pp 1418ndash1429 2006

[8] H Zou and H H Zhang ldquoOn the adaptive elastic-net with adiverging number of parametersrdquo The Annals of Statistics vol37 no 4 pp 1733ndash1751 2009

[9] C H Zhang ldquoNearly unbiased variable selection under mini-max concave penaltyrdquoThe Annals of Statistics vol 38 no 2 pp894ndash942 2010

[10] R Koenker ldquoQuantile regression for longitudinal datardquo Journalof Multivariate Analysis vol 91 no 1 pp 74ndash89 2004

[11] H Wang G Li and G Jiang ldquoRobust regression shrinkage andconsistent variable selection through the LAD-Lassordquo Journal ofBusiness amp Economic Statistics vol 25 no 3 pp 347ndash355 2007

[12] Y Li and J Zhu ldquo1198711-norm quantile regressionrdquo Journal ofComputational and Graphical Statistics vol 17 no 1 pp 163ndash185 2008

[13] Y Wu and Y Liu ldquoVariable selection in quantile regressionrdquoStatistica Sinica vol 19 no 2 pp 801ndash817 2009

[14] M Slawski ldquoThe structured elastic net for quantile regressionand support vector classificationrdquo Statistics and Computing vol22 no 1 pp 153ndash168 2012

[15] A Belloni and V Chernozhukov l1ndash Penalized Quantile Regres-sion in High-Dimensional Sparse Models 2011

[16] L Wang Y Wu and R Li ldquoQuantile regression for analyzingheterogeneity in ultra-high dimensionrdquo Journal of the AmericanStatistical Association vol 107 no 497 pp 214ndash222 2012

[17] B Peng and LWang ldquoAn iterative coordinate descent algorithmfor high-dimensional nonconvex penalized quantile regres-sionrdquo Journal of Computational and Graphical Statistics vol 24no 3 pp 676ndash694 2015

[18] D B Sharma H D Bondell and H H Zhang ldquoConsistentgroup identification and variable selection in regression withcorrelated predictorsrdquo Journal of Computational and GraphicalStatistics vol 22 no 2 pp 319ndash340 2013

[19] R Wilcox Introduction to Robust Estimation and HypothesisTesting Academic Press San Diego Calif USA 2nd edition1997

[20] A Alkenani and K Yu ldquoA comparative study for robust canoni-cal correlation methodsrdquo Journal of Statistical Computation andSimulation vol 83 no 4 pp 690ndash720 2013

[21] W D Mangold L Bean and D Adams ldquoThe impact of inter-collegiate athletics on graduation rates among major NCAAdivision I universities implications for college persistencetheory and practicerdquo Journal of Higher Education vol 74 no5 pp 540ndash563 2003

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

Page 3: Group Identification and Variable Selection in Quantile ...downloads.hindawi.com/journals/jps/2019/8504174.pdf · ResearchArticle Group Identification and Variable Selection in Quantile

Journal of Probability and Statistics 3

Table 1 ME SA GA and SGA results of Example 1 for n=100

120591 Criterion Ridge QR- QR- QR- QR- QR-PACSQR LASSO SCAD adaptive LASSO elastic-net

120590=1

010

ME(SE) 01144(00176) 00834(00182) 00684(00165) 00673(00153) 00623(00112) 00432(00093)SA 0 62 79 80 87 79GA 0 0 0 0 0 88SGA 0 0 0 0 0 72

050

ME(SE) 00832(00056) 00524(00065) 00443(00056) 00431(00050) 00404(00041) 00143(00020)SA 0 64 81 82 86 80GA 0 0 0 0 0 89SGA 0 0 0 0 0 73

075

ME(SE) 01140(00173) 00829(00177) 00672(00150) 00668(00146) 00619(00105) 00429(00082)SA 0 63 81 81 88 80GA 0 0 0 0 0 89SGA 0 0 0 0 0 72

120590=3

010

ME(SE) 01471(00183) 01168(00188) 01062(00162) 01056(00162) 00953(00120) 00755(00106)SA 0 59 77 77 84 77GA 0 0 0 0 0 86SGA 0 0 0 0 0 70

050

ME(SE) 01153(00066) 00855(00071) 00763(00060) 00756(00060) 00730(00049) 00473(00034)SA 0 61 80 80 83 78GA 0 0 0 0 0 87SGA 0 0 0 0 0 71

075

ME(SE) 01466(00179) 01161(00185) 01056(00161) 01044(00158) 00945(00117) 00744(00099)SA 0 61 79 79 85 78GA 0 0 0 0 0 87SGA 0 0 0 0 0 70

We compared QR-PACS with ridge QR QR-Lasso QR-SCAD QR-adaptive Lasso and QR-elastic-net The perfor-mance of the methods was compared using model error(ME) criterion for prediction accuracy which was definedby (120573 minus 120573)1015840119881(120573 minus 120573) where 119881 represents the populationcovariance matrix of X and the resulting model complexityfor model discovery The median and standard error (SE)of ME were reported Also selection accuracy (SA oftrue models identified) grouping accuracy (GA of truegroups identified) and of both selection and groupingaccuracy (SGA) were computed and reported Note that noneof ridge QR QR-Lasso QR-SCAD QR-adaptive Lasso andQR- elastic-net perform grouping The sample size was 100and the simulated model was replicated 100 times Sometypical examples are reported as follows

Example 1 In this example we assumed the true parametersfor the model of study as 120573 = (2 2 2 0 0 0 0 0)119879 119883 isinR8 The first three predictors were highly correlated withcorrelation equal to 07 and their coefficients were equal inmagnitude while the rest were uncorrelated

Example 2 In this example the true coefficients wereassumed as120573 = (05 1 2 0 0 0 0 0)119879119883 isin R8The first threepredictors were highly correlatedwith correlation equal to 07and their coefficients were different in magnitude while therest were uncorrelated

Example 3 In this example the true parameters were 120573 =(1 1 1 05 1 2 0 0 0 0)119879119883 isin R10The first three predictorswere highly correlated with correlation equal to 07 and theircoefficients were equal in magnitude while the correlationfor the second three predictors was equal to 03 and theircoefficients were different in magnitudes The remainingpredictors were uncorrelated

Example 4 In this example the true parameters were 120573 =(1 1 1 05 1 2 0 0 0 0)119879119883 isin R10The first three predictorswere correlated with correlation equal to 03 and theircoefficients were equal in magnitude while the correlationfor the second three predictors was 07 and their coefficientswere different in magnitudes The remaining predictors wereuncorrelated

Example 5 In this example the true parameters wereassumed as 120573 = (2 2 2 1 1 0 0 0 0 0)119879 119883 isin R10 The firstthree and the second two predictors were highly correlatedwith correlation equal to 07 and their coefficients weredifferent in magnitude while the rest were uncorrelated

For all the values of 120591 and 120590 Table 1 shows that the QR-PACS method has the lowest ME Although the QR-elastic-net andQR-adaptive Lasso have the highest SA it is clear thatall the considered methods do not perform grouping exceptQR-PACS The QR-PACS method successfully identifies thegroups of predictors as seen in the GA and SGA rows

4 Journal of Probability and Statistics

Table 2 ME SA GA and SGA results of Example 2 for n=100

120591 Criterion Ridge QR- QR- QR- QR- QR-PACSQR LASSO SCAD adaptive LASSO elastic-net

120590=1

010

ME(SE) 01170(00201) 00848(00195) 00696(00177) 00665(00165) 00639(00106) 00445(00101)SA 0 60 78 78 84 78NG 100 100 100 100 100 100SNG 0 69 70 70 78 69

050

ME(SE) 00857(00075) 00538(00068) 00456(00062) 00439(00057) 00421(00055) 00156(00035)SA 0 63 81 81 84 79NG 100 100 100 100 100 100SNG 0 70 71 71 80 70

075

ME(SE) 01155(00185) 00841(00179) 00681(00161) 00679(00155) 00630(00111) 00437(00090)SA 0 61 80 81 84 79NG 100 100 100 100 100 100SNG 0 70 70 71 79 70

120590=3

010

ME(SE) 01275(00213) 00968(00199) 00788(00185) 00745(00183) 00728(00106) 00419(00117)SA 0 57 77 77 83 77NG 100 100 100 100 100 100SNG 0 67 69 69 76 67

050

ME(SE) 00876(00078) 00549(00072) 00464(00064) 00447(00062) 00430(00060) 00156(00040)SA 0 63 80 80 83 79NG 100 100 100 100 100 100SNG 0 69 70 70 79 69

075

ME(SE) 01164(00190) 00850(00187) 00692(00170) 00688(00162) 00639(00120) 00446(00098)SA 0 60 79 80 84 79NG 100 100 100 100 100 100SNG 0 69 70 70 78 69

In Table 2 the percentage of no-grouping (NG nogroups found) and percentage of selection and no-grouping(SNG) were reported instead of GA and SGA respectivelyIn terms of prediction and selection the QR-PACS methoddoes not performwell while the QR-elastic-net QR-adaptiveLasso and QR-SCAD perform the best respectively Allthe methods under consideration perform well in termsof not identifying the group Thus the QR-PACS is not arecommendedmethodwhen there is high correlation and thesignificant variables do not form a group

Table 3 demonstrates that the QR-elastic-net and QR-adaptive Lasso have the best SA respectively however theQR-PACS performs better in terms of ME It is obvious thatQR-PACS identifies the important group with high GA andSGA

Table 4 shows that theQR-elastic-net QR-adaptive Lassoand QR-SCAD have the best SA It is clear that the QR-PACShas the best results among the other methods in terms ofMEIn terms ofGAand SGA it can be observed that theQR-PACSperforms well

FromTable 5 it can be noticed that theQR-elastic-net hasthe best SA The QR-PACS has excellent GA Also it is clearthat QR-PACS successfully identifies the groups of predictorsas seen in the GA and SGA

5 NCAA Sports Data

In this section the behavior of the QR-PACS with ridge QRQR-Lasso QR-SCADQR-adaptive Lasso andQR-elastic-net

was illustrated in the analysis of NCAA sports data [21] Westandardized the predictors and centered the response beforethe data analysis

In each repetition the authors randomly split the datainto a training and a testing dataset the percentage of thetesting data was 20 and models were fit onto the trainingset The NCAA sports data were randomly split 100 timeseach to allow for more stable comparisons We reported theaverage and SE of the ratio of test error (RTE) over QR of allmethods and the effective model size (MZ) after accountingfor equality of absolute coefficient estimates

The NCAA data was taken from a study of the effectsof sociodemographic indicators and the sports programs ongraduation rates

The data size is n=94 and p=19 predictorsThe dataset andits description are available from the website (httpwww4statncsuedusimboosvarselectncaahtml) The predictorsare students in top 10 HS (x1) ACT COMPOSITE 25TH(x2) On living campus (x3) first-time undergraduates (x4)Total Enrolment1000 (x5) courses taught by TAs (x6)composite of basketball ranking (x7) in-state tuition1000(x8) room and board1000 (x9) avg BB home attendance(x10) Professor Salary (x11) Ratio of Studentfaculty (x12)white (x13) Assistant professor salary (x14) population ofcity (x15) faculty with PHD (x16) Acceptance rate (x17)receiving loans (x18) and Out of state (x19)

From Table 6 the results indicate that QR-PACS doessignificantly better than ridge QR QR-Lasso QR-SCAD

Journal of Probability and Statistics 5

Table 3 ME SA GA and SGA results of Example 3 for n=100

120591 Criterion Ridge QR- QR- QR- QR- QR-PACSQR LASSO SCAD adaptive LASSO elastic-net

120590=1

010

ME(SE) 01383(00195) 01176(00196) 00933(00164) 00908(00151) 00902(00119) 00558(00098)SA 0 58 79 79 82 78GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

050

ME(SE) 01139(00071) 00843(00062) 00662(00059) 00643(00054) 00638(00051) 00367(00029)SA 0 62 80 81 84 79GA 0 0 0 0 0 86SGA 0 0 0 0 0 71

075

ME(SE) 01289(00180) 01083(00171) 00861(00161) 00843(00147) 00834(00108) 00479(00081)SA 0 61 80 80 84 79GA 0 0 0 0 0 86SGA 0 0 0 0 0 71

120590=3

010

ME(SE) 01490(00199) 01279(00198) 01030(00169) 01016(00155) 00998(00124) 00651(00104)SA 0 57 78 78 80 77GA 0 0 0 0 0 83SGA 0 0 0 0 0 69

050

ME(SE) 01244(00076) 00945(00065) 00867(00062) 00742(00056) 00735(00054) 00464(00032)SA 0 61 79 80 83 78GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

075

ME(SE) 01399(00185) 01189(00171) 00958(00166) 00946(00160) 00930(00111) 00567(00086)SA 0 60 79 80 83 78GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

Table 4 ME SA GA and SGA results of Example 4 for n=100

120591 Criterion Ridge QR- QR- QR- QR- QR-PACSQR LASSO SCAD adaptive LASSO elastic-net

120590=1

010

ME(SE) 01380(00193) 01149(00195) 00939(00165) 00937(00153) 00832(00130) 00560(00105)SA 0 57 79 79 82 78GA 0 0 0 0 0 86SGA 0 0 0 0 0 71

050

ME(SE) 01132(00069) 00816(00062) 00672(00060) 00672(00056) 00568(00062) 00369(00036)SA 0 62 81 81 85 80GA 0 0 0 0 0 86SGA 0 0 0 0 0 72

075

ME(SE) 01280(00178) 01056(00171) 00873(00164) 00872(00149) 00764(00119) 00481(00088)SA 0 61 80 80 84 79GA 0 0 0 0 0 86SGA 0 0 0 0 0 71

120590=3

010

ME(SE) 01495(00200) 01270(00197) 01016(00169) 01008(00154) 00970(00120) 00639(00100)SA 0 57 78 78 81 77GA 0 0 0 0 0 84SGA 0 0 0 0 0 70

050

ME(SE) 01239(00072) 00928(00064) 00859(00060) 00738(00054) 00730(00051) 00458(00029)SA 0 62 80 80 83 79GA 0 0 0 0 0 85SGA 0 0 0 07 0 70

075

ME(SE) 01391(00181) 01182(00167) 00953(00163) 00941(00158) 00921(00106) 00559(00081)SA 0 60 79 80 83 78GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

6 Journal of Probability and Statistics

Table 5 ME SA GA and SGA results of Example 5 for n=100

120591 Criterion Ridge QR- QR- QR- QR- QR-PACSQR LASSO SCAD adaptive LASSO elastic-net

120590=1

010

ME(SE) 01330(00161) 01106(00168) 00914(00154) 00910(00142) 00818(00122) 00548(00981)SA 0 58 79 79 84 79GA 0 0 0 0 0 87SGA 0 0 0 0 0 72

050

ME(SE) 01119(00061) 00807(00053) 00666(00054) 00663(00051) 00561(00054) 00358(00031)SA 0 63 82 82 86 81GA 0 0 0 0 0 87SGA 0 0 0 0 0 73

075

ME(SE) 01274(00169) 01048(00162) 00867(00157) 00865(00143) 00756(00109) 00476(00079)SA 0 62 81 81 84 79GA 0 0 0 0 0 87SGA 0 0 0 0 0 72

120590=3

010

ME(SE) 01486(00188) 01260(00189) 01001(00156) 01000(00148) 00960(00110) 00625(00088)SA 0 57 77 77 81 78GA 0 0 0 0 0 84SGA 0 0 0 0 0 70

050

ME(SE) 01227(00069) 00922(00059) 00850(00053) 00731(00047) 00724(00045) 00445(00020)SA 0 63 80 80 83 80GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

075

ME(SE) 01385(00177) 01172(00154) 00940(00154) 00935(00151) 00926(00100) 00552(00075)SA 0 60 80 80 83 79GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

Table 6 The test error and the effective model size values for the methods under consideration based on the NCAA sport data

Method 120591 = 025 120591 = 050 120591 = 075RTE MZ RTE MZ RTE MZ

Ridge QR 1135 (0035) 19 1131 (0037) 19 1134 (0034) 19QR-LASSO 1126(0033) 6 1123(0032) 6 1124(0033) 6QR-SCAD 1115 (0036) 6 1111 (0035) 6 1113 (0035) 6QR-adaptive LASSO 1110 (0029) 5 1101 (0030) 5 1108 (0029) 5QR-elastic-net 1029(0029) 5 1022 (0026) 5 1025(0030) 5QR-PACS 0913(0014) 5 0896(0012) 5 0912(0015) 5

QR-adaptive Lasso and QR-elastic-net in test error In factridge QR QR-Lasso QR-SCAD QR-adaptive Lasso andQR-elastic-net perform worse than the QR in test error Theeffective model size is 5 for QR-PACS although it includesall variables in the models

6 Conclusions

In this paper QR-PACS for group identification andVS underQR settings is developed which combines the strength of QRand the ability of PACS for consistent group identificationand VS QR-PACS can achieve the two goals simultaneouslyQR-PACS extends PACS frommean regression settings toQRsettings It is proved computationally that it can be simply

carried out with an effective computational algorithm Thepaper shows that QR-PACS can yield promising predictiveprecision as well as identifying related groups in both sim-ulation and the real data Future direction or extension of thecurrent paper is QR-PACS under Bayesian framework Alsorobust QR-PACS is another possible extension of the currentpaper

Data Availability

The data which is studied in our paper is the NCAA sportsdata from Mangold et al [21] It is public and available fromthewebsite (httpwww4statncsuedusimboosvarselectncaahtml) [21]

Journal of Probability and Statistics 7

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paper

References

[1] R Koenker andG Bassett Jr ldquoRegression quantilesrdquo Economet-rica vol 46 no 1 pp 33ndash50 1978

[2] R Tibshirani ldquoRegression shrinkage and selection via thelassordquo Journal of the Royal Statistical Society Series B (StatisticalMethodology) vol 58 no 1 pp 267ndash288 1996

[3] J Fan and R Li ldquoVariable selection via nonconcave penalizedlikelihood and its oracle propertiesrdquo Journal of the AmericanStatistical Association vol 96 no 456 pp 1348ndash1360 2001

[4] R Tibshirani M Saunders S Rosset J Zhu and K KnightldquoSparsity and smoothness via the fused lassordquo Journal of theRoyal Statistical Society B Statistical Methodology vol 67 no1 pp 91ndash108 2005

[5] H Zou and T Hastie ldquoRegularization and variable selectionvia the elastic netrdquo Journal of the Royal Statistical Society BStatistical Methodology vol 67 no 2 pp 301ndash320 2005

[6] M Yuan and Y Lin ldquoModel selection and estimation in regres-sion with grouped variablesrdquo Journal of the Royal StatisticalSociety Series B (Statistical Methodology) vol 68 no 1 pp 49ndash67 2006

[7] H Zou ldquoThe adaptive lasso and its oracle propertiesrdquo Journal ofthe American Statistical Association vol 101 no 476 pp 1418ndash1429 2006

[8] H Zou and H H Zhang ldquoOn the adaptive elastic-net with adiverging number of parametersrdquo The Annals of Statistics vol37 no 4 pp 1733ndash1751 2009

[9] C H Zhang ldquoNearly unbiased variable selection under mini-max concave penaltyrdquoThe Annals of Statistics vol 38 no 2 pp894ndash942 2010

[10] R Koenker ldquoQuantile regression for longitudinal datardquo Journalof Multivariate Analysis vol 91 no 1 pp 74ndash89 2004

[11] H Wang G Li and G Jiang ldquoRobust regression shrinkage andconsistent variable selection through the LAD-Lassordquo Journal ofBusiness amp Economic Statistics vol 25 no 3 pp 347ndash355 2007

[12] Y Li and J Zhu ldquo1198711-norm quantile regressionrdquo Journal ofComputational and Graphical Statistics vol 17 no 1 pp 163ndash185 2008

[13] Y Wu and Y Liu ldquoVariable selection in quantile regressionrdquoStatistica Sinica vol 19 no 2 pp 801ndash817 2009

[14] M Slawski ldquoThe structured elastic net for quantile regressionand support vector classificationrdquo Statistics and Computing vol22 no 1 pp 153ndash168 2012

[15] A Belloni and V Chernozhukov l1ndash Penalized Quantile Regres-sion in High-Dimensional Sparse Models 2011

[16] L Wang Y Wu and R Li ldquoQuantile regression for analyzingheterogeneity in ultra-high dimensionrdquo Journal of the AmericanStatistical Association vol 107 no 497 pp 214ndash222 2012

[17] B Peng and LWang ldquoAn iterative coordinate descent algorithmfor high-dimensional nonconvex penalized quantile regres-sionrdquo Journal of Computational and Graphical Statistics vol 24no 3 pp 676ndash694 2015

[18] D B Sharma H D Bondell and H H Zhang ldquoConsistentgroup identification and variable selection in regression withcorrelated predictorsrdquo Journal of Computational and GraphicalStatistics vol 22 no 2 pp 319ndash340 2013

[19] R Wilcox Introduction to Robust Estimation and HypothesisTesting Academic Press San Diego Calif USA 2nd edition1997

[20] A Alkenani and K Yu ldquoA comparative study for robust canoni-cal correlation methodsrdquo Journal of Statistical Computation andSimulation vol 83 no 4 pp 690ndash720 2013

[21] W D Mangold L Bean and D Adams ldquoThe impact of inter-collegiate athletics on graduation rates among major NCAAdivision I universities implications for college persistencetheory and practicerdquo Journal of Higher Education vol 74 no5 pp 540ndash563 2003

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

Page 4: Group Identification and Variable Selection in Quantile ...downloads.hindawi.com/journals/jps/2019/8504174.pdf · ResearchArticle Group Identification and Variable Selection in Quantile

4 Journal of Probability and Statistics

Table 2 ME SA GA and SGA results of Example 2 for n=100

120591 Criterion Ridge QR- QR- QR- QR- QR-PACSQR LASSO SCAD adaptive LASSO elastic-net

120590=1

010

ME(SE) 01170(00201) 00848(00195) 00696(00177) 00665(00165) 00639(00106) 00445(00101)SA 0 60 78 78 84 78NG 100 100 100 100 100 100SNG 0 69 70 70 78 69

050

ME(SE) 00857(00075) 00538(00068) 00456(00062) 00439(00057) 00421(00055) 00156(00035)SA 0 63 81 81 84 79NG 100 100 100 100 100 100SNG 0 70 71 71 80 70

075

ME(SE) 01155(00185) 00841(00179) 00681(00161) 00679(00155) 00630(00111) 00437(00090)SA 0 61 80 81 84 79NG 100 100 100 100 100 100SNG 0 70 70 71 79 70

120590=3

010

ME(SE) 01275(00213) 00968(00199) 00788(00185) 00745(00183) 00728(00106) 00419(00117)SA 0 57 77 77 83 77NG 100 100 100 100 100 100SNG 0 67 69 69 76 67

050

ME(SE) 00876(00078) 00549(00072) 00464(00064) 00447(00062) 00430(00060) 00156(00040)SA 0 63 80 80 83 79NG 100 100 100 100 100 100SNG 0 69 70 70 79 69

075

ME(SE) 01164(00190) 00850(00187) 00692(00170) 00688(00162) 00639(00120) 00446(00098)SA 0 60 79 80 84 79NG 100 100 100 100 100 100SNG 0 69 70 70 78 69

In Table 2 the percentage of no-grouping (NG nogroups found) and percentage of selection and no-grouping(SNG) were reported instead of GA and SGA respectivelyIn terms of prediction and selection the QR-PACS methoddoes not performwell while the QR-elastic-net QR-adaptiveLasso and QR-SCAD perform the best respectively Allthe methods under consideration perform well in termsof not identifying the group Thus the QR-PACS is not arecommendedmethodwhen there is high correlation and thesignificant variables do not form a group

Table 3 demonstrates that the QR-elastic-net and QR-adaptive Lasso have the best SA respectively however theQR-PACS performs better in terms of ME It is obvious thatQR-PACS identifies the important group with high GA andSGA

Table 4 shows that theQR-elastic-net QR-adaptive Lassoand QR-SCAD have the best SA It is clear that the QR-PACShas the best results among the other methods in terms ofMEIn terms ofGAand SGA it can be observed that theQR-PACSperforms well

FromTable 5 it can be noticed that theQR-elastic-net hasthe best SA The QR-PACS has excellent GA Also it is clearthat QR-PACS successfully identifies the groups of predictorsas seen in the GA and SGA

5 NCAA Sports Data

In this section the behavior of the QR-PACS with ridge QRQR-Lasso QR-SCADQR-adaptive Lasso andQR-elastic-net

was illustrated in the analysis of NCAA sports data [21] Westandardized the predictors and centered the response beforethe data analysis

In each repetition the authors randomly split the datainto a training and a testing dataset the percentage of thetesting data was 20 and models were fit onto the trainingset The NCAA sports data were randomly split 100 timeseach to allow for more stable comparisons We reported theaverage and SE of the ratio of test error (RTE) over QR of allmethods and the effective model size (MZ) after accountingfor equality of absolute coefficient estimates

The NCAA data was taken from a study of the effectsof sociodemographic indicators and the sports programs ongraduation rates

The data size is n=94 and p=19 predictorsThe dataset andits description are available from the website (httpwww4statncsuedusimboosvarselectncaahtml) The predictorsare students in top 10 HS (x1) ACT COMPOSITE 25TH(x2) On living campus (x3) first-time undergraduates (x4)Total Enrolment1000 (x5) courses taught by TAs (x6)composite of basketball ranking (x7) in-state tuition1000(x8) room and board1000 (x9) avg BB home attendance(x10) Professor Salary (x11) Ratio of Studentfaculty (x12)white (x13) Assistant professor salary (x14) population ofcity (x15) faculty with PHD (x16) Acceptance rate (x17)receiving loans (x18) and Out of state (x19)

From Table 6 the results indicate that QR-PACS doessignificantly better than ridge QR QR-Lasso QR-SCAD

Journal of Probability and Statistics 5

Table 3 ME SA GA and SGA results of Example 3 for n=100

120591 Criterion Ridge QR- QR- QR- QR- QR-PACSQR LASSO SCAD adaptive LASSO elastic-net

120590=1

010

ME(SE) 01383(00195) 01176(00196) 00933(00164) 00908(00151) 00902(00119) 00558(00098)SA 0 58 79 79 82 78GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

050

ME(SE) 01139(00071) 00843(00062) 00662(00059) 00643(00054) 00638(00051) 00367(00029)SA 0 62 80 81 84 79GA 0 0 0 0 0 86SGA 0 0 0 0 0 71

075

ME(SE) 01289(00180) 01083(00171) 00861(00161) 00843(00147) 00834(00108) 00479(00081)SA 0 61 80 80 84 79GA 0 0 0 0 0 86SGA 0 0 0 0 0 71

120590=3

010

ME(SE) 01490(00199) 01279(00198) 01030(00169) 01016(00155) 00998(00124) 00651(00104)SA 0 57 78 78 80 77GA 0 0 0 0 0 83SGA 0 0 0 0 0 69

050

ME(SE) 01244(00076) 00945(00065) 00867(00062) 00742(00056) 00735(00054) 00464(00032)SA 0 61 79 80 83 78GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

075

ME(SE) 01399(00185) 01189(00171) 00958(00166) 00946(00160) 00930(00111) 00567(00086)SA 0 60 79 80 83 78GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

Table 4 ME SA GA and SGA results of Example 4 for n=100

120591 Criterion Ridge QR- QR- QR- QR- QR-PACSQR LASSO SCAD adaptive LASSO elastic-net

120590=1

010

ME(SE) 01380(00193) 01149(00195) 00939(00165) 00937(00153) 00832(00130) 00560(00105)SA 0 57 79 79 82 78GA 0 0 0 0 0 86SGA 0 0 0 0 0 71

050

ME(SE) 01132(00069) 00816(00062) 00672(00060) 00672(00056) 00568(00062) 00369(00036)SA 0 62 81 81 85 80GA 0 0 0 0 0 86SGA 0 0 0 0 0 72

075

ME(SE) 01280(00178) 01056(00171) 00873(00164) 00872(00149) 00764(00119) 00481(00088)SA 0 61 80 80 84 79GA 0 0 0 0 0 86SGA 0 0 0 0 0 71

120590=3

010

ME(SE) 01495(00200) 01270(00197) 01016(00169) 01008(00154) 00970(00120) 00639(00100)SA 0 57 78 78 81 77GA 0 0 0 0 0 84SGA 0 0 0 0 0 70

050

ME(SE) 01239(00072) 00928(00064) 00859(00060) 00738(00054) 00730(00051) 00458(00029)SA 0 62 80 80 83 79GA 0 0 0 0 0 85SGA 0 0 0 07 0 70

075

ME(SE) 01391(00181) 01182(00167) 00953(00163) 00941(00158) 00921(00106) 00559(00081)SA 0 60 79 80 83 78GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

6 Journal of Probability and Statistics

Table 5 ME SA GA and SGA results of Example 5 for n=100

120591 Criterion Ridge QR- QR- QR- QR- QR-PACSQR LASSO SCAD adaptive LASSO elastic-net

120590=1

010

ME(SE) 01330(00161) 01106(00168) 00914(00154) 00910(00142) 00818(00122) 00548(00981)SA 0 58 79 79 84 79GA 0 0 0 0 0 87SGA 0 0 0 0 0 72

050

ME(SE) 01119(00061) 00807(00053) 00666(00054) 00663(00051) 00561(00054) 00358(00031)SA 0 63 82 82 86 81GA 0 0 0 0 0 87SGA 0 0 0 0 0 73

075

ME(SE) 01274(00169) 01048(00162) 00867(00157) 00865(00143) 00756(00109) 00476(00079)SA 0 62 81 81 84 79GA 0 0 0 0 0 87SGA 0 0 0 0 0 72

120590=3

010

ME(SE) 01486(00188) 01260(00189) 01001(00156) 01000(00148) 00960(00110) 00625(00088)SA 0 57 77 77 81 78GA 0 0 0 0 0 84SGA 0 0 0 0 0 70

050

ME(SE) 01227(00069) 00922(00059) 00850(00053) 00731(00047) 00724(00045) 00445(00020)SA 0 63 80 80 83 80GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

075

ME(SE) 01385(00177) 01172(00154) 00940(00154) 00935(00151) 00926(00100) 00552(00075)SA 0 60 80 80 83 79GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

Table 6 The test error and the effective model size values for the methods under consideration based on the NCAA sport data

Method 120591 = 025 120591 = 050 120591 = 075RTE MZ RTE MZ RTE MZ

Ridge QR 1135 (0035) 19 1131 (0037) 19 1134 (0034) 19QR-LASSO 1126(0033) 6 1123(0032) 6 1124(0033) 6QR-SCAD 1115 (0036) 6 1111 (0035) 6 1113 (0035) 6QR-adaptive LASSO 1110 (0029) 5 1101 (0030) 5 1108 (0029) 5QR-elastic-net 1029(0029) 5 1022 (0026) 5 1025(0030) 5QR-PACS 0913(0014) 5 0896(0012) 5 0912(0015) 5

QR-adaptive Lasso and QR-elastic-net in test error In factridge QR QR-Lasso QR-SCAD QR-adaptive Lasso andQR-elastic-net perform worse than the QR in test error Theeffective model size is 5 for QR-PACS although it includesall variables in the models

6 Conclusions

In this paper QR-PACS for group identification andVS underQR settings is developed which combines the strength of QRand the ability of PACS for consistent group identificationand VS QR-PACS can achieve the two goals simultaneouslyQR-PACS extends PACS frommean regression settings toQRsettings It is proved computationally that it can be simply

carried out with an effective computational algorithm Thepaper shows that QR-PACS can yield promising predictiveprecision as well as identifying related groups in both sim-ulation and the real data Future direction or extension of thecurrent paper is QR-PACS under Bayesian framework Alsorobust QR-PACS is another possible extension of the currentpaper

Data Availability

The data which is studied in our paper is the NCAA sportsdata from Mangold et al [21] It is public and available fromthewebsite (httpwww4statncsuedusimboosvarselectncaahtml) [21]

Journal of Probability and Statistics 7

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paper

References

[1] R Koenker andG Bassett Jr ldquoRegression quantilesrdquo Economet-rica vol 46 no 1 pp 33ndash50 1978

[2] R Tibshirani ldquoRegression shrinkage and selection via thelassordquo Journal of the Royal Statistical Society Series B (StatisticalMethodology) vol 58 no 1 pp 267ndash288 1996

[3] J Fan and R Li ldquoVariable selection via nonconcave penalizedlikelihood and its oracle propertiesrdquo Journal of the AmericanStatistical Association vol 96 no 456 pp 1348ndash1360 2001

[4] R Tibshirani M Saunders S Rosset J Zhu and K KnightldquoSparsity and smoothness via the fused lassordquo Journal of theRoyal Statistical Society B Statistical Methodology vol 67 no1 pp 91ndash108 2005

[5] H Zou and T Hastie ldquoRegularization and variable selectionvia the elastic netrdquo Journal of the Royal Statistical Society BStatistical Methodology vol 67 no 2 pp 301ndash320 2005

[6] M Yuan and Y Lin ldquoModel selection and estimation in regres-sion with grouped variablesrdquo Journal of the Royal StatisticalSociety Series B (Statistical Methodology) vol 68 no 1 pp 49ndash67 2006

[7] H Zou ldquoThe adaptive lasso and its oracle propertiesrdquo Journal ofthe American Statistical Association vol 101 no 476 pp 1418ndash1429 2006

[8] H Zou and H H Zhang ldquoOn the adaptive elastic-net with adiverging number of parametersrdquo The Annals of Statistics vol37 no 4 pp 1733ndash1751 2009

[9] C H Zhang ldquoNearly unbiased variable selection under mini-max concave penaltyrdquoThe Annals of Statistics vol 38 no 2 pp894ndash942 2010

[10] R Koenker ldquoQuantile regression for longitudinal datardquo Journalof Multivariate Analysis vol 91 no 1 pp 74ndash89 2004

[11] H Wang G Li and G Jiang ldquoRobust regression shrinkage andconsistent variable selection through the LAD-Lassordquo Journal ofBusiness amp Economic Statistics vol 25 no 3 pp 347ndash355 2007

[12] Y Li and J Zhu ldquo1198711-norm quantile regressionrdquo Journal ofComputational and Graphical Statistics vol 17 no 1 pp 163ndash185 2008

[13] Y Wu and Y Liu ldquoVariable selection in quantile regressionrdquoStatistica Sinica vol 19 no 2 pp 801ndash817 2009

[14] M Slawski ldquoThe structured elastic net for quantile regressionand support vector classificationrdquo Statistics and Computing vol22 no 1 pp 153ndash168 2012

[15] A Belloni and V Chernozhukov l1ndash Penalized Quantile Regres-sion in High-Dimensional Sparse Models 2011

[16] L Wang Y Wu and R Li ldquoQuantile regression for analyzingheterogeneity in ultra-high dimensionrdquo Journal of the AmericanStatistical Association vol 107 no 497 pp 214ndash222 2012

[17] B Peng and LWang ldquoAn iterative coordinate descent algorithmfor high-dimensional nonconvex penalized quantile regres-sionrdquo Journal of Computational and Graphical Statistics vol 24no 3 pp 676ndash694 2015

[18] D B Sharma H D Bondell and H H Zhang ldquoConsistentgroup identification and variable selection in regression withcorrelated predictorsrdquo Journal of Computational and GraphicalStatistics vol 22 no 2 pp 319ndash340 2013

[19] R Wilcox Introduction to Robust Estimation and HypothesisTesting Academic Press San Diego Calif USA 2nd edition1997

[20] A Alkenani and K Yu ldquoA comparative study for robust canoni-cal correlation methodsrdquo Journal of Statistical Computation andSimulation vol 83 no 4 pp 690ndash720 2013

[21] W D Mangold L Bean and D Adams ldquoThe impact of inter-collegiate athletics on graduation rates among major NCAAdivision I universities implications for college persistencetheory and practicerdquo Journal of Higher Education vol 74 no5 pp 540ndash563 2003

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

Page 5: Group Identification and Variable Selection in Quantile ...downloads.hindawi.com/journals/jps/2019/8504174.pdf · ResearchArticle Group Identification and Variable Selection in Quantile

Journal of Probability and Statistics 5

Table 3 ME SA GA and SGA results of Example 3 for n=100

120591 Criterion Ridge QR- QR- QR- QR- QR-PACSQR LASSO SCAD adaptive LASSO elastic-net

120590=1

010

ME(SE) 01383(00195) 01176(00196) 00933(00164) 00908(00151) 00902(00119) 00558(00098)SA 0 58 79 79 82 78GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

050

ME(SE) 01139(00071) 00843(00062) 00662(00059) 00643(00054) 00638(00051) 00367(00029)SA 0 62 80 81 84 79GA 0 0 0 0 0 86SGA 0 0 0 0 0 71

075

ME(SE) 01289(00180) 01083(00171) 00861(00161) 00843(00147) 00834(00108) 00479(00081)SA 0 61 80 80 84 79GA 0 0 0 0 0 86SGA 0 0 0 0 0 71

120590=3

010

ME(SE) 01490(00199) 01279(00198) 01030(00169) 01016(00155) 00998(00124) 00651(00104)SA 0 57 78 78 80 77GA 0 0 0 0 0 83SGA 0 0 0 0 0 69

050

ME(SE) 01244(00076) 00945(00065) 00867(00062) 00742(00056) 00735(00054) 00464(00032)SA 0 61 79 80 83 78GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

075

ME(SE) 01399(00185) 01189(00171) 00958(00166) 00946(00160) 00930(00111) 00567(00086)SA 0 60 79 80 83 78GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

Table 4 ME SA GA and SGA results of Example 4 for n=100

120591 Criterion Ridge QR- QR- QR- QR- QR-PACSQR LASSO SCAD adaptive LASSO elastic-net

120590=1

010

ME(SE) 01380(00193) 01149(00195) 00939(00165) 00937(00153) 00832(00130) 00560(00105)SA 0 57 79 79 82 78GA 0 0 0 0 0 86SGA 0 0 0 0 0 71

050

ME(SE) 01132(00069) 00816(00062) 00672(00060) 00672(00056) 00568(00062) 00369(00036)SA 0 62 81 81 85 80GA 0 0 0 0 0 86SGA 0 0 0 0 0 72

075

ME(SE) 01280(00178) 01056(00171) 00873(00164) 00872(00149) 00764(00119) 00481(00088)SA 0 61 80 80 84 79GA 0 0 0 0 0 86SGA 0 0 0 0 0 71

120590=3

010

ME(SE) 01495(00200) 01270(00197) 01016(00169) 01008(00154) 00970(00120) 00639(00100)SA 0 57 78 78 81 77GA 0 0 0 0 0 84SGA 0 0 0 0 0 70

050

ME(SE) 01239(00072) 00928(00064) 00859(00060) 00738(00054) 00730(00051) 00458(00029)SA 0 62 80 80 83 79GA 0 0 0 0 0 85SGA 0 0 0 07 0 70

075

ME(SE) 01391(00181) 01182(00167) 00953(00163) 00941(00158) 00921(00106) 00559(00081)SA 0 60 79 80 83 78GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

6 Journal of Probability and Statistics

Table 5 ME SA GA and SGA results of Example 5 for n=100

120591 Criterion Ridge QR- QR- QR- QR- QR-PACSQR LASSO SCAD adaptive LASSO elastic-net

120590=1

010

ME(SE) 01330(00161) 01106(00168) 00914(00154) 00910(00142) 00818(00122) 00548(00981)SA 0 58 79 79 84 79GA 0 0 0 0 0 87SGA 0 0 0 0 0 72

050

ME(SE) 01119(00061) 00807(00053) 00666(00054) 00663(00051) 00561(00054) 00358(00031)SA 0 63 82 82 86 81GA 0 0 0 0 0 87SGA 0 0 0 0 0 73

075

ME(SE) 01274(00169) 01048(00162) 00867(00157) 00865(00143) 00756(00109) 00476(00079)SA 0 62 81 81 84 79GA 0 0 0 0 0 87SGA 0 0 0 0 0 72

120590=3

010

ME(SE) 01486(00188) 01260(00189) 01001(00156) 01000(00148) 00960(00110) 00625(00088)SA 0 57 77 77 81 78GA 0 0 0 0 0 84SGA 0 0 0 0 0 70

050

ME(SE) 01227(00069) 00922(00059) 00850(00053) 00731(00047) 00724(00045) 00445(00020)SA 0 63 80 80 83 80GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

075

ME(SE) 01385(00177) 01172(00154) 00940(00154) 00935(00151) 00926(00100) 00552(00075)SA 0 60 80 80 83 79GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

Table 6 The test error and the effective model size values for the methods under consideration based on the NCAA sport data

Method 120591 = 025 120591 = 050 120591 = 075RTE MZ RTE MZ RTE MZ

Ridge QR 1135 (0035) 19 1131 (0037) 19 1134 (0034) 19QR-LASSO 1126(0033) 6 1123(0032) 6 1124(0033) 6QR-SCAD 1115 (0036) 6 1111 (0035) 6 1113 (0035) 6QR-adaptive LASSO 1110 (0029) 5 1101 (0030) 5 1108 (0029) 5QR-elastic-net 1029(0029) 5 1022 (0026) 5 1025(0030) 5QR-PACS 0913(0014) 5 0896(0012) 5 0912(0015) 5

QR-adaptive Lasso and QR-elastic-net in test error In factridge QR QR-Lasso QR-SCAD QR-adaptive Lasso andQR-elastic-net perform worse than the QR in test error Theeffective model size is 5 for QR-PACS although it includesall variables in the models

6 Conclusions

In this paper QR-PACS for group identification andVS underQR settings is developed which combines the strength of QRand the ability of PACS for consistent group identificationand VS QR-PACS can achieve the two goals simultaneouslyQR-PACS extends PACS frommean regression settings toQRsettings It is proved computationally that it can be simply

carried out with an effective computational algorithm Thepaper shows that QR-PACS can yield promising predictiveprecision as well as identifying related groups in both sim-ulation and the real data Future direction or extension of thecurrent paper is QR-PACS under Bayesian framework Alsorobust QR-PACS is another possible extension of the currentpaper

Data Availability

The data which is studied in our paper is the NCAA sportsdata from Mangold et al [21] It is public and available fromthewebsite (httpwww4statncsuedusimboosvarselectncaahtml) [21]

Journal of Probability and Statistics 7

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paper

References

[1] R Koenker andG Bassett Jr ldquoRegression quantilesrdquo Economet-rica vol 46 no 1 pp 33ndash50 1978

[2] R Tibshirani ldquoRegression shrinkage and selection via thelassordquo Journal of the Royal Statistical Society Series B (StatisticalMethodology) vol 58 no 1 pp 267ndash288 1996

[3] J Fan and R Li ldquoVariable selection via nonconcave penalizedlikelihood and its oracle propertiesrdquo Journal of the AmericanStatistical Association vol 96 no 456 pp 1348ndash1360 2001

[4] R Tibshirani M Saunders S Rosset J Zhu and K KnightldquoSparsity and smoothness via the fused lassordquo Journal of theRoyal Statistical Society B Statistical Methodology vol 67 no1 pp 91ndash108 2005

[5] H Zou and T Hastie ldquoRegularization and variable selectionvia the elastic netrdquo Journal of the Royal Statistical Society BStatistical Methodology vol 67 no 2 pp 301ndash320 2005

[6] M Yuan and Y Lin ldquoModel selection and estimation in regres-sion with grouped variablesrdquo Journal of the Royal StatisticalSociety Series B (Statistical Methodology) vol 68 no 1 pp 49ndash67 2006

[7] H Zou ldquoThe adaptive lasso and its oracle propertiesrdquo Journal ofthe American Statistical Association vol 101 no 476 pp 1418ndash1429 2006

[8] H Zou and H H Zhang ldquoOn the adaptive elastic-net with adiverging number of parametersrdquo The Annals of Statistics vol37 no 4 pp 1733ndash1751 2009

[9] C H Zhang ldquoNearly unbiased variable selection under mini-max concave penaltyrdquoThe Annals of Statistics vol 38 no 2 pp894ndash942 2010

[10] R Koenker ldquoQuantile regression for longitudinal datardquo Journalof Multivariate Analysis vol 91 no 1 pp 74ndash89 2004

[11] H Wang G Li and G Jiang ldquoRobust regression shrinkage andconsistent variable selection through the LAD-Lassordquo Journal ofBusiness amp Economic Statistics vol 25 no 3 pp 347ndash355 2007

[12] Y Li and J Zhu ldquo1198711-norm quantile regressionrdquo Journal ofComputational and Graphical Statistics vol 17 no 1 pp 163ndash185 2008

[13] Y Wu and Y Liu ldquoVariable selection in quantile regressionrdquoStatistica Sinica vol 19 no 2 pp 801ndash817 2009

[14] M Slawski ldquoThe structured elastic net for quantile regressionand support vector classificationrdquo Statistics and Computing vol22 no 1 pp 153ndash168 2012

[15] A Belloni and V Chernozhukov l1ndash Penalized Quantile Regres-sion in High-Dimensional Sparse Models 2011

[16] L Wang Y Wu and R Li ldquoQuantile regression for analyzingheterogeneity in ultra-high dimensionrdquo Journal of the AmericanStatistical Association vol 107 no 497 pp 214ndash222 2012

[17] B Peng and LWang ldquoAn iterative coordinate descent algorithmfor high-dimensional nonconvex penalized quantile regres-sionrdquo Journal of Computational and Graphical Statistics vol 24no 3 pp 676ndash694 2015

[18] D B Sharma H D Bondell and H H Zhang ldquoConsistentgroup identification and variable selection in regression withcorrelated predictorsrdquo Journal of Computational and GraphicalStatistics vol 22 no 2 pp 319ndash340 2013

[19] R Wilcox Introduction to Robust Estimation and HypothesisTesting Academic Press San Diego Calif USA 2nd edition1997

[20] A Alkenani and K Yu ldquoA comparative study for robust canoni-cal correlation methodsrdquo Journal of Statistical Computation andSimulation vol 83 no 4 pp 690ndash720 2013

[21] W D Mangold L Bean and D Adams ldquoThe impact of inter-collegiate athletics on graduation rates among major NCAAdivision I universities implications for college persistencetheory and practicerdquo Journal of Higher Education vol 74 no5 pp 540ndash563 2003

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

Page 6: Group Identification and Variable Selection in Quantile ...downloads.hindawi.com/journals/jps/2019/8504174.pdf · ResearchArticle Group Identification and Variable Selection in Quantile

6 Journal of Probability and Statistics

Table 5 ME SA GA and SGA results of Example 5 for n=100

120591 Criterion Ridge QR- QR- QR- QR- QR-PACSQR LASSO SCAD adaptive LASSO elastic-net

120590=1

010

ME(SE) 01330(00161) 01106(00168) 00914(00154) 00910(00142) 00818(00122) 00548(00981)SA 0 58 79 79 84 79GA 0 0 0 0 0 87SGA 0 0 0 0 0 72

050

ME(SE) 01119(00061) 00807(00053) 00666(00054) 00663(00051) 00561(00054) 00358(00031)SA 0 63 82 82 86 81GA 0 0 0 0 0 87SGA 0 0 0 0 0 73

075

ME(SE) 01274(00169) 01048(00162) 00867(00157) 00865(00143) 00756(00109) 00476(00079)SA 0 62 81 81 84 79GA 0 0 0 0 0 87SGA 0 0 0 0 0 72

120590=3

010

ME(SE) 01486(00188) 01260(00189) 01001(00156) 01000(00148) 00960(00110) 00625(00088)SA 0 57 77 77 81 78GA 0 0 0 0 0 84SGA 0 0 0 0 0 70

050

ME(SE) 01227(00069) 00922(00059) 00850(00053) 00731(00047) 00724(00045) 00445(00020)SA 0 63 80 80 83 80GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

075

ME(SE) 01385(00177) 01172(00154) 00940(00154) 00935(00151) 00926(00100) 00552(00075)SA 0 60 80 80 83 79GA 0 0 0 0 0 85SGA 0 0 0 0 0 70

Table 6 The test error and the effective model size values for the methods under consideration based on the NCAA sport data

Method 120591 = 025 120591 = 050 120591 = 075RTE MZ RTE MZ RTE MZ

Ridge QR 1135 (0035) 19 1131 (0037) 19 1134 (0034) 19QR-LASSO 1126(0033) 6 1123(0032) 6 1124(0033) 6QR-SCAD 1115 (0036) 6 1111 (0035) 6 1113 (0035) 6QR-adaptive LASSO 1110 (0029) 5 1101 (0030) 5 1108 (0029) 5QR-elastic-net 1029(0029) 5 1022 (0026) 5 1025(0030) 5QR-PACS 0913(0014) 5 0896(0012) 5 0912(0015) 5

QR-adaptive Lasso and QR-elastic-net in test error In factridge QR QR-Lasso QR-SCAD QR-adaptive Lasso andQR-elastic-net perform worse than the QR in test error Theeffective model size is 5 for QR-PACS although it includesall variables in the models

6 Conclusions

In this paper QR-PACS for group identification andVS underQR settings is developed which combines the strength of QRand the ability of PACS for consistent group identificationand VS QR-PACS can achieve the two goals simultaneouslyQR-PACS extends PACS frommean regression settings toQRsettings It is proved computationally that it can be simply

carried out with an effective computational algorithm Thepaper shows that QR-PACS can yield promising predictiveprecision as well as identifying related groups in both sim-ulation and the real data Future direction or extension of thecurrent paper is QR-PACS under Bayesian framework Alsorobust QR-PACS is another possible extension of the currentpaper

Data Availability

The data which is studied in our paper is the NCAA sportsdata from Mangold et al [21] It is public and available fromthewebsite (httpwww4statncsuedusimboosvarselectncaahtml) [21]

Journal of Probability and Statistics 7

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paper

References

[1] R Koenker andG Bassett Jr ldquoRegression quantilesrdquo Economet-rica vol 46 no 1 pp 33ndash50 1978

[2] R Tibshirani ldquoRegression shrinkage and selection via thelassordquo Journal of the Royal Statistical Society Series B (StatisticalMethodology) vol 58 no 1 pp 267ndash288 1996

[3] J Fan and R Li ldquoVariable selection via nonconcave penalizedlikelihood and its oracle propertiesrdquo Journal of the AmericanStatistical Association vol 96 no 456 pp 1348ndash1360 2001

[4] R Tibshirani M Saunders S Rosset J Zhu and K KnightldquoSparsity and smoothness via the fused lassordquo Journal of theRoyal Statistical Society B Statistical Methodology vol 67 no1 pp 91ndash108 2005

[5] H Zou and T Hastie ldquoRegularization and variable selectionvia the elastic netrdquo Journal of the Royal Statistical Society BStatistical Methodology vol 67 no 2 pp 301ndash320 2005

[6] M Yuan and Y Lin ldquoModel selection and estimation in regres-sion with grouped variablesrdquo Journal of the Royal StatisticalSociety Series B (Statistical Methodology) vol 68 no 1 pp 49ndash67 2006

[7] H Zou ldquoThe adaptive lasso and its oracle propertiesrdquo Journal ofthe American Statistical Association vol 101 no 476 pp 1418ndash1429 2006

[8] H Zou and H H Zhang ldquoOn the adaptive elastic-net with adiverging number of parametersrdquo The Annals of Statistics vol37 no 4 pp 1733ndash1751 2009

[9] C H Zhang ldquoNearly unbiased variable selection under mini-max concave penaltyrdquoThe Annals of Statistics vol 38 no 2 pp894ndash942 2010

[10] R Koenker ldquoQuantile regression for longitudinal datardquo Journalof Multivariate Analysis vol 91 no 1 pp 74ndash89 2004

[11] H Wang G Li and G Jiang ldquoRobust regression shrinkage andconsistent variable selection through the LAD-Lassordquo Journal ofBusiness amp Economic Statistics vol 25 no 3 pp 347ndash355 2007

[12] Y Li and J Zhu ldquo1198711-norm quantile regressionrdquo Journal ofComputational and Graphical Statistics vol 17 no 1 pp 163ndash185 2008

[13] Y Wu and Y Liu ldquoVariable selection in quantile regressionrdquoStatistica Sinica vol 19 no 2 pp 801ndash817 2009

[14] M Slawski ldquoThe structured elastic net for quantile regressionand support vector classificationrdquo Statistics and Computing vol22 no 1 pp 153ndash168 2012

[15] A Belloni and V Chernozhukov l1ndash Penalized Quantile Regres-sion in High-Dimensional Sparse Models 2011

[16] L Wang Y Wu and R Li ldquoQuantile regression for analyzingheterogeneity in ultra-high dimensionrdquo Journal of the AmericanStatistical Association vol 107 no 497 pp 214ndash222 2012

[17] B Peng and LWang ldquoAn iterative coordinate descent algorithmfor high-dimensional nonconvex penalized quantile regres-sionrdquo Journal of Computational and Graphical Statistics vol 24no 3 pp 676ndash694 2015

[18] D B Sharma H D Bondell and H H Zhang ldquoConsistentgroup identification and variable selection in regression withcorrelated predictorsrdquo Journal of Computational and GraphicalStatistics vol 22 no 2 pp 319ndash340 2013

[19] R Wilcox Introduction to Robust Estimation and HypothesisTesting Academic Press San Diego Calif USA 2nd edition1997

[20] A Alkenani and K Yu ldquoA comparative study for robust canoni-cal correlation methodsrdquo Journal of Statistical Computation andSimulation vol 83 no 4 pp 690ndash720 2013

[21] W D Mangold L Bean and D Adams ldquoThe impact of inter-collegiate athletics on graduation rates among major NCAAdivision I universities implications for college persistencetheory and practicerdquo Journal of Higher Education vol 74 no5 pp 540ndash563 2003

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

Page 7: Group Identification and Variable Selection in Quantile ...downloads.hindawi.com/journals/jps/2019/8504174.pdf · ResearchArticle Group Identification and Variable Selection in Quantile

Journal of Probability and Statistics 7

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paper

References

[1] R Koenker andG Bassett Jr ldquoRegression quantilesrdquo Economet-rica vol 46 no 1 pp 33ndash50 1978

[2] R Tibshirani ldquoRegression shrinkage and selection via thelassordquo Journal of the Royal Statistical Society Series B (StatisticalMethodology) vol 58 no 1 pp 267ndash288 1996

[3] J Fan and R Li ldquoVariable selection via nonconcave penalizedlikelihood and its oracle propertiesrdquo Journal of the AmericanStatistical Association vol 96 no 456 pp 1348ndash1360 2001

[4] R Tibshirani M Saunders S Rosset J Zhu and K KnightldquoSparsity and smoothness via the fused lassordquo Journal of theRoyal Statistical Society B Statistical Methodology vol 67 no1 pp 91ndash108 2005

[5] H Zou and T Hastie ldquoRegularization and variable selectionvia the elastic netrdquo Journal of the Royal Statistical Society BStatistical Methodology vol 67 no 2 pp 301ndash320 2005

[6] M Yuan and Y Lin ldquoModel selection and estimation in regres-sion with grouped variablesrdquo Journal of the Royal StatisticalSociety Series B (Statistical Methodology) vol 68 no 1 pp 49ndash67 2006

[7] H Zou ldquoThe adaptive lasso and its oracle propertiesrdquo Journal ofthe American Statistical Association vol 101 no 476 pp 1418ndash1429 2006

[8] H Zou and H H Zhang ldquoOn the adaptive elastic-net with adiverging number of parametersrdquo The Annals of Statistics vol37 no 4 pp 1733ndash1751 2009

[9] C H Zhang ldquoNearly unbiased variable selection under mini-max concave penaltyrdquoThe Annals of Statistics vol 38 no 2 pp894ndash942 2010

[10] R Koenker ldquoQuantile regression for longitudinal datardquo Journalof Multivariate Analysis vol 91 no 1 pp 74ndash89 2004

[11] H Wang G Li and G Jiang ldquoRobust regression shrinkage andconsistent variable selection through the LAD-Lassordquo Journal ofBusiness amp Economic Statistics vol 25 no 3 pp 347ndash355 2007

[12] Y Li and J Zhu ldquo1198711-norm quantile regressionrdquo Journal ofComputational and Graphical Statistics vol 17 no 1 pp 163ndash185 2008

[13] Y Wu and Y Liu ldquoVariable selection in quantile regressionrdquoStatistica Sinica vol 19 no 2 pp 801ndash817 2009

[14] M Slawski ldquoThe structured elastic net for quantile regressionand support vector classificationrdquo Statistics and Computing vol22 no 1 pp 153ndash168 2012

[15] A Belloni and V Chernozhukov l1ndash Penalized Quantile Regres-sion in High-Dimensional Sparse Models 2011

[16] L Wang Y Wu and R Li ldquoQuantile regression for analyzingheterogeneity in ultra-high dimensionrdquo Journal of the AmericanStatistical Association vol 107 no 497 pp 214ndash222 2012

[17] B Peng and LWang ldquoAn iterative coordinate descent algorithmfor high-dimensional nonconvex penalized quantile regres-sionrdquo Journal of Computational and Graphical Statistics vol 24no 3 pp 676ndash694 2015

[18] D B Sharma H D Bondell and H H Zhang ldquoConsistentgroup identification and variable selection in regression withcorrelated predictorsrdquo Journal of Computational and GraphicalStatistics vol 22 no 2 pp 319ndash340 2013

[19] R Wilcox Introduction to Robust Estimation and HypothesisTesting Academic Press San Diego Calif USA 2nd edition1997

[20] A Alkenani and K Yu ldquoA comparative study for robust canoni-cal correlation methodsrdquo Journal of Statistical Computation andSimulation vol 83 no 4 pp 690ndash720 2013

[21] W D Mangold L Bean and D Adams ldquoThe impact of inter-collegiate athletics on graduation rates among major NCAAdivision I universities implications for college persistencetheory and practicerdquo Journal of Higher Education vol 74 no5 pp 540ndash563 2003

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

Page 8: Group Identification and Variable Selection in Quantile ...downloads.hindawi.com/journals/jps/2019/8504174.pdf · ResearchArticle Group Identification and Variable Selection in Quantile

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom