06328291

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. 62, NO. 2, FEBRUARY 2013 809Multitask Spectrum Sensing in Cognitive RadioNetworks via Spatiotemporal Data MiningXin-Lin Huang, Member, IEEE, Gang Wang, Member, IEEE, and Fei Hu, Member, IEEEAbstractRecently, compressive sensing (CS) andspectrumsensinghavebeentwohot topics inthesignal processingandcognitive radionetwork(CRN) elds, respectively. Due tothesampling rate limitation of the analog-to-digital converter inspectrum-sensing circuits, some works have proposed integratingthesetwotechniquestoachievelow-overheadspectrumsensingin CRNs. These works aim to minimize spectrum reconstructionerrors based on linear regression methods, and1-norm is typi-callyusedtomakeatradeoffbetweenspectrumsparsenessandreconstructionaccuracy. However, sincetheinterferencerangeof primaryusersislimited, multipleclustersintheCRNmaynot share acommonsparse spectrum, andthus, the 1-normmaynot beappropriatetohandleall clustersinCSinversion.Hence, weproposeanovel multitaskspectrum-sensingmethodbasedonspatiotemporal dataminingmethods. Ineachcluster,weassumethatthespectrumsensingisexecutedinasynchro-nized way. The cluster head (CH) manages the operations, and acommon sparseness hyperparameter is used to make a consensusdecision. Amongmultipleclusters, synchronizedCSsamplingisnotrequiredinourscheme;instead,theDirichletprocessprioris employedtomakeanautomaticgroupingof thespectrum-sensing results among different clusters with a common sparsenesshyperparametersharedinsideeachgroup. Toexploit thetime-domainrelevanceamongconsecutiveCSobservations, ahiddenMarkovmodelisemployedtodescribetherelationshipbetweenthe hidden subcarrier states and the consecutive CS observations,and the Viterbi algorithm is used to make an accurate spectrumdecision for each secondary user. Simulation results show that ourproposedalgorithmcansuccessfullyexploit thespatiotemporalrelationshiptoachievehigherspectrum-sensingperformanceintermsof normalizedmeansquareerror, probabilityof correctdetection, andprobabilityoffalsealarm, comparedwithafewother related works.IndexTermsCognitiveradionetwork(CRN), Dirichletpro-cess(DP), hiddenMarkovmodel (HMM), spatiotemporal datamining, spectrum sensing.Manuscript received November 27, 2011; revised March 22, 2012, June 30,2012, and September 6, 2012; accepted October 5, 2012. Date of publicationOctober 10, 2012; date of current version February 12, 2013. This work wassupported in part by the National Natural Science Foundation of China underGrant No. 61201225 and in part by the Natural Science Foundation of Shanghaiunder Grant No. 12ZR1450800. The review of this paper was coordinated byProf. B. Hamdaoui.X.-L. HuangiswiththeDepartment ofInformationandCommunicationEngineering, Tongji University, Shanghai 201804, China(e-mail: [email protected]).G.WangiswiththeCommunicationResearchCenter,HarbinInstituteofTechnology, Harbin 150001, China (e-mail: [email protected]).F. Hu is with the Department of Electrical and Computer Engineering, TheUniversity of Alabama, Tuscaloosa, AL 35487 USA (e-mail: [email protected]).Color versions of one or more of the gures in this paper are available onlineat http://ieeexplore.ieee.org.Digital Object Identier 10.1109/TVT.2012.2223767I. INTRODUCTIONTODAY, the spectrum assignment policy in wireless com-municationsis regulatedby governmentalagencies.Thehugebandwirelessspectrumissegmentedandauthorizedtolicensed holders or services. With the dramatic increase of high-denition audio/video applications through wireless access,hundreds of megahertz to many gigahertz of wireless bandwidthare required, which causes scarcity of the limited wireless spec-trum resource. On the other hand, according to a report from theFederal Communications Commission (FCC) [1], the temporalandgeographical variationsintheutilizationofthelicensedspectrum are from 15% to 85%. This means that much of thespectrum is not efciently utilized. The increasing high-qualityservice requirement, limited available spectrum, and inefcientspectrum utilization necessitate a new communication patternto exploit the existing wireless spectrum opportunistically [2],[3]. Dynamicspectrumaccess(DSA) hasbeenproposedtosolvethespectruminefciencyproblemsandisimplementedincognitiveradionetworks(CRNs) [4]. InaCRN, throughtheopportunisticuseoffreespectrum(alsocalledspectrumholes), a device can gain access to more wireless bandwidth,which is the main goal of the FCC regulations [5].Cognitiveradio(CR)techniquesprovidethecapabilityofdetecting spectrum holes and sharing the spectrum in an oppor-tunistic manner. DSA techniques can select the best availablechannel from the spectrum pool for CR devices to operate [4],[6]. Morespecically, CRenablessecondaryusers(SUs)toperform a series of operations as follows: 1) spectrum sensingto predict what spectrumis available and recognize the presenceoftheprimaryuser(PU)whenaPUreoccupiesthelicensedchannel; 2) spectrum management to select the best availablechannel from the spectrum pool for special services; 3) spec-trum sharing to coordinate access to all available channels withother SUs; and 4) spectrum mobility to vacate the channel assoon as possible when a PU is detected [4]. Spectrum sensingis one of the most important components in the cognition cycle(see Fig. 1).InFig. 1, thespectrum-sensingmodulehelpstheSUstorecognize the radio environment, i.e., identifying the spectrumoccupancystatesof bothPUsandother SUs. Thespectruminformation can be further used by spectrum analysis and spec-trum decision modules to analyze the available channel qualityand then make a channel assignment decision, respectively [4].Recently, many signal processing techniques have been de-velopedforspectrumsensing, andthesecanbeclassiedaseither noncooperative detectionor cooperative detection. Innoncooperative detection methods, different spectrum-sensingmethodscan be chosenby individualSUs [8], such asmatch0018-9545/$31.00 2012 IEEE810 IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. 62, NO. 2, FEBRUARY 2013Fig. 1. Basic cognition cycle [4], [7].lter detection [9], energy detection [10], and cyclostationaryfeature detection [11]. These noncooperative detection modelsare designed for individual SUs to process their local spectrumobservations and thus are easy to implement. However, withoutconsiderationofspatial diversityinformation[12], theirper-formance is limited. Hence, many schemes [12][18] resort tocooperative detection. It is more accurate than noncooperativedetection since the uncertainty in the individual SUs detectioncan be signicantly reduced [12]. In the cooperative detectionmodel, two typical data fusion methods are widely used today:1) centralizedand2) decentralizedfusions[13][18]. Inthecentralizedfusionmodel,allobservationsfromdifferentSUsare collected by a fusion center through multihop transmissionsinCRN[13], [14]. Thenal result ofcentralizedfusioncanbegloballyoptimalduetothespatialinformationconsideredinthedecisionprocess[12]. However, italsoresultsinhighcommunication cost since it requires the transmission of localinformation to a fusion center as well as the broadcast of deci-sion information to each SU. Moreover, the fusion center needstohavepowerful computationresourcestoprocessthehugeamount of informationquickly. Hence, decentralizedfusionmethods are proposed in [15][18] to reduce power consump-tionandoperationloadinthefusioncenter. ThroughlocalsensinganddecisioninformationexchangewithneighboringSUs, eachSUs local computationcaneventuallyconvergetoaglobal decision. Muchresearchworkhasbeendoneoncooperative decentralized fusion [19], [20].In [21][23], cooperation spectrum-sensing methods are pro-posed for a simple two-SU network and a multi-SU network,respectively, to reduce the detection time and improve the prob-ability of correct detection in distributed networks. Reference[21]alsoassumesthatthepositionsofallPUsareknowntoSUs. In [24] and [25], the Markov chain is used to describe thechannel state transition, and a decentralized strategy is proposedfor SUs to decide which channels to sense and access, to im-prove the network throughput. In [26], the distributed detectiontheoryisusedtorealizecooperativespectrumsensinginanindoor environment to improve the radio awareness of CRNs.In[27],acensoringmethodwithquantizationisproposedtoreducetheaveragenumberofsensingbitsforlow-overheadinformation exchange. In [12], a decentralized fusion solutionto cooperative compressive sensing (CS) is proposed to obtainglobal optimality and make a consensus decision. The 1-normhas been used widely and is veried to be one of the best inver-sion methods to make a tradeoff between signal sparseness andreconstructionerrors[28]. Theorthogonal propertybetweenthespectrumstatisticsofPUsandSUsisusedtodistinguishbetween sparse common and sparse innovations in [12], wheresparse common and sparse innovations are dened as the globalspectrumoccupationofPUsandthelocal interferencefromneighboring SUs, respectively.The works most related to our proposed multitask spectrumsensingarediscussedin[29]and[30]. Theseworksarede-signedfor signal reconstructionafter sub-Nyquist samplingof multitasksignals (here, the local spectrumsensinginaspecicSUiscalledatask). Thoseschemescanreducere-constructionerrorsthroughacentralizedfusion. In[29], themultitask CS (MT-CS) is investigated. It estimates a commonsparsenesshyperparameteraftercollectinglocal observationsfromall users. Basedontheestimatedcommonsparsenesshyperparameter and local observations, each user reconstructstheoriginal signals. In[30], all local observationsfromSUsare assumed not to share the same sparseness hyperparameter.Hence, the cooperative fusion is only executed in some specicusers (called one group) rather than the whole network. TheDirichletprocess(DP)priorisproposedtorealizeautomaticnonparametric groupingandinversionof observations (i.e.,using CS signals to recover the original Nyquist-rate-sampledsignals) within each group [30]. In this paper, we will use themultitask Bayesian CS (BCS) model [29] to solve the spectrum-sensing issues in each cluster. Meanwhile, a DP prior [30] willbeusedtorealizeautomaticnonparametricgroupingfortheCS data collected from different clusters. We further integratethe DP-based spectrum-sensing model with the hidden Markovmodel (HMM) to exploit the temporal-dependent information.Moreover, we also consider the effects of sub-Nyquist samplingand the deep-fading characteristic of intersymbol interference(ISI) channels in PUs and SUs.Themaincontributionsofthispaperincludethefollowingthree aspects:1) A cooperative and decentralized BCS inversion isproposed as a spectruminformation sharing mecha-nismthroughacommonsparsenesshyperparameter. Ahierarchical sparseness prior is employed for sharinginformation based on all local CSobservations. Thecomplex-valued observations and unknown ISI channelsare also considered in the spectrum sampling process.2) We extend our work from the preceding sharing mecha-nism to a nonparametric grouping mechanism. Since theCS observations from different clusters may not be appro-priate for sharing due to the limited radio coverage rangeof each PU, a DP-based hierarchical model is proposed torealize simultaneous grouping and BCS inversion of theCS observations.3) After thespatial spectrumdiversityinformationisex-ploited, the HMMis used in each cluster to exploittheinternalrelationshipamongtemporal-dependentCSHUANG et al.: MULTITASK SPECTRUM SENSING IN CRNs VIA SPATIOTEMPORAL DATA MINING 811observations. The Viterbi algorithm is used to choose theoptimal hyperparameter, which is nally used to make alocal spectrum decision in each SU.The rest of this paper is organized as follows. In Section II,the challenges of spectrum sensing and the system model aredescribed. InSectionIII, werst consider thecaseof onlyonecluster that implementsCSsamplingandinversion. Weassume that all member nodes (also called SUs in this paper)inthe same cluster canshare spectruminformation, andahierarchical sparseness prior is employed to derive the commonsparseness property of the spectrum. In Section IV, we extendthe one-cluster case to the whole CRN (i.e., multicluster case)and use the DP-based hierarchical model and HMM to realizespatiotemporal data mining. In Section V, simulation results areprovided to show the efciency of our cooperative distributedspectrum-sensing method in terms of normalized mean squareerror (MSE), probability of correct detection, and probability offalse alarm. Section VI concludes this paper.II. PROBLEM STATEMENTSpectrum sensing is a critical task in a CRN that adopts DSAtoimprovenetworkspectrumutilization[31]. ThespectrumsensinginanSUaims toaccuratelyidentifythe spectrumoccupancystatusinbothPUsandotherSUstofacilitatetheutilization of idle spectrum holes while at the same time strictlyprotect the PUs transmissions and avoid harmful interferenceamong SUs. To reduce the impact of deeply fading ISI channelsandimprovethespectrumdetectionperformance, weadoptcollaborative spectrum sensing that exploits the spatial diversityinformation among multiple SUs. It stands out as an effectiveapproach to alleviate the problem of detection failure as well asthe hidden terminal problem. Another advantage of cooperativesensing is that the effective signal-to-noise ratio may increaseproportionally as the number of cooperating SUs increases [12].Thiscanreducethespectrum-sensingcostofeachindividualSU (i.e., the number of CS measurements).Toenablecooperativedecentralizedspectrumsensinginahuge band CRN, several major challenges have to be addressed.First, accordingtotheNyquist samplingtheorem, spectrumsensing over a wide frequency band requires a high spectrumsamplingrate, whichisabottleneckoftheanalog-to-digitalconverter (ADC). Second, conventional cooperation spectrum-sensingalgorithmsrequireafusioncentertocollectobserva-tions from all SUs and make centralized sensing decisions [32].This centralizedprocessingmayincur highcommunicationcosts and render the entire network vulnerable to node failure[12]. Third, spatially distant SUs might not be ideally synchro-nized during sampling in the sensing stage, and the observationsfromdistant SUsmight not beappropriateforsharingsincethe transmission rangeofPUsmaynotcoverall theclusters.Asaresult,allSUsmightnotshareonecommonsparsenessspectrum decision.Existing works assume the following: 1) All CRs stay silentsuchthat onlythePUsareemittingpower duringtheSUsspectrum sensing [18], [32], and 2) the transmitting power ofPUs is high enough such that it can be heard by all SUs. Hence,a common sparseness spectrum is shared among all SUs. TheFig. 2. Cooperative decentralized spectrum sensing for cluster-based CRN.rst assumptionimposesastringent requirement onthesyn-chronization among all SUs, which is difcult to implement in alarge-scale CRN. The second assumption imposes that the PUsareverypowerful orthewholeCRNisdeployedinasmallarea, andnointerferenceanddeepfadingexistsintheradioenvironment. Thoseconditionsmaynot berealisticinmanyCRN applications.Inthis paper, weconsider thecluster-basedCRNshowninFig. 2. InFig. 2, theSUs aredeployedinawidearea,andtheclustersareformedbasedonsomemetrics, suchaslocation,mobility,etc.(pleaserefertoourpreviouswork[3]onCRNclusteringstrategies). Thispaperassumesthesameclusteringcriterionasin[3]. Therefore, wecanassumethatall cluster members share the same spectrum map due to theirclose distances to each other. However, due to spectrum-sensingnoise and errors in each cluster member, there could be minordiscrepanciesamongtheir sensingresults. Thus, anefcientspectrumfusionalgorithmisneededtoreachaconsensusinterms of the entire clusters spectrum patterns. In the spectrum-sensingstage, all SUs inthesamecluster rst individuallyoperateCSsamplinginsynchronizationmodebasedonthespectrum management commands sent from the CH. The CHthen collects the local result from each cluster member to makea fusion result and then broadcasts the fused information to theCHs in other clusters.To realize our cooperative decentralized spatiotemporal datamining, we assume the following: 1) Sampling synchronizationin the whole CRN is not required, and synchronized samplingonlyoccursinsideeachcluster. Whenacluster startsspec-trum sensing, it may also receive the signal from neighboringclusters. Such signal can be seen as the interference from theviewpoint of spectrumsensing. Hence, weneedtoidentifywhich channels are occupied by the PUs or SUs from the neigh-boring clusters. 2) The transmission power of PUs may not behigh enough to cover the whole CRN area, and different clus-ters may have different spectrum occupancystatus due to thegeographical-dependentPUs.Hence,thesparsenessspectrumis determined by the geographical-dependent PUs as well as the812 IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. 62, NO. 2, FEBRUARY 2013distant SUs in other clusters. 3) The CS algorithm is adopted ineach SU during spectrum sensing since the wideband spectrumsensing requires high sampling rate and high cost on the ADCcircuit according to the Nyquist sampling theorem.III. MULTITASK BAYESIAN COMPRESSIVE SENSINGMODELING WITH HIERARCHICAL PRIOR:ONE-CLUSTER CASEFor simplicity, werst consider thecooperativespectrumsensing in one cluster only and then extend it to the multiclustercaseinthenextsection. Ineachcluster, theCHperiodicallybroadcaststhespectrum-sensingcommands, andall membernodes execute signal sampling and exchange information withtheir CH. Intheone-cluster case, weassumethat all mem-ber nodes share one commonsparseness spectrum. Inthissection, we propose using MT-CS modeling with hierarchicalprior to detect spectrum holes in a cooperative manner. Unlikeconventional cooperative spectrum-sensing schemes [12], [29],[33][35], here we consider the following: 1) unknown channelimpulseresponse(CIR)betweenPUsandSUs; 2)complex-valued sampling signals (including horizontal component andorthogonal component); and 3) automatic identication of PUsandSUs spectrumoccupancystatesbasedonthespectrumassignment records from its neighboring CHs.We assume that there are Msubcarriers (also called chan-nels) and NSUs in the target cluster. Through Nyquist sam-pling, the received signals in the SU j can be represented asrj(n) = rPj(n) +rCj(n) +wj(n), n = 0, 1, . . . , M 1 (1)where rPj(n)=

Ii=1hi,j(n)xi(n) and rCj(n)=

igi,j(n) yi(n) correspond to the received signals from a total of IPUsandtheinterferencefromneighboringclusters, respectively.hi,j(n) is the CIR between PUi and SUj(j= 1, 2, . . . , N),and() denotes convolution. gi,j(n) is theCIRbetweenaneighboringcluster iandSUj. xi(n)andyi(n)correspondtotheoriginaltransmittedsignalfromPUiandneighboringcluster i, respectively. wj(n) represents the additive whiteGaussian noise.After M-point discrete Fourier transform(DFT) for theobserved signals, (1) can be further rewritten as [12]Rj(k) =I

i=1Hi,j(k)Xi(k) +

iGi,j(k)Yi(k) +Wj(k) (2)where Hi,j(k), Xi(k), Gi,j(k), Yi(k), and Wj(k) (k =0, 1, . . . , M 1) are the complex-valued frequency-domaindiscrete versions ofhi,j(n),xi(n),gi,j(n),yi(n), andwj(n),respectively.Due to the sampling rate limitation of ADC andthe sparse nature of the received signals in (2), we use the CStechnique in spectrum sampling as follows:j= jFMF1MRPj+ jFMF1MRCj+ jFMF1MWj(3)wherejisamj 1vector(mjM)oftheCSobserva-tions, andRPj=

Ii=1Hi,j(k)Xi(k)isorthogonal toRCj=

iGi,j(k)Yi(k) (i.e., (RPj)TRCj= 0). jFMis theobser-vationmatrix, wherejisamj Mmatrixwithelementsconstituted randomly [29], and FMis the M MDFT matrix[12]. In (3), jis the time-domain sampling signal and can befurther rewritten asj=jRPj+ jRCj+ jWj=j Re_RPj+RCj+Wj_+sqrt(-1) j Im_RPj+RCj+Wj_. (4)SincetheRe{}partisorthogonaltotheIm{}partandboth of themhave the same structure [see (4)], we only showtheanalysis of the Re{} part here (to save space), and the Im{}part can be analyzed in the same manner. Any analysis resultfrom Re{} can be symmetrically extended to the Im{} caseeventually. Hence, we havej= jj+j(5)wherej= Re{RPj+RCj } is aM 1 vector that representsthe spectrumoccupationstates, and jis a mj 1vectorwhose components are independentlyidenticallydistributed(i.i.d.)Gaussianvariables. Weemployahierarchical MT-CSmodel for our cooperative spectrum-sensing scheme based on(5). Since j is i.i.d. draws of a zero-mean Gaussian distributionwithunknownprecision0, thelikelihoodfunctionfor theparametersjand0, basedontheobservationsj, canberepresented asp{j|j, 0} = (2/0)mj/2exp_02 j jj

22_(6)wheremjis the number of measurements. It is much smallerthan the Nyquist sampling rate Min (1).The parameters jare assumed to be drawn from a productof zero-mean Gaussian distributions that are shared by the SUsin one cluster, and therefore, the N tasks are statistically relatedto each other. Specically, letting j,k represent the kth elementof the vector j, we havep{j|, 0} =M

k=1N_j,k|0, 101k_(7)where = {1, 2, . . . , M} is a hyperparameter. To promotesparseness over j, a Gamma prior can be placed on the hyper-parameter 0, i.e.,p{0|a, b} = Ga(0|a, b) =ba(a)a10exp(b0). (8)Thus, theposterior probabilityof jbasedontheobservedsignals j and hyperparameter can be represented asp{j|j, }=_p(j|j, , 0)p(0|a, b)d0=_p(j|j, 0)p(j|, 0)_p(j|j, 0)p(j|, 0)djp(0|a, b)d0=(a +M/2)_1+12b(jj)T1j(jj)(a+M/2)(a)(2b)M/2|j|1/2(9)HUANG et al.: MULTITASK SPECTRUM SENSING IN CRNs VIA SPATIOTEMPORAL DATA MINING 813wherej=jTj j(10)j=_Tj j+A_1. (11)In (11), A is a diagnose matrix (A=diag(1, 2, . . . , M)).From(9)(11), toobtaintheposterior probabilityof j, weshould rst seek the point value of the hyperparameter basedon the Nobservations.The maximum likelihood (ML) function of the observationsfromNSUs can be written as (see the Appendix for detaileddeduction steps)() =N

j=1log p(j|)=N

j=1log_p(j|j, 0)p(j|, 0)p(0|a, b)djd0= 12N

j=1_(mj+2a) log_TjB1jj+2b_+log |Bj|+ Const. (12)HereBj= E + jA1Tj(13)whereEis the identity matrix. According to [33],Bjcan bedecomposed asBj=E + jA1Tj=E +

n=k1nj,nTj,n +1kj,kTj,k=Bj,k +1kj,kTj,k(14)where Bj,k=E+

n=k1nj,nTj,n(k=1, 2, . . . , M). Hence,the determinant and inversion of matrix Bj can be expressed as|Bj| =|Bj,k|1 +1kTj,kB1j,kj,k (15)B1j=B1j,k B1j,kj,kTj,kB1j,kk + Tj,kB1j,kj,k. (16)Then, the contribution of the basis vectorj,kin the likeli-hood function (12) can be separated from others, i.e.,() = 12N

j=1_(mj+ 2a) log_TjB1j,kj+ 2b_+ log |Bj,k|_+ Const 12N

j=1_log_1 +1ksj,k_+ (mj+ 2a) log_1 q2j,k/ej,kk +sj,k__=(k) +(k) (17)where k= {1, 2, . . . , k1, k+1, . . . , M}. sj,k, qj,k,and ej,k are dened assj,k=Tj,kB1j,kj,kqj,k=Tj,kB1j,kjej,k=TjB1j,kj+ 2b. (18)To update k in each iteration, we x the other hyperparam-eter k as the latest value, differentiate the likelihood function(k) with k, and set the result to zero, i.e.,(k)k=N

j=1sj,k_sj,kq2j,k/ej,k_/k(mj+2a)q2j,k/ej,k+sj,k2(k+sj,k)_k+sj,kq2j,k/ej,k_= 0. (19)Since k is the precision of the Gaussian distribution, we havek> 0. We assume ksj,k (this is an empirical result statedin [29]), and thus, k +sj,k sj,k in (19). Then, we can derivethe new k from (19) asifN

j=1(mj+ 2a)q2j,k/ej,k sj,ksj,k(sj,k q2j,k/ej,k)> 0k NN

j=1(mj+2a)q2j,k/ej,ksj,ksj,k(sj,kq2j,k/ej,k)(20)elsek= . (21)Hence, the SU calculates ((mj+ 2a)q2j,k/ej,k sj,k)/(sj,k(sj,k q2j,k/ej,k))ineachiterationandbroadcastssuchvalue to its CH. From (20) and (21), we can update the hyperpa-rameter k(k = 1, 2, . . . , M) after each iteration. After reach-ing the upper bound of iteration times, or if the increment valueof the likelihood in (17) is less than a threshold (which meansthat wealmost reachthemaximumvalueofthelikelihood),theCHobtainsthespectrumdecisionforitscluster. In(21),k= means j,k= 0 (j= 1, 2, . . . , N), and the subcarrierk is available to SUs.From the foregoing analysis, one can see that: 1) the membernodes in one cluster seek a consensus spectrum map based onthemultitaskBCSmodel, and2)theinformationexchangedamong member nodes can be used to derive the shared hyperpa-rameter = {1, 2, . . . , M}. An advantage of our proposedhierarchical prior [see (7) and(8)] is tocollect the spatialcontributionfromall member nodes toderivethecommonsparseness spectrum and thus remove ISI channel fading.After several iterations, the result = {1, 2, . . . , M}will converge, and we can then make a binary spectrumdecisionof dPUand dSU[tobediscussedin(57) and(58)], whichrepresents the spectrum occupancy states of PUs and SUs.IV. DISTRIBUTED INFORMATION EXCHANGEANDSPATIOTEMPORAL DATA MINING: MULTICLUSTER CASEInthis section, wewill extendtheone-cluster spectrum-sensingcasetoamulticlustercase,i.e.,thewholeCRNwith814 IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. 62, NO. 2, FEBRUARY 2013observationsfromdifferentclusters. InSectionIII, oneclus-ter is assumedtoshare one commonsparseness spectrum.However, theCRNmaybedeployedoveralarge-scalearea,andthesparsenessspectrumdecisionsmayvaryindifferentpositionsduetothegeographical-dependent PUsandsignalattenuation along a path. Hence, different clusters may not bestatistically interrelated to each other, and the CS observationsfrom different clusters may not be appropriate for sharing. Forexample, one cluster may be located near a high TV tower (basestation), which makes less IEEE 802.22 channels available forSUs. Inthemulticluster case, weshoulddesignanefcientalgorithm that rst groups the CS observations from differentclusters (multiple clusters CS observations may belong to thesame group as long as they obey the same spectrum statistics)andthenusesthemultitaskBCSmodel (seeSectionIII) ineach group to discover the common sparseness spectrum withineachgroup.Forthispurpose,weintroduceaDPpriortothehierarchical BCS model that has been discussed in Section III.The DP prior [30] has shown a powerful capability of automat-ically classifying different samples into groups based on theirstatistical patterns. In our application, the DP prior will be usedto realize both spectrum grouping and CS inversion.A. DPDP is a distribution over probability measure and has two pa-rameters: 1) precision and 2) base distribution G0 [30]. In themulticluster spectrum-sensing case, different clusters may havedifferent hyperparameters, that is, i= {i1, i2, . . . , iM},the cluster ID i = 1, 2, . . . , C (C is the total number of clustersintheCRN). Weassume {i, i = 1, . . . , C}isdrawnidenti-cally from distribution G, which is a random draw from the DP,i.e.,i|G G, i = 1, . . . , C (22)G DP(, G0) (23)E(G) = G0. (24)Equation (22) is the likelihood function for G, and the hyper-parameter i has been derived in the multitask BCS model (seeSection III). Equation (23) is the prior knowledge of G.Whenweintegral out Gaccordingto(22) and(23), iobeys the base distribution G0. In our cluster-based CRN, whenone cluster collects the hyperparameter information i={1, 2, . . . , i1, i+1, . . . , C} from other clusters, the basedistribution G0 is updated, and we have [36]p(i|i, , G0)=+C1G0+1+C1C

k=1,k=ik(25)where krepresents a mass point concentrate at kwithprobability 1/( +C 1). { k}Kk=1 (K C) represents a setof distinct hyperparameters in {k}Ck=1. We assume that thereareniknumber of clusters that choose kin { k}Kk=1. Then,(25) can be further written asp(i|i, , G0)=+C1G0+1+C1K

k=1nik k. (26)Equation (26) clearly shows the important sharing propertyof DP distribution: a new sampleiprefers to select a group k with a large population nik.In(22), thedistributionGcanbegeneratedbythestick-breakingprocess, whichintroducestwoindependent randomvariables k and k(k = 1, 2, . . . , ), i.e.,G =

k=1kk(27)wherek= kk1

n=1(1 n) (28)k| Beta(1, ) (29)k|G0 G0. (30)In(27)and(28), kandkaredrawni.i.d. fromaBetadistribution [(29)] and base distribution G0 [(30)], respectively.To promote sparseness over j, we assume that G0is a multi-plication of Gamma distribution [29], i.e.,G0 M

k=1Ga(c, d). (31)In (27), we can see that the number of mass points is innite.However, thetotal number of uniquevaluesof kisnite.Hence, we can use nite approximation to represent DP via amodied distribution G, i.e.,G =J

k=1lkk(32)wherelkrepresentstheweightofmasspoint k, and Jk=1lk= 1. Hence, (22) can be rewritten asp(i|G) = p_i|{lk}k=1,J, {k}k=1,J_ =J

k=1lkk(33)where Jis the number of unique values of the hyperparameter,obviously, J C. Moreover, {l1, l2, . . . , lJ} obeys the Dirich-let distribution{l1, l2, . . . , lJ} Dir(1, 2, . . . , J). (34)B. Automatically Grouping and DistributedInformation ExchangeBased on the preceding DP, the hidden model shown in (5)can be dened asj|j, 0 N_jj, 10E_, j= 1, 2, . . . , Cj,k|zj,k N_0, 101zj,k_, k = 1, 2, . . . , M0 Ga(a, b)j|{lk}k=1,J, {k}k=1,J J

k=1lkkzj Categorical(l1, l2, . . . , lJ){l1, l2, . . . , lJ} Dir(1, 2, . . . , J) (35)HUANG et al.: MULTITASK SPECTRUM SENSING IN CRNs VIA SPATIOTEMPORAL DATA MINING 815where zjis an index variable to indicate to which groupthecluster jbelongs. IntheDPmodel, weareinterestedin{k}k=1,Jand {zj}j=1,C, which are the required informationfor spectrum decision [see (57) and (58)]. The lower bound ofthe marginal log-likelihood function can be written as(, )=__q(z, l)[log p(, z, l|, )log q(z, l)] dzdl=__q(z, l)[log p(, z, l|, )log q(z, l)] dzdl=__q(l)C

j=1q(zj)___log p(l|) +C

j=1_log p(zj|l) + log p_j|zj__log q(l)C

j=1log q(zj)___dzdl. (36)We can use the variational Bayesian inference, i.e., a varia-tional posterior distribution q({zj}j=1,C,{lk}k=1,J)=

Cj=1q(zj)q({lk}k=1,J),toapproximatethetrueposteriorp({zj}j=1,C,{lk}k=1,J|{j}j=1,C) [30]. In (36), estimation of and canbeobtainedbymaximizingthelowerbound(, )viatheexpectation-maximization (EM) algorithm as follows:1) IntheE-step, is estimatedbymaximizing(, )given = {k}k=1,Jas the latest estimated values.Specically, q(l) and q(z) are updated separately bymaximizingthelower boundin(36) givenother q()values and .2) IntheM-step,thevaluesofareestimatedbymaxi-mizing (36) given the most current values of , q(l), andq(z). Letting j,k= q(zj= k), (36) then becomes() =J

k=1

k (k) (37)

k (k) =C

j=1j,k log p (j|k)=C

j=1j,k log_p(j|j, 0)p (j|k, 0)p(0|a, b)djd0= 12C

j=1j,k_(mj+ 2a) log_TjB1j,kj+b_+ log |Bj,k|_+ const (38)whereBj,k=E + jA1kTj(39)Ak=diag__k,j_j=1,M_. (40)From(37)and(38),wecanseethattheelementsof={1, 2, . . . , J}areindependenttoeachotherandcanthusbeobtainedseparatelybymaximizing(38)intheM-step.Toobtain the optimal values= {1, 2, . . . , J}, we decom-pose the matrix Bj,k in the same way as what we have done in(14), i.e.,Bj,k=E +M

t=1,t=n1k,tj,tTj,t +1k,nj,nTj,n=Bj,k,n +1k,nj,nTj,n(41)where Bj,k,n (n = 1, 2, . . . , M) is used to denote the accumu-lated effects of {k,1, k,2, . . . , k,n1, k,n+1, . . . , k,M}. In(41), we separate the contribution of k,n from other items. Thematrixdeterminantandinverseidentitiesin(38)canthenberewritten as|Bj,k| =|Bj,k,n|1 +1k,nTj,nB1j,k,nj,n (42)B1j,k=B1j,k,n B1j,k,nj,nTj,nB1j,k,nk,nB1j,k,nj,n. (43)Substituting(42)and(43)into(38), wecanfurthersolve(38) as

k(k)= 12C

j=1j,k_(mj+ 2a) log_TjB1j,k,nj+b_+ log |Bj,k,n|_+const 12C

j=1j,k_log_1 +1k,nsj,k,n_+ (mj+ 2a)log_1 q2j,k,n/ej,k,nk,n +sj,k,n__= k_k,n_+k_k,n_(44)wheresj,k,n=Tj,nB1j,k,nj,nqj,k,n=Tj,nB1j,k,njej,k,n=TjB1j,k,nj+ 2b. (45)Equation(44) indicatesthedependenceof k(k)onthehyperparameter k,n, which can be isolated from all the otherparameters k,n. Weassumethat cluster j has Njnodes,andthe sub-Nyquist samplingrates of these Njnodes are{mj,1, mj,2, . . . , mj,Nj}. Hence,k(k,n) in (44) can be fur-ther rewritten as

k_k,n_ = 12C

j=1j,kNj

l=1_log_1 +1k,nsj,l,k,n_+(mj,l + 2a) log_1 q2j,l,k,n/ej,l,k,nk,n +sj,l,k,n__. (46)816 IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. 62, NO. 2, FEBRUARY 2013According to [30], j,k and (1, 2, . . . , J) can be updatedintheE-stepfor cooperativeCSinversion. Weextend[30]toourmulticlusterCRNapplicationandconsiderthecontri-butionsofallmembernodesineachcluster.Weassumethatall member nodes incluster j sharethesamemembershipj,k(k = 1, 2, . . . , J). Then, we have [30]j,k=Nj

l=1erj,l,kJ

m=1Nj

l=1erj,l,m(47)k=1J+C

j=1j,k(48)whererj,l,k=_(k) _J

m=1m__ 12_(mj,l + 2a) log_Tj,lB1j,l,kj,l +b_+ log |Bj,l,k|_(49)(x) = log (x)x. (50)Maximizing k(k,n) in (46) (i.e., (k(k,n))/k,n= 0)cannot besolveddirectlyinaclose-loopformat becausethedenominator of eachfactor isasecond-order polynomial ofk,n. Moreover, theentireequationisthesumof Cj=1Njfactors. Hence, we will obtain a complex equation with the or-derof2(

Cj=1Nj 1) + 1 = 2

Cj=1Nj 1, whichcannotbe solved in close loop. As what we have done in (20) and (21),here we also assume thatk,nsj,l,k,n(this is an empiricalresult stated in [29]). By maximizingk(k,n) in the M-step,we getifC

j=1j,kNj

l=1(mj,l + 2a)q2j,l,k,n/ej,l,k,n sj,l,k,nsj,l,k,n_sj,l,k,n q2j,l,k,n/ej,l,k,n_ >0k,n C

j=1Njj,kC

j=1j,kNj

l=1(mj,l+2a)q2j,l,k,n/ej,l,k,nsj,l,k,nsj,l,k,n(sj,l,k,nq2j,l,k,n/ej,l,k,n)(51)elsek,n= . (52)Hence, a CH exchanges its fusion result j,k

Njl=1((mj,l +2a)q2j,l,k,n/ej,l,k,nsj,l,k,n)/(sj,l,k,n(sj,l,k,nq2j,l,k,n/ej,l,k,n))with other CHs in each iteration. In (52), k,n= means thatchanneln is unoccupied by PUs and other SUs. The detailedsteps of our proposed algorithm are described as follows.Our proposed DP-based hierarchical BCS algorithm:(1) Initialize k, k,nandthe corresponding j,n(k =1, 2, . . . , J, j= 1, 2, . . . , C, and n = 1, 2, . . . , M).(2) The member node l in cluster jupdates rj,l,kaccordingto (49) and (50), and calculates ((mj,l + 2a)q2j,l,k,n/ej,l,k,n sj,l,k,n)/(sj,l,k,n(sj,l,k,n q2j,l,k,n/ej,l,k,n)) based on its localobservations. Those two values will be collected by the CH incluster j.(3) The CH in cluster j updates the membership j,k and kaccording to (47) and (48), respectively.(4) For k = 1, 2, . . . , J, the CH in cluster jselects a candi-date basis j,l,nand updates k,naccording to (51) and (52).Here, we choose the element k,n with the maximal incrementk(k,n) [see (46)] in each iteration.(5) The CH broadcasts the fusion resultj,k

Njl=1((mj,l +2a)q2j,l,k,n/ej,l,k,n sj,l,k,n)/(sj,l,k,n(sj,l,k,n q2j,l,k,n/ej,l,k,n)) to its neighboring CHs.(6)Checkalgorithmterminatingcriterion, whichcouldbea) anupper boundof iterationtimesor b) theincrement of(, )ineachiterationbeinglessthanathreshold. (Note:when (, ) cannot be increased much, that means we almostreachthemaximumofthelikelihood. Thenwecanstoptheiterations since our goal is to seek the ML). If either of themmeets, then stop; otherwise, go back to step (2).Reference[37]pointedout that ifthejoint distributionofhidden variables belongs to a curved exponential family, thenthe EM algorithm can nd a stationary value of the likelihoodfunction. In our case, p(z, l|, ) = p(z|l)p(l|), where p(z|l)follows a Categorical distribution and p(l|) follows Dirichletdistribution. SinceCategorical distributionandDirichlet dis-tributionbothbelongtoexponentialfamily,andtheDirichletdistributionistheconjugateprioroftheCategoricaldistribu-tion, p(z, l|, ) should belong to a curved exponential family.Hence, our proposed EM algorithm will nally converge to astationary point.After themarginal log-likelihoodfunction[see(36)] con-verges to a stationary point, we obtain {k, k=1, 2, . . . , J} aswell as the membership {j,k, j =1, 2, . . . , C, k=1, 2, . . . , J}.In our proposed DP-based hierarchical BCS model, we updatekby monotonically increasing the likelihood function()in each iteration until the convergence is achieved.In the foregoing discussions, we fully exploited the spatial re-lationship among the CS observations from all clusters to inferspectrummap. Tofurtherincreasetheaccuracyofspectrum-sensingdecision, wenext employtheHMMtoexploit thetime-domain relevance of subcarrier states and select the mostpossible candidate k for the nal spectrum decision.C. HMMInFig.3,therelationshipbetweenhiddensubcarrierstatesand CSobservations is plotted. Since the subcarrier statesshouldbe time relevant, onlya small number of subcarri-ers change their binarystates betweentwoconsequent CSobservations.HUANG et al.: MULTITASK SPECTRUM SENSING IN CRNs VIA SPATIOTEMPORAL DATA MINING 817Fig. 3. Relationship between hidden subcarriers states and CS observations.Fig. 4. First-order Markov model for each subcarrier state.Here, we consider the time-domain relevance when assigningthenal hyperparameter ktoeachcluster for thetimetthspectrum sensing. In Fig. 3, the previous states are consideredin the HMM as well as the nal spectrum decision. The prob-ability that cluster j selects k(t) as its hyperparameter can becalculated,andthehyperparameterthatleadstothemaximalprobability will be selected as the nal hyperparameter, i.e.,Vk(1)(1) =p (j(1)|k(1)) p (k(1)) (53)Vk(t)(t) =p (j(t)|k(t)) maxz(t1)p (k(t)|z(t 1))Vz(t1)(t 1) (54)(t) =arg maxz(t)_Vz(t)(t)_. (55)Equations (53)(55) are the standard Viterbi algorithms withthe recurrence calculations. The computation complexity of theViterbi algorithm is O(J2). Here, Vk(t)(t) is the probability ofthe most probable hyperparameter sequence that is responsiblefor the rst t observations. Moreover, the corresponding k(t)is used for the nal hyperparameter to make a decision on thesubcarriers states.In (53) and (54), p(j(t)|k(t)) is equal to the membershipj,kin (47). Moreover, the transition probability between twoadjacent hiddenhyperparameters is considereda rst-orderMarkov model [38]. Note that we are only interested in the bi-nary subcarrier state dk,n(t) that corresponds to two cases, i.e.,0 < k,n(t) < and k,n(t) = , instead of the exact valueofk,n(t). Hence, thetransitionprobability(p(k(t)|z(t 1)) can be further rewritten asp (k(t)|z(t 1)) p {dk(t)|dz(t 1)} . (56)Thetransitionprobabilityof thebinarysubcarrier stateisdescribed as a Markov model in Fig. 4. Here, we assume thecurrent subcarrier state(tthtime) hasrelationshiponlywiththe last subcarrier state (i.e., t 1th time). The nth element ofdz(t 1) and dk(t), i.e., dz,n(t 1) and dk,n(t), only has onestate (1 or 0). In our rst-order Markov model, we assumethat the transition probabilities p0,0, p0,1, p1,0, and p1,1 are xed[39][41] and can be detected by SUs.After goingthrough {k(t)}k=1,Jin(54), wecanobtaina candidate k(t) withthe maximal value as the nal hy-perparameter for cluster j. As we didinSectionIII, hereweagainonlyconsidertherealpart Re{}inmathanalysissince the imaginary partIm{} can be analyzedin the samemanner. Hence, wecanget twohyperparameters k(t)andz(t) for the real and imaginary parts, respectively. The nalbinarysubcarrierstateisdeterminedbyathresholdandtwohyperparameters, i.e.,if__k,n(t)_2+_z,n(t)_2< thresholddn(t) = 1 (57)elsedn(t) = 0 (58)where n {1, 2, . . . , M}.D. Identify Spectrum DecisionAfter we apply the previously discussed spatiotemporal dataminingscheme,wethenfurtheremploythespectrumassign-mentrecordsinneighboringCHstoidentifywhichsubcarri-ersareoccupiedbyPUs, andtheremainingsubcarriersareregarded as the interference from the SUs in other clusters. Thesubcarriers temporally occupied by SUs in neighboring clustersarealsospectrumopportunitiesthatcanbeaccessedthroughnegotiationorcompetitionschemesamongCHs. Thebinaryspectrumdecisiondn(t)obtainedpreviouslyisamixtureofPUsandSUs occupancystateineachsubcarrier. EachCHshouldsendout thebinaryspectrumdecisiondn(t) anditscorresponding sensing time to its neighboring CHs.Whenthe neighboringCHs receive dn(t) andthe corre-spondingsensingtime, theyrevisethesubcarriers stateas-signedfor their data transmissions duringthesensingtime(i.e., changedn(t) = 1 todn(t) = 0) and return the modiedbinary spectrum decision to the source CH. After exchangingthe results with the neighboring CHs, a CH will obtain the nalbinary spectrumdecision, which is determined only by the PUsactivities.V. PERFORMANCE ANALYSISIn this section, we will test the performance of our proposedspatiotemporal data mining algorithmand compare it withorthogonal matching pursuit (OMP) [35], single-task BCS[34], decentralized fusion scheme [12], and MT-CS [29]. Theseschemeswerechosenfor comparisonbecausetheyhaveallusedCSalgorithmstosamplespectrumandthenrecoverthespectrummap. TheOMPalgorithmisafast greedystrategythat iterativelyselectsthebasisfunctionsmost alignedwiththecurrentresidual,anditssolutionisbasedonthe1-norm[28]. The BCS algorithm builds a hierarchical sparseness prioranduses therelevancevector machinefor single-taskBCSinversion. The OMP and BCS algorithms are executed in eachindividual SU, andthespatial diversityfromtheotherSUsCSobservationsisnot considered. Thedecentralizedfusionscheme [12] uses the 1-norm to collect spatial diversity againstwireless fading, andthe MetropolisHastings weight set isadoptedto enforce consensusspectrum decision.The MT-CSalgorithmisanextensionofBCSandconsidersthecasethat818 IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. 62, NO. 2, FEBRUARY 2013Fig. 5. Characteristics of the ISI channels.multipleusers aredetectingonecommonsparseness signalsimultaneously. TheMT-CSalgorithmassumesthat thereisacommonhyperparameter sharedamongall tasksandtriestodiscover this commonhyperparameter basedonmultipleobservations fromdifferent users. The decentralized fusionscheme, MT-CS, and our proposed spatiotemporal data miningalgorithms have all used MT-CS.Inoursimulation, weconsideraCRNthat consistsof20clusters, andeachclusterhasvemember nodes. ThereareM= 512subcarriersavailable. Weassumethat thereareintotal I= 38 PUs in the CRN, and each PUoccupies one subcar-rier. In these 20 clusters, each group with two neighboring clus-tersshareonecommonsparsenessspectrum,and20PUsareassumed to exist in the area of those two clusters. Hence, thereexists commonsparseness spectrumamongseveral clusters,butallclustersareunlikelytoshareonecommonsparsenessspectrum due to different PU situations (see Section IV).Sincefrequency-selectivefadingexistsinISIchannels,weassume that each signal received by an SU, experiences one ofthe following three channels (see list below). These three xedISI channels are selected from the examples used in [42] and[43]. The characteristics of these three channels are plotted inFig. 5. We can see the following:1) Channel A: h = [0.407, 0.815, 0.407], which is a spectral-null channel;2) Channel B: h = [0.8, 0.6], although it does not havespectral nulls, its Fourier transformation values at somefrequencies are small;3) Channel C: h = [0.0001 + 0.0001j, 0.0485 + 0.0194j,0.0573+0.0253j, 0.0786 + 0.0282j, 0.0874 + 0.0447j,0.9222 + 0.3031j, 0.1427 + 0.0349j, 0.0835 + 0.0157j,0.0621 + 0.0078j, 0.0359 + 0.0049j, 0.0214+0.0019j],which does not have spectral null or small Fouriertransformation values.For the comparison purpose (with BCS), we initialize 0=102/std()2[34]. Asin[29], thezero-meanGaussiannoisewith standard deviation (i.e., 0.05) is added to each of the mea-surements. We set a = 102/std()2and b = 1 [29] in MT-CSand our proposed spatiotemporal data mining algorithms suchthat the mean of the Gamma prior Ga(0|a, b) is aligned withthe xed value of 0. We assume that the hidden state transfor-mation model can be obtained by SUs. For simplicity, we setp0,0= p1,1= 0.7 and p0,1= p1,0= 0.3 in our simulation.Fig. 6. Performancecomparisonsof thereconstructedsignalsintermsofnormalized MSE.A. Performance of Reconstruction ErrorsWe rst study the performance of spectrum map reconstruc-tionerrorsduringtheCSinversionprocessinour proposedspatiotemporal data mining algorithm. We use the normalizedMSE of the reconstructed spectrum as in [29], [30], and [34],i.e.,___RPRPreconstructed___2/RP

2. (59)Inoursimulation, thereareveSUsineachcluster. EachSUcollects 100consecutive CSobservations andanalyzesthese observations using different CSinversion algorithms,i.e., OMP, BCS, decentralized fusion scheme, MT-CS, and ourproposed spatiotemporal data mining. The normalized MSE ofthe reconstructed signal is plotted in Fig. 6.FromFig. 6, it canbeseenthat our proposedalgorithmandMT-CSbothperformbetter thanother schemes (OMP,BCS, anddecentralizedfusionscheme). This is duetotheuseof multipleobservationsfromdifferent SUstodiscoverthehiddenhyperparameterforspectrummodeling. However,theobservationsfromallSUsdonotsharethesamesparse-nessspectrum. OurproposedalgorithmusestheDPpriortoautomatically group and discover the hidden hyperparametersineachgroupsimultaneously. Furthermore, thetime-domaindependencyamongtheconsecutiveCSobservations is alsoexploitedsincethehiddensubcarriersstatedoesnotchangedramatically. Hence, ourproposedalgorithmperformsbetterthan the MT-CS algorithm.Although the performance of the MT-CS algorithm is closertoour schemecomparedwiththeother threeschemes, thereconstructionerrorsof our proposedalgorithmarereducedby15%to22%(correspondingto40to100pointsofmea-surements, respectively) intermsof normalizedMSE. Suchanimprovement(15%or22%)isimportantsincethismeansthat we could achieve 15%or 22%higher spectrummapreconstructionaccuracyineachSU. Notethat theCSsignalreconstruction accuracy has been a challenging issue due to theapproximation nature of1-normalization that is used in mostCS reconstruction methods. From Fig. 6, one can see that ourproposedalgorithmhasthesmallestnormalizedMSE,whichindicates that our scheme has the best spectrum-sensing perfor-mance among all the schemes. The improvement in spectrum-sensingcanbefurtherseeninFig.7(tobeelaboratednext),HUANG et al.: MULTITASK SPECTRUM SENSING IN CRNs VIA SPATIOTEMPORAL DATA MINING 819Fig. 7. Comparisons of spectrum detection performance. Number of measurements: (a) 40. (b) 50. (c) 60. (d) 70. (e) 80. (f) 90. (g) 100.820 IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. 62, NO. 2, FEBRUARY 2013particularly when the probability of correct detection PCD= 1(which means no interference to the PUs).Another important advantage of our scheme is that the HMMis integrated into our proposed algorithm, which means that achannel state prediction capability can be further exploited. Forexample, we could use HMMto predict what channels will havehigh quality, and send data only in those channels. When sucha prediction capability is equipped with the spectrum-sensingalgorithm, the spectrum-sensing overhead can be dramaticallyreduced. Our next-step research will design a cross-layer CRNprotocol based on the HMM prediction results.B. Performance of Spectrum SensingThe main goal of spectrum sensing is to detect the character-istics of spectrum holes, and then these detection results can beused by the spectrum decision module (see Fig. 1). Hence, herewewill present thespectrumholedetectionperformanceforour proposed algorithm [particularly (57) and (58)]. Since thekey metrics in spectrum sensing are the probability of correctspectrumdetection PCDandthe probabilityof false alarmPFA[31], wewill evaluatethesetwometricsbycomparingthe binaryspectrumsparseness decision dPUwiththe truesubcarriers state dPU as follows [12]:PCD=E_dTPU(dPU= dPU)1TdPU_(60)PFA=E_(1 dPU)T(dPU = dPU)M 1TdPU_. (61)InCRN, weexpect that thespectrum-sensingschemecanprovide a high probability of correct detection PCDand a lowprobabilityoffalsealarmPFA. AhighprobabilityofcorrectdetectionPCDcanreducetheinterferencetothePUs[31],whereas a low probability of false alarm PFA can provide morebandwidthtotheSUs. Thespectrum-sensingperformanceofour proposed algorithm is shown in Fig. 7.For comparisonpurposes, wealsoplot thespectrumholedetection performance of OMP, BCS, decentralized fusionscheme, andtheMT-CSalgorithminFig. 7. Thenumberofmeasurementsforeachobservationvariesfrom40to100.InFig. 7, it can be seen that with the increase of the number ofmeasurements, theperformanceof theOMPandBCSalgo-rithms improves gradually. The more measurements acquired,themorespectruminformationobtained, andthusthebetterspectrumholesdetectionperformancewecanachieve. Notethat thelocal observationis theonlyinformationexploitedintheOMPandBCSalgorithms [seeFig. 7(a)(g)]. How-ever, increasingthe number of measurements alone cannotdramaticallyimprovetheperformanceof our spatiotemporaldataminingscheme,andthesameconclusionholdsforbothdecentralizedfusionschemeandMT-CSalgorithm, bothofwhich can collect spatial diversity information from other SUs.Fig. 7 shows that under the same PCDand PFArequirements,ourproposedalgorithmcanfurtherreducethesamplingraterequirement and thus has a lower sampling cost.Fig. 8. Tradeoffbetweenthenumberofmembernodesandthenumberofmeasurements.In Fig. 7, one can also see the following: 1) When the numberof measurements increases, a higher probability of correctdetection can be achieved since a richer spectrum informationcan be exploited in each SU. Meanwhile, the spectrum-sensingperformance curves start to deviate from the X-axis more ob-viously (see Fig. 7) since more spectrum information exchangeamong SUs also helps to reduce the probability of false alarm.Under such a case, Fig. 7 clearly shows that our schemehasthelowestprobabilityoffalsealarmamongallschemes.2) In CRN, since interference to PUs is not allowed, one may beonly interested in the PCD= 1 case. For such a case (i.e., whenPCD= 1), Fig.7 showsthatthe probabilityoffalse alarm inour proposed algorithm is reduced by 70% compared with theMT-CS algorithm. This is because more spectrum holes can beaccurately detected via our proposed algorithm.C. Tradeoff Between the Number of Measurements and theNumber of TasksSince our proposed spatiotemporal data mining is a multitaskspectrum-sensing algorithm, we can strike a good balancebetweenthe number of member nodes ineachcluster andthe number of spectrum measurements. In Fig. 8, we plot thenormalized MSE performance for three different cases (i.e., 1,3, and5member nodesineachcluster, respectively). FromFig. 8, it canbeseenthat ifthenumberoftasksisreduced(i.e., a smaller number of member nodes ineach cluster),we can increase the number of measurements to collect moreinformation for the CS inversion process.WhentherequirednormalizedMSEvaluesare0.12, 0.13,0.14,and0.15, respectively,the numberof measurementsforcluster size = 5canbe reducedby7.4%, 7.7%, 9.4%, and3.9%, respectively, compared with the case of cluster size = 3.Hence, under the same performance requirements (here we usenormalizedMSE), theCRNwithmorenodesineachclustercan reduce its sampling rate and thus lower the sampling cost.VI. CONCLUSIONIn this paper, we proposed spatiotemporal data miningschemes for low-cost spectrumsensinginCRNs. First, weemployed the DP-based MT-CS method to group the observa-tionsfromdifferentclustersthatmaynotshareonecommonsparseness spectrum. Meanwhile, the BCS inversion was usedHUANG et al.: MULTITASK SPECTRUM SENSING IN CRNs VIA SPATIOTEMPORAL DATA MINING 821toinfer the hyperparameter for eachgroupas well as theemission probability for each hidden subcarriers state. In ad-dition to the DP-based spatial spectrum-sensing model, we alsoused the HMM to further exploit the time-domain dependencyamongconsecutiveobservations. TheViterbi algorithmwasusedtodeducethehiddenhyperparametertomakeacorrectspectrum decision. Finally, the spectrum assignment records intheneighboringCHswereusedtoidentifywhichsubcarriersare occupied by the PUs, and the others can be regarded as theinterference from the SUs in other clusters. Our simulation re-sults illustrated the efciency of our proposed spectrum-sensingalgorithm, which utilizes the spatial and temporal data miningmethodtodiscoverthehiddensubcarriersstate. Ourresultsalso show the following: 1) Our proposed algorithm producesthe smallest normalized reconstruction MSE compared with theother four CS-basedalgorithms. 2) It hasthebest spectrumhole detection performance in terms of the two key metrics, i.e.,the probability of correct detection and the probability of falsealarm. Theresultsillustratethat spatiotemporal dataminingcaneffectivelycollect thespatial diversityinformationfromdifferent SUs and reect the time-domain dependency amongconsecutive spectrum observations.Our futureworkwill focus oncross-layer CRNprotocoldesign to maintain a stable spectrum-sensing performance andthusprovideamorestablequalityofservice(suchasdelay,throughout, etc.) for SUs trafc.APPENDIXDETAILED STEPSFOR (12)In (12), the ML function can be rewritten as() =N

j=1log p(j|)=N

j=1log__p(j|j, 0)p(j|, 0)p(0|a, b)djd0=N

j=1log_p(0|a, b)__p(j|j, 0)p(j|, 0)dj_d0(62)wherep(j|, 0)=_p(j|j, 0)p(j|, 0)dj=_ _20_mj2exp_02 jjj

22_M

k=1_20k_12exp_0k2j,k

22_dj=_20_mj2M

k=1_20k_12_exp_02(jjj)T(jjj)_exp_02TjAj_dj=_20_mj2M

k=1_20k_12_exp_02_TjjTjjjTjTj j+TjTj jj+TjAj__dj=_20_mj2M

k=1_20k_12_exp_02Tjj02_j_Tj j+A_1Tj j_T_Tj j+A__j_Tj j+A_1Tj j_+02Tjj_Tj j+A_1Tj j_dj=__20_mj2M

k=1_20k_12_exp_02_j_Tj j+A_1Tj j_T_Tj j+A__j_Tj j+A_1Tj j__dj_exp_02Tj_Ej_Tj j+A_1Tj_j_. (63)From(63), onecanseethat p(j|, 0) follows azero-meanGaussiandistribution,withcovariancematrix {0[E j(Tj j+A)1Tj ]}1= (1/0)(E + jA1Tj ). Hence,(62) can be further rewritten as()=N

j=1log_ p(0|a, b)__ p(j|j, 0)p(j|, 0)dj_d0=N

j=1log_p(0|a, b)_20_mj21E+jA1Tj12exp_02Tj_E+jA1Tj_1j_d0=N

j=1log_(2)mj21E+jA1Tj12_p(0|a, b)mj20exp_02Tj_E+jA1Tj_1j_d0_=N

j=1log_(2)mj21E+jA1Tj12_ba(a)a1+mj20exp_0_b+12Tj_E+jA1Tj_1j__d0_=N

j=1log___(2)mj21E+jA1Tj12ba(a)1_b+12Tj_E+jA1Tj_1j_a+mj2___822 IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. 62, NO. 2, FEBRUARY 2013=N

j=1mj2log(2)12 logE+jA1Tj+log_ba(a)__a+mj2_log_b+12Tj_E+jA1Tj_1j_= 12N

j=1_(mj+2a) log_TjB1jj+2b_+logE+jA1Tj+N

j=1_mj2log(2)+log_ba(a)_+_a+mj2_log2_= 12N

j=1_(mj+2a) log_TjB1jj+2b_+log |Bj|] + Const (64)where Bj= E + jA1Tj , and Eis the identity matrix. Wecan also see the existence of a constant in the foregoing result,that is, Const =

Nj=1{(mj/2) log(2) + log(ba/(a)) +(a + (mj/2)) log 2}.REFERENCES[1] Spectrumpolicytaskforcereport,FCC, Washington, DC, Rep. EtDocket 02-135, Nov. 2002.[2] H. Khalife, S. Ahuja, N. Malouch, and M. Krunz, Probabilistic path se-lection in opportunistic cognitive radio networks, in Proc. IEEE GLOBE-COM, Dec. 2008, pp. 15.[3] X.-L. Huang, G. Wang, F. Hu, and S. Kumar, Stability-capacity-adaptiverouting for high-mobility multihop cognitive radio networks, IEEETrans. Veh. Technol., vol. 60, no. 6, pp. 27142729, Jul. 2011.[4] I. F. Akyildiz, W. Lee, M. C. Vuran, and S. Mohanty, Next generation/dynamicspectrumaccess/cognitiveradiowirelessnetworks:Asurvey,Comput. Netw., vol. 50, no. 13, pp. 21272159, Sep. 2006.[5] K. R. Chowdhury and M. D. Felice, SEARCH: A routing protocol formobilecognitiveradioad-hocnetworks,Comput. Commun., vol. 32,no. 18, pp. 19831997, Dec. 2009.[6] G. Zhu, I. F. Akyildiz, and G. Kuo, STOD-RP: A spectrum-tree basedon-demand routing protocol for multi-hop cognitive radio networks, inProc. IEEE GLOBECOM, 2008, pp. 15.[7] H. Kushwaha, Y. Xing, R. Chandramouli, and H. Heffes, Reliable multi-media transmission over cognitive radio networks using fountain codes,Proc. IEEE, vol. 96, no. 1, pp. 155165, Jan. 2008.[8] T.YcekandH.Arslan,Asurveyofspectrumsensingalgorithmsforcognitive radio applications, IEEE Commun. Surveys Tuts., vol. 11, no. 1,pp. 116130, 1st Quart, 2009.[9] R. Tandra and A. Sahai, Fundamental limits on detection in lowSNRunder noiseuncertainty,inProc. IEEEWNCMC, 2005, vol. 1,pp. 464469.[10] S. Shankar, C. Cordeiro, andK. Challapali, Spectrumagile radios:Utilizationandsensingarchitectures,inProc. IEEEDySPAN, 2005,pp. 160169.[11] M. Ghozzi, F. Marx, M. Dohler, and J. Palicot, Cyclostationarility-basedtest fordetectionofvacant frequencybands,inProc. IEEECROWN-COM, 2006, pp. 15.[12] F. Zeng, C. Li, and Z. Tian, Distributed compressive spectrum sensingin cooperative multihop cognitive networks, IEEE J. Sel. Topics SignalProcess., vol. 5, no. 1, pp. 3748, Feb. 2011.[13] M.R.Duarte,M.B.Wakin,D.Baron,andR.G.Baraniuk,Universaldistributed sensing via random projections, in Proc. IEEE IPSN, 2006,pp. 177185.[14] Y. Wang, A. Pandharipande, Y. Polo, and G. Leus, Distributed compres-sive wide-band spectrum sensing, in Proc. IEEE ITA, 2009, pp. 178183.[15] M. E. Yildiz, T. C. Aysal, andK. E. Barner, In-networkcooperativespectrum sensing, in Proc. EUSIPCO, 2009, pp. 15.[16] Z. Li, F. R. Yu, and M. Huang, Adistributed consensus-based cooperativespectrum sensing scheme in cognitive radios, IEEE Trans. Veh. Technol.,vol. 59, no. 1, pp. 383393, Jan. 2010.[17] J.A.BazerqueandG.B.Giannakis,Distributedspectrumsensingforcognitive radionetworks byexploitingsparsity, IEEETrans. SignalProcess., vol. 58, no. 3, pp. 18471862, Mar. 2010.[18] Z. Tian, Compressedwidebandsensingincooperativecognitiveradionetworks, in Proc. IEEE GLOBECOM, 2008, pp. 15.[19] L. Xiao, S. P. Boyd, and S.-J. Kim, Distributed average consensus withleast-mean-square deviation, J. Parallel Distrib. Comput., vol. 67, no. 1,pp. 3346, Jan. 2007.[20] I.D.Schizas,A.Ribeiro,andG.B.Giannakis,ConsensusinAdHocWSNs with noisy linksPart I: Distributedestimationof deterministic sig-nals, IEEE Trans. Signal Process., vol. 56, no. 1, pp. 350364, Jan. 2008.[21] G. Ganesan and Y. Li, Cooperative spectrum sensing in cognitive radionetworks, in Proc. IEEE DySPAN, 2005, pp. 137143.[22] G. Ganesan and Y. Li, Cooperative spectrum sensing in cognitive radio,Part II: Multiuser networks, IEEE Trans. Wireless Commun., vol. 6, no. 6,pp. 22142222, Jun. 2007.[23] C.Sun,W.Zhang,andK.B.Letaief,Cluster-basedcooperativespec-trumsensingincognitiveradiosystems,inProc. IEEEICC, 2007,pp. 25112515.[24] Q. Zhao, L. Tong, andA. Swami, DecentralizedcognitiveMACfordynamic spectrum access, in Proc. IEEE DySPAN, 2005, pp. 224232.[25] Q. Zhao, L. Tong, A. Swami, and Y. Chen, Decentralized cognitive MACfor opportunistic spectrum access in ad hoc networks: A POMDP frame-work, IEEEJ. Sel. Areas Commun., vol. 25, no. 3, pp. 589600, Apr. 2007.[26] M. GandettoandC. Regazzoni, Spectrumsensing: Adistributedap-proachforcognitiveterminals,IEEEJ. Sel. AreasCommun., vol. 25,no. 3, pp. 546557, Apr. 2007.[27] C. Sun, W. Zhang, andK. B. Letaief, Cooperativespectrumsensingfor cognitive radios under bandwidth constraints, in Proc. IEEE WCNC,2007, pp. 15.[28] E. J. Cands, J. Romberg, and T. Tao, Robust uncertainty principles: Ex-act signal reconstruction from highly incomplete frequency information,IEEE Trans. Inf. Theory, vol. 52, no. 2, pp. 489509, Feb. 2006.[29] S. Ji, D. Dunson, and L. Carin, Multitask compressive sensing, IEEETrans. Signal Process., vol. 57, no. 1, pp. 92106, Jan. 2009.[30] Y. Qi, D. Liu, D. Dunson, and L. Carin, Multi-task compressive sensingwith Dirichlet process priors, in Proc. ACM ICML, 2008, pp. 768775.[31] X.-L. Huang, G. Wang, F. Hu, and S. Kumar, The impact of spectrumsensing frequency and packet-loading scheme on multimedia transmissionover cognitive radio networks, IEEE Trans. Multimedia, vol. 13, no. 4,pp. 748761, Aug. 2011.[32] Z. Quan, S. Cui, and A. H. Sayed, Optimal linear cooperation for spec-trumsensingincognitiveradionetworks,IEEEJ. Sel. TopicsSignalProcess., vol. 2, no. 1, pp. 2840, Feb. 2008.[33] M. E. Tipping andA. Faul, Fastmarginal likelihoodmaximisationforsparse Bayesian models, in Proc. AISTATS, 2003, pp. 35.[34] S. Ji, Y. Xue, and L. Carin, Bayesian compressive sensing, IEEE Trans.Signal Process., vol. 56, no. 6, pp. 23462356, Jun. 2008.[35] J. A. Tropp and A. C. Gilbert, Signal recovery from random measure-ments via orthogonal matching pursuit, IEEE Trans. Inf. Theory, vol. 53,no. 12, pp. 46554666, Dec. 2007.[36] Y. W. Teh, M. I. Jordan, M. J. Beal, and D. M. Blei, Hierarchical Dirich-letprocesses,J.Amer.Stat.Assoc.,vol.101,no.476,pp.15661581,Dec. 2006.[37] C. F. Jeff Wu, On the convergence properties of the EM algorithm, Ann.Stat., vol. 11, no. 1, pp. 95103, Mar. 1983.[38] H. Ishikawa, Transformation of general binary MRF minimization to therst-order case, IEEE Trans. Pattern Anal. Mach. Intell., vol. 33, no. 6,pp. 12341249, Jun. 2011.[39] J. Borges and M. Levene, Evaluating variable-length Markov chain mod-elsforanalysisofuserwebnavigationsessions,IEEETrans. Knowl.Data Eng., vol. 19, no. 4, pp. 441452, Apr. 2007.[40] J. Wang, F. Wang, C. Zhang, H. C. Shen, and L. Quan, Linear neighbor-hood propagation and its applications, IEEE Trans. Pattern Anal. Mach.Intell., vol. 31, no. 9, pp. 16001615, Sep. 2009.[41] H.-M. Lu, D. Zeng, and H. Chen, Prospective infectious disease outbreakdetectionusingMarkovswitchingmodels,IEEETrans. Knowl. DataEng., vol. 22, no. 4, pp. 565577, Apr. 2010.[42] X.-L. Huang, G. Wang, and F. Hu, Minimal Euclidean distance-inspiredoptimal and suboptimal modulation schemes for vector OFDM system,Int. J. Commun. Syst., vol. 24, no. 5, pp. 553567, May 2011.[43] X. G. Xia, Precoded and vector OFDM robust to channel spectral nullsand with reduced cyclic prex length in single transmit antenna systems,IEEE Trans. Commun., vol. 49, no. 8, pp. 13631374, Aug. 2001.HUANG et al.: MULTITASK SPECTRUM SENSING IN CRNs VIA SPATIOTEMPORAL DATA MINING 823Xin-Lin Huang (S09M12) received the M.E. andPh.D. degreesincommunicationengineeringfromHarbinInstituteof Technology, Harbin, China, in2008, and 2011, respectively.He is an Associate Professor with the Departmentof Information and Communication Engineering,Tongji University, Shanghai, China. He has pub-lished over 25 research papers and has two patents.His research focuses on joint source-channel coding,OFDMtechnology, cognitiveradionetworks, andmachine learning.Dr. Huang was the recipient of Chinese Government Award for OutstandingPh.D. Students in 2010. FromAugust 2010 to September 2011, he wassupported by the China Scholarship Council to do research in the Departmentof Electrical and Computer Engineering, University of Alabama, as a VisitingScholar. HeisaPaper Reviewer forIEEETRANSACTIONS ONWIRELESSCOMMUNICATIONS, IEEE TRANSACTIONSON VEHICULAR TECHNOLOGY,IEEE COMMUNICATIONS LETTERS, Wireless Personal Communications, andthe International Journal of Communication Systems.GangWang(M11) receivedtheB.E., M.E., andPh.D. degreesincommunicationengineeringfromHarbinInstituteof Technology, Harbin, China, in1984,1987 and 2007, respectively.He is a Professor with the CommunicationResearch Center, Harbin Institute of Technology,Harbin, China. He is the Chairman of the Departmentof CommunicationEngineering. Hehaspublishedover 60 research papers and four books. His generalinterests include ad hoc networks, wireless commu-nications, and articial intelligence.Dr.WangwastherecipientoftheNationalGradeIIPrizeofScienceandTechnology Progress and National Grade III Prize of Science and TechnologyProgress.Fei Hu(M12) receivedthePh.D. degreeinsig-nal processing fromTongji University, Shanghai,China, in1999, andthePh.D. degreeinelectricaland computer engineering from Clarkson University,Potsdam, NY, in 2002.HeiscurrentlyanAssociateProfessor withtheDepartment of Electrical and Computer Engineering,University of Alabama, Tuscaloosa, AL. He has pub-lished over 170 journal/conference papers and bookchapters. HisresearchhasbeensupportedbyU.S.NSF, Cisco, Sprint, and other sources. His researchexpertise is in cognitive radio networks and security.

Documents

06328291