Download pdf - Machine Learning Approaches for Wireless Spectrum and Energy …chintha/pdf/thesis/ppt_keyu.pdf · 2018. 10. 2. · Cognitive radio Energy harvesting 2/20. ML for wireless spectrum

PhD Final Oral Defense

Machine Learning Approaches for Wireless Spectrum andEnergy Intelligence

Keyu Wu

Department of Electrical and Computer EngineeringUniversity of Alberta, Edmonton, Alberta T6G 1H9, Canada

September, 2018

Outlines:

1 Background and Motivations

2 Sensing-Probing-Transmitting Control of EH CR

3 Selective Transmission for EH Sensors

4 CSS under Spectrum Heterogeneity

5 Conclusion and Future Research

Outline






Spectrum and Energy Consideration of WirelessCommunications

With the increase of data volume, service types and devices, wirelesscommunication face

Spectrum scarcity

Limited spectrum for wireless applications(33% for all commercial applications in 225 to 3700 MHz);Spectrum reallocation is expensive and slow(70 MHz band costs 19.8 billion dollars).

Energy issue

Huge energy consumption without careful design(communication industry may use 51% of global electricity in 2030);Difficult for powering massive amount of IoT devices.

1 / 20

Cognitive Radio and Energy Harvesting

Cognitive radioEnergy harvesting

2 / 20

ML for wireless spectrum & energy intelligence

Machine learning, a data-driven methodology, is promising for handlingrelevant spectrum and energy management problems.

With ML as a primary tool, three research contributions are made

Joint sensing-probing-transmitting control for EH CR

Optimal transmission for an EH sensor with data priority consideration

Cooperative spectrum sensing under spectrum heterogeneity

3 / 20

Outline






Sensing-probing-transmitting

Harvest: energy package arrives each time slot (uncontrollable)

Sense: measure channel output to detect and track PU activity

Probe: estimate CSI via pilot sequence

Transmit: based on CSI, adapt transmission power and send data

4 / 20






4 / 20






4 / 20






4 / 20






Problem

Based on energy status, PU activity and CSI, the node needs to decidewhether or not to sense and probe, and how much power for transmission,in order to maximize long-term throughput.

4 / 20

Two-stage MDP for long-term optimization

Goal: solving a policy π∗ that maximizes expected throughput

π∗ = arg maxπ

{E

[ ∞∑t=0

γtr(st , π(st))

]}

5 / 20

After-state simplification

s

r

a

s'

Currentstate

After-state Nextstate

π∗(s) = arg maxa∈A(s)

{r(s, a) + γE[V ∗(s ′)|s, a]

}= arg max {immed. reward + expected furture value}= arg max

a∈A(s){r(s, a) + J∗( %(s, a)︸︷︷︸

after-state β

)}

6 / 20

Solve J∗ with RL

Exactly solving J∗ requires the pdfs of EH and fading processes, which canbe hard to obtain.

7 / 20

Solve J∗ with RL

Exactly solving J∗ requires the pdfs of EH and fading processes, which canbe hard to obtain.

We consider to (approximately) learn J∗ with RL algorithm withoutdistribution information.

EH/fadingsample RL

7 / 20

Learned policies

(a) Sensing-probing sub-policy (b) Transmitting sub-policy

aSP : ‘00’, no sense; ‘10’, sense but no probe; ‘11’, sense and probeEnergy for: sense, 1; probe, 2; transmit, {no tx, 3, 4, 5, 6}.

8 / 20

Outline






Selective transmission for EH sensor

Incorporating data-centric consideration

packets associated different priorities

drop low priority packet to save energy

9 / 20

Selective transmission for EH sensor

Problem

Based on EH, energy status, CSI and packet priority, the node needs todecide whether or not to send each packet, in order to maximize the totalpriority values of sent packets.

9 / 20

MDP formulation with after-state

10 / 20

Structural results of J∗ and π∗

Theorem

After-state value function J∗(p) is differentiable and non-decreasing.

Theorem

The optimal policy π∗ has the following structure

π∗([b, h, d ]) =

{1 if b ≥ h and d ≥ J∗(b)− J∗(b − h),

0 otherwise.

11 / 20

Learn J∗ with monotone neural network

FittedValue

Iteration

z-1Data set

FMNN

new MNN

current MNN

ifCnvg.

12 / 20

0 0.2 0.4 0.6 0.8 1

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

0

0.2

0.4

0.6

0.8

1

1.2

1.4

0 0.2 0.4 0.6 0.8 1

0.5

1

1.5

2

2.5

3

13 / 20

Learning efficiency

5e2 1e3 1e4 1e5 1e6

Consumed data samples

0.06

0.08

0.1

0.12

0.14

0.16

0.18

Aver

aged

rew

ard

s

Online-DIS

FNN

FMNN

5 10

105

0.16

0.18

Learning curve

14 / 20

Outline






CSS under spectrum heterogeneity

Spectrum heterogeneity

SUs at different spatial locations mayexperience different spectrumstatuses.

15 / 20

CSS under spectrum heterogeneity

Spectrum heterogeneity

SUs at different spatial locations mayexperience different spectrumstatuses.

Problem

Under spectrum heterogeneity, how to exploit neighbor information to fuseSU observations for improving sensing performance.

15 / 20

Fuse data via MAP-MRF

MAP-MRF CSS framework

Compute maximum a posterior estimation

xMAP = arg maxx

{ΦX (x)

N∏i=1

γxi fY |X (yi | xi )

},

weight γ > 0 introduces tradeoff

Existing works in references [99–102] fuse data via solving marginaldistributions.Compared with [99–102], the proposed MAP-MRF can be solved moreflexibly and efficiently.

16 / 20

Three CSS algorithms based on MAP-MRF

Via graph cut theory, GC-CSS algorithm solves xMAP exactly; complexityorder: O(N · |E|2).

17 / 20


Via dual decomposition theory, DD-CSS estimates xMAP distributedly (atcluster-level); complexity: O(T · Nl · |El |2).

17 / 20


Distributed network: clusters with size 1, DD1-CSS becomes fullydistributedly message passing algorithm; guaranteed for solving xMAP ;complexity order O(T · |N (i)|3).

17 / 20


Existing algorithms (based on belief propagation) only work in distributedsetting; complexity: O(T · |N (i)| · 2|N (i)|).

17 / 20

Performance comparison

0 0.1 0.2 0.3 0.40.5

0.6

0.7

0.8

0.9

1

GC-CSS

DD-CSS

DD1-CSS

BP-CSS

Ind-SS

ROC for various algorithms.18 / 20

Outline






Conclusion

In EH CR, the joint optimization of sensing, probing and transmittingis modeled as a two-stage MDP, whose structure is exploited forafter-state simplification.

In EH WSNs, the optimal selective transmission policy is investigated,which is proved to be threshold-based and derived by training amonotone neural network.

CSS under spectrum heterogeneity is formulated via MAP-MRF,which can be effectively solved by graph cut theory and dualdecomposition theory with polynomial complexity.

19 / 20

Future research

Optimal sensing-probing policy without primary user model

Multi-link selective transmission for energy-harvesting sensors

Learn MRF model from data

20 / 20

Thank you!