19
Temporal Logic Replication for Dynamically Reconfigurable FPGA Partitioning Wai-Kei Mak Dept. of Computer Science and Engineering University of South Florida Evangeline F.Y. Young Dept. of Computer Science and Engineering The Chinese University of Hong Kong

Temporal Logic Replication for Dynamically Reconfigurable FPGA Partitioning

Embed Size (px)

DESCRIPTION

Temporal Logic Replication for Dynamically Reconfigurable FPGA Partitioning. Wai-Kei Mak Dept. of Computer Science and Engineering University of South Florida Evangeline F.Y. Young Dept. of Computer Science and Engineering The Chinese University of Hong Kong. Outline. - PowerPoint PPT Presentation

Citation preview

Page 1: Temporal Logic Replication for Dynamically Reconfigurable FPGA Partitioning

Temporal Logic Replication for Dynamically Reconfigurable FPGA

Partitioning

Wai-Kei MakDept. of Computer Science and Engineering

University of South Florida

Evangeline F.Y. YoungDept. of Computer Science and Engineering

The Chinese University of Hong Kong

Page 2: Temporal Logic Replication for Dynamically Reconfigurable FPGA Partitioning

Outline

I. Dynamically reconfigurable FPGAII. Temporal partitioning = Conventional

partitioning?III. Temporal logic replication

What? Why? How?

IV. Experimental resultsV. Conclusions

Page 3: Temporal Logic Replication for Dynamically Reconfigurable FPGA Partitioning

Dynamically Reconfigurable FPGA

Store multiple contexts on chip.Reuse logic blocks and wire segments

dynamically.The contexts stored can correspond to the

multiple stages of a large circuit.

Page 4: Temporal Logic Replication for Dynamically Reconfigurable FPGA Partitioning

Temporal Circuit Partitioning

Temporal partitioning multiple stages execute sequentially

Spatial partitioning multiple components execute concurrently

Page 5: Temporal Logic Replication for Dynamically Reconfigurable FPGA Partitioning

Temporal Logic Replication

Can reduce buffering requirement.Effectively utilize available slack logic

capacity.

Page 6: Temporal Logic Replication for Dynamically Reconfigurable FPGA Partitioning

Temporal Constraints

For a net n = (v1, {v2, …, vp}),

require s(v1) s(vj), j=2,…,p, if v1 is a combinational node

Page 7: Temporal Logic Replication for Dynamically Reconfigurable FPGA Partitioning

Temporal Constraints (Cont’d)

require s(vj) s(v1), j=2,…,p, if v1 is a flip-flop node

Page 8: Temporal Logic Replication for Dynamically Reconfigurable FPGA Partitioning

Temporal Partitioning with Replication

Problem: Partition given circuit into pre-defined # stages satisfying all temporal constraints.

Objective: Minimize buffers required between stages.

Proposal: Utilize available slack logic capacity to reduce signal buffering.

Solution: An effective 2-step approach.

Page 9: Temporal Logic Replication for Dynamically Reconfigurable FPGA Partitioning

2-Step Approach

Step 1: Compute a temporal partition w/o replication.Step 2: Repeatedly identify the bottleneck stage and

apply replication for that stage.

Page 10: Temporal Logic Replication for Dynamically Reconfigurable FPGA Partitioning

Advantages of 2-Step Approach

Will not replicate unnecessarily.All temporal constraints are already

satisfied when replicating.

Page 11: Temporal Logic Replication for Dynamically Reconfigurable FPGA Partitioning

Min-Area Min-Cut Replication

Let stage i be the bottleneck stage.Min-Cut ReplicationCompute a subset of nodes Ri in stage i

for replication into stage i+1 to maximally reduce the communication cost at stage i.

Min-Area Min-Cut ReplicationCompute a minimum subset of nodes Ri in

stage i for replication into stage i+1 to maximally reduce the communication cost at stage i.

Page 12: Temporal Logic Replication for Dynamically Reconfigurable FPGA Partitioning

Optimal Solution for Min-Area Min-Cut Replication

Observation 1:The min-cut replication problem can be solved by

computing a minimum cut (Vi-Ri,Ri) in stage i.

Observation 2:The min-area min-cut replication problem can be

solved by computing a minimum cut (Vi-Ri,Ri) in stage i s.t. |Ri| is minimized.

Let Vi = set of nodes in stage i.

Page 13: Temporal Logic Replication for Dynamically Reconfigurable FPGA Partitioning

Example

A pre-partition:

Computing a minimum cut in stage 2:

Page 14: Temporal Logic Replication for Dynamically Reconfigurable FPGA Partitioning

Example (Cont’d)

Computed R2 = {j}

Page 15: Temporal Logic Replication for Dynamically Reconfigurable FPGA Partitioning

Network Modeling

Need to ensure thatcut size = buffer requirement

For a net (v1, {v2, …, vp}),

Page 16: Temporal Logic Replication for Dynamically Reconfigurable FPGA Partitioning

The Case of Limited Slack Logic Capacity

The solution of min-area min-cut replication suffices if slack logic capacity is sufficiently large.

Otherwise, |Ri| exceeds the slack, then use a heuristic to reduce Ri.

Use a repeated max-flow min-cut heuristic to gradually reduce Ri (so cut size is only increased gradually).

H. Yang, D.F. Wong, “Efficient Network Flow based Min-Cut Balanced Partitioning”, ICCAD’94.

Page 17: Temporal Logic Replication for Dynamically Reconfigurable FPGA Partitioning

Algorithm

Input: Stage area bound A.1. Network modeling for bottleneck stage i.2. Compute min-cut (Vi-Ri,Ri) s.t. |Ri| is

minimized.3. If |Vi+1|+|Ri| A, stop and return Ri.

4. Collapse a node in Ri with all nodes in

Vi-Ri, goto 2.

Page 18: Temporal Logic Replication for Dynamically Reconfigurable FPGA Partitioning

Experimental Results

Circuit#buf w/o rep.

#buf w/ rep.

Imprv. %

Rep. %

C3540 198 194 2.02 0.48

C5315 140 129 7.86 0.67

C6288 83 63 24.10 4.41

C7552 210 176 16.19 3.12

S13207

688 669 2.76 2.54

S15850

761 699 8.15 3.59

S35932

2729 2636 3.41 2.48

S38417

2194 2104 4.10 0.63

S38584

2280 2137 6.27 0.98

Page 19: Temporal Logic Replication for Dynamically Reconfigurable FPGA Partitioning

Conclusions

Proposed temporal logic replication to reduce buffering requirement in DRFPGA partitioning.

Presented an effective 2-step approach.Formulated and optimally solved the min-area

min-cut replication problem.Extended to case of limited slack logic

capacity. In the paper, a new timing-driven temporal

partitioning algorithm was introduced to compute pre-partition.