Upload
l
View
214
Download
0
Embed Size (px)
Citation preview
Optimization of Microprogram Control Unit with Code Sharing
A. Barkalov, L.Titarenko, L.Smolinski Institute of Computer Engineering and Electronics
Podgórna 50, Zielona Góra, Poland E-mail {A.Barkalov, L.Titarenko}@iie.uz.zgora.pl,[email protected]
Abstract
The method of hardware reduction is proposed which is oriented on compositional microprogram control units with code sharing and PAL-based CPLD chips. The method is based on a wide fan-in of PAL macrocells allowing using more than one source for codes of operational linear chains. An example of the proposed method application is given.
1. Introduction
Control units are very important parts of digital systems [1]. Now, complex programmable logic devices (CPLD) are widely used for implementing logic circuits of control units [2, 3]. They include macrocells of programmable array logic (PAL) having the wide fan-in and limited number of terms per macrocell. To optimize the hardware amount in logic circuit of a control unit, the peculiarities of CPLD should be taken into account, as well as features of a control algorithm to be implemented. If a control algorithm is represented by the linear graph-scheme of algorithm (GSA), then the model of compositional microprogram control unit (CMCU) with code sharing [4] can be used for its interpretation. In this case the codes of classes of pseudoequivalent operational linear chains (OLC) [4] can be represented by more than one source due to the wide fan-in of PAL macrocells. In this article we propose some method of CMCU logic circuit optimization based on use of two sources of codes.
2. CMCU with code sharing
Let GSA be represented by sets of vertices B
and arcs E. Let 210 , EEbbB E , where
0b is an initial vertex, Eb is a final vertex, 1E is a set
of operator vertices, where M,E1 and 2E is a set
of conditional vertices. A vertex 1Ebq contains a
microinstruction YbY q , where NyyY ,...,1
is a set of data-path microoperations [1]. Each vertex
2Ebq contains a single element of the set of logical
conditions LxxX ,...,1 .
Let us form a set of operational linear chains
(OLC) GC ,...,1 for GSA , where each
OLC Cg is a sequence of operator vertices and
each pair of its adjacent components corresponds to
some arc of the GSA. Each OLC Cg has only
one output gO and the arbitrary number of inputs.
Formal definitions of OLC, its input and output can be found in [4].
Let us name GSA as a linear GSA (LGSA) if the following condition takes place:
2G
M (1)
Each vertex 1Ebq corresponds to
microinstruction qMI kept in a control memory
(CM) of CMCU and it has an address qA . The
microinstruction can be addressed using
Each vertex 1Ebq corresponds to
microinstruction qMI kept in a control memory
(CM) of CMCU and it has an address qA . The
microinstruction can be addressed using
MR 2log (2)
978-1-4577-1958-5/11/$26.00 ©2011 IEEE
bits represented by variables Rr TTTT ,...,1 .
Let OLC Cg include gF components and let
)...,,max( 1 GFFQ . Let a binary code
)( gK correspond to OLC Cg and let it
have GR bits, where
GRG 2log (3)
Let a binary code )( qbK correspond to each
component of OLC Cg and let it have QR bits,
where
QRQ 2log (4)
Let components be encoded in such a way that the following condition takes place
1)()( 1 gigi bKbK (5)
where GgF,i g ,,1,1...,1 . Let us use
variables TTr for encoding of components, where
QG RTR , . Let the folloowing condition take
place
RRR QG (6)
In this case a linear GSA can be interpreted using
the model of CMCU 1U with code sharing (Fig. 1) .
Figure 1. Structural diagram of CMCU U1
In CMCU 1U , a block of microinstruction
addressing (BMA) implements the system of input memory functions for counter CT and register RG:
.,
,,
X
X
(7)
Let us point out that in the case of CMCU 1U an
address of microinstruction is represented as the following one:
,))*K(bK()A(b qgq (8)
where qb is a component of OLC Cg and “*” is
a sign of concatenation. The CMCU 1U operates in
the following order.
If 1Start , then an initial address (all zeros) is loaded into RG and CT. In the same time a flip-flop TF is set up which causes 1Fetch , then microinstructions can be read out of control memory.
Each cell of CM keeps microoperations Yyn and
special variables 0y and Ey . If 10 y , then a
current content of CT is incremented, otherwise both CT and RG are loaded from BMA. The first case corresponds to transition from any OLC component except of its output. The second case corresponds to
transition from an OLC output. If 1Ey , then the
flip-flop TF is reset, signal 0Fetch and operation of CMCU is terminated. It corresponds to transition
from the vertex 1Ebq , where Ebb Eq , .
Pulse Clock is used for timing of CMCU.
Let us point out that OLC Cji , are
pseudoequivalent OLC [4] if their outputs are connected with input of the same vertex of GSA . The hardware amount in logic circuit of BMA can be decreased due to introduction of a special block for transforming the OLC codes into the codes of the classes of pseudoequivalent OLC named as a code transformer (TC) [4]. But the TC consumes some resources of the chip in use. In this article we propose to use free cells of CM for representation of the codes of classes of POLC (the first source of codes). In the same time, the register RG can be used as the second source of the codes of POLC.
3. Main idea of proposed method
Let CC 1 be a set of OLC, where 1Cg if its
output is not connected with the final vertex Eb . Let
IC BB ,,1 be a partition of the set 1C on the
classes of POLC. Let us encode the OLC 1Cg in
such a way that the majority of classes CiB are
represented by a single generalized interval of GR -
dimensional Boolean space. The well-known algorithm ESPRESSO [5] can be used for this encoding. Let
BAC , where AiB if the block is
represented by a single interval, otherwise BiB .
If condition
OB (9)
takes place, then the register RG represents all codes
iBK for classes CiB . In this case the BMA
implements 0H terms, where 0H is the number of
transitions of Mealy finite state machine (FSM) used for microinstruction addressing [4]. It is a minimal possible number of transitions.
Let us point out that logic circuits for BMA, RG, CT and TF are implemented using PAL macrocells. To implement the CM, some external PROM chips should
be used having t outputs, where 16,8,4,2,1t [2,
3]. Let the hot-one encoding of microoperations [6] be
used in CMCU 1U . In this case each word of CM
includes N+2 bits. The number 2 is added to N to take
into account additional variables 0y and Ey (Fig. 1).
If each PROM has t outputs and not less than M
words, then it is enough 0K chips for implementing
CM, where
t
NK
20 (10)
Obviously, there are 0R free outputs of PROM chips
where
200 NtKR (11)
These outputs can be used for encoding of classes
BiB , where
1log2 BBR (12)
The number 1 is added to B to indicate the
situation when BiB . If condition
BRR 0 (13)
takes place, then all classes BiB are
represented by CM. Otherwise, only CMI classes can
be represented by control memory, where
02RCMI (14)
The rest of the classes BiB should be placed
into A and represented by RG. In both cases we
propose to use the CMCU 2U (Fig. 2) for
interpretation of a LGSA .
Figure 2. Structural diagram of CMCU U2
In CMCU 2U , the block BMA implements functions
,,,
,,,
XZ
XZ
(15)
where variables Zzr are used for encoding of
classes BiB , 0RZ . The control memory CM
implements functions 0,, yYZ and Ey depended on
variables from sets and T . The principles of
operation for both 1U and 2U are practically
identical. The following design method for CMCU
2U is proposed in this article:
1. Construction of sets 1,CC and C for LGSA
.
2. Encoding of OLC Cg and their
components.
3. Construction of partitions A and B .
4. Encoding of classes BiB .
5. Construction of CMCU transition table.
6. Specification of the control memory.
7. Implementation of CMCU logic circuit.
4. Example of proposed method
Let the sets 71181 ,,,,, CC
and 41 ,, BBC be formed for a GSA 1 ,
where 11 B , 4322 ,, B , 653 ,B ,
74 B , 3211 ,, bbb , 76542 ,,, bbbb ,
10983 ,, bbba , 14114 ,, bb , 16155 ,bb ,
1918176 ,, bbba , 242322821207 ,,,, bbbbb .
As we can see 8G , 3GR , 4Q , 2QR ,
24M , 5R . Therefore, condition (6) takes place and code sharing can be applied. Let us point out
that condition (1) takes place and GSA 1 is LGSA.
Let us encode OLC Cg in the way shown in
Fig. 3 using variables r , where 3 .
Figure 3. Codes of OLC for GSA 1
The components of OLC Cg are encoded in a
trivial way [4]: the first component has the code 00, the second – 01 and so on to satisfy condition (5).
Remind that variables 1T , 2T are used for the
component encoding.
From Fig. 3 it can be found that classes 31, BB and
4B are represented by single intervals of Boolean
space. These intervals are treated as the codes iBK ,
namely 0001 BK , *103 BK ,
1114 BK . The class CB 2 is represented by
the intervals 00* and 01*. This analysis shows that
431 ,, BBBA and 2BB 1BR .
Let it be N=13 for the GSA 1 and let us use PROM
chips with t=4 for implementation of CM. In this case
40 K , 10 R and condition (13) takes place. The
variable Zz 1 is used for encoding of the class 2B .
Let 12 BK , whereas 01 z indicates the
situation BiB .
Let transitions for classes BiB be represented
by the following system of generalized formulae of transitions (GFT) [4]:
;
;
;
;
2211714
1151953
4322043217321522
143218321621411
bxbxB
bxbxB
xxxbxxxbxxbxB
bxxxbxxxbxxbxB
(16)
This system is the base for construction of the
transition table having the following columns: iB ,
iA BK , iB BK qb , qbA , hX , h , h , h .
Here iA BK is a code iBK , where AiB ;
iB BK is a code for BiB ; hX is an input
signal taken from system of GFT; h is a set of input
memory functions loading the code gK into RG;
h is a set of input memory functions loading the
code qbK into CT; 0,,1 Hh is the number of
transition. In our case 321 ,, DDD ,
54 , DD . Let us find some addresses of
microinstructions. For example, the vertex 20b is the
first component of OLC C7 . Therefore,
1110020 bA . By analogy we can find that
0100011 bA , 1000015 bA , 1010015 bA
and 1011019 bA .
Let the symbol jiU stand for interpretation of
GSA j using the model 2,1iUi . The part of the
table of transitions for classes CBB 32 , is shown
in Table 1.
Table 1. Table of CT for FSM )( 12 U
This table is used for construction of system (15). For example, the following equations can be derived
from Table 1: 51213212 xzxxzD (after
minimization); 51214 xzD . The control memory of
CMCU 12 U is specified using well-known
methods [4]. This step is omitted in our article, as well as the step of implementation of the logic circuit. Let
us point out that 120 H and it is determined by the
total number of terms in system (16). In the case of
CMCU 11 U we have 20H . It means that
application of proposed method permits decrease of this value in 1,67 times. We can expect that the
numbers of PAL macrocells in the logic circuits of BMA have the same ratio [4].
6. References [1] Baranov S. Logic Synthesis for Control Automata. –
Kluwer Academic Publishers, 1994. – 312 pp. [2] http://www.altera.com. [3] http://www.xilinx.com. [4] Barkalov A., Titarenko L. Logic Synthesis for
Compositional Microprogram Control Units. – Berlin: Springer, 2008. – 272 pp.
[5] Maxfield C. The Design Warrior’s Guide to FPGAs. – Amsterdam: Elseveir, 2004. – 541 pp.