VLSI Implementation Styles

7/29/2019 VLSI Implementation Styles

1/40

44-1 2006 by CRC Press LLC

44Full-Custom andSemi-Custom DesignCONTENTS

44.1 Introduction ..........................................................44-144.1.1 Semi-Custom Design .................................................44-2

44.1.2 Full-Custom Design... ........................................ ........ 44-2

44.1.3 Motivation for Semi-Custom Design.......................44-2

44.2 Full-Custom Design Sequence of aDigital System........................................................44-3

References .........................................................................44-5

44.1 Introduction

As integrated circuits become more inexpensive and compact, many new types of products, such as digital

cameras, digital camcorders, and digital television [2], are being introduced, based on digital systems.

Consequently, logic design must be done under many different motivations. Since each case is different,

we have different design problems. For example, we have to choose an appropriate IC (integrated circuit)

logic family, since these cases have different performance requirements (scientific computers require high

speed, but wristwatches require very low power consumption), although in recent years, CMOS has beenmore widely used than other IC logic families, such as ECL, which has been used for fast computers.

Logic functions that are frequently used by many designers, such as a full adder, are commercially

available as off-the-shelf IC packages. (A package means an IC chip or a discrete component encased in

a container.) Logic networks that realize such logic networks are often called standard (logic) networks.

A single component, such as a resistor and a capacitor, is also commercially available as an off-the-shelf

discrete component package. Logic networks can be assembled with these off-the-shelf packages. In many

cases, not only performance requirements but also compactness and low cost are very important for

products such as digital cameras. So, digital systems must accordingly be realized in IC packages that are

designed, being tailored to specific objectives, rather than assembling many of these off-the-shelf packages

on pc-boards, although assembling with these off-the-shelf packages has the advantage of ease of partial

design changes.

Here, however, let us consider two important cases of designing an IC chip inside such an IC package,

which is not off-the-shelf, that leads to two sharply contrasting logic design approaches: quick design

and high-performance design. Quick design of IC chips is called semi-custom design (recently called

ASIC design, abbreviating Application Specific Integrated Circuit design), whereas deliberate design for

high performance is called full-custom design because full-custom design is fully customized to high

performance. Full-custom design is discussed in this chapter, and different approaches of semi-custom

design will be discussed in the succeeding chapters.

Saburo MurogaUniversity of Illinois

at Urbana-Champaign


2/40


3/40

Full-Custom and Semi-Custom Design 44-3

2006 by CRC Press LLC

The second term on the right-hand side of Eq. 44.1, [Manufacturing cost per IC package], is fairly

proportional to the size of each chip when the complexity of manufacturing is determined, being usually

on the order of dollars, or tens of dollars in the case of commercial chips. In the case of full-custom

design, chips are deliberately designed by many designers spending many months. So, [Design expenses],

the first term on the right-hand side of Eq. 44.1 is very high and can easily be on the order of tens of

millions of dollars. Thus, the first term is far greater than the second term, making [Total cost of an IC

package] very expensive, unless [Production volume] is very large, being on the order of more than tens

of millions. Many digital systems that use IC chips are produced in low volume and [Design expenses]

must be very low. Semi-custom design is for this purpose and CAD programs need to be used extensively

for shortening design time and manpower in order to reduce [Design expenses]. In this case, [Manufac-

turing cost per IC chip] is higher than that in the case of full-custom design because the size of each

chip is larger.

Thus, we can see the following from the formula in Eq. 44.1: chips by semi-custom design are cheaper

in small production volume than those by full-custom design, but more expensive in high production

volume. But chips by full-custom design are cheaper in the case of high volume production, and are

expensive for low volume production.

44.2 Full-Custom Design Sequence of a Digital System

Full-custom design flow of a digital system follows a long sequence of different design stages, as follows.

First, the architecture of a digital system is designed by a few people. The performance or cost of the

entire system is predominantly determined by architectural design, which must be done based on good

knowledge of all other aspects of the system, including logic design and also software to be run. If an

inappropriate architecture is chosen, the best performance or lowest cost of the system cannot be achieved,

even if logic networks, or other aspects like software, are designed to yield the best results. For example,

if microprogramming is chosen for the control logic of a microcomputer based on ROM, it occupies too

much of the precious chip area, sacrificing performance and cost, although we have the advantages of

short design time and design flexibility. Thus, if performance or manufacturing cost is important,

realization of control logic by logic networks (i.e., hard-wired control logic) is preferred. Actually, every

design stage is important for the performance of the entire system. Logic design is also one of key factors

for computer performance, such as architecture design, transistor circuit design, layout design, compilers,

and application programs. Even if other factors are the same, computer speed can be significantly

improved by deliberate logic design.

Next, appropriate IC logic families and the corresponding transistor circuit technology are chosen for

each segment of the system. Other aspects such as memories are simultaneously determined in greater

detail. We do not use expensive, high-speed IC logic families where speed is not required.

Architecture and transistor circuits are outside the scope of this handbook, so they are not discussed

here further.

The next stage in the design sequence is the design of logic networks, considering cost reduction and

the highest performance, realizing functions for different segments of the digital system. Logic design

requires many engineers for a fairly long time.

Then, logic networks are converted into transistor circuits. This conversion is called technologymapping. It is difficult to realize the functions of the digital system with transistor circuits directly,

skipping logic design, although experienced engineers can design logic networks and technology mapping

at the same time, at least partly. Logic design with AND, OR, and NOT gates, using conventional switching

theory, is convenient for human minds because AND, OR, and NOT gates in logic networks directly

correspond, respectively, to basic logic operations, AND, OR, and NOT in logic expressions. Thus, logic

design with AND, OR, and NOT gates is usually favored for manual design by designers and then followed

by technology mapping. For example, the logic network with AND and OR gates shown in Figure 44.1(a)

is technology-mapped into the MOS circuit shown in Figure 44.1(c). A variety of IC logic families, such


4/40


5/40


6/40


7/40

45-1 2006 by CRC Press LLC

45Programmable LogicDevicesCONTENTS

45.1 Introduction ...........................................................45-145.2 PLAs and Variations...............................................45-2

45.3 Logic Design with PLAs ........................................45-5

45.4 Dynamic PLA .........................................................45-7

45.5 Advantages and Disadvantages of PLAs...............45-745.5.1 Applications of PLAs.......................... ....................... 45-9

45.6 Programmable Array Logic ...................................45-9References ........................................................................45-10

45.1 Introduction

Hardware realization of logic networks is generally very time-consuming and expensive. Also, once logic

functions are realized in hardware, it is difficult to change them. In some cases, we need logic networks

that are easily changeable. One such case is logic networks whose output functions need to be changed

frequently, such as control logic in microprocessors, or logic networks whose outputs need to be flexible,such as additional functions in wrist watches and calculators. Another case is logic networks that need

to be debugged before finalizing. Programmable logic devices (i.e., PLDs)are for this purpose. On these

PLDs, all transistor circuits are laid out on IC chips prior to designers use, considering all anticipated

cases. With PLDs, designers can realize logic networks on an IC chip, by only deriving concise logic

expressions such as minimal sums or minimal products, and then making connections among pre-laid logic

gates on the chip. So, designers can realize their own logic networks quickly and inexpensively using these

pre-laid chips, because they need not design logic networks, transistor circuits, and layout for each design

problem. Thus, designers can skip substantial time of months for hardware design. CAD programs for

deriving minimal sums or minimal products are well developed [1], so logic functions can be realized very

easily and quickly as hardware, using these CAD programs. The ease in changing logic functions without

changing hardware is just like programming in software, so the hardware in this case is regarded as

programmable. Programmable logic arrays (i.e., PLAs) and FPGAs are typical programmable logic devices.

PLDs consists of mask-programmable PLDs and field-programmable PLDs. Mask-programmablePLDs (i.e., MPLDs) can be made only by semiconductor manufacturers because connections are made

by custom masks. Manufacturers need to make few masks for connections out of all of more than

20 masks, according to customers specification on what logic functions are to be realized. Unlike mask-

programmable PLDs, field-programmable PLDs (i.e., FPLDs) can be programmed by users and are

economical only for small production volume, whereas MPLDs are economical for high production

volume. Logic functions can be realized quicker on FPLDs than on MPLDs, saving payment of charges

Saburo MurogaUniversity of Illinois

at Urbana-Champaign


8/40


9/40


10/40

45-4 The VLSI Handbook


by De Morgans theorem. Thus, this is interpreted as a network of AND gates in the first level and

OR gates in the second (output) levels, as illustrated in Figure 45.1(d). This is the reason why the

upper and lower matrices in Figure 45.1(a) are called AND and OR arrays, respectively. The vertical

lines which run through the two arrays in Figure 45.1(a) are called the product lines, since they

correspond to the product terms in disjunctive forms for the output functions f1, f2, and f3. Thus,

any combinational network (or networks) of AND and OR gates in two levels can be realized by a

PLA. The connections of MOSFET gates to horizontal or vertical lines are usually denoted by dots,

as shown in Figure 45.2.

Sequential networks can also be easily realized on a PLA, as shown in Figure 45.2. Some outputs of

the OR array are connected to the inputs of master-slave flip-flops (usuallyJ-Kmaster-slave flip-flops),

whose outputs are in turn connected to the AND array as its inputs. More than one sequential network

can be realized on a single PLA, along with many combinational networks. Flip-flops can be also realized

inside the AND and OR arrays without providing them outside the arrays.

In many PLAs, the option of an outputf1 or its complement is provided in order to give flexibility,

as illustrated in the lower right-hand corner of Figure 45.2. By disconnecting one of the two s at each

output, we can have eitherf1 or as output, as illustrated in Figure 45.3. Whenf1 has too many productsin its disjunctive form and cannot be realized on a PLA, its complement may have a sufficiently small

number of terms to be realizable on the PLA, or vice versa.

If the number of product lines in a PLA is too many, each horizontal line gets too long with a significant

increase in parasitic capacitance. Then, if the majority of the MOSFET gates provided are connected to

this horizontal line, the input or its inverter has too many fan-out connections on this horizontal line.

Similarly, the total number of horizontal lines cannot be too large. In other words, the array size of a

PLA is limited because of speed considerations. In contrast, the size of a ROM can be much larger, since

we can use more than one decoder, or use a complex decoding scheme.

FIGURE 45.2 PLA with flip-flops and output-complementation choice.

x

y

z

J

Cl

K

Q

J

Cl

K

Q

Inputs

f1

f2

f3

OR array

J-Kmaster-slave

flip-flops

AND array

Reset

Clock

f1

f1 f1


11/40


12/40



The PLA show in Figure 45.3, for example, is minimized for the given functions f1, f2, and f3, with 8

product lines and array size, (2 4 + 3) 8 = 88.

However, the minimization of the number of connections in a minimal two-level AND-OR network

may not be as important as the minimization of the number of AND gates, although it tends to reduce

the power consumption, because the chances of faulty PLAs can be greatly reduced by careful fabrication

of chips. But the PLA size is determined by the number of AND gates and cannot be changed by any

other factors. Also, instead of making connections (i.e., dots) as they become necessary on a PLA, a

PLA is sometimes prepared by disconnecting unnecessary connections by laser beam or by blowing

fuses after it has been manufactured with all MOSFET gates connected to the lines. In this case, the

chances of faults can be reduced by increasing the number of connections (i.e., the number of dots) in

the two-level AND-OR network.

For comparison with a PLA, the MOS realization of a ROM is shown in Figure 45.4. The upper matrix

is a decoder which has 2nvertical lines if there are ninput variables. The lower matrix stores information

by connecting or not connecting MOSFET gates. Figure 45.4 actually realizes the same output functions

(in negative logic) as those in Figure 45.1(a). The AND array in Figure 45.1(a) is essentially a counterpart

of the decoder in Figure 45.4, or the decoder may be regarded as a fixed AND array with 2n product

lines, which is the maximum number of the product lines in a PLA. The AND array in Figure 45.1(a)

has only three vertical lines, whereas the decoder in Figure 45.4 has eight fixed vertical lines. This indicates

the compact information packing capability of PLAs. PLAs are smaller than ROMs, although the packing

advantage of PLAs varies, depending on functions. For example, if we construct a ROM that realizes

the functions of the PLA of Figure 45.3, in a manner similar to Figure 45.4, the decoder consists of 8

horizontal lines and 16 vertical lines, and the lower matrix for information storage consists of 16 vertical

lines and 3 horizontal lines. Thus, the ROM requires the array size of 16 (8 + 3) = 176, compared with

88 in Figure 45.3.

FIGURE 45.4 ROM that corresponds to the PLA in Figure 45.1.

f1xyz xyz

f3xyz xz

f2xz

x

y

z

VddVdd

(111) (110) (101) (100) (011) (010) (001) (000)

Decoder


13/40

Programmable Logic Devices 45-7


Generally, the size difference between PLAs and ROMs sharply increases as the number of input

variables increases.

A PLA, however, cannot store some functions, such as x1x2xnifn is large, because 2n1

product lines are required and the number of these lines is excessively large for a PLA. (The horizontal

lines become too long with excessive fan-out and parasitic capacitance.) However, we can store these

functions in a ROM with an appropriate decoding scheme.

Of course, in the case of ROMs, storing a truth table without worrying about conversion of given logic

functions into a minimal sum is convenient, although it makes the ROM size bigger than the PLA size.

Minimal two-level networks of AND and OR gates for the absolute minimization of the PLA size

can be derived by the minimization methods discussed in earlier chapters, if a function to be

minimized has either at most several variables, or many more variables but with a simple relationship

among its prime implicants [8]. But otherwise, we have to be content with near-minimal networks

instead of minimal networks. In many cases, efforts to reduce the PLA size, even without reaching

an absolute minimum, result in significant size reduction. Also, CAD programs have been developed

with heuristic minimization methods [12,13], such as the one by Hong et al. [7], which was the

first powerful heuristic procedure drastically different from conventional minimization procedures.

MINI, PLA minimization program of Hong, et al., was later improved to ESPRESSO by Rudell,

Brayton, et al. [1,10,11]. Recently, however, Coudert and Madre [26] developed a new method for

absolute minimization by implicitly expressing prime implicants and minterms using BDDs

described in Chapter 29. By this method, absolute minimization of functions with greater numbers

of variables is more feasible than before, although it is still time-consuming.

45.4 Dynamic PLA

If we want to realize a PLA in CMOS, instead of static nMOS circuit that has been discussed in Chapter 33,

Section 33.3, in order to save power consumption, then a PLA in CMOS requires a large area because

we need pMOS and nMOS subcircuits. Thus, instead of static CMOS, the dynamic CMOS illustrated in

Figure 45.5(a) is usually used. During the absence of a clock pulse of the first- and second-phase clocks,

1 and 2 (i.e., during 1 = 2 = 0 (low voltage, using positive logic)) shown in Figure 45.5(b), pMOSFETs,

T1, T2, and T3, become conductive and nMOSFETs, T4, T5, and T6 become non-conductive prechargingvertical lines, P1, P2, and P3. When a clock pulse of the first-phase clock, 1, appears but a clock-pulse of

the second-phase clock, 2, does not appear yet, i.e., when 1 = 1 (high voltage) and 2 = 0, pMOSFETs,

T1, T2, and T3, become non-conductive and nMOSFETs, T4, T5, and T6, become conductive. Then,

depending on the values ofx,y, and z, some verticle lines, P1, P2, and P3 are discharged through some

of the nMOSFETs in the AND array. (For example, ify= 0 (low voltage), P1 is discharged through

nMOSFETs A.) A clock pulse of the second-phase clock, 2, is still absent (i.e., 2 = 0), so pMOSFETs,

T7, T8, and T9, become conductive and nMOSFETs T10, T11, and T12, become non-conductive, precharging

horizontal lines,f1,f2, andf3. When a clock pulse of the first-phase clock, 1, is still present, and a clock

pulse of the second-phase clock, 2, appears, i.e., when 1 = 2 = 1, pMOSFETs, T7, T8, and T9, become

non-conductive and nMOSFETs,T10, T11, and T12, become conductive. Then, some of horizontal lines, f1,

f2, and f3, are discharged through some of the nMOSFETs in the OR array, depending on which of the

vertical lines, P1, P2, and P3, are still charged.

45.5 Advantages and Disadvantages of PLAs

PLAs, like ROMs which are more general, have the following advantages over random-logic gate networks,

where random-logic gate networks are those that are compactly laid out on an IC chip:

1. There is no neeed for the time-consuming logic design of random-logic gate networks and even

more time-consuming layout.

2. Design checking is easy, and design change is also easy.


14/40


15/40

Programmable Logic Devices 45-9


PLAs have the following advantage and disadvantage, compared with ROMs:

For storing the same functions or tasks, PLAs can be smaller than ROMs; generally, the size

difference sharply increases as the number of input variables increases.

The small size advantages of PLAs diminishes as the number of terms in a disjunctive formincreases. Thus, PLAs cannot store complex functions, i.e., functions whose disjunctive forms

consist of many product terms.

45.5.1 Applications of PLAs

Considering the above advantages and disadvantages, PLAs have numerous unique applications. A micro-

processor chip uses many PLAs because of easy of design change and check. In particular, PLAs are used

in its control logic, which is complex and requires many changes, even during its design. Also, PLAs are

used for code conversions, microprogram address conversions, decision tables, bus priority resolvers, and

memory overlay.

When a new product is to be manufactured in small volume or test-marketed, PLAs is a choice. When

the new product is well received in the market and does not need further changes, PLAs can be replaced

by random-logic gate networks for low cost for high volume production and high speed. Also, a full-custom design approach is very time-consuming, probably taking months or years, but if PLAs are used

in the control logic, a number of different custom-design chips with high performance can be made

quickly by changing only one connection mask for the PLAs, although these chips cannot have drastically

different performance and functions.

45.6 Programmable Array Logic

A programmable array logic (PAL) is a special type of a PLA where the OR array is not programmable.

In other words, in a PAL, the AND array is programmable but the OR array is fixed; whereas in a PLA,

both arrays are programmable. The advantage of PALs is the elimination of fuses in the OR array in

Figure 45.1(a) and special electronic circuits to blow these fuses. Since these special electronic circuits

and programmable OR array occupy a very large area, the area is significantly reduced in PAL. Sincesingle-output, two-level networks (i.e., many AND gates in the first level and one OR gate as the network

output) are needed most often in desing practice, many single-output two-level networks which are

mutually unconnected are placed in some PAL packages.

In digital systems, many non-standard networks are still used because designers want to differentiate

their computers from competitors. But logic functions that designers want to have are too diverse to be

standardized by semiconductor manufacturers. When off-the-shelf IC packages for standard networks,

including microprocessors and their peripheral networks, are assembled on pc boards, many non-

standard networks are usually required for interfacing them to other key networks or for minor modi-

fications. So, they require many discrete components and IC packages, each of which has a smaller number

of transistors, in addition to a microprocessor package with millions of gates, occupying a significant

share of the areas on pc boards. Now, we can make connections inside PALs, instead of custom-making

pc boards. Custom-made pc boards are expensive and time-consuming because connection patterns on

pc boards need to be designed, these pc boards need to be manufactured and then the holes of pc boards

have to be soldered to the pins of IC packages. The replacement by PAL packages can substantially reduce

the area, time, and cost. If we consider related factors such as reductions of cabinet size, power consump-

tion, and fans, the significance of this reduction is further appreciated.

There are mask-programmable PALs and field-programmable PALs (i.e., FPALs). When logic design

is not finalized and needs to be changed often, FPAL packages can reduce expense and time for repeatedly

redesigning and remaking pc boards.


16/40


17/40


18/40



46.2 CMOS Gate Arrays

CMOS gate arrays are commercially available from many manufacturers in slightly different layout

forms. As an example, Figure 46.2 shows a cell of a CMOS gate array, where a pair of pMOSFETs anda pair of nMOSFETs are placed on the left and right, respectively, without connections between them.

The NAND gate shown in Figure 46.3(a) can be realized by connecting the components shown in

Figure 46.2 by two metal layers as shown in Figure 46.3(b). These two metal layers are formed by

forming the first metal layer shown in Figure 46.3(c), the insulation layer (not shown), and then the

second metal layer shown in (d). The inverter shown in Figure 46.4(a) can be realized by connections

as shown in Figure 46.4(b).

Many different patterns other than that in Figure 46.2 are available for the components of a cell.

FIGURE 46.1 Gate array.

FIGURE 46.2 A cell of CMOS gate array. (Courtesy of Fujitsu Ltd. With permission.)

(a) Before making connections (b) After connections made

f1

f3

x8

f2x3

x4x5

x1x2

x6x7

Polysilicon gate for pMOS Polysilicon gate for nMOS

p for source/drain n for source/drain

n substrate

n forVdd p forVss

p tab


19/40


20/40


21/40

Gate Arrays 46-5


logic networks. The cost difference would be greater (the cost is not necessarily linearly propor-

tional to chip size) for the same production volume.

3. It is difficult to keep gate delays uniform. As the number of fan-outs and the length of fan-out

connections increase, delays increase dramatically. (If delay times of gates are not uniform, the network

tends to generate spurious output signals.) In the case of full-custom design, the increase of gate delay

by long or many-output connections of a gate can be reduced by redesigning the transistor circuit

(e.g., increasing transistor size for delivering greater output power and accordingly reducing the delay).

But such a precise adjustment is not possible in the case of gate arrays.

Responding to a variety of different user needs in terms of speed, power consumption, cost, design

time, ease of change, and possibly others, a large number of different gate arrays are commercially available

from semiconductor manufacturers or are used in-house by computer manufacturers. Different numbers

of gates are placed on a chip, with different configuration capabilities. Some gate arrays contain memories,

for example.

References

1. Okabe, M. et al., A 400k-transistor CMOS sea-of-gate array with continuous track allocation, IEEEJ. Solid-State Circuits, pp. 12801286, Oct. 1989.

2. Muroga, S., VLSI System Design, John Wiley & Sons, 1982.

3. Price, J.E., VLSI chip architecture for large computers, in Hardware and Software Concepts in VLSI,

Edited by G. Rabbat, Van Nostrand Reinhold Co., pp. 95115, 1983.


22/40


23/40


24/40



functions by software. Even application programs can be run on FPGAs and perform much faster than

on general-purpose computer in many cases.

As the price of FPGAs goes down with higher speed, FPGAs are replacing other semi-custom design

approaches in many applications.

47.2 Basic Structures of FPGAs

In the case of mask-programmable gate arrays, designers have to wait a few weeks for delivery of finished

gate arrays from semiconductor manufacturers because the semiconductor manufacturers must prepare

custom masks (although the number of custom masks for gate arrays is fewer than the case of the

standard-cell library approach described in Chapter 48). With FPGAs, designers can realize their design

on FPGA chips by themselves in minutes. Thus, FPGAs are becoming popular [1,2,810].

Several different types of structures for FPGAs are available commercially. All of them have a basic

structures that consists of many logic blocks or logic cells, accompanied by a large number of pre-laid

lines for connecting these logic blocks. So, some manufacturers call FPGAs logic block arrays(LBAs).

One has a structure similar to a gate array with routing channels where each logic cell in a gate array is

replaced with a logic block, as shown in Figure 47.1. Another one is similar to sea-of-gate array, as shownin Figure 47.2 illustrated with 16 logic blocks. Also, there is a structure similar to standard cells (to be

discussed in the next chapter) where there are routing channels between a pair of rows of logic blocks,

as shown in Figure 47.3. There is a structure where outputs of logic blocks are connected to the inputs

of other logic blocks through bus lines, as shown in Figure 47.4.

The internal structure of logic blocks or logic cells differs, depending on the manufacturer. A logic

block consists of SRAMs (used as look-up tables), PALs, NAND gates, along with multiplexers, flip-flops,

and others. Lines are pre-laid horizontally and vertically and are connected to the inputs and outputs of

logic blocks byprogrammable switches. Various programmableswitches, such as fuses, anti-fuses, RAMs,

and non-volatile memories, are provided by different manufacturers. Each line actually consists of many

short line segments and only necessary line segments are connected in order not to add unnecessary

delay due to parasitic capacitance by using an excessive number of line segments. Line segments are also

connected by programmable switches.

FIGURE 47.1 FPGA type of gate array with routing channels.

Connection lines

denotes a connection to be

made or to be disconnected.

Logic block

Switch matrix


25/40


26/40


27/40


28/40


29/40


30/40


31/40


32/40


33/40


34/40


35/40

Cell-Library Design Approach 48-3


48.3 Hierarchical Design Approach

The cell library design approaches, using cells of different shapes and sizes, can reduce the chip size more

than the polycell design approach, because by keeping the same height, a large portion of the area of

each cell is wasted, and by keeping all connections among cells in routing channels, the connection area

may not be minimized. Moreover, by using a hierarchical approach based on cells of different shapes

and sizesin other words, by treating many cells as a building block in a higher level, and many such

building blocks as a building block in a next higher level, and so onwe can further reduce the chip

area, as illustrated in Figure 48.2, because global area minimization can be treated better, even though

this is done on the monitor. In other words, cells A, B, C, and D are assembled into a block R (shown

in a dot-lined rectangle), as shown in Figure 48.2. Then, such blocks, R, S, T and U, shown in dot-lined

rectangles are assembled into a bigger block W, which is a block in a higher level than blocks R, S, T,and U, as shown in Figure 48.2. But this is much more time-consuming than the polycell design approach,

and the development of efficient CAD programs is harder. It appears to be difficult to make the difference

of chip area from full-custom designed chips within about 20%, although the areas of full-custom

designed chips vary greatly with designers and, accordingly, comparison is not simple.

References

1. Lauther, U., Cell based VLSI design system, in Hardware and Software Concepts in VLSI, Ed. by

G. Rabbat, Van Nostrand Reinhold, pp. 480494, 1983.

2. Kick, B. et al. Standard-cell-based design methodology for high-performance support chips, IBM

Jour. Res. Dev., pp. 505514, July/Sept. 1997.

3. Muroga, S., VLSI System Design, John Wiley & Sons, 1982.

FIGURE 48.2 Hierarchical design approach.

T U

R S

W

A

C D

B


36/40


37/40


38/40


39/40


40/40


has variations and it makes a difference whether or not libraries of cells or macrocells are prepared from

scratch. (Notice that in Figure 49.2, design approaches are shown in thin-line curves for the sake of

simplicity, but actually they should be represented in very broad lines.) The cost per package for the

off-the-shelf package design approach is fairly uniform over the entire range, but it increases for low

production volumes because the development cost becomes significant as initial investment in the overall

package cost. The relationship shown in this figure will change as the integration size of an IC chip

increases, because the dependence on CAD will inevitably increase.

49.4 Comparison of All Different Design Approaches

As discussed so far, we have a very wide spectrum of different design approaches, from full-custom design

approaches to the design approaches with off-the-shelf packages, as illustrated in Table 49.1. Digital

systems can be designed by combining them. Depending upon different criteria imposed by different

design motivations, such as speed, power consumption, size, design time, ease of changes, and reliability,

designers can use the following approaches:

1. Custom-design full- and semi-custom approaches

2. Off-the-shelf discrete components and off-the-shelf IC packages, along with memory packages

3. Off-the-shelf microcomputers along with off-the-shelf IC packages

The full-custom design approaches give us the highest performance and reliability or the smallest

chip size, although they are most time-consuming. (Even in the case of microcomputers, the full-

custom designed microcomputers have better performance and smaller size than off-the-shelf micro-

computers, by being tailored to the users specific needs.) This is one end of the wide spectrum of

different design approaches. At the other end, the off-the-shelf microcomputers give us a design

approach where the development time is shortest, by programming rather than by chip design

(including logic design), and the design changes are the easiest. The off-the-shelf discrete components

and off-the-shelf IC packages give us logic networks tailored to specific needs with less programming

than the off-the-shelf microcomputers.

Custom design approaches, in particular the full-custom design approaches, are the most economical

for very high production volumes (on the order of a few hundred thousand) but the least economical

for low production volumes.

When the production volume is low, the off-the-shelf discrete components and off-the-shelf IC

packages give us the most economical approaches for simple tasks, but the off-the-shelf microcomputers

are more economical for complex tasks, although performance is usually sacrificed.

TABLE 49.1 Comparison of Different Task-Realization Approaches

Full-Custom Semi-Custom

Off-the-Shelf IC

Package

Off-the-Shelf

Microcomputer

Speed Fastest Fast Medium Slowest

Size Smallest (chip size) Small (chip size) Large (many chips) Medium (many chips)

Development time Longest (layout) Long (layout) Medium (logic design) Short (programming)Flexibility Lowest Low Medium High

Initial investment Highest (layout) High (layout) Medium (logic design) Low (programming)

Unit Cost

High volume Lowest Low Medium Highest

Low volume Highest High Medium Lowest

Reliability Highest High Low Medium

Documents

VLSI Implementation Styles