7
Low-cost, on-line self-testing of processor cores based on embedded software routines q Dimitris Gizopoulos * Department of Informatics, University of Piraeus, Piraeus, Greece Received 1 August 2003; revised 28 November 2003; accepted 29 December 2003 Abstract On-line testing for complex system-on-chip architectures requires a synergy of concurrent and non-concurrent fault detection mechanisms. While concurrent fault detection is mainly achieved by hardware or software redundancy, like duplication, non-concurrent fault detection, particularly useful for periodic testing, is usually achieved through hardware-based self-test. Software-based self-test has been recently proposed as an effective alternative to hardware-based self-test allowing at-speed testing while eliminating area, performance and power consumption overheads. In this paper, we investigate the applicability of software-based self-test to non-concurrent on-line testing of embedded processor cores and define, for the first time, the corresponding requirements. Low-cost, in-field testing requirements, particularly small test execution time and low power consumption guide the development of self-test routines. We show how self-test programs with a limited number of memory references and based on compact test routines provide an efficient low-cost on-line test strategy for an RISC processor core. q 2004 Elsevier Ltd. All rights reserved. Keywords: On-line testing; Processor testing; Self-testing; Software-based self-testing; Low-cost testing 1. Introduction Manufacturing technologies with feature sizes in the deep submicron area are already a common practice. This fact along with the emergence of the system-on-chip (SoC) design paradigm has resulted in single chip electronic components with sophisticated functionality and unique performance. This increased functionality and performance does not come by itself. The problem of testing deeply embedded processors and the surrounding cores in a complex SoC is becoming more and more difficult at any level of the system life cycle. Manufacturing testing aims at detecting faults during the fabrication process before the SoC is delivered to market. High manufacturing test quality for embedded cores requires at-speed testing to detect defects that manifest themselves only in the actual speed of operation of the system. Hardware self-test techniques like scan-based built-in self-test (BIST) provide excellent test quality as they achieve the at-speed testing goal. However, high-test quality comes along with large hardware overhead, complex timing issues to be resolved and increased power consump- tion during testing. As processor technology strives for high performance, low area and low power consumption it is apparent that hardware BIST cannot co-exist with these high-end processor features. Software-based self-test techniques for embedded pro- cessors have been proposed in Refs. [1–5] as effective alternatives to hardware self-test. These techniques are non- intrusive in nature as they use the processor instruction set to perform self-testing. The key concept of software-based self-test is the generation of efficient self-test routines that lead to high structural fault coverage. The processor executes the self-test software downloaded from a low- speed, low-cost external ATE at its actual speed (at-speed testing) and no area, performance or power consumption penalties are induced. Therefore, software-based self-test is a very important low-cost test solution, i.e. it does not add hardware overhead, it does not have any impact on performance and/or timing and it does not consume additional power during execution. 0026-2692/$ - see front matter q 2004 Elsevier Ltd. All rights reserved. doi:10.1016/j.mejo.2003.12.005 Microelectronics Journal 35 (2004) 443–449 www.elsevier.com/locate/mejo q A preliminary version of this paper was presented at the IEEE International On-line Testing Symposium (IOLTS) 2003. * Tel.: þ30-210-414-2372; fax: þ30-210-414-2264. E-mail address: [email protected] (D. Gizopoulos).

Low-cost, on-line self-testing of processor cores based on embedded software routines☆

Embed Size (px)

Citation preview

Low-cost, on-line self-testing of processor cores based

on embedded software routinesq

Dimitris Gizopoulos*

Department of Informatics, University of Piraeus, Piraeus, Greece

Received 1 August 2003; revised 28 November 2003; accepted 29 December 2003

Abstract

On-line testing for complex system-on-chip architectures requires a synergy of concurrent and non-concurrent fault detection mechanisms.

While concurrent fault detection is mainly achieved by hardware or software redundancy, like duplication, non-concurrent fault detection,

particularly useful for periodic testing, is usually achieved through hardware-based self-test.

Software-based self-test has been recently proposed as an effective alternative to hardware-based self-test allowing at-speed testing while

eliminating area, performance and power consumption overheads.

In this paper, we investigate the applicability of software-based self-test to non-concurrent on-line testing of embedded processor cores and

define, for the first time, the corresponding requirements. Low-cost, in-field testing requirements, particularly small test execution time and

low power consumption guide the development of self-test routines. We show how self-test programs with a limited number of memory

references and based on compact test routines provide an efficient low-cost on-line test strategy for an RISC processor core.

q 2004 Elsevier Ltd. All rights reserved.

Keywords: On-line testing; Processor testing; Self-testing; Software-based self-testing; Low-cost testing

1. Introduction

Manufacturing technologies with feature sizes in the

deep submicron area are already a common practice. This

fact along with the emergence of the system-on-chip (SoC)

design paradigm has resulted in single chip electronic

components with sophisticated functionality and unique

performance. This increased functionality and performance

does not come by itself. The problem of testing deeply

embedded processors and the surrounding cores in a

complex SoC is becoming more and more difficult at any

level of the system life cycle.

Manufacturing testing aims at detecting faults during the

fabrication process before the SoC is delivered to market.

High manufacturing test quality for embedded cores

requires at-speed testing to detect defects that manifest

themselves only in the actual speed of operation of

the system. Hardware self-test techniques like scan-based

built-in self-test (BIST) provide excellent test quality as

they achieve the at-speed testing goal. However, high-test

quality comes along with large hardware overhead, complex

timing issues to be resolved and increased power consump-

tion during testing. As processor technology strives for high

performance, low area and low power consumption it is

apparent that hardware BIST cannot co-exist with these

high-end processor features.

Software-based self-test techniques for embedded pro-

cessors have been proposed in Refs. [1–5] as effective

alternatives to hardware self-test. These techniques are non-

intrusive in nature as they use the processor instruction set to

perform self-testing. The key concept of software-based

self-test is the generation of efficient self-test routines that

lead to high structural fault coverage. The processor

executes the self-test software downloaded from a low-

speed, low-cost external ATE at its actual speed (at-speed

testing) and no area, performance or power consumption

penalties are induced. Therefore, software-based self-test is

a very important low-cost test solution, i.e. it does not add

hardware overhead, it does not have any impact on

performance and/or timing and it does not consume

additional power during execution.

0026-2692/$ - see front matter q 2004 Elsevier Ltd. All rights reserved.

doi:10.1016/j.mejo.2003.12.005

Microelectronics Journal 35 (2004) 443–449

www.elsevier.com/locate/mejo

q A preliminary version of this paper was presented at the IEEE

International On-line Testing Symposium (IOLTS) 2003.* Tel.: þ30-210-414-2372; fax: þ30-210-414-2264.

E-mail address: [email protected] (D. Gizopoulos).

After the chip’s manufacturing correctness is verified by

manufacturing testing, it is placed in its actual environment

of operation where many different types of faults may

appear. Cosmic rays, alpha particles, electromagnetic

interference, and power glitches are some of the main

reasons for operational faults appearance. Operational faults

are usually classified into permanent faults that exist

indefinitely, intermittent faults that appear at regular time

intervals and transient faults that appear irregularly and last

for short time [6]. On-line testing aims at detecting and/or

correcting these operational faults by means of concurrent

and non-concurrent test techniques.

Non-concurrent on-line test strategies are particularly

useful for periodic testing, which assures system reliability.

These techniques are based on hardware BIST and are

capable of detecting permanent faults and intermittent faults

with fairly large duration (when test is applied periodically).

Non-concurrent testing is not capable to guarantee the

detection of transient and intermittent faults of small

duration time.

Concurrent on-line test strategies are used to detect

operational faults within small time frame—low error

detection latency while keeping the system in normal

operation. These strategies utilize hardware redundancy

techniques like duplication with compare, watchdog, and

self-checking design [6,7]. However, when large increase in

silicon area is not acceptable, time (or software) redundancy

techniques provide an alternative for on-line testing. These

techniques achieve the software implemented hardware

fault tolerance (SIHFT) by duplicating program statements

or by executing programs repeatedly [8] or by implementing

signature monitoring.

In this paper, the characteristics of software-based self-

test, i.e. self-testing with the execution of embedded

machine code routines, that make it appropriate in the

context of on-line testing during system operation, are

identified. In particular, we show how software-based self-

test can provide an effective, low-cost, non-concurrent on-

line test solution for embedded systems that do not require

immediate detection of errors (such as detection latency of a

single clock cycle) and cannot afford, in terms of hardware

and performance overhead, the well-known hardware

redundancy or time redundancy mechanisms. If immediate

error detection is mandatory, then software-based self-test

can be transparently combined with a concurrent test

scheme to provide a comprehensive on-line test strategy.

This paper is organized as follows. Section 2 presents the

on-line testing framework and elaborates on guidelines for

embedded software development: (a) small, loop-based test

code, (b) quick test code with minimum memory references.

Section 3 presents a summary of software-based self-test

techniques. Suitability of the software-based self-test

techniques for on-line testing is explored in Section 4.

Section 5 presents overall results for the on-line test routines

developed for major components of the Plasma processor.

Finally, Section 6 concludes the paper.

2. Low-cost, on-line testing requirements

On-line testing of embedded systems is performed while

the processor operates in its normal operational environ-

ment where external memory, SDRAM, ROM or flash

memory and other I/O-peripherals may be present. The

processor operates under the control of an operating system

as shown in Fig. 1. Test program execution may be initiated

during system startup or shutdown thus ensuring correct

operation in subsequent startups. Alternatively, the operat-

ing system scheduler may identify idle cycles and issue test

program execution or test program may be executed at

regular time intervals with the aid of programmable timers

found in the system. When running test software at system

start-up/shutdown or at idle intervals, only permanent faults

(existing or others that became permanent) or non-

permanent faults (intermittent or transient) that accidentally

exist during test execution will be detected. On the other

hand, when running test software at regular intervals,

intermittent faults will also be detected (provided that they

have enough duration). In both cases, transient faults are

accidentally detected.

It is apparent that test software is an additional process

that competes with user processes for system resources,

CPU cycles and memory. Therefore, on-line test programs

execution is considered to be an overhead to overall system

performance regarding memory area, power consumption

and execution cycles. Minimal impact of on-line embedded

test routines to the system resources is the key for achieving

Fig. 1. On-line software self-testing concept.

D. Gizopoulos / Microelectronics Journal 35 (2004) 443–449444

a low cost on-line test strategy for processors and processor-

based embedded systems.

An effective low-cost software-based on-line self-test

methodology should aim at the following goals:

† small memory footprint;

† small execution time;

† low power consumption.

The reasons why the above goals are crucial for a low-

cost on-line self-test methodology is that such a method-

ology should:

† reserve as few as possible system resources (memory

words and CPU clock cycles) for on-line test execution—

this can be guaranteed if self-test routines as fast and

small in size of the program and data;

† detect the targeted faults with as low as possible fault

detection latency—this can be guaranteed again only

if the self-test routines are fast and small and involve

as few as possible memory references (in data

memory and instruction memory).

In the following subsections we analyze these goals and

elaborate on their contribution to low-cost on-line testing.

2.1. Small memory footprint

A serious problem of large on-line self-test routines is

that they require large parts of the system memory to store

their instructions and data in. Any kind of memory like

ROM, RAM or flash memory can be used for this purpose,

but reduced memory sizes of low-cost embedded appli-

cations set a tight limit on memory usage for testing

purposes. Embedded systems usually have very small

amounts of memory carefully allocated to operating system

and the user programs.

Test programs with big memory footprint usually take

more time to run (larger code/data translates to increased

cache miss rate and thus increased number of memory stall

cycles). Additionally, a test program with large memory

footprint will force user data to be unloaded from cache

memory. These data have to be moved to system’s external

memory. When user program resumes, it will experience

cache misses which will significantly affect the overall

system performance.

Finally, large on-line test routines with many memory

references lead to increased power consumption during self-

test execution. It has been calculated that a serious part of

the system power consumption is located in the memory

subsystem and thus reduction of memory references is

always reducing consumed power during test. This is

particularly important in battery-operated systems where

self-test routines are to be executed periodically.

2.2. Small execution time

On-line self-test software routines are regarded as an

overhead to user processes execution time. To tone down

this overhead, test software should run in the minimum

possible number of CPU clock cycles. An ideal case is when

test software is able to execute in a quantum time cycle

assuming an operating system with round robin CPU

scheduling strategy. Typical values of quantum times used

in embedded application are in the range of a few hundreds

of milliseconds. Although it is possible to have test software

execution broken into several quanta, this will lead to

further overhead in system operation due to larger context

switch overheads. The major problem is that large execution

time of self-test routines or spanning them over more than

one quantum time cycles lengthens fault detection latency

of both permanent faults and temporary faults existing

during self-test program execution.

To guarantee minimum performance overhead, on-line

self-test software should:

† consist of compact, loop-based routines of limited

iterations;

† minimize interactions with the memory system.

2.3. Low power consumption

A very common application for on-line testing is mobile

applications where power consumption is of great import-

ance. A study by Intel [10] has shown that 33% of a

notebook system power is consumed in the CPU with a two-

level cache hierarchy system. In the CPU, about 20–30% of

power is consumed in the cache system and about 30% is

consumed in clock circuitry. Considering the data transfers

from external memory in case of a cache miss, the power

consumed in the overall memory system increases further-

more. The processor has to stall when a cache miss occurs.

Extra energy has to be consumed in driving clock-trees and

pulling up and down the external bus between the on-chip

cache and external memory.

On-line test software that takes advantage of temporal

and spatial locality reduces the cache miss overhead. A test

program that is loop based takes advantage of temporal

locality while a test program that references a small amount

of data takes advantage of spatial locality. Test programs

should be constructed in a way that takes advantage of these

features.

3. Software-based self-test methodologies

Software-based self-test methodologies for embedded

processors in SoCs have been presented as an attractive

alternative to classical hardware-based self-test. They

alleviate the problems caused by DFT since they move the

test process to a higher level of abstraction. Therefore, area,

D. Gizopoulos / Microelectronics Journal 35 (2004) 443–449 445

performance and power consumption overheads are elimi-

nated. Different approaches to software-based self-test have

been presented in the literature.

Functional software-based self-test methodologies [1]

use the processor’s instruction set without requiring prior

knowledge of the processor structure. They apply random

instruction sequences combined with random test data in

order to exercise a large portion of processor’s functionality

resulting in self-test software routines that are characterized

by high memory and execution time requirements.

Structural software-based self-test strategies like Ref. [4]

are characterized by a component oriented test development

approach, fine-tuned to the low, gate level details of the

processor core. Pseudorandom patterns or patterns gener-

ated by an automatic test pattern generator (ATPG) are

produced for each of the processor components in an

iterative method which takes into consideration the

constraints imposed by the processor instruction set. For

the targeted components (mainly combinational com-

ponents) where constrained test generation is possible, the

generated tests achieve high fault coverage, and they are

delivered to the component using processor instructions.

A methodology that is neither functional nor low-level

structural has been proposed in Ref. [5]. Test development

is applied on high register-transfer (RT) level while

targeting structural faults. Components are classified

according to their testability properties in functional,

control or hidden classes. This classification leads to self-

test programs achieving high fault coverage with very small

cost both in terms of the engineering effort required and also

in terms of the size and execution time of the generated self-

test programs.

In this paper, we focus on the feasibility and suitability of

different software-based self-test methodologies for on-line

testing of embedded systems. We compare four different

self-test code development styles, namely ATPG based

using immediate instructions, ATPG based using memory

loads, pseudorandom based like in Refs. [2–4] and

deterministic based like in Ref. [5]. The suitability of each

of these methods for on-line testing is analyzed based on the

appropriate criteria and requirements for low-cost, on-line

testing.

4. On-line test challenges for software-based self-test:

alternatives analysis

To demonstrate how software-based self-test is suitable

for low-cost, on-line testing we have developed test routines

for a parallel multiplier, a very critical component of the

processor in terms of speed and circuit area, a component

that can be found in all modern processors. We present the

self-test routines in a generic fashion, which can be applied

to other functional components of the processor. The MIPS

instruction set has been used in order to demonstrate

samples of the self-test routines and a model of the MIPS I

processor named Plasma [11] has been used to provide the

experimental fault simulation results.

Test program execution time can be generally described

by the following equation [9]:

CPU-execution-time

¼ clock-cycle-time £ ðCPU-clock-cycles

þ pipeline-stall-cycles

þ memory-stall-cyclesÞ

It is apparent from the formula that the existence of

pipeline stalls has a negative impact on test execution speed.

Although they do not affect power consumption, they should

be avoided by constructing test programs, which do not

cause unresolved data hazards. Control hazards can be

avoided in architectures that implement the branch delay

slot resolution, like MIPS, by proper instruction placement

in the delay slot. However, stall cycles are unavoidable

when a prediction unit is used to handle branch conditions.

Moreover, the interaction of the test program with the

instruction and data memories has a very serious impact on

the test program execution (which in turn leads to longer

detection latency, a crucial factor for on-line testing).

Minimization of memory interaction (both for instructions

fetching and for data reads and writes) leads to self-test

programs which are more appropriate for low-cost, on-line

testing, i.e. have smaller memory requirements, smaller test

execution times and smaller detection latencies.

In the following subsections we see how the above formula

for program execution time evaluates in four different self-

test program development styles and we compare them with

respect to their suitability for on-line testing.

4.1. ATPG-based self-test routines

An ATPG tool can be used to generate N test patterns for

the processor component under test (the parallel multiplier

in our example) taking into consideration the constraints

imposed by the instruction set. In the Plasma/MIPS model

we use in our analysis and evaluation, the ATPG-generated

test patterns map to two processor instructions for the

multiplication operation. These instructions are mul and

mulu using the MIPS terminology for signed and unsigned

multiplication, respectively. The ATPG-generated test

patterns can be transformed to test routines in two ways.

Either using the processor instruction set with immediate

addressing mode to generate and apply the patterns or

loading patterns from memory and then applying them.

A sample of a routine that uses the immediate instruction

(load immediate (li) instruction) format to generate and

apply patterns is illustrated in Fig. 2.

Test patterns are loaded in registers using the li pseudo-

instruction, which the assembler decomposes to instructions

lui and ori. After test patterns are applied, test responses

D. Gizopoulos / Microelectronics Journal 35 (2004) 443–449446

are compacted by the compaction code. Self-test routines

developed using the immediate instruction format avoid

transferring data from memory. Thus, no data cache misses

will occur. On the other hand, instruction cache misses will

occur frequently as instructions are not re-used in a loop-

based scheme.

Alternatively, the ATPG generated test patterns can be

fetched from the memory system as shown in Fig. 3. In this

scheme, the test patterns for testing the signed multipli-

cation are put in the processor’s memory starting at address

mul_patterns. These patterns are fetched, applied to

the component under test (multiplier) and afterwards test

responses are compacted.

The load word (lw) instruction requires two clock

cycles in order to fetch data from the SRAM. Therefore, no

instruction dependent on the loaded register should be

placed after the lw instruction. This would cause a pipeline

stall making test routine run for more cycles. This is an

example of careful self-test program development focusing

on on-line testing.

Although the test routine fetches test patterns from

memory system and applies them using a loop-based

approach thus minimizing instruction cache miss rate, it

fails to reduce data cache miss rate. Particularly, in the case

that the number of test patterns generated from the ATPG is

large the data cache miss rate will dramatically affect test

program execution time. Therefore, in the most usual cases

(where a significant number of ATPG patterns are necessary

for testing the targeted component), this approach is not

suitable for low-cost, on-line testing.

4.2. Pseudorandom-LFSR-based self-test routines

A pseudorandom test pattern generation approach can be

used to test the parallel multiplier. A routine that applies

pseudorandom test pattern to the multiplier is demonstrated

in Fig. 4.

Obviously, this self-test routine takes advantage of

temporal locality, since it is loop based. No data cache

misses will occur and the number of interactions with the

memory system is limited to the instruction references only.

Pseudorandom-based self-test routines seem to suit well to

on-line software self-test requirements when a random-

testable processor component is considered, i.e. when a high

fault coverage can be obtained with a reasonably small

number of random test patterns applied without many re-

seedings of the software pseudorandom pattern generator.

Unfortunately, many of the processor components are

random-pattern resistant which means that a large number

of pseudorandom test patterns must be applied to reach

acceptable test quality. This fact leads to test programs with

excessive test execution time which has a very serious

Fig. 2. Generating ATPG test patterns using the immediate instruction

format.

Fig. 3. Loading ATPG test patterns from memory.

Fig. 4. Pseudorandom test pattern generation.

D. Gizopoulos / Microelectronics Journal 35 (2004) 443–449 447

impact on the fault detection latency of the applied on-line

testing methodology.

4.3. Deterministic self-test routines

The parallel multiplier is a very critical-to-test com-

ponent of the group of functional components that possess

an inherent regularity (arithmetic components, logic arrays,

register files) and thus a high RT level, deterministic test

development approach [5] can be applied. Software-based

self-test routines generated according to Ref. [5] lead to

compact code where deterministic (non-ATPG based) test

patterns are generated, applied and compacted in a loop-

based manner without data memory interaction. Sample of a

deterministic self-test routine is presented in Fig. 5.

Although, usually, a few more test patterns are necessary

when compared to the ATPG approach the code above

meets better the requirements of on-line testing as it keeps

instruction and data cache miss rate at very low levels.

Deterministic self-test is also applied to any functional

components with inherent regularity. Such regular modules

occupy large silicon areas in the processor and thus very

high fault coverage can be obtained with small test sets

applied by small and fast test routines. Therefore, both low-

cost on-line test requirements for small memory footprint,

and small detection latency are completely satisfied.

4.4. Suitability for on-line testing

Evaluating the presented self-test routines for on-line

testing is not a trivial task. A test routine that executes for

many cycles, like the LFSR routine, might seem inap-

propriate for on-line testing. However, if we consider an

SoC built around a high-speed processor with very small

memory or a slow memory like SDRAM, it is preferable to

let CPU execute for more cycles using a deterministic or

pseudorandom-based approach. On the other hand, if a low-

speed processor is used with a high-speed memory like a

SRAM it might be preferable to use an ATPG-based

approach to avoid excessive CPU cycles and have test

patterns coming from the fast memory provided that the

number of patterns is very low.

Table 1 presents test routine characteristics taking into

consideration instruction cache miss rate, data cache miss

rate and number of test patterns.

The ATPG approach leads either to high instruction

cache miss rate or to high data cache miss rate. Test routine

construction is affected by the irregularity of the ATPG

generated test patterns.

The pseudorandom approach keeps cache miss rate at

low level, but usually requires a large number of test

patterns to guarantee high-test quality. Therefore, overall

test program execution is lengthened.

On the other hand, the deterministic approach achieves

high-test quality, wherever it can be applied, by generating a

small number of non-ATPG test patterns without exercising

the memory system too much. As a result, low-cost on-line

test goals are achieved at a higher degree.

5. Experimental results

A 32-bit MIPS like core named Plasma with three-level

pipeline is used as a vehicle to demonstrate test routines

characteristics. It supports interrupts and all MIPS-I user

mode instructions, except unaligned load and store (which are

patented) [11]. The Plasma core was enhanced with a parallel

multiplier with the following characteristics: Booth recoding,

Wallace trees for partial products summation and fast carry

lookahead addition at the final stage [12]. Mentor Graphics

suite was used for VHDL simulation (Modelsim) and

generation of a test set of 168 test patterns (FlexTest) taking

into consideration the constraints imposed by the instruction

set.TwosoftwareLFSRprogramsweredeveloped togenerate

400 and 1200 test patterns for the multiplier, respectively.

We developed self-test routines for the parallel multiplier

according to the approaches presented earlier. Test responses

are compacted in all cases using a software MISR routine and

a final signature is stored in memory for further analysis.

Table 2 presents statistics for the test programs

developed for the parallel multiplier.

Column 2 presents the memory requirements of each test

program in four-byte words. Column 3 presents the data

memory references (loads and stores) the test programs are

expected to make. Columns 4 and 5 present the clock cycles

Fig. 5. Deterministic test pattern generation.

Table 1

Test routine characteristics

Test routine Instruction cache

miss rate

Data cache

miss rate

Number of

test patterns

ATPGIMM High Low Low

ATPGLOOP Low High Low

LFSR Low Low High

Deterministic Low Low Low

D. Gizopoulos / Microelectronics Journal 35 (2004) 443–449448

and the test coverage for stuck-at faults that is achieved by

each test strategy.

The ATPG-based test programs cause significant pro-

blems when placed in the on-line test environment because of

their quite large memory footprint. The ATPGLOOP test

program requires 336 data memory references in order to

load 168 patterns as each pattern is 64-bits while the lwinstruction loads 32-bit amount of data and 1 memory

reference in order to store the signature. While the ATPGIMM

test program achieves the smallest number of execution

cycles, its huge memory footprint makes it prohibitive for the

on-line test environment. Optimizing the ATPGIMM by

removing the compaction code after each multiplication and

jumping to the compaction routine (ATPGIMM_JAL) reduces

the memory footprint by almost 47%. However, the memory

footprint of the ATPGIMM_JAL routine remains large and fails

to meet the on-line test requirements.

We should note that the ATPG-based routines are

favored as they reside in SRAM. Applying the same test

routines on a multi-level cache system without SRAM will

degrade their performance as explained earlier in Section

4.1. On the other hand, the LFSR and deterministic-based

routines suit much better to in-field test environment. Both

LFSR and deterministic routines have a very small memory

footprint and thus do not affect user space. Achieving very

high-test quality with pseudo-patterns is a real challenge as

already mentioned in Section 4.2. Table 2 shows that the

pseudorandom-based routine has a great impact on test

execution time. Increased test execution time means that

fault detection latency is also increased.

On-line test requirements are met best by the determi-

nistic-based routines. Not only do they have a small memory

footprint but they also achieve high-test quality while

keeping test application time relatively low. The determi-

nistic-based routines can also be applied to a system without

on-chip SRAM. Because of their loop-based nature as

shown in Section 4.3 it is expected that after a small number

of instruction cache miss will fully reside in cache.

We have applied the deterministic test development

strategy to the Plasma CPU core. Self-test program statistics

(memory requirements and CPU clock cycles), along with

achieved test coverage are presented in Table 3.

The attractive features of deterministic-based routines

namely small memory footprint and small test application

time lead to low power consumption without compromising

high-test quality thus low-cost on-line objectives are met.

6. Conclusions

We have shown that recently proposed software-based

self-test of processor cores in SoCs can be a very effective,

low-cost strategy for on-line testing. Faults can be detected

with relatively small-embedded test code that executes for

small time intervals. Software-based self-test for on-line

testing can be applied to improve reliability of low-cost

embedded systems where hardware redundancy and soft-

ware-redundancy cannot be applied due to their excessive

cost in terms of silicon area and execution time, respect-

ively. The experimental results presented for a RISC

processor cores of a classical architecture used in embedded

systems show that a deterministic-based self-test routine

development strategy leads to high-quality routines of small

size, small execution time and low power consumption.

References

[1] J. Shen, J. Abraham, Native mode functional test generation for

processors with applications to self-test and design validation,

Proceedings of IEEE ITC 1998, pp. 990–999.

[2] F. Corno, G. Cumani, M. Sonza Reorda, G. Squillero, Fully automatic

test program generation for microprocessor cores, IEEE Design

Automation and Test in Europe Conference (DATE), 2003, pp. 1006–

1011.

[3] W. Zhao, C. Papachristou, Testing DSP cores based on self-test

programs, Proceedings of the IEEE Design Automation and Test in

Europe Conference (DATE), 1998, pp. 166–172.

[4] L. Chen, S. Dey, Software-based self-testing methodology for

processor cores, IEEE Trans. CAD Integ. Circuits Sys. 20 (3)

(2001) 369–380.

[5] N. Kranitis, G. Xenoulis, A. Paschalis, D. Gizopoulos, Y. Zorian,

Low-cost software-based self-testing of RISC processor cores,

Proceedings of the IEEE Design Automation and Test in Europe

Conference (DATE), 2003, pp. 714–719.

[6] H. Al-Assad, B.T. Murray, J.P. Hayes, Online BIST for embedded

systems, IEEE Des. Test Comput. 15 (4) (1998) 17–24. Oct.–Dec.

[7] M. Nicolaidis, Y. Zorian, On-line testing for VLSI—a compendium of

approaches, J. Electron. Testing: Theory Appl. 12 (1–2) (1998) 7–20.

[8] N. Oh, E.J. McCluskey, Error detection by selective procedure call

duplication for low energy consumption, IEEE Trans. Reliability 51

(4) (2002) 392–402.

[9] J. Hennessy, D. Patterson, Computer Architecture A Quantitative

Approach, 1996, MKP.

[10] Intel Corporation, Mobile Power Guidelines 2000, Dec. 11, 1998.

[11] Plasma CPU Model, http://www.opencores.org/projects/mips.

[12] J. Phil, E. Sand, Arithmetic Module Generator for High Performance

VLSI Designs, http://www.fysel.ntnu.no/modgen/.

Table 3

Self-test program statistics for the entire Plasma RISC processor

Statistic test program size (words) 930

Clock cycles 13,077

Test coverage 94.6%

Table 2

Multiplier test program statistics

Test

program

Size

(words)

Data references Clock cycles Test coverage

(%)

TPATPG_LOOP 397 337 4240 99.5

TPATPG_IMM 2539 1 2890 99.5

TPATPG_IMM_JAL 1207 1 3562 99.5

TPLFSR (400) 69 1 9837 97.6

TPLFSR(1200) 69 1 29,437 97.9

TPDETERMINISTIC 57 1 4050 99.0

D. Gizopoulos / Microelectronics Journal 35 (2004) 443–449 449