A scalable high-performance graphics processor: GVIP

  • Published on

  • View

  • Download


<ul><li><p>The ;31sual lJomputer </p><p>A scalable high- performance graphics processor: GVIP </p><p>Tsuneo Ikedo </p><p>Computer Architecture Laboratory, University of Aizu, Tsuruga, Ikki-machi, Fukushima 965, Japan </p><p>The GVIP (geometric and TV image pro- cessor) graphics processor, which creates and synthesizes computer graphics and TV images and meets the requirements of multi-media systems, is described. The hardware modules that make up this graphics processor include: a 32-bit em- bedded RISC processor, a Phong and Gouraud shading processor, a texture mapping processor, a hidden surface re- moval processor, an HDTV video image processor, a BitBlt processor, an image- processing module, and an outline font fill generator. These hardware modules fab- ricated using 0.8/zm CMOS standard cells have been placed in three integrated cir- cuit chips. The total number of gates used for one set of chips is approximately 350000. </p><p>Key words: Graphic processor - multi- media systems - HDTV- Polygon ren- dering </p><p>Introduction </p><p>Research and development in the graphics proces- sor field is embarking upon a new epoch as demands for ever higher drawing speeds and multi-media system requirements have made past solutions obsolete. Multi-media systems have to manage as well as synthesize various types of inputs such as computer graphics images, camera images, and sound. For example, data captured by a camera or scanner may have to be syn- thesized in multi-dimensional space and com- bined with character data and computer graphics images using filtering or pattern recognition tech- niques; the graphics output may be displayed on such devices as CRT monitors or color laser printers after converting the pixel primitives into appropriate data formats in real time. Past research and development efforts in the graphics processing area have been mainly in the area of wireframe and polygon rendering tech- niques; they employed special software algorithms which met such graphics standards as PHIGS + or used massively parallel computer architectures to enhance the speed of ray-tracing methods. Even for such extremely demanding applications as multi-media graphics, the trend has been to use massively parallel general-purpose processing ele- ment (PE) architectures or specialized symmetric hardware modules interconnected for parallel processing. For example, geometric images in current rendering systems are often produced by using special data formats (e.g. polynominal equa- tions) for the drawing primitives, symmetric inter- connection networks, and for the processing elements (PEs) one of the following: general-pur- pose RISC processors [1], digital signal proces- sors (DSP) [2], or application-specific integrated circuits (ASICs) [3, 4]. Their performance varies considerably depending on the data type being processed or on the function being carried out; in multi-media applications, they would not fare well because of their inefficiency in handling diverse data types concurrently and their poor utilization of computer resources. In fact, when a drawing speed exceeding a few million polygons per second is required, massively parallel general- purpose PE architectures become extremely cost inefficient because the required number of PEs becomes excessive for practical implementations. Hybrid organizations which combine PEs, used as graphics accelerators, and specialized hardware for display controllers are also used in currently </p><p>The Visual Computer (1995) 11:121-133 1 r "1 9 Springer-Verlag 1995 f_. / </p></li><li><p> ompuCer </p><p>available commercial systems [5]. Looking at the cost vs performance issues, the best solution is believed to be a graphics processor built with several highly specialized modules interconnected for parallel processing. In this paper, a new type of graphics processor, the GVIP (geometric and TV, image processor) will be described. This processor is built with multiple hardware modules; each hardware mod- ule has its own specialized function, such as execu- tion of application-specific software code (RISC processor), rendering, BitBlt, image synthesis with geometric and HDTV primitives, hidden surface removal, and texture mapping. It can handle vari- ous data types without performance degradation because hardware resources do not have to be shared. Furthermore, it is scalable and hence drawing speeds exceeding 10 million polygons may be easily achieved. </p><p>System overview </p><p>GVIP is a graphics system which can manage, synthesize, and display as single or combined images various kinds of data derived from com- puter graphics and TV pictures. The system con- sists of two parts: the graphics accelerator and the drawing primitives generator. In this paper, only the drawing primitives generator will be de- scribed; the GVIP graphics accelerator will be described in a forthcoming paper. The GVIP is scalable and hence its performance may be en- hanced by interconnecting sets of modules in a MIMD architecture. One set of GVIP chips can produce 3 million three-dimensional polylines/s and 1.2 million polygons/s while carrying out 100-pixel hidden surface removal (24 bits depth), texture-mapping, and Phong shading operations. The remarkable feature of GVIP is that it is ca- pable of maintaining this kind of performance even when it receives HDTV images (73 MHz), because it is able to store images in its frame buffer independently. The basic system shown in Fig. 1 consists of three VLSI chips: the graphics processor GVIP01, the BitBlt (bit block transfer) processor GVIP02, and the video controller GVIP03. The GVIP system provides real-time interfaces with various devices without using special buffers. Conventional graphics processing computer architecture ap- </p><p>I / t </p><p>Sy# m </p><p>i ! ! ! ! Ho v r-soa i- (HighDensityTV) I Color Printer ! Camera I VTR- CD i , </p><p>i </p><p>cPub </p><p>2 </p><p>Application </p><p>ink </p><p>r Window 7 F - ~ Multi~Media Layer3 Management J ~ | .Interface nterface </p><p>Window ] ~ t e II Image I !Synthesisl ~ --- .. II Segmentation I of I Layer2 i Control Transformation I LHighli~ghtin~g ] </p><p>Primitives 7 ~ 7 ~ IFrarne buffer ~, iP~g~!leil Layer1 Attributes / TeT~_ure ~ LZB] I Control i i O .. . . . t i control ~ M i- Cache ~ Net~wn~rk </p><p>Hardware GVIP/MIMD Architecture </p><p>3 </p><p>Fig. 1. GVIP system organization </p><p>Fig. 2. GVIP graphics display system </p><p>Fig. 3. GVIP graphics instruction set </p><p>proaches such as pixel arrays, systolic arrays, or special purpose multi-processor organizations which employ microprogramming are not used in GVIP. This chip integrates several different hard- ware modules which perform different functions on different data. Parallel processing and pipelin- ing are used extensively. The drawing speed is virtually unaffected by the type of data processed because there are hardware modules optimized for each data type. Furthermore, GVIP has the capability to operate in parallel at very high </p><p>122 </p></li><li><p> omputer speeds in such tasks as simultaneous computer image generation (rendering process) HDTV image transmission, and true color image printing without any additional frame grabbers. The GVIP01 chip is connected to the system bus and acts as the master graphics processor for the system. It has an embedded 32-bit RISC proces- sor and several hardware modules with specia- lized graphics functions. Main memory and program RAM/ROM of the RISC processor are available externally for the user's specified pro- gram and data processing. It analyzes instructions sent from the system processor and then sends the pixel primitives to the following hardware mod- ules: the Gouraud and Phong shading module, the BitBlt module, the hidden surface removal (HSR) module, the TV (HDTV) image module, the texture mapping module, the outline font fill module, the image processing module, and the parallel link channel protocol module. The BitBlt processor chip GVIP02 acts as an interface between the GVIP01 and the frame buf- fer. This architecture, which uses a master graphics processor and several BltBlt processors to manage pixel operations by dividing a frame buffer, has been used in the past [6-8-1. In addi- tion to the usual modules found in BitBlt proces- sors, such as the pixel cache, the interior style generator, the boolean operation unit, and the z cache, the GVIP02 has four-way Phong shading generators, a texture mapping processor, a video signal converter, and a parallel link channel. The transmission speed (including the frame buffer writing time) from the external bus to the GVIP, passing through this chip and terminating in the frame buffer, is 8 ns per pixel (32 bits). This trans- mission speed figure is based on actual experi- mental data using a 60-ns access time DRAM (e.g. MB814260) as the frame buffer and 100 MHz video data frequency. GVIP03 is a video controller chip which gene- rates and controls cursors and video images. It integrates the video signal sent by the GVIP02 chip with cursors and outputs the combined sig- nals to a digital-to-analog converter (DAC). The input/outputs signals of this chip are organized with two-phase timing in order to be able to process high-frequency (100MHz or higher) video signals. The basic organization of the GVIP system with true color capability is shown in Fig. 2; in this figure, the GVIP04 chip which carries </p><p>out the parallel Z component comparisons is shown. Detailed descriptions of all the modules in the GVIP system will be given in the subsequent sections. As shown in Fig. 3, the GVIP graphics processing instructions are organized in a hierarchical fashion. The approximately 60 instructions in layer 1 are used to control hardware. Instructions in layer 2 are executed by two different modules: (a) the embedded RISC processor in GVIP01: image processing and highlight pre-processing; and (b) the graphics accelerator (attached to the GVIP01): coordinate transformations, window control, and user-specified highlighting. The instructions in the third layer are used by the system, processor. In total there are nearly 200 instructions: 30 for hardware control, 19 for win- dow control, 56 for primitives, 38 for attributes, 47 for coordinate transformations, and 16 for sound. </p><p>Graphics processor GVIP01 </p><p>A block diagram of the GVIP01 chip is shown in Fig. 4. Many different types of highly specialized pipelined/parallel processors are integrated within the GVIP01 chip. Theses processors are interconnected using a route tree topology. The drawing primitives sent to GVIP01 by the system processor undergo several processes such as coor- dinate transformation, perspective projection, clipping, and highlighting in order to produce device coordinate primitives which are then used to generate pixels. The RISC processor inside the GVIP01 chip could be used to carry out all of these processes using software routines; however, its main role is restricted to the generation of pixels from the device coordinate primitives so that high throughput may be attained. The coor- dinate transformations are processed by a graphics accelerator; its details are not discussed in this paper. Other processes carried out or prep- rocessed by the GVIP01 chip include shading, texture mapping, hidden surface removal, and anti-alias. This chip works as the central graphics system processor, managing and controlling other chips such as GVIP02 and GVIP03. It was de- signed using standard cells and the chip occupies 14mm 2. The GVIP01 chip consists of the follow- ing modules: </p><p>123 </p></li><li><p>omputcr </p><p>. Port R SC ~ 3 ~ e ~ Z ~ T r a c k i n g 'Shad'ng Pre-Pr0cessor ID" Graphcs" CPU &amp; Core : Processor </p><p>I | F IFO II tpr~176 I1', ~ _ 9 L - -~H bus , i - ~ el_ </p><p>- - - - - . vul,~ne.om ,. Pe~rru r' I I : I I Processor" I r ~P~ee~a~atl0n)process(~r -N </p><p>. . . . . . . . . . . IP : I _ ~ ' I [ ~ ~or .~b Inst ction Dual7axis : ImageProcessing i - </p><p>L . RAM /~ ' i zer~ ,: ~ ~1 T . " - P ,~v~ rn ' - . , I Rende ,l]g P r0c~s0r~SneAdtdo re~ I bUS </p><p>' External - ~ u / I Pr~g~emr ~l~ta / I i ~ F . . . . buffer </p><p>LROM/RAMJ I I Ad~z dress Generator~'=--~r_ Adress bus Ip I[/m age ~ F e s i s - ~ ~ , - - - - ~ F Parallel l ink </p><p>i Processe~Dl_Channel Contro l e r ~ i Colqtrol bus </p><p>4 Polygon Vertex </p><p>&amp; Affributes </p><p>J i 2 i lon . . . . . . . o : - : . . . . . . . . . f : : . . . . . i D'ff " , Adder Boolean ' To </p><p>~ r 0 c e sng~,,~ ~rC~Track lng : Fill Co-effic[er J Cache Operatlonl Buffer: Frame </p><p>Light r . . . . . . . . . . . . : I Source.~ i . . . . . . . . . . . . . . ; </p><p>. . . . . . . . . . . . . . . . . . . . . . . . . . . Z ; Frame </p><p>nterpolation :Texture ~!oc~ss Pa~ern [GVlP 01 . i Gr iP. 9,4 . </p><p>Fig. 4. GVIP01 graphics processor </p><p>Fig. 5. Example of rendering processing </p><p>- Thirty-two-bit RISC processor -Render ing (Gouraud and Phong shading) </p><p>preprocessor - Texture mapping processor - Thirty-two-bit hidden surface and line removal </p><p>processor - Anti-alias processor - BitBlt and image copy/move processor - Outline font fill processor - Parallel link channel controller - Image processing module - Frame buffer address and timing generator - Video timing generator </p><p>Embedded RISC processor </p><p>Currently, a 32-bit, 25-MIPS embedded RISC processor is used; it will be upgraded to a 64- bit l l0-MIPS/100-MFLOPS RISC processor (0.6 #m CMOS) during 1994. Its main functions are to interpret instructions/data sent by the sys- </p><p>tern processor and to distribute instructions and data to graphics modules; it seldom performs arithmetic operations. In addition to a general purpose register, it has 32 pixel (32-bit/pixel) image buffers which can directly transmit pixel to/from a frame buffer. The execution code and data reside outside the GVIP01 chip; graphics- specific instructions are used to minimize the ma- chine cycle. Separate instruction and data busses are used. The RISC processor also performs func- tions such as control of the multiple local bus and special interrupt and fetch handling for hardware graphics modules. In the rendering mode such as Gouraud or Phong, this processor is used simply to manage data transmission (load and store), to the appropriate hardware modules; other render- ing tasks are handled by the rendering processor. The logic gates used for the RISC processor oc- cupy about 30% of the GVIP01 chip area; this implies that most of the GVIP01 functions are carried out by the specialized graphics modules on the GVIP01 chip. </p><p>Rendering processor </p><p>This processor works together with the embedded RISC processor in the GVIP01 chip. It consists of the sequence controller, 16 register files, 14 digital differential analyzers (DDAs) grouped into two sets (seven DDAs/set), and a high-speed bus which connects it to the RISC processor. Vertex interpolation are computed for three-dimensional coordinates, normal vectors, and for intensity transformation vectors along the outlines of a polygon. The transformation vector interpola- tions are needed for texture mapping. Two sets of DDAs are used so that it is possible to simulta- neously output the coordinates and the attributes of the right and left branches o...</p></li></ul>


View more >