28
A joint newsletter of the Statistical Computing & Statistical Graphics Sections of the American Statistical Association. April 1993 Vol.4 No.1 COMPUTING GRAPHICS A WORD FROM OUR CHAIRS Statistical Computing One of the perks of the Chair of the Statistical Com- puting Section is writing a column for this newsletter. Imagine: I can write for several thousand members of my profession, without the benefit of refereeing. I get to tell you what I think, rather than what I know. Maybe I’m a bit odd, but what I’ve been thinking about is copyrighting and patenting of statistical software and of computer material in general. My interest was piqued by an advertisement I received for a computer pack- age that would display a high dimensional plot using a patented algorithm. The name of the method was jar- gon, so I could tell nothing from the advertisement about the program and what it did. Mainly, what the adver- tisement did was make me wonder just what it meant for an algorithm to be patented. Was I supposed to believe that patenting was a substitute for peer review? Did it imply certification? Did it mean that I could not com- pute this graph, whatever it is, with my own computer code unless I paid a royalty? Assuming that most of us non-lawyers are as ignorant of copyrighting and patenting as I was, I thought a short summary of what I’ve learned might be of in- terest. Finding the information was easy: one call to my university’s patent office brought a few relevant pa- pers, particularly [1], and a copy of the law. A review of my university library’s on-line card catalog gave nearly 200 references, including a journal, Software Protec- tion, which has been published since 1982. Copyright and patent are very different. The basis for copyright is contained in the U. S. Constitution, which gives to Congress the authority “[t]o promote the Progress of Science and the useful Arts by securing for Limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries” [2]. Exactly what can be covered by copyright was spelled CONTINUED ON PAGE ?? FEATURE ARTICLE Saxpy, gaxpy, LAPACK, and BLAS Colin Goodall The Pennsylvania State University Measuring Performance One of the best understood computational tasks is lin- ear algebra. Considerable effort has gone into fast and accurate code for these manipulations, e.g. LINPACK (Dongarra et al. 1979), EISPACK, and most recently LAPACK (Anderson et al. 1992). The speed of these computations is measured in mega flops (MFLOPS), or millions of floating point instructions per second. Each floating point instruction is a single arithmetic opera- tion, e.g. a multiplication, a divide, an addition or sub- traction, performed in full floating point precision arith- metic, usually 64 bits double precision. Two common benchmarks for comparing performance are Dhrystone, which measures speed of integer arithmetic computa- tions, and Whetstone for floating point computations. These measure not just the speed of the CPU but also compiler performance. Dhrystone performance, mea- sured in millions of instructions per second, can equal and possibly exceed the clock speed of the CPU. A more exacting benchmark involves a practical lin- ear algebra task, specifically the 100 100 LINPACK benchmark (Dongarra (1993)). This measures the speed achieved, in MFLOPS, in computing the Cholesky de- composition of an arbitrary 100 100 symmetric ma- trix, using a fixed set of FORTRAN code. The numbers obtained from this benchmark are surprising: For exam- ple a SparcStation II, with a processor rated at around 25 MFLOPS, has a 100 100 LINPACK speed of 4.0 MFLOPS (Dongarra (1993)). Put another way, the theo- retical requirement for an Cholesky decomposition is 3 3 2 2 flops, or 353,333 flops when 100. CONTINUED ON PAGE ??

Production of Stereoscopic Displays for Data Analysis

Embed Size (px)

Citation preview

A joint newsletter of theStatistical Computing& Statistical GraphicsSections of the AmericanStatistical Association.

April 1993 Vol.4 No.1 COMPUTING GRAPHICS

A WORD FROM OUR CHAIRS

Statistical ComputingOne of the perks of the Chair of the Statistical Com-puting Section is writing a column for this newsletter.Imagine: I can write for several thousand members ofmy profession, without the benefit of refereeing. I getto tell you what I think, rather than what I know.

Maybe I’m a bit odd, but what I’ve been thinking aboutis copyrighting and patenting of statistical software andof computer material in general. My interest was piquedby an advertisement I received for a computer pack-age that would display a high dimensional plot using apatented algorithm. The name of the method was jar-gon, so I could tell nothing from the advertisement aboutthe program and what it did. Mainly, what the adver-tisement did was make me wonder just what it meant foran algorithm to be patented. Was I supposed to believethat patenting was a substitute for peer review? Did itimply certification? Did it mean that I could not com-pute this graph, whatever it is, with my own computercode unless I paid a royalty?

Assuming that most of us non-lawyers are as ignorantof copyrighting and patenting as I was, I thought ashort summary of what I’ve learned might be of in-terest. Finding the information was easy: one call tomy university’s patent office brought a few relevant pa-pers, particularly [1], and a copy of the law. A review ofmy university library’s on-line card catalog gave nearly200 references, including a journal, Software Protec-tion, which has been published since 1982.

Copyright and patent are very different. The basisfor copyright is contained in the U. S. Constitution,which gives to Congress the authority “[t]o promote theProgress of Science and the useful Arts by securing forLimited Times to Authors and Inventors the exclusiveRight to their respective Writings and Discoveries” [2].Exactly what can be covered by copyright was spelled

CONTINUED ON PAGE ??

FEATURE ARTICLE

Saxpy, gaxpy, LAPACK,and BLASColin Goodall The Pennsylvania State University

Measuring PerformanceOne of the best understood computational tasks is lin-ear algebra. Considerable effort has gone into fast andaccurate code for these manipulations, e.g. LINPACK(Dongarra et al. 1979), EISPACK, and most recentlyLAPACK (Anderson et al. 1992). The speed of thesecomputations is measured in mega flops (MFLOPS), ormillions of floating point instructions per second. Eachfloating point instruction is a single arithmetic opera-tion, e.g. a multiplication, a divide, an addition or sub-traction, performed in full floating point precision arith-metic, usually 64 bits double precision. Two commonbenchmarks for comparing performance are Dhrystone,which measures speed of integer arithmetic computa-tions, and Whetstone for floating point computations.These measure not just the speed of the CPU but alsocompiler performance. Dhrystone performance, mea-sured in millions of instructions per second, can equaland possibly exceed the clock speed of the CPU.

A more exacting benchmark involves a practical lin-ear algebra task, specifically the 100 � 100 LINPACKbenchmark (Dongarra (1993)). This measures the speedachieved, in MFLOPS, in computing the Cholesky de-composition of an arbitrary 100 � 100 symmetric ma-trix, using a fixed set of FORTRAN code. The numbersobtained from this benchmark are surprising: For exam-ple a SparcStation II, with a processor rated at around25 MFLOPS, has a 100 � 100 LINPACK speed of 4.0MFLOPS (Dongarra (1993)). Put another way, the theo-retical requirement for ann�nCholesky decompositionis n3=3 + 2n2 flops, or 353,333 flops when n = 100.

CONTINUED ON PAGE ??

EDITORIAL

“Three in 93 and four in 94” was proclaimed as therallying call of the new editors of the joint StatisticalComputing and Statistical Graphics newsletter. If wehave succeeded then this first issue of 1993 should be inyour hands before the April Interface meeting. We areplanning to mail the second issue before the Joint Statis-tical Meetings in August, and the third sometime aroundThanksgiving. More issues mean the newsletter can bemore timely, provide announcements and encourage di-alogue. Please use your newsletter to communicate withthe membership of the two largest sections of the ASA.Our deadlines for the remaining issues in 1993 are thelast day of June and October.

Many regular columns will continue, but we solicit yourhelp with new ideas and offers to write columns or once-only pieces. Please keep those e-cards and e-letterscoming!

This issue has two feature articles. The first feature de-scribes the availability and discusses the design of publicdomain matrix and linear algebra routines. Many aca-demic computer installations can make available highquality subroutines of the standard algorithms withoutthe need to purchase or lease commercial software pack-ages, e.g. IMSL or NAG. Algorithms are available, asdescribed by Colin Goodall, to anyone with FTP soft-ware and access to the internet.

The second feature, about graphics and stereoscopic dis-plays, is like the Sunday night mini-movie on networktelevision. It looks like a feature article, but is really thefirst episode of a new column. Dan Carr has steppeddown as editor but couldn’t resist the challenge of aregular column. The graphic images which accompanythis article should also be seen as the beginning of moreinnovative graphical material which we would like toprint.

The Newsletter is now being set/typeset in LATEX. Af-ter three years of colorful newsletters we have tried torestrain ourselves to typographic spice and a change informat. Neither of us has any sense of color, so wewill take the conservative (and probably boring) tackand use black text on a neutral background. We want tocredit Kevin Fox, the design artist from Penn State, forthe new masthead and for keeping our link to the pastwith the intersecting circles punctuating the articles.

Submissions should be sent by email to either of theeditors. If you can prepare your article in TEX or LATEXthat will make our lives just a little easier. Otherwiseplain old ASCII format is fine.

In preparing this issue we have realized yet again theherculian efforts of the newsletter’s founding editors,Sallie Keller-McNulty and Dan Carr. We hope we canlive up to the standards they have set for us.

James L. RosenbergerEditor Statistical Computing [email protected]

Mike MeyerEditor, Statistical Graphics [email protected]

SECOND FEATURE

Production ofStereoscopic Displaysfor Data AnalysisDaniel B. Carr George Mason University

Dedicated to David L. Hall, colleague and friend.

Stereoscopic displays help the analyst escape from thelimited domain of 2-D visualization into the natural do-main of 3-D visualization. The goal of producing 3-Dscatterplots motivates much of the following discussion.The goal has strong implications in terms of selectinga stereo projection. In the every day world, familiaritywith objects and many depth cues facilitates fusion ofleft and right retinal images into a stereo image. Monoc-ular depth cues include linear perspective (objects ofequal size transect areas inversely proportional to theirdistance), interposition or occlusion (when one objectis in front of another it obscures the more distant ob-ject), shadows (we generally assume light comes fromabove), detail perspective (no fine detail appears in dis-tant objects due to limited visual acuity), and aerialperspective (greater optical depth through the air leadsto a blue shift). For most elementary 3-D scatterplotsprior knowledge about the form to be perceived is lim-ited and monocular cues are restricted, so care must betaken in the selection of a stereo projection.

Two Infinite Families of Reasonable StereoProjectionsDifferent geometric models lead to different stereo pro-jections. (Geometric models are idealized in that eacheye has a blind spot, a region of high resolution, andvarious imperfections.) A simple model (Newman andSproull 1979) presumes that the eyes converge on a sin-gle focal point and constructs left and right images by

2 Statistical Computing and Statistical Graphics Newsletter April 1993

projecting onto left and right projection planes. Theprojection planes contain the focal point and are orthog-onal to the respective lines of site. This fixed-focal-point model is appropriate for advanced dynamic sys-tems that update immediately as the eyes change theirfocal point. However, the fixed-focal-point projectiondoes not correspond to the data analyst’s typical stereo-viewing scenario. Valyus (1962) states, “it has beenshown experimentally that eye movements performedwhen stereoscopic pictures are viewed are similar tothose performed in observing a real object. As thegaze is transferred from one object to another the eyesperform conjugate movements directed to the subjec-tively most important regions, and at the same time co-ordinated convergence movements take place.” Whilethe fixed-focal-point projection has proved passable forshowing non-updated images of familiar scenes, imagefusion problems result when looking at points in thecorners of the plot. Thus fixed-focal-point projection isinadequate for 3-D scatterplots.

When the eyes have multiple focal points within thesame stereo image, a reasonable compromise uses a sin-gle common projection plane that is parallel to a frontalview of the face. In a multiple-focal-point model, a datapoint projected into the projection plane has the same ycoordinate for both left and right images. This is a fun-damental requirement for 3-D scatterplot projections.

LCOP(-e/2,0,d)

RCOP(e/2,0,d)

+X

+Y

-Z

(x,y,z)

(xr,yb,0)

(xl,yb,0)

Figure 1

Two classes of projections satisfy the multiple-focal-point (single projective plane) constraint. The first classuses standard projective methods with separate centersof projection for the left and right eyes, denoted LCOPand RCOP respectively. Assume a right handed coordi-nate system with positive z toward the viewer. Then theLCOP and RCOP coordinates relative to the center ofthe workstation screen are (�e=2; 0; d) and (e=2; 0; d)where e is the eye separation (see Figure 1). The pro-jected coordinates can then be found by scaling the vec-tor from the eye to the data point by a constant, s, so thatthe scaled vector touches the screen (the z coordinate is

0). For the right eye this yields

s� [(x; y; z) � (e=2; 0; d)] + (e=2; 0; d) = (xr; yr; 0)(1)

Solving both right-eye and left-eye equations for s basedon the z coordinate and substituting yields:

xr = (dx� ez=2)=(d � z)

xl = (dx+ ez=2)=(d � z) (2)

y = dy=(d � z)

The fact the both left and right images have the samey coordinate is evident from geometric considerations.Consider a data point appearing behind the projectionplane (or viewing screen). The two eyes and this datapoint form a triangle that intersects the screen. If thefrontal view of the face is parallel to the screen and theeyes are level, then the twin projected points must alsobe level.

The second class of multiple-focal-point stereo projec-tions may be called a depth-cued orthogonal projection.In this projection, the LCOP and RCOP are shifted foreach data point so that the midpoint between the eyeshas the same x and y coordinates as the display-scaleddata point to be projected. The depth-cued points arethen

xr = x� ez=[2 � (d� z)]

xl = x+ ez=[2 � (d� z)] (3)

y = y

The two projection classes have continuous projectionand viewing parameters. Projection parameters includeeye separation (e), projection distance (d) and the size ofthe viewing cube into which we translate and scale thedata. For convenience consider the workstation screenas a viewing cube that is 20 centimeters on a side withits front face centered and aligned with the screen sur-face. Viewing conditions may differ from the projectionmodel and that introduces magnification (m) and view-ing distance (d0) parameters. These parameters can bevaried over an interval and still lead to comfortable im-age fusion of a 3-D point cloud. Consequently bothstereo projection classes are infinite.

Projection Parameter Bounds and ParallaxWhile an infinite number of projections are effective,some parameter bounds should not be violated and someprojections are more desirable than others. The follow-ing provides some background concerning parameterconstraints. The curious reader is referred to Valyus(1962) for additional detail.

For most people, eye separation falls in the interval be-tween 5:2 and 7:4 centimeters. Fortunately, using the

April 1993 Statistical Computing and Statistical Graphics Newsletter 3

exact eye separation for each individual is not crucialas evidenced by stereo publications that are enjoyed bydiverse audiences.

Parallax is a key concept for understanding the pro-jection parameter bounds. The horizontal parallax, p,of a point refers to the distance between the projectedcoordinates on the screen, xr and xl. Then for ourmultiple-focal-point stereo projection

p = xr � xl = �ez=(d� z) (4)

A point appearing in front of the screen will have apositive z, z < d, and the parallax will be negative.Similarly, a point appearing behind the screen will havepositive parallax. Parallax must be limited to provideacceptable stereo fusion. Maintaining image focus con-strains the amount of acceptable parallax. Suppose theeyes converge on the twin images of a point as if thepoint were real. The apparent location of the point is thefocal point. Eye convergence is usually coupled withaccommodation (lens focusing) so that a region in frontand in back of the focal point is in focus. If this regionincludes the workstation screen the image of the pointwill be clear. While those experienced in stereo viewingoften learn to decouple the convergence and accommo-dation of their eyes, a mismatch can lead to either fusionor focus problems. Constraining the parallax to keep theperceived image depth close to the screen avoids suchproblems. The horizontal parallax is related to angu-lar parallax on the retina. Studies (Valyus 1962, Yehand Silverstein 1990) have related the speed of imagefusion to angular parallax and provide guidelines. Inshort, restricting the viewing cube to 23 centimeters ona side or smaller will generally satisfy the less restrictive(but slower fusion) Valyus bound. Lipton (1982) rec-ommends the equivalent of centering the viewing cubedepthwise on the screen to control the parallax. In thiscase the viewing cube can be made substantially largerbefore the parallax becomes excessive.

Magnification and Viewing DistanceView related parameters include the viewing distance d0

and the image magnification m. The equation of paral-lax can be solved for the apparent distance from a pointto the screen, z0, in terms of the actual viewing distance,d0.

z0 = d0=(1 � e=p) (5)If the parallax is fixed then doubling the viewing dis-tance doubles the apparent distance of the point to thescreen. Thus the viewing cube frame can be made toappear squashed or elongated by selecting a differentviewing distance, d0, than actual projection distance, d.

Often a stereo image is shown to a large audience on abig screen. If the image is not designed for the room,

magnification may cause problems. Magnifying the im-age magnifies the parallax. A Taylor series expansionof the depth equation about p = 0 yields

z0 = �d0� (p=e+ (p=e)2

+ (p=e)3� � �) (6)

When the parallax is small relative to eye separation,magnifying the image increases the apparent depth al-most linearly. As the parallax to eye-separation ra-tio approaches one-half, the nonlinear terms contributeequally and magnification distorts the image. As magni-fied parallax approaches eye separation, the depth of im-age theoretically approaches minus infinity. This causesboth depth distortion and fusion problems before the ap-parent depth reaches infinity.

Control of Perspective DifferencesUnder natural conditions the left and right eyes havedifferent views of the world. The field of view for asingle human eye is about 150� horizontally and 135�

vertically. In binocular vision the field of view coveredby both eyes is 120� so 30� or 1=5 of each eye’s fieldof view is unique to the eye. For 3-D scatterplots allpoints must be seen unless hidden by occlusion, so im-ages must be restricted to the shared field of view. Ifthe viewing cube is sufficiently smaller than availabledisplay space on the workstation screen this is not aproblem.

Hodges (1992) provides an astute comment on stereoproduction for workstations. He notes that some hard-ware systems provide only a single center of projection.Then the standard trick for producing stereo projectionsis to use the “off-axis” projection. For each eye’s viewthis shifts the data toward midpoint between the eyes,projects from the midpoint and then shifts the resultback. While the projections produce coordinates iden-tical to the LCOP and RCOP projections, the fields ofview differ. This provides another reason to constrainthe size of the viewing cube. For scatterplots all corre-sponding left and right points must appear on the screen.

Perspective differences within the shared field of viewcan cause fusion and interpretation problems. Perspec-tive induced fusion problems may be evident whenshowing the viewing-cube frame in small side-by-sideplots. If, for example, the left cube face has the x

coordinate at �e=2, the left cube face will project asa line in the left eye image but as a trapezoid in theright eye image. This radical perspective-induced dis-crepancy complicates image fusion. Since the viewingcube frame is simple, avoiding radical differences inperspective is straightforward. For example rotatingthe viewing cube often suffices. In fact some analystsprefer perspective views of the cube that have two or

4 Statistical Computing and Statistical Graphics Newsletter April 1993

Figure 2 - Random Points on a Mobius Strip. The structure is apparent with any 3-D viewing approach. In flatlandrecognizing the structure is much harder. Piecing scatterplot matrix views together is not so easy without brushing orconditioning. Conditioning or slicing is helpful in moving to higher dimensions. For real data, overplotting is moreof a problem and density- based presentations become advantageous.

Figure 3 - Selected Stereo Contours for a 3-D Density Estimate.

Figure 4 - Steepest Ascent Ridge Traces for a 2-D Density Surface. Ridges and their projections in domain spaceprovide another way of viewing density estimates. This example is provided courtesy of Qiang Luo.

Figure 5 - Random Dot Stereogram. This contour completion illusion at different depths takes a while to fuse. Thisimage is provided courtesy of Nathan Carr, who used it in a high school science project that repeated experimentspioneered by Bela Julesz (1971).

April 1993 Statistical Computing and Statistical Graphics Newsletter 5

three infinity points rather than the common head-onview that has only one infinity point. Another trickbases the projection on a distance substantially largerthan the actual viewing conditions. This reduces per-spective differences.

The foreshortening (interpoint distances appear smalleras a function of depth) of perspective views compli-cates the reading of coordinates based on given axes.Equations (3) provide an orthographic stereo projectionthat removes all perspective differences and allows thex and y coordinates to be interpreted directly. Carrand Littlefield (1983) describe a simple implementationthat exploits the scaling in standard statistical graphicspackages. Plot the left eye and right eye coordinates of apoint using (X�Xp; Y ) and (X+Xp; Y ) respectivelywhere the values are in data units. The expression forthe half parallax, Xp, in X data units, is

Xp = k � range(X) � (Z � Zmin)=range(Z) (7)

The constant k is chosen as a conveniently small valuesuch as :026.

Littlefield provided a color-anaglyph implementation ina statistical package over a decade ago by modifyingMinitab (remember when source code was available).This modification allowed drawing of arbitrary glyphsand included color table control to handle light mix-ing. Red and green points overplotted as yellow as theyshould and color control allowed subtle depth-basedshading. Color polaroids demonstating that a third vari-able helps little in group discrimination were shown atthe first ASA Graphics Exposition in 1982 and addi-tional examples were published (Carr, Nicholson, Lit-tlefield, and Hall 1986). While color anaglyph stereo isnot the most desirable form of stereo, the anaglyph workdemonstrated the simplicity of stereo plot production aslong as statistical packages provide for control of colormixing (hint).

Side by Side Stereo ExamplesAn article on graphics should have some graphics.Stereo production methods are incredibly diverse. Forworkstations, time-multiplexed methods that alternatelyroute images to the left and right eyes are gaining inpopularity. Side-by-side methods are common in non-electronic publications and so will be used here. Notethat other disciplines often publish stereo images incolor (Editors’ note: Maybe someday the newsletter willbe able to afford to do this). For example see the ray-traced image in Hodges (1992). The additional color-based depth cues are especially helpful when showingsurfaces. The images here are monochrome point andline drawings and for brevity focus on geometry rather

than data analysis. The legends provide a brief descrip-tion of the simple examples.

The examples are designed for parallel fusion. To fusethe images look at the left image with the left eye andthe right image with the right eye from a distance ofabout a 50 centimeters. It may be helpful to separatethe images with a card or to use an inexpensive magni-fying stereopticon. These parallel fusion figures wereproduced using an S function (stereo.pairs) which isavailable via anonymous ftp from galaxy.gmu.edu(in subdirectory /submissions/eda).

Future ArticlesSeparate articles will give background for more ad-vanced examples and provide some discussion of statis-tical interpretation. The next article will focus on alphablending and with luck will include a translucent stereorendering of overlapping contours of a density estimatefor 3-D data. While many practicing statisticians do notyet have the requisite hardware, computing environ-ments are changing very rapidly. Soon a high percent-age of statisticians will be able to study 3-D structurein the calm of non-rotating images. Future articles willdiscuss enhancement tools for the representation andstudy of data.

Additional ReferencesMany additional details are available in the literature,concerning stereoscopic resolution, hypersteroscopy,and enhancements for statistical images, common pro-duction difficulties, etc. Some useful starting referencesare Wegman and Carr (1992), Hodges (1992), Carr andNicholson (1988), Papathomas and Julesz (1988), Hu-ber (1987) and Lipton (1982). Valyus (1962) providesone of the most detailed expositions and the fascinatingwork of Julesz (1971) produces considerable insight intovisual processing via study of optical illusions encodedin random dot stereograms.

ReferencesCarr. D. B. and W. L. Nicholson. (1988), “EX-

PLOR4: A Program for Exploring Four-dimensionalData.”Dynamic Graphics for Statistics, eds. W. S.Cleveland and M. E. McGill, pp. 309-329. Belmont,California: Wadsworth.

Carr, D. B., W. L. Nicholson, R. J. Littlefield, and D. L.Hall. (1986), “Interactive Color Display Methods forMultivariate Data.” Statistical Image Processing andGraphics, eds. E. J. Wegman and D. J. Depriest, pp.215-250, New York: Marcel Decker.

Carr, D. B. and R. J. Littlefield. (1983), “ColorAnaglyph Stereo Scatterplots – Construction De-tails.” Computer Science and Statistics: Proceedings

6 Statistical Computing and Statistical Graphics Newsletter April 1993

of the 15th Symposium on the Interface. New York:North Holland Publishing Company.

Hodges, L. R. (1992), “Tutorial: Time-MultiplexedStereoscopic Computer Graphics.” IEEE ComputerGraphics & Applications, pp. 20-30.

Huber, P. J. (1987), “Experiences With Three-Dimensional Scatterplots.” Journal of the AmericanStatistical Association, 82(398), 448-453.

Julesz, B. (1971), The Foundations of Cyclopean Per-ception. Chicago: University of Chicago Press.

Lipton, L. (1982), Foundations of the Stereo-ScopicCinema, A Study in Depth. New York: Van NostrandReinhold.

Newman, W. M. and R. F. Sproull. (1979), Principlesof Interactive Computer Graphics, Second Edition.New York: McGraw-Hill.

Papathomas, T. V. and B. Julesz. (1988), “The Applica-tion of Depth Separation to the Display of Large DataSets.” Dynamic Graphics for Statistics, eds. W. S.Cleveland and M. E. McGill, pp. 353-377. Belmont,California: Wadsworth.

Valyus, N. A. (1962), Stereoscopy. New York: TheFocal Press.

Wegman, E. J. and D. B. Carr. (1992), “StatisticalGraphics and Visualization.” Technical Report No.84. Center for Computational Statistics, George Ma-son University, Fairfax, VA 22030.

Yeh, Y. Y. and Silverstein, L. D. (1990), “Limits ofFusion and Depth Judgement in Stereoscopic ColorDisplays.” Human Factors, 32, 45-60.

*Minitab is a trademark of Minitab,Inc.S is a trademark of AT&T.

Daniel B. CarrGeorge Mason [email protected]

DEPARTMENTAL COMPUTING

Not just hardware andsoftwareA computer system has four major components—hardware, software, communications and people. Thehardware and software angles of computing get dis-cussed at length. In this column I will try to focuson issues in computing that have to do with getting acoordinated computing system going and maintained.The issues cut across types of hardware and software,operating systems and organizations. While my per-sonal experience has been in developing a large (100+

computers) UNIX-based environment in academia, theissues surrounding departmental computing appear inbusiness and government and across platforms. My de-partment has networked PCs, Macs and multiple Vaxesin addition to the UNIX network. I’ll try to keep thepresentations as independent as I can of any particularcomputing platform. We’ll discuss platform specificswhen necessary for examples of general principles.

I confess to having several basic premises that guide mythinking about departmental systems.

� Departmental computing is the right level for sup-port of statistical activities. Reliance on Univer-sity or corporate level computing systems natu-rally leads to inadequate resources.

� A coordinated system in which users can send andreceive mail, share files, share printers and otherperipherals, share software installations, sharedocumentation and share user support is a rea-sonable goal.

� Planning is possible. Despite budget uncertain-ties and power politics, planning can and shouldbe done.

� Diversity happens. We can’t stop it and we haveto work to accept it. The best system for a fiscalperson may not be the best system for a statisticalscientist. Should all the statistical scientists in adepartment have the same platform? Let’s dis-cuss this. Let me know what you think. See theend of this column for my e-mail address.

� A network connection to the Internet is essentialfor proper access to the information needed to doone’s job. Again, let me know what you think.

� Larger systems require full-time systems manage-ment. We’ll start here (see the next section).

In future columns I hope to cover issues that hinder ac-ceptance and development of departmental computingresources. Here are some of the issues I see as important(in no particular order):

� Acquisition of systems. Choosing platform(s),fund raising, planning, purchasing. Breakingin—how does one get started? What about solomachines?

� Administration. System management, user sup-port.

� Maintenance (hardware, software, network gear)of modern departmental computing networks.

� Resource sharing. Coordinating funds as well asequipment.

April 1993 Statistical Computing and Statistical Graphics Newsletter 7

� Specifications. What should the system be ableto do. Performance, software, licensing, vendors.

� Diversity and innovation. Supporting diversity,planning and implementing innovation.

� The computer life-cycle. Trickle-down comput-ing and opportunities for old equipment.

� Migration to new systems and new ways of work-ing. Acceptance and training.

� Institutional barriers to progress. Technical,physical, financial and bureaucratic impediments.

As a first topic, let’s consider the often neglected re-quirements for system management.

System ManagementProviding system management for a new or evolvingsystem can be a real “catch-22”. How can you justifysystem management if you don’t yet have a complexrunning system? How can you build a complex runningsystem without system management?

Most departments recruit a knowledgeable facultymember into the role of system administrator or man-ager. If things go well, the system grows and the facultymember gets burned out trying to do system manage-ment in his or her “spare time”. Graduate students maybe called in to assist or a scientific programmer maybecome involved. If things go very well, the systembecomes complex and the case is made to hire full-timehelp. The faculty member is relieved of system man-agement duties. The start-up phase from no system to asystem requiring a system manager may last years.

Just as a department needs secretarial help to supportthe work of its faculty, a department with a complexmodern computing system will need a full-time systemmanager to support the systems and its use by students,staff and faculty. Twenty systems is more than a part-time job and 100 systems is more than a full-time job.Check with other departments at your institution to seewhat staffing commitments they have made to systemmanagement.

Centralized support organizations may be available atyour institution to relieve some system managementduties, but these rarely compensate for having someonelocally.

System managers are responsible for the day to day op-eration of the system. They perform backups, installsoftware, install revisions, install new hardware, sendhardware out for repair, make minor repairs in-house,prepare strategic plans, keep up-to-date on hardwareand software trends, manage the department’s softwarelicenses, track down bugs in hardware and software,

monitor system usage, provide and maintain basic secu-rity for data, interact with the institution’s networkingorganizations, respond to hardware, software and net-work failures, design and implement solutions to work-group problems, train users in the operation of the sys-tem, write documentation regarding system procedures,use of software, and local conditions, maintain the var-ious operating system software subsystems, respond tochanging external network conditions, simplify systemuse by providing new tools, negotiate with vendors, andgenerally make sure that everyone can use the comput-ers to do their jobs.

Why would anyone think this is not a full-time, difficultjob? It reminds me of the problems we have convincingsome non-statisticians that statisticians are necessary.“Why can’t I do it myself with a computer program”,they ask. And we stand ready to tell them. We mustbe equally clear that system management should not bedone as a hobby.

In planning for system management or making thecase for system management there are additional is-sues. Who will provide backup for the system managerso that there is someone to turn to when the systemmanager is on vacation and things break? The obviouschoice is the faculty member who recently retired fromcreating the system in the first place. A second issueinvolves hiring and retention. System managers can behard to find. Senior system managers have large salariesand may have tired of the daily grind of responding toproblems. Junior people may lack either the technicalknowledge necessary to do a thorough job, or lack thematurity and responsibility that comes with experience.In any case, system managers are in very great demand,and rightly so. One needs to expect turnover in thesystem manager’s position.

What’s Next?I would very much like to hear from the readers. We’llall get bored hearing what I did, what I’m doing, andwhat I think is important. What issues have you facedin the construction of your departmental system? Howwere problems solved? What problems remain? Whichissues that I plan to discuss are the most important toyou? Which are irrelevant? What should I discuss thatI haven’t mentioned?

The best way for me to communicate with you is viaelectronic mail. My address is at the end of the column.I will collect comments, questions, criticisms and anec-dotes and use them in future columns (with your priorpermission, of course).

At the ASA meeting in San Francisco this Au-

8 Statistical Computing and Statistical Graphics Newsletter April 1993

gust, I will be hosting a roundtable luncheon on thetopic of Departmental Computing. Please join us!

Michael ConlonUniversity of [email protected]

FROM OUR CHAIRS : : :

CONTINUED FROM PAGE 1

out in the 1976 copyright law: protection is availablefor “original works of authorship: : :fixed in any tangiblemedium of expression, now known or later developed,from which they can be perceived, reproduced, or other-wise communicated, either directly or with the aid of amachine or device” [3]. Originality does not imply thatthe work must be novel or unique, only that it originatewith the author. Thus, for example, a computer programwritten as a class assignment is presumably a candidatefor copyright even though many other students wouldproduce similar programs to perform the same task.

Some things are specifically excluded from copyright,such as “any idea, procedure, process, system, methodof operation, concept, principle, or discovery, regard-less of the form in which it is described, explained, il-lustrated, or embodied: : : ” [4]. I would interpret this tomean that one could copyright a program that computesa singular value decomposition, for example, but onecould not copyright the singular value decompositionitself.

The owner of a copyright has authority to sell or other-wise distribute copies of the program. If I buy a copy ofSystat, for example, I have whatever rights are laid outin the agreement between me and the copyright owner.The license for Systat allows me to use the program onone CPU for its intended purpose of statistical analysis.I do not have the right to reverse engineer the softwareto find out how it works. I can sell my copy, but thelicense agreement will apply to the buyer as well. IfI were to use Systat with my own data, output will beproduced in a format that may be unique to Systat; forexample, I might get a graph that could not be producedin any other way. Copyright law specifically excludesforms such as bank checks or date books from copyright,so presumably the graph produced in Systat would beowned be me, not by Systat, Inc.

Whereas a copyright gives its holder exclusive use of acreation, a patent gives its holder the right to claim roy-alties to an invention. Patents are generally not intendedto limit use of an invention, only to require payment foruse. For example, a patent could be issued for an inven-

tion such as a cassette tape. Many manufacturers canmake cassette tapes, and they would pay a royalty to thepatent holder.

Under U.S. patent law, it appears that a mathematicalalgorithm, defined as a “procedure for solving a giventype of mathematical problem,” [5] is not patentable,while a particular application of an algorithm may bepatentable. For example, a patent would not be issuedfor a factor analysis, but a patent could be issued for aspecific implementation in computer code that does fac-tor analysis. Different computer code using the samecomputational algorithm may not be in violation of thepatent.

According to Greg Aharonian, over 9000 softwarepatents have been issued, with over 1300 issued in 1992alone, mostly to a few large companies. Many of thepatents are issued for code that serves a specific purpose.For example, IBM was issued patents for a “Computeruser interface with window title bar icons” and for a“Method for deleting a marked portion of a structureddocument.” Again according to Aharonian, patent in-fringement lawsuits are rare. He believes that mostpatents are obtained as a defensive measure to protectagainst future law suits, not to prosecute others.

After this bit of research, I guess I now understand aprogram that uses a patented algorithm for computing agraph: the statement is advertising hyperbole. Patent-ing is no substitute for peer review, and if the graph isof use, and I understand it, I am free to write my owncode to compute it.

References[1] Wadley, James B. (1986), “An Introduction to copy-

right protection of computer programs and the semi-conductor chip protection act of 1984," Journal of theKansas Bar Association, July, 185-190.

[2] United States Constitution, Article 1, Section 8.[3] 17 U.S.C. 102(a).[4] 17 U.S.C. 102(b).[5] 1106 OG 6

Sanford WeisbergChair, Statistical Computing [email protected]

Statistical GraphicsHello. I know you’re busy, so I won’t run on too longhere at the keyboard. I just wanted to let you knowthat the Statistical Graphics Section has many activitiesplanned for 1993 and that you are invited to take part.The easiest way to particiate is to come to the Joint Sta-

April 1993 Statistical Computing and Statistical Graphics Newsletter 9

tistical Meetings in San Francisco; it will be a primefocus for graphics events. I’d like to tell you about afew of the things we have planned.

Research contributions are always an important compo-nent of the meetings, and David Scott has put together astrong set of invited paper sessions—the list appears onpage 21. If you would enjoy stimulating conversationon graphical topics, try one of the luncheon roundta-bles. Sally Morton has been working to catalog, dis-tribute, and display our growing collection of graphicsvideo tapes. The newest and best of these will be shownin a screening room at the meetings. Another featureof this year’s graphics program is a continuing educa-tion course that we are sponsoring: Bill Cleveland on“Visualizing Data”.

Every other year the graphics section sponsors a postersession where any interested person or group is invitedto analyze a common dataset. As a participant, it’s al-ways stimulating to be able to discuss a project withothers who have looked carefully at the same data; asa spectator, it is interesting to compare and contrast thedifferent approaches to the data. This year, David Cole-man has provided two datasets for the session: one onnutritional content of breakfast cereals and another onea time-series with surprises. Make sure you show up tosee the posters this year, and make a note to do the anal-ysis and contribute a poster of your own for the 1995meetings.

Keep on the lookout for the student poster competitionthat we sponsor with the Statistical Education section.The work of these school children is always refreshingand often displays data in a very creative way.

Under the new ASA constitution, the sections havemuch more autonomy to provide services for theirmembers. The statistical graphics section can ac-complish a lot, but we need to know what activi-ties would be of most interest to you. Let me orany of the other section officers know if you have agood idea that the section can pursue. Also, pleaseget involved—we are always looking for volunteerswho would like to play a role in section activies.

Rick BeckerChair, Statistical Graphics [email protected]

SAXPY, GAXPY, : : :

CONTINUED FROM PAGE 1

Nominally, around 71 100 � 100 Cholesky decompo-sitions can be calculated each second at 25 MFLOPS,

but in actuality only 11 per second are possible at 4.0MFLOPS. The performance of other machines is givenin Table 1, selected from 671 entries in Dongarra (1993).

The 100 � 100 LINPACK benchmark uses standardcompiler options. For example, the -dalign optionson SUNs, which forces the alignment of each doubleprecision datum on an 8-byte boundary, can yield fasterspeeds. Significantly faster speeds can be achieved ifthe code itself is modified. These attempts are recordedagainst the 1000�1000 LINPACK benchmark in Table1 (col.2). Considerable ingenuity has gone into opti-mizing 1000 � 1000 LINPACK performance, and it isapparent from Table 1 that a two to three-fold increasein performance or better is common, which then ap-proaches nominal performance (col.3).

Table 1: LINPACK (Cholesky decomposition)performance in MFLOPS for

100� 100 and 1000� 1000 matrices

Computer 100 1000 CPUCRAY Y-MP C90 (16 proc.) 479 9715 15238CRAY Y-MP C90 (1 proc.) 387 874 952NEC SX-3/14R (1 proc.) 368 5199 6400DEC 10000-610 (200 MHz) 43 112 200HP 9000/735 (99 MHz) 41 107 198IBM 6000-560 (50 MHz) 31 84 100DEC 3000-500 (150 MHz) 30 80 150SGI R4000 (50 MHz) 16 32 50IBM 6000-530 (25MHz) 15 42 50SUN SPARC 10/30 (36MHz) 9.3SUN SPARC 2 4.0Apple MAC Quadra 700 1.4NeXTCube 1.4Compaq Deskpro 486 1.3nCube 2 (1024 proc.) 258 2409nCube 2 (32 proc.) 46 75nCube 2 (1 proc.) 0.78 2.02 2.35Apple Mac IIfx 0.37Apple Mac IIsi 0.19Compaq 386/20 w/387 0.16microVAX II 0.12IBM AT w/80287 0.012

The initial goal of this article is to elaborate further onthe gap between nominal and actual performance. Thekey issue is memory management, discussed in the nextsection. The efficient use of memory is illustrated forthe apparently straightforward task of matrix multipli-cation in the Matrix Multiplication section. The detailsare illuminating, and include description of the “ele-mentary operations” of vector and matrix arithmetic,saxpy and gaxpy. Two other, very effective tech-niques for speeding computations are block algorithmsand loop unrolling, discussed in the Performance Issuessection.

10 Statistical Computing and Statistical Graphics Newsletter April 1993

In programming the LINPACK subroutines for linearalgebra, Dongarra et al. (1979) noted that efficient codecould be achieved through a modular approach. That is,an efficient and numerically accurate algorithm for, say,the QR decomposition or singular value decomposition,is implemented through a series of calls to a standardlibrary of basic subroutines for vector and matrix manip-ulation, the BLAS (Basic Linear Algebra Subroutines).The same is true for LAPACK (Demmel et al. (1991)),but in a more sophisticated sense, reflecting the memorymanagement issues discussed in this article. The BLASencompass many operations, from the most simple suchas vector copy or vector inner product (BLAS level 1,used in LINPACK), to matrix multiplication and the so-lution of triangular systems of linear equations (BLASlevel 3, used in LAPACK).

The portable FORTRAN public-domain implementa-tion of BLAS has been improved by several hardwarevendors, including IBM and Silicon Graphics Inc., andcan include some assembly coding. Dyad Softwaremarkets an assembly code version of BLAS for DOS.The BLAS routines are useful very generally: While itis quite easy to code a routine to handle such tasks ascopying an array to another array, computing the innerproduct of two vectors, or matrix multiplication, BLASprovides these routines also. A tabular summary of theBLAS routines is given on page ??. They are carefullycoded for numerical accuracy and speed, and can oftenbe called from C.

LINPACK, LAPACK, and the portable implementationof BLAS can be obtained by “mail order” from netlib,using e-mail, ftp, or the X-based graphical interfaceprogram xnetlib. The xnetlib software is particularlysophisticated, as it includes the library contents hier-archically by name and by subject classification, in-cludes various search options, and follows dependen-cies to facilitate downloading all relevant files. Callsto the BLAS library can then be substituted for theBLAS routines distributed with individual LINPACKroutines; LAPACK distribution keeps the two separate.The xnetlib software can be obtained by ftp from netlib.

This section concludes with a caveat. In a user appli-cation, such as graphics, combinatorial optimization, ordata base manipulation, even if the code is built on theBLAS library, the usage of memory may be substan-tially more non-local, and include much more indirectaddressing, than is found in typical linear algebra ap-plications. Then the gap between nominal and actualperformance is wider still. For example, the IBM RISC6000 architecture provides exceptional performance onthe LINPACK benchmarks, especially the 1000� 1000

benchmark which utilizes IBM’s ESSL library imple-mentation of BLAS. On the other hand, on a moleculardynamics benchmark, the price/performance ratio of aDEC, SGI, or HP workstation can be better than that ofthe IBM.

Memory ManagementThe fast cycle time of a CPU can only be utilized ifthe data and instructions can be placed in CPU quicklyenough. The first concern is to configure the systemwith sufficient RAM, so that paging between RAM anddisk is eliminated. As 16MB of RAM on a workstationis now standard, and 32MB is common, this will betaken for granted. The second concern is data transferbetween RAM and CPU. Whether the memory and CPUboards connect by the system bus or a dedicated bus, thetransfer rate is slow compared to the CPU speed. Thussome memory, so-called primary cache memory, is typ-ically placed on the CPU board and integrated tightlywith the CPU.

Figure 1 Typical Architecture

Figure 2 Simplified Architecture

Typical cache memory may comprise 8KB for data and8KB for instructions. (A greater amount of secondarycache memory, 1 MB say, may be available, but accessspeeds to it are not comparable to those for primarycache access.) Figure 1 shows a typical configuration;for our purposes the simplified version, Figure 2, suf-fices. The size of the primary cache (from now on,simply ‘cache’) is small. For example, the data for the100 � 100 LINPACK benchmark greatly exceeds the1024 double precision number (32�32 matrix) capacityof an 8KB cache. Larger primary caches are included insome recently introduced workstations, but the amountis still nowhere the size of a modest application.

An efficient program is one that minimizes the transferof data between RAM and cache, through maximizingthe number of floating point operations for each copy

April 1993 Statistical Computing and Statistical Graphics Newsletter 11

between RAM and cache. Vector-vector and matrix-vector routines in levels 1 and 2 BLAS respectivelyperform O(n) operations on an O(n) amount of data,but the matrix-matrix routines in level 3 BLAS performO(n3

) floating point operations on an O(n2) amount of

data.

Matrix Multiplication; saxpy and gaxpyConsider the 3� 3 by 3 � 3 matrix multiplication

AB = C (8)0@

1 2 34 5 67 8 9

1A

0@

:1 :2 :3:4 :5 :6:7 :8 :9

1A = C (9)

The conventional approach is to compute the elementsof C one by one, by the dot (inner) products

cij =X

k

aikbkj : (10)

In more detail, first initialize cij = 0 and then sequen-tially add aik � bkj to the current cij , for k = 1; : : : ; 3.For c11 this may be depicted0@

� � �

� � �

� � �

1A

0@

� � �

� � �

� � �

1A =

0@

� � �

� � �

� � �

1A :

(11)

Another approach is, after initializing the elements ofC to 0, to multiply the first column of A by b11 and addthe result to the current first column of C , then multiplythe second column of A by b21 and add the result tothe current first column of C , then multiply the thirdcolumn of A by b31 and add the result to the current firstcolumn of C to obtain the final first column of C . Thefirst step is depicted0@

� � �

� � �

� � �

1A

0@

� � �

� � �

� � �

1A =

0@

� � �

� � �

� � �

1A ;

(12)where � indicates that the element ofC is partially com-puted after this step, instead of fully computed (�). Thethree steps may be depicted0@

� ? �

� ? �

� ? �

1A

0@

� � �

? � �

� � �

1A =

0@

� � �

� � �

� � �

1A :

(13)Each of the three steps is called a saxpy, for “scalertimes (vector) x plus y.” Algebraically, a saxpy opera-tion is

�x+ y 7! y : (14)

The first column of C is constructed from three saxpyoperations. Together they are known as a gaxpy, or

generalized saxpy,

Ax+ y 7! y : (15)

(In the saxpy the vector x is a column of A, in the gaxpythe vector x is the first column of B, which containsthe scalars �.) In their Second Edition (1989), Goluband van Loan strongly emphasize saxpy and gaxpy, andwrite of saxpy as a fifth elementary operation, alongwith addition, subtraction, multiplication, and division.They point to the increasing role that saxpy and gaxpyhas played in numerical linear algebra in the past 10-15years, beginning with the use of saxpy in LINPACK(Dongarra et al. 1979).

A third approach to multiplying two matrices is by anouter product, building the matrix in three steps,

a1bt1 + a2b

t2 + a3b

t3 = C ; (16)

where aj is the jth column of A and btj is the jth row

of B. The first outer product, which can be viewed as 3saxpy’s, is depicted0@

� � �

� � �

� � �

1A

0@

� � �

� � �

� � �

1A =

0@

� � �

� � �

� � �

1A :

(17)

The dot product, saxpy-gaxpy, and outer product algo-rithms are the principal alternatives from among 3! = 6arrangements of the computation, eq. (??), as threenested loops indexed by the subscripts i, j, and k inevery order. Each saxpy in eq. (??) and eq. (??) isa column saxpy, because the x is a column vec-tor. The 6 algorithms comprise 2 algorithms each basedon dot products, 2 on column saxpy’s, and 2 on rowsaxpy’s.

Performance IssuesAlthough each of the six algorithms for matrix multipli-cation involves precisely 33 multiplications and 2 � 32

additions, they differ markedly in the manner in whichdata is transferred to and from memory. For the ef-ficient use of cache and also in vector processing, weprefer algorithms that both minimize data transfer andoperate on vectors of elements that are contiguous inmemory. Given that matrices are stored by column,each dot product accesses a vector of contiguous entriesand a vector of non-contiguous entries. Column saxpy’saccess two vectors of contiguous entries (x and y, and ascalar), and are preferable to row saxpy’s, which accesstwo vectors of non-contiguous entries. When three col-umn saxpy’s are arranged as a gaxpy, the x differ but ydoes not. When three column saxpy’s are arranged asan outer product, the y differ, but x does not. The for-mer (gaxpy) is preferable, as the x need only be loadedfrom memory, whereas in computing the outer product

12 Statistical Computing and Statistical Graphics Newsletter April 1993

the y must be loaded from memory, incremented, andreplaced in memory.

These memory management issues are known as unitstride and vector touches. Two additional topics, blockalgorithms and loop unrolling, are discussed shortly. Insummary:

Unit stride. The stride of a vector is the distance be-tween successive elements of the vector in mem-ory. Unit stride is preferred. A column saxpy hasunit stride, a row saxpy does not.

Vector touches. A vector touch is the loading of a vec-tor of data, to or from RAM, cache or a vectorprocessor. A gaxpy requires approximately halfthe vector touches of an outer product.

Block algorithms. When a computation is divided intoblocks, a greater fraction of the computation canbe performed using only data transfer betweencache and the CPU, without the use of main mem-ory.

Loop unrolling. Utilization of data currently in mem-ory is enhanced when several successive itera-tions, 4 say, of a loop are written out explicitly(The number of iterations of the loop is decreasedby the same factor.)

Interestingly, very similar issues arise in the design ofalgorithms for parallel processing. Much of the excite-ment surrounding coding for parallel processing can befound in efficient coding for single processors. It iswidely known that it is essential to structure code care-fully if it is to be vectorized for a multiple processor, orvector, computer. But similar concerns apply to singleprocessor architecture, as vector operations may be usedto move data into and out of cache.

Block algorithms

In general the full O(n3) operations to multiply two

n � n matrices cannot be accomplished with all threematrices,A,B, andC entirely in cache, at least for largematrices. However, it is possible to substantially reducethe number of memory loads and stores through the useof operations on submatrices, that is, block algorithms.The floating point operations are unchanged in number,but their order is rearranged.

Let AIK , BKJ and CIJ denote the IKth, KJ th andIJ th block of A, B and C respectively. Then

CIJ =

X

K

AIKBKJ : (18)

Apart from the last row and column of submatrices, eachsubmatrix AIK of A, say, will have the same number ofrows and of columns, the block size. The block size is

often 32, but can be 64, for example for operations ontriangular matrices.

As is the case for ordinary matrix multiplication, blockmatrix multiplication can be arranged in 3! = 6 ways.The block gaxpy approach is

CIJ +AIKBKJ 7! CIJ (19)

where the subscript I varies fastest, K second, and J

slowest. When each matrix multiplication AIKBKJ iscomputed using a gaxpy, there are six subscripts, whichvary in the order i (fastest), k, j, I , K , to J (slowest).The cache hit rate is particularly high if, for each I , J ,andK , the entire matrices AIK , BKJ , andCIJ are con-tained in cache. If each matrix is 16 � 16, 6144 bytesof cache are used. In fact, though, a block size of 32is feasible with an 8KB cache: For given I , J , and K ,and using gaxpy multiplication, the matrix AIK is usedrepeatedly, once for each column of BKJ . The matrixCIJ is used repeatedly, but one column at a time, andthe elements of BKJ are referenced once only. There-fore cache need contain only the entire matrix AIK anda column of CIJ .

Loop unrolling

Panzeira (1992) describes the use of loop unrolling tofurther reduce the ratio of memory accesses to floatingpoint operations, Consider the gaxpy

do j = 1; 32; 1; do k = 1; 32; 1

t = bkj

do i = 1; 32; 1

cij = cij + aik � t

end do

end do; end do

The inner loop is unrolled four times as follows:

do j = 1; 32; 1; do k = 1; 32; 1

t = bkj

do i = 1; 29; 4

ci+0 j = ci+0 j + ai+0 kt

ci+1 j = ci+1 j + ai+1 kt

ci+2 j = ci+2 j + ai+2 kt

ci+3 j = ci+3 j + ai+3 kt

end do

end do; end do

Inner loop unrolling reduces the looping overhead butnot memory accesses. Thus the performance gains are

April 1993 Statistical Computing and Statistical Graphics Newsletter 13

not significant. Panzeira reports that for 50 � 50 ma-trices actual gaxpy performance is 31% of theoretical(CPU) speed. With inner loop unrolling efficiency is33% of theoretical speed. For matrix multiplicationwith matrices of 250 or more rows and columns, effi-ciency increases from 24% to 25%. With blocking, theincrease in efficiency is from 31% to 33% for matricesof all sizes.

Middle and outer loop unrolling is much more success-ful. Unrolling the inner loop twice, the middle loop fourtimes, and the outer loop twice, the code becomes:

do j = 1; 31; 2; do k = 1; 29; 4

t00 = bk+0 j; t10 = bk+1 j

t20 = bk+2 j; t30 = bk+3 j

t01 = bk+0 j+1; t11 = bk+1 j+1

t21 = bk+2 j+1; t31 = bk+3 j+1

do i = 1; 31; 2

ci j = ci j + ai kt00 + ai k+1t10 + ai k+2t20+

ai k+3t30

ci j+1 = ci j+1+ai kt01+ai k+1t11+ai k+2t21+

ai k+3t31

ci+1 j = ci+1 j + ai+1 kt00 + ai+1 k+1t10+

ai+1 k+2t20 + ai+1 k+3t30

ci+1 j+1 = ci+1 j+1 + ai+1 kt01 + ai+1 k+1t11+

ai+1 k+2t21 + ai+1 k+3t31

end do

end do; end do

Efficiency is now around 70% and 51% for the 50� 50and 250+� 250+ matrices respectively, and 69% and65% respectively with blocking. Thus using loop un-rolling, actual speeds can be at least doubled, e.g. from31% to 65% efficiency. Automatic innermost loop un-rolling has been implemented in compilers, e.g. Sil-icon Graphics FORTRAN compiler (IRIX release 3.3and above). More specialized compiler technology isneeded to implement other types of automatic loop un-rolling. For example, the SGI Power FORTRAN Ana-lyzer implements automatic outermost loop unrolling.

Software for Linear Algebra: An OverviewSoftware for linear algebra includes libraries of highlevel routines, such as LINPACK and LAPACK, andalso lower level routines, the BLAS library. CORE-MATH and netlib are repositories for sets of routines,such as EISPACK, LINPACK, LAPACK, the genericversion of BLAS, optimizers, and nonlinear equation

solvers. Electronic mail requests can be sent to [email protected], with a message such as

send indexsend index from LAPACKsend index from BLASsend index from benchmark

Alternatively software can be downloaded interactivelyusing ftp to research.att.com (login as user ftp),or the utility xnetlib.

BLAS routines are in sets of 2 or 4, with suffices S and Dfor single and double precision real data, and C and Z forsingle and double precision complex data. Tables 2-4show the contents of BLAS at level 1 (vector-vector op-erations), level 2 (matrix-vector operations), and level3 (matrix-matrix operations) respectively. These sum-maries are intended to show what routine programmingtasks can be circumvented using the BLAS library.

Table 2: BLAS LEVEL 1,Vector-Vector Operations

copy scopy x 7! yswap sswap x$ yscale sscal �x 7! xsaxpy saxpy �x+ y 7! xdot product sdot xty

Euclidean norm snrm2pxtx

L1 norm sasumP jxij

maximum index isamax index of maxplanar rotations drot planar rotations

Table 3: BLAS LEVEL 2Matrix-Vector Operations

multiplication (gaxpy) dgemv �Ax+ �y 7! ytriangular matrices dtrmv Tx 7! xback substitution dtrsv solve Tx = brank 1 matrix update dger A+ �xyt 7! A

symmetric rank 1 update dsyr A+ �xxt 7! A

symmetric rank 2 update dsyr2 A+ �xyt+

�yxt 7! A

Table 4: BLAS LEVEL 3,Matrix-Matrix Operations

Matrix Multiplicationgeneral matrices dgemm �AB + �C 7! C

one symmetric matrix dsymm �AB + �C 7! C

one triangular matrix dtrmm �TB + �C 7! C

Solving Triangular Systemsback substitution dtrsm solve TX = �B

Rank k and Rank 2k Matrix Updatesrank k matrix update dsyrk �C + �AAt 7! C

symmetric rank dsyr2k �C + �ABt+

2k update �BAt 7! C

14 Statistical Computing and Statistical Graphics Newsletter April 1993

The BLAS routines are carefully coded, sometimes inassembly code, and, particularly for BLAS level 3, mayutilize loop unrolling, blocking, and parallel processing— as well as saxpy’s and gaxpy’s — to take advantageof modern machine architecture. (On rare occasions theuse of higher level BLAS routines is not optimal, e.g. toinvert a p� p triangular matrix, dtrsm from BLAS isinferior to dtrtri from LAPACK.) Original sourcesfor BLAS are Lawson et al. (1979) and Dongarra etal. (1988, 1989). Coleman and van Loan (1988) giveexamples of the use of the BLAS, emphasizing level 1.Anderson et al. (1992, Appendix C) give a summary ofthe BLAS. Other summaries can be found in UNIX manpages distributed with the software. The summary givenhere includes only some of the routine names, but otherroutines are closely related and their names can be foundfrom the references, or the man pages (which may com-bine descriptions of several routines on a single page).

LINPACK uses only BLAS level 1 routines, but can bere-coded quite easily to take advantage of BLAS level2. However, performance can then diminish as the sizeof the problem increases. LAPACK uses the more ef-ficient BLAS level 3 routines, although some tuningof LAPACK is possible for optimal performance. TheLAPACK auxiliary enquiry functionILAENV, providesa lookup table of parameters, such as block size, foreach primary function. The generic ILAENV in theLAPACK distribution may be modified for better per-formance on individual systems, but, as LAPACK itselfis new, departures from the generic version are uncom-mon. This is perhaps an opportunity for an excellentand worthwhile class exercise in experimental design.See Demmel et al. (1991), Section 4.

The necessary BLAS routines are included when LIN-PACK routines are sent from netlib; LAPACK requiresthat the BLAS routines have been installed separately.LAPACK is intended to be a replacement to LINPACK.LAPACK, like BLAS and unlike LINPACK and EIS-PACK, consistently includes routines for all four datatypes. A little of LINPACK’s functionality is absentfrom LAPACK, for example, the LINPACK routinesDCHUD and DCHDD to update the Cholesky factor ofa symmetric matrix when a rank 1 matrix is added orsubtracted; a more general alternative is TOMS algo-rithm 686, Reichel and Gragg (1990). The LAPACKrelease notes (send release notes from lapack)include practical guidance on using LAPACK on a num-ber of machines, e.g. by suggesting compiler options.

AcknowledgementsI wish to thank Jim Rosenberger for suggesting I writethis article, Jack Dongarra for advice on the use of

LAPACK, and Olivier Schreiber at Silicon GraphicsInc. for information on their work with the LINPACKbenchmarks and compilers. This work is based in parton Goodall (1993), and was supported by NSF GrantDMS-9208656.

ReferencesAnderson, E., Bai, Z., Bischof, C., Demmel, J., Don-

garra, J., Du Croz, J., Greenbaum, A., Hammarling,S., McKenney, A., Ostrouchov, S., and Sorensen, D.(1992). LAPACK Users’ Guide. Philadelphia, PA:SIAM.

Coleman, T.F. and Van Loan, C. (1988). Handbook forMatrix Computations. Philadelphia, PA: SIAM.

Demmel, J.W., Dongarra, J.J., and Kahan, W. (1991).“On designing portable high performance numericallibraries. LAPACK working notes no. 39. Univer-sity of Tennessee. (to netlib: send lawn39 fromlapack)

Dongarra, J.J. (1993). “Performance of various comput-ers using standard linear equations software.” Tech-nical report CS-89-85, University of Tennessee andOak Ridge National Laboratory. 1 March 1993. (tonetlib: send performance from benchmark)

Dongarra, J.J, Moler, C.B., Bunch, J.R., and Stewart,G.W. (1979). LINPACK Users’ Guide. Philadelphia,PA: SIAM.

Dongarra, J.J., DuCroz, J., Hammarling, S., and Han-son, R.J. (1988). “An extended set of Fortran basiclinear algebra subprograms.” ACM Trans. on Math.Software, 14, 1-17. (See also pp. 18-32.)

Dongarra, J.J., DuCroz, J., Duff, I.S., and Hammarling,S. (1989). “A set of level 3 basic algebra subpro-grams.” ACM Trans. on Math. Software

Golub, G.H. and van Loan, C.F. (1983, 1989). Ma-trix Computations. Baltimore, MD: Johns HopkinsUniversity Press. Second edition, 1989.

Goodall, C.R. (1993), “Computation Using the QR De-composition.” In: Handbook in Statistics, Volume 9:Statistical Computing (C.R. Rao, ed.). Amsterdam,NL: Elsevier/North-Holland.

Lawson, C.L., Hanson, R.J., Kincaid, D., and Krough, F.(1979) “Basic linear algebra subprograms for Fortranusage.” ACM Trans. Math. Software, 5, 308-325.

Panzeira, (1992). “Nested loops optimization.” SGItechnical report.

Reichel, L. and Gragg, W.B. (1990). “Algorithm 686:Fortran subroutines for updating the QR decomposi-tion,” ACM Trans. Math. Software, 16, 369-377.

Colin GoodallThe Pennsylvania State [email protected]

April 1993 Statistical Computing and Statistical Graphics Newsletter 15

COMPUTER COMMUNICATION AND NETSNOOPING

Gopher and otherresource discovery toolsThe tremendous growth of services and informationavailable over the Internet has prompted the develop-ment of several tools that ease the search for these re-sources. Recently, I described archie in this column.Archie is a tool that maintains and searches an inventoryof information available at FTP sites. One disadvantageof archie is that one has to know at least a substring ofthe file name or directory in which the information islocated to find it. Recently, a reader of this columnbrought gopher to my attention. I used archie to findgopher, installed a gopher-client on my machine (xgo-pher in my case), and then I used gopher to find andretrieve information about itself and other related tools.

I found that most of the Internet services that I de-scribed in previous issues of this newsletter are avail-able through gopher. Some of these are: the statlibarchive, the netlib archive, whois servers, archie, andmany more. Some new resources that I discovered withgopher include: other white page servers, databases(such as census data, weather data, and geographicaldata), library catalogs of many universities around theworld, electronic books (such as the King James Bible,Moby Dick, and Peter Pan), a wavelet archive, the AMScombined membership list, and many more.

Gopher is really more difficult to describe than it is touse, but I will try anyway. Gopher is a menu driven inter-face that allows browsing through Internet resources re-gardless of their type and without having to worry aboutInternet addresses. By selecting menu items one can ob-tain programs, documents, pictures, and even sounds,look up addresses, do keyword searches, and use manyother quite different Internet resources. Gopher origi-nated as a distributed campus information service at theUniversity of Minnesota, the “Golden Gophers,” andbecause it evolved to “go fer” things on the Internet,the name was coined. A recent book that contains awhole chapter on gopher as well as a wealth of otherinformation about the Internet is Kroll (1992). The bestway to learn about gopher is to find a gopher-client andstart using it. Telnet to consultant.micro.umn.edu, lo-gin as gopher and “go fer” it! This telnet-accessibleanonymous gopher-client can help you obtain the soft-ware required to install your own gopher-client. Thereare now versions available for just about any hardwareand operating system combination. You only need to be

connected to the Internet.

The collection of all information and services availablethrough gopher is called gopherspace. Gopherspaceis still growing but is already vast with hundreds ofgopher-servers worldwide. Once you are connected toone gopher-server you can reach other gopher-serversby selecting the menu item “Other Gopher and Infor-mation Servers” and then selecting the server you want.Most gopher-servers are very general and have similarmenu entries, while others can be focused on particularinformation available at a given site.

Gopher is a menu driven interfacethat allows browsing through Internetresources: : :

Gopher is great for browsing through information onthe Internet, especially if you have some idea whichgopher-servers have information that interests you. Be-cause gopherspace is now so vast, browsing is not anefficient way to look for information. A recent answer tothis problem, available through gopher on most gopher-servers, is veronica (very easy rodent-oriented net-wideindex to computerized archives). It was developed atthe University of Nevada and provides a keyword searchof most gopher-server menus in the entire gopherspace.Searching the menus with veronica is not a replacementfor browsing because the menus are not always very de-scriptive of the contents and represent only one of manyinterpretations of the contents. For example, the sameinformation could be listed as “geological databases,”“rock data,” or even “Jack’s favorite data” because Jackis a geologist.

WAIS (Wide Area Information Server) is another toolfor finding resources on the Internet. Unlike guidedbrowsing through Internet resources with gopher orveronica searches of gopher menus, WAIS providesa facility for keyword search of documents over theInternet. If the Internet is a collection of libraries ofdocuments, WAIS is an attempt to automate the interac-tion between a library patron and a reference librarian.WAIS does not search through the actual documents,rather it searches through indexes built from the docu-ments. Only those libraries that have a WAIS index canbe searched. The uniqueness and strength of WAIS isthat it can rate the relevance of documents in terms ofsimilarity to other user specified documents. You canget to a WAIS server with gopher, but there are someadvantages to running your own WAIS client program.Many are available from think.com, because much of thedevelopment of WAIS is done at Thinking Machines. Ofcourse, you can also use gopher or archie to find them.

16 Statistical Computing and Statistical Graphics Newsletter April 1993

ReferenceKroll, Ed (1992), The Whole Internet User’s Guide andCatalog, O’Reilley & Associates.

The Last Word: : :Let me end this column by saying that it is time forme to sign off and let someone else continue in myplace. While writing this column for the past threeyears, I have learned much about computer commu-nication and various Internet services. Please contactme [email protected] or the computing ed-itor [email protected] of this newsletter if you areinterested in writing or editing a similar column or sim-ply contributing to it. Also, please send any commentsor suggestions regarding this column to me and I willforward them to the next column editor.

George OstrouchovOak Ridge National [email protected]

Getting your Facts fromFAQsThis is the first of an occasional series about navigatingthe Internet. Since there are entire books devoted tothis subject (see above), I won’t even try to be com-prehensive. Instead, I’ll provide some clues on Internetsnooping for the statistician.

The Internet and bitnet mailing lists and netnews sys-tem is an amazing resource. It contains technical ma-terial and discussions about all sorts of other things.At my university the campus electronic bulletin boardsystem contains a wide collection of the Internet bit-net mailing lists (including things like the s-news mail-ing list), an electronic newswire, netnews, and a gag-gle of local discussion boards. I find it very usefulto be able to watch netnews and the Internet mailinglists without getting my personal mailbox even moreclogged than it already is. I’ll use the generic term“bboard” to refer to both netnews and the various In-ternet mailing lists. I regularly read a large numberof bboards (I won’t admit to how many). Four ofthem are devoted to statistics. The three high vol-ume bboards are netnews.sci.math.stat, theInternet S-news mailing list, and the Bitnet STAT-Lmailing list. Many readers will already know aboutthese two mailing lists, but if you do not, here is howto subscribe. To subscribe to S-news send a messageto [email protected]. A hu-man will either add you to the list or suggest a local

re-distribution point for the list. I strongly suggest thatyou find a local feed for the S-news mailing list, or setone up yourself. To subscribe to STAT-L, send a oneline e-mail message containing just the line

SUBSCRIBE STAT-L

to [email protected]. My fourth statisticalrelated bboard is the xlisp-stat-news mailing list. Thereis also a fledgling mailing list for those interested inBayesian statistics. If you are interested in that, contactme via e-mail.

Finding your way around netnews and the mailing listscan be daunting both for novices and experienced usersalike. One way of making quick headway is to read theFAQs (Frequently Asked Questions) postings.

Many bboards have a periodic posting (often monthly)containing the FAQs. These posts are often long mes-sages containing answers to many questions whichare frequently discussed on the bboard. For examplethere are several forums for discussing the TEX type-setting language. One of the more common ones isnetnews.comp.text.tex. The FAQ posting con-tains information about FTP sites that carry collectionsof TEX macros, various public domain implementa-tions of TEX for different platforms (e.g., EmTeX forDOS/Windows, OzTeX for the Mac) and so on. Fornew users the FAQ posting will often contain answersto many of the simple (and not-so-simple) questions.For aficionados the FAQ file often contains pointers tosources of detailed information. If you are consideringposting a question to a bulletin board or newsgroup (par-ticularly the more technical newsgroups) it is always agood idea to read the FAQ file first. At the least it savesyou getting a number indignant “read the FAQ” repliesand in the best of circumstances it gets you an answerin minutes rather than hours or days.

One way of making quick headway is toread the FAQs (Frequently Asked Ques-tions) postings.

Not every bulletin board or newsgroup contains a FAQpost and, even worse, many campus bulletin board sys-tem generally don’t keep a months worth of back posts.So what should you do when you can’t find a FAQpost? Suppose you are interested in the LISP program-ming language and you can’t find the FAQ file. (Thereis one, it is posted around the middle of every month,in about 6 parts, with a total of over 200Kb of text, toseveral lisp-related forums). A good strategy is to startout by subscribing to the appropriate newsgroup (in thiscase netnews.comp.lang.lisp is a good place to start) andbecome a passive reader/scanner of the messages. If the

April 1993 Statistical Computing and Statistical Graphics Newsletter 17

FAQ file is posted every month, and it is not in the oldmessages available on your bulletin board system, thenit is likely that you will see the post in the next week ortwo. While you are waiting you can sometimes learn alot by just reading the other discussions.

A final tidbit. The USENET community does quite agood job of collecting FAQ files in one place. Thereare several USENET newsgroups that consist solely ofFAQs. Some that I know of are netnews.comp.answers(many FAQs for newsgroups in the netnews comp hier-archy), netnews.misc.answers, netnews.news.answers,netnews.rec.answers, and netnews.sci.answers.

Of the statistics related bboards, I’m only aware of aFAQ for the S-news mailing list. This is archived inStatLib. To get a copy, send the message

send faq from S

to [email protected], or use gopher orFTP to access StatLib.

By adroitly using your bulletin board system you canquickly learn a lot about many technical areas. Ofcourse, the trick is to limit the amount of time youwaste reading pointless material. Again, this is where itis useful to read the FAQs. In a few minutes one can of-ten find answers to most of the interesting and commonquestions. So go forth and explore the bulletin boardsand be a smart reader by sticking to the FAQs.

Mike MeyerCarnegie Mellon [email protected]

BITS FROM THE PITS

Statistical Computing andGraphics in Science andIndustryThis column features statistical computing and statisti-cal graphics activities in science and industry. I inviteyour comments and suggestions for future columns.Please send comments, inquiries, and suggestions toAlbert M. Liebetrau, Analytic Sciences Department,Battelle-Northwest, MS K7-34, P.O. Box 999, Rich-land, WA 99352, AM [email protected],509-375-2694.

Uncertainty Analysis for Computationally-Demanding System Codes

Part one of a two-part series

Some degree of anthropogenic environmental change isan inevitable consequence of global-scale human activ-ity. Both policy makers and the public are extremelyinterested in the possible consequences of environmen-tal changes. The Department of Energy is particularlyinterested in the effects on energy consumption, bothof naturally occurring environmental changes and thoseresulting from human activity. Large-scale modelingefforts are currently underway to assess the possibleeffects of potential environmental change scenarios.Global estimates of energy consumption, for example,are obtained by integrating the results from several re-gional models. The regional models incorporate factorssuch as regional climatology and hydrology, and includemodels of terrestrial ecosystems and human activities(agriculture, energy use, adaptation, etc.). Boundaryconditions for regional models are derived from generalcirculation models, which are run for the climate-changescenarios of interest. (We can debate the usefulness ofmodels, but such models are in fact being used to makesubstantive policy decisions and their use for this pur-pose will only increase with time.)

One important use of models such as those describedabove is uncertainty analysis, which refers to methodsfor estimating the probability distribution of model re-sponse resulting from variability in its input variables.However, traditional methods of uncertainty analysis(Monte Carlo simulation) fail for system models suchas these because each component may in its own rightbe a complex code with heavy computing demands. Anew approach to uncertainty analysis is required. Toaddress this problem, McKay et al. (1979) developeda sampling method (Latin hypercube sampling) that ismore efficient for simulation than simple random sam-pling. Taking another approach, Downing et al. (1985)used response surface approximations to reduce com-puting requirements. In Liebetrau and Scott (1991) andLiebetrau et al. (1993), we propose a strategy for com-plex system codes that employs both response-surfaceapproximations and efficient sampling strategies. Thebasic idea is to develop a simplified analog to the com-plex system code, which we shall call a performanceassessment (PA) code, by approximating its computa-tionally demanding components. The PA code is ab-stracted from the underlying code so as to preserve theessential features of component processes and the in-teractions among them. The PA code can be used foruncertainty analysis because it requires less computing

18 Statistical Computing and Statistical Graphics Newsletter April 1993

resources than its underlying analog.

Our strategy for developing PA codes has three basicelements. These are (i) efficient selection of an initialset of model inputs (realizations), (ii) development ofan approximation to the response surface of the under-lying codes, and (iii) an updating algorithm that usesexisting information to determine the locations (inputs)for additional runs. The overall idea is to develop atwo-tiered system model that consists of a performanceassessment model that “sits atop” an underlying modelwhich is made up of the detailed component models.After response surfaces for the component models havebeen approximated, most computations for uncertaintyanalysis are done at the PA level. When it is necessaryto drop down to the lower level, results of these newruns are used systematically to improve approximationsat the upper level.

In previous columns, I have described some of the toolsavailable to implement the approximating strategy out-lined here. In the second part of this article, I willdescribe our experiences in attempting to implementthis strategy for two real-world examples.

ReferencesDowning, D. J., R. H. Gardner and F. O. Hoff-

man. (1985). “An Examination of Response-SurfaceMethodologies for Uncertainty Analysis in Assess-ment Models.” Technometrics 27(2), 151-163.

Liebetrau, A. M. and M. J. Scott. (1991). “Strategies forModeling the Uncertain Impacts of Climate Change.”Journal of Policy Modeling 13(2), 185-204.

Liebetrau, A. M., P. D. Whitney, D. W. Engel and C.A. LoPresti. (1993). “Computational Analogues toComplex Computer-Based Codes.” Technical Re-port.

McKay, M. D., R. J. Beckman, and W. J. Conover. 1979.“A Comparison of Three Methods for Selecting Val-ues of Input Variables in the Analysis of Output froma Computer Code.” Technometrics 21(2), 239-245.

Albert M. LiebetrauBattelle Pacific Northwest LaboratoriesAM [email protected]

GEOGRAPHIC INFORMATION SYSTEMS

Designing the GISInterfaceIn my last column, I promised to describe the use of fo-cus groups if a significant number of readers expressed

interest. Focus groups might appear inappropriatelydistant a topic for a column on geographic informationsystems (GIS). But anyone who has tried to use a GISintended for more than pedagogic purposes will appre-ciate its relevance. Developers of GIS software havepaid scant attention to both interface design and the ex-periences and opinions of actual and potential users, andtheir products reflect an appalling ignorance of humanfactors equaled only, one might argue, by the develop-ers of videocassette recorders. That GIS developers stillrely almost exclusively on after-the-fact feedback fromusers and disgruntled letter-writers reflects an immaturetechnology inadequately exploited by specialized devel-opers facing limited competition in a fragmented mar-ketplace. That people struggle to use their products isconvincing evidence of human persistence in exploitinga promising technology—not to mention the profitabil-ity of vendor-run workshops and short courses. I had nodelusions that the comparatively infantile graphic nar-ratives discussed in my last column were any better thanthe typical GIS. But group interviews seemed a usefulway to solicit critical advice in a synergetic setting.

Focus Groups

Although the focus group is not a full-fledged human-factors technique, human-factors experts use group in-terviews (as they are also called) to pretest product de-signs as well as to identify issues worth addressing withmore exacting testing procedures. Employed princi-pally in marketing and media research, the group inter-view is a systematic method for identifying the rangeof attitudes and preferences that buyers, readers, view-ers, and voters have about a product, publication, TVprogram, or political candidate. Social scientists havelittle use for focus groups, which are wholly unreliablefor estimating means, variances, covariances and otherfodder for the variation-explaining paradigm. But mar-keting experts and product designers have found groupinterviews an effective strategy for identifying flaws andgaining insight about user needs. A synergy developsin which participants stimulate each other to recognizeproblems and propose alternatives. It seems obviousbut I’ll say it anyway: to generate meaningful results,focus-group participants must reflect the target popula-tion.

To gather useful opinions about my prototype AtlasTouring scripts, I needed groups of more or less typ-ical users of geographic data or graphic software. Soinstead of coercing undergraduates or bribing churchgroups, I wrote some letters, made some calls, and ar-ranged for sessions with groups of six to ten profes-sionals at four sites: Syracuse University’s School of

April 1993 Statistical Computing and Statistical Graphics Newsletter 19

Information Studies (faculty and doctoral students only,please), the Cartographic Division of the National Ge-ographical Society (in Washington, D.C.), the Office ofGeographic and Cartographic Research in the U.S. Geo-logical Survey’s National Mapping Division (in Reston,VA), and the GIS applications unit at IBM Corporation’sresearch and development center at Kingston, NY. Themanagers with whom I dealt were cooperative and en-thusiastic, and graciously recruited staff with a rangeof experiences in geographic analysis, map use, andinterface design. They bought the argument that theircommitment of valuable staff time might be treated as acolloquium or training seminar. At IBM I even sweet-ened the offer by lecturing later in the day to a somewhatlarger group on the theme of How to Lie with Maps(University of Chicago Press, 1991)—an opportunity toplug my book proved difficult to resist, as it does now.

Developers of GIS software have paidscant attention to both interface designand the experiences and opinions of ac-tual and potential users, : : :

I didn’t conduct the focus groups myself; to do so wouldhave tainted the results. I wanted honest, frank opin-ions about the prototype graphic narratives, not nice,polite comments inhibited by the designer’s presence.Although I could have tried to avoid defensive reponsesto criticism, I find body language—the crossed arms, theshocked look, the odd twitch, the higher pitch, the longpause—nearly impossible to control. So early in thestudy I engaged an experienced focus-group “facilita-tor,” Myke Gluck, then a doctoral student in informationstudies. (Myke defended his dissertation on December,and moved on to a faculty position in the School ofLibrary and Information Studies at Florida State Uni-versity, in Tallahassee.) I accompanied Myke only onthe visit to IBM, and while he ran the session, I hid out inan office. Myke and I collaborated in developing a par-ticipant questionnaire and a focus-group protocol. Thequestionnaire helped both to verify that the participantsdid in fact reflect our target population and to revealbackgrounds and viewpoints useful in interpreting in-dividual responses and group dynamics. The protocolpromoted uniformity among the four groups and guar-anteed efficient use of the hour or so participants andtheir supervisors could commit to the session.

Group InterviewsWe divided the group interview into two parts, the first toaddress a demonstration of the full “correlation script”(about 11 minutes) and the second to discuss a mixeddemonstration (about 10 minutes) consisting of two dy-

namic spatial-temporal maps from the “historical script”and a new, interactive version of a key introductorygraphic phrase from the correlation script. Myke nar-rated each demonstration—we rehearsed this carefully,for consistency—and then presented a number of is-sues for discussion. After the first demonstration, theparticipants were asked to discuss the issues of infor-mativeness and coherence as well as the script’s goodpoints and bad points, and after the second demo theyaddressed issues of user interaction and customization.For each issue the protocol included a sequence of dis-cussion questions. When addressing the issue of coher-ence, for instance, Myke asked questions such as: Didthe progression of graphic phrases seem logical? Didthe maps and graphs fit together? Were the text andlabels useful or helpful? Our protocol also includedseveral “probes” with which Myke coaxed further criti-cisms or suggestions from the participants. The probesincluded: Can you give me a specific example? Whydo you feel that way? Does anyone feel differently?How do you see it? Myke taped the entire session withtwo recorders, both in full view. As part of the uni-versity’s human-subjects review process, we prepareda statement, read to participants at the outset, agreeingto hide their identity and to destroy the tapes when fin-ished with the analysis. Our statement also noted thatthey should feel free to leave at any time for any rea-son. Fortunately for us no one did. Then came the hardpart. Interpreting the results required hours of listeningto tapes and taking and comparing notes. We found ithelpful to listen to the tapes separately, record our ownobservations, exchange notes, listen to the tapes again,and discuss the results. One of us then wrote an initialdraft, which the other modified extensively.

FindingsAlthough communications and marketing researchersoften carry out a more systematic content analysis—after having their tapes transcribed and coded by spe-cially trained assistants—that level of rigor is moreappropriate to studies comparing two or more designsor products than to exploratory studies concerned, likeours, with identifying issues and pinpointing flaws. Ourintermediate goal was two lists, one registering widelyshared complaints or suggestions to be addressed in thenext design cycle and the other inventorying less fre-quent yet nonetheless worthwhile hints and options. Inaddition to suggestions explicitly stated by group par-ticipants, this second list typically includes the investi-gators’ own ideas spontaneously triggered by the tapeddiscussion. More important, though, are shortcomingsand insights registered by two or more groups: in ad-dition to blatantly obvious problems, the primary list

20 Statistical Computing and Statistical Graphics Newsletter April 1993

commonly includes design deficiencies and strategiesaffecting only a minority of potential users. How ef-fectively a software designer appreciates and addressesthese not-widely-shared concerns can be crucial in de-veloping an efficient, effective, and broadly inclusivesystem. Findings based on focus groups are much closerto the soft end of the measurement-rigor-reliability con-tinuum, and the risk of bias seems markedly greaterthan with the more tightly controlled research strate-gies of confirmatory analysis. Whoever reads our re-port (still under editorial review) might well questionwhether Myke and I have carefully listened to partici-pants’ complaints about the demonstration software andopenly reported their doubts and reservations about nar-rative graphics. Conscious or unconscious bias is al-ways a problem when researchers rely heavily on inter-views, and our study is no exception. But the possibilityof bias is cause only for caution and skepticism, not out-right rejection. The value of our study can be judgedby the apparent reasonableness and thoroughness of thecollective and constructive criticisms we report.

What did we learn? Lots—far too much to describein the current column. Would we do use focus groupsagain? You bet—despite the difficulties of pleadingwith managers, coping with D.C. traffic to meet a tightschedule, and taking our own Macintosh system intoand out of an IBM facility. The exercise yielded valu-able insights, and I can’t wait for the opportunity thissummer—NSF willing—to plunge in, implement andrefine some new strategies, and generate vastly im-proved demos for another round of focus-group eval-uation. The next iteration might not yield a satisfactorysolution, but I surely have a clearer sense of where theAtlas Touring project is headed.

Mark MonmonierSyracuse [email protected]

BOOK REVIEW BEAT

This column returns, with the same proprietorship, be-cause the new editors invited it, and because I was will-ing to continue. I had reservations, however. The previ-ous editors always asked for the next column, but therewas never any feedback from anyone about whether thecolumn was good or bad, useful or wasted space. I willneed to have positive feedback before I will feel thatit is worth continuing. Actually, I will need more thanthat. Cost cutbacks in my corporation have taken awayfrom me all the statistics journals, except the ones that I

buy. So, I am looking for some contributing assistance,correspondents who can report on reviews of statisticalcomputing and graphics books that are reviewed in otherjournals. Particularly I am interested in the proliferatingcrop of statistical computing journals, none of which Iwill see any more, plus Biometrics, Applied Statistics,and Shorter Book Reviews.

There were two other problems that I encountered inpreparing this column. First, there have been fewerreviews of statistical computing and graphics books re-cently. Second, some of the ones that have appearedhave been written by me in the report section of Tech-nometrics Book Reviews. It seems rather ridiculous toquote myself, so I shall just remark briefly on those re-ports. In November I commented on new editions of twobooks on statistical analysis using SAS, Sas System forRegression, Freund and Littell, and Sas System for Lin-ear Models, Littell, Freund, and Spector. I found each tobe significantly improved over its previous edition andan essential personal library item for any statisticianwho uses SAS for doing statistical analysis. In Febru-ary I commented on a newcomer to this collection, SasSystem for Statistical Graphics, M. Friendly. For methis much larger book was equally valuable, especiallyfor its providing of extensive SAS code and macros,also for its comprehensive coverage of graphics in alltypes of statistical analyses.

Other statistical packages were included as central fea-tures of books, too. In JASA for December is a reviewfor the 2nd Edition of Data Analysis for Managers withMinitab, H. Roberts, Scientific Press. John McKen-zie praises the book’s use of real data, effort at makingstatistics interesting, inclusion of statistics as part ofbusiness, and use of software to analyze data, statingthat “the author has written a book that should be exam-ined by anyone dealing with elementary business statis-tics”. In comments in the Technometrics Editor Reportsin February, I opine that the book is better suited forteaching statisticians about business applications thanmanagers about statistics. GENSTAT also gets sometextbook support with Applied Statistics: A Handbookof Genstat Analyses, E. Snell and H. Simpson (Chap-man and Hall). Mike Driscoll, in Technometrics forNovember, notes that the “handbook details the use ofGENSTAT to analyze the data used for the examplesin Snell’s book, Applied Statistics: Principles and Ex-amples”. For that book, he finds this one a “usefulsupplement” with “limited value otherwise”.

There are some reviews of books not about statisticalpackages. In JASA for December, Martin David reviewsa collection of papers, Statistical and Scientific Data

April 1993 Statistical Computing and Statistical Graphics Newsletter 21

Bases, Z. Michalewicz, editor (Ellis Horwood). Notingthe proliferation of persons who are embedding scien-tific data in data bases, he says that “this volume willleave them inadequately informed on the conceptualstructure to use in their work and on the strategic trade-offs that are possible with powerful off-the-shelf Re-lational Data Base Management Systems (RDBMS)”.In Technometrics for February, Minoo Niknian reviewsRandomization and Monte Carlo Methods in Biology,B. Manly (Chapman and Hall), and finds the book “ap-propriate for master’s level students of statistics andpracticing statisticians, as well as subject matter inves-tigators with good backgrounds in statistics and expe-rience in computing”. He calls the book “welcome inapplied statistics”. In Biometrics for September, JimGentle reviews the same book, saying that “the book isgenerally well written, and the examples are interestingand easy to understand”. He also notes that “the word‘biology’ in the title should not limit the readership”.

The Editors or this columnist need some feedback todetermine whether a look at reviews of statistical com-puting and graphics books is useful, or whether someother format should be used to deal with the books. Ifthe overview of reviews is to remain reasonably com-prehensive, some contributors for the aforementionedjournals are needed. Contact Eric Ziegel or the editors.

Eric ZiegelTechnometrics Book Reviews [email protected]

NEWS CLIPPINGS

Joint Statistical MeetingsStatistical Graphics Invited ProgramSection: Statistical GraphicsSession Title: Young Researchers In Statistical Graph-

icsSession Time: Tuesday, August 10, 8:30 - 10:20Organizer: David W. ScottSession Chair: David ScottSpeakers:

“Graphics Keys—Misha’s Resource Database Ap-proach to Extensible Graphics,” James Hardin, Bat-telle Lab, [email protected]

“The Mode Tree: Nonparametric Visualization ofDensity Features,” Michael Minnotte, Utah StateUniversity, [email protected]

“Assisting Inductive Modeling With Visualiza-tion,” John Elder, University of Virginia,[email protected]

“Graphical Methods for Finding Structure in Multi-variate Data,” Qiang Luo, George Mason Univer-sity, [email protected]

Section: Statistical Graphics—co-sponsor Stat Ed,Teaching Health Stat

Session Title: Statistical Graphics and Animation ForInstruction/Classroom

Session Time: Wednesday, August 11, 10:30 - 12:20Organizer: Joseph NewtonSession Chair: Joseph NewtonSpeakers:

“Integrating Dynamic Graphics into a Lin-ear Regression Course,” Sandy Weis-berg/Dennis Cook, University of Minnesota,[email protected]

“Exploratory Dynamic Graphics for SurvivalData,” Neely Atkinson, Texas Medical Center,[email protected]

“Methods for the Analysis of Data From DesignedExperiments,” Wei-Yin Loh, University of Wis-consin, [email protected]

Section: ACM SIGGRAPH [subj to final signing]—Statistical Graphics, Stat Ed

Session Title: Multimedia: Past, Present, and FutureSession Time: Tuesday, August 10, 2:00 - 3:50Organizer: Jan Pedersen, Xerox ParcChair: Jan Pedersen

TBA [Current Multimedia Environment], Dan Rus-sell, Xerox PARC,[email protected]

“Intelligent Agents as a User-InterfaceMetaphor,” Tim Oren, Kaleida Labs, Inc.,[email protected]

“Multimedia Futures,” Enrique Godreau, Aldus Cor-poration, enrique [email protected]

Discussant: William Cleveland, AT&T Bell Labs,[email protected]

Section: Statistical GraphicsSession Title: Novel Applications of Multivariate Sta-

tistical GraphicsSession Time: Tuesday, August 10, 4:00 - 5:50Organizer: David ScottChair: TO BE NAMED

“Visualizing Speech Data and Hidden MarkovModels,” Jim Schimert [speaker], AndreasBuja, Werner Stuetzle, Statistical Sciences, Inc.,[email protected]

“Visualizing Multivariate Rank Data,” G. Thompsonand K. Baggerly, Southern Methodist University

“Variable Resolution Bivariate Plots,” ChishengHuang, Univ. of Washington,[email protected]

22 Statistical Computing and Statistical Graphics Newsletter April 1993

Statistical Graphics Roundtable LuncheonDiscussions

1. “Software for statistical graphics,” JamesE. Gentle, George Mason University,[email protected]

2. “Software for Multivariate Functional Estimationand Visualization,” David W. Scott, Rice Univer-sity, [email protected]

3. “Issues in implementing graphical func-tions,” Linda Clark, AT&T Bell Labs,[email protected]

4. “Meaningful Statistical Graphics for Manufactur-ing,” Karen Kafadar, National Cancer Institute,[email protected]

5. “Single Frame Video Recording: The Wave ofthe Future?,” William F. Eddy, Carnegie MellonUniversity, [email protected]

6. “Graphics in Bayesian Statistical Com-puting,” William DuMouchel, New Eng-land Biomedical Research Foundation,[email protected]

Statistical Computing Invited ProgramThe Statistical Computing Section has an exciting col-lection of eight invited and special contributed sessionsfor the annual ASA meeting in San Francisco.

The session that received the top number of votes inthe ASA competition for extra sessions was organizedby Ed Wegman of George Mason University and titled“Virtual Reality for Exploratory Analysis.” The ses-sion is a tutorial on using virtual reality to explore highdimensional data. Rotating a point cloud on a screento “see” in three dimensions is an example of the kindof advances in graphical data analysis we saw in the1980’s. Virtual reality appears to offer the next step forthe 1990’s: it can be used to “immerse” the statisticianin the data world.

A second session entitled “Software Statistics” orga-nized by David James, AT&T Bell Laboratories, ex-amines the role of statistics in the relatively new fieldof software engineering. Scott Vander Wiel, AT&TBell Laboratories, will discuss “Capture-Recaptureand Other Statistical Methods for Software InspectionData,” while Wendell Jones, Bell Northern Research,will discuss “Modeling for Software Systems.” AlsoBill Curtis, Carnegie-Mellon University, will discuss“Conundrums in Applying Statistics to Software Engi-neering.”

Getting up to speed on high performance computers isthe topic of the session “High Performance Computing”

organized by Janis Hardwick, University of Michigan,who will present “Computational Analyses of Sequen-tial Allocation Problems.” Phil Spector, University ofCalifornia, Berkeley, will present “Parallelizing CART”and George Ostrouchov, Oak Ridge National Labora-tories,will present “Statistical Modeling Applicationson Parallel Computers.” Quentin Stout, University ofMichigan, will be the discussant.

Sallie Keller-McNulty, Kansas State University, orga-nized a session that looks at nontraditional models enti-tled “Statistics Driven by Non-Classical Assumptions:Spatial Sampling, Estimation, and Modeling.” She willpresent “Mean Field Estimation for Probability Distri-bution Function Methods of Turbulent Reactive Flows.”Katherine Campbell, Los Alamos National Laborato-ries, will discuss spatial considerations in environmen-tal sampling in “Sampling the Continuum: The Spa-tial Support of Environmental Data.” Douglas Nychka,North Carolina State University, will discuss “Stretch-ing and Translating Surfaces to Model the Drain Currentin a Semiconductor.”

G. Pliego of ITESL, Mexico, has organized a session“Wavelet Software and Algorithms in Statistics.” thatincludes an introduction to wavelets. “ComputationalTechniques for Wavelet Applications” is the title of WimSweldens, University of South Carolina and KatholiekeUniversiteit Leuven, Belgium. Z.Wang, Purdue Uni-versity, will present “Software Examples for StatisticalWavelet Analysis.” Carolyn Carroll, IBM, will discuss“Parallelizing Wavelets.”

John Maryak, The Johns Hopkins University, organized“Recursive Methods for Optimization in Time Series”with Larry Goldstein, University of Southern Califor-nia, as discussant. T.L.Lai, Stanford University, willpresent “Parallel Recursive Algorithms in Time SeriesEstimation and Control.” James Spall, The Johns Hop-kins University, discusses “Accelerated Stochastic Opti-mization in Multivariate Problems,” and John Monahan,North Carolina State University, presents “Applicationsof Stationary Stochastic Approximation.”

The session “Smoothing in Data Analysis: Tutorialon Methods and Applications” is organized by KarenKafadar, National Cancer Institute, who will present“Choosing among Two-dimensional Smoothers in Prac-tice.” Trevor Hastie, AT&T Bell Laboratories, willpresent “What Smoothies Do” and Colin Goodall, PennState University, will present “Nonlinear Smoothers forOne-Dimensional Data.” Katherine Hansen, SandiaNational Laboratories, will present “Smoothing Mul-tidimensional Data,” and Owen Devine, Center for Dis-ease Control, will act as discussant.

April 1993 Statistical Computing and Statistical Graphics Newsletter 23

Tim Hesterberg, Franklin and Marshall College, orga-nized the session “Weighted Samples in Simulation,”and he will present “Importance Sampling and ControlVariates for the Bootstrap.” George Easton, Univer-sity of Chicago, will discuss “Configural Polysamplingand Some Related Research” while Peter Glynn, Stan-ford University, will discuss “Rare Event Simulation forQueues.”

Mary Ellen Bock, Program Chairman 1993Statistical Computing Section

Statistical Computing Roundtable LuncheonDiscussionsFive interesting Luncheon Discussions are planned inSan Francisco. The topics and a brief description ofeach are given below. Feel free to contact the Discus-sion Leaders for more information. To participate inthese discussions, you need to sign-up for the luncheonwhen you register for the conference. All of these lun-cheons will be scheduled for the same day

1. “Impact of Computing on Industrial ExperimentalDesign,” Perry D. Haaland from Becton Dicken-son Research Center ([email protected]) willbe leading a discussion on a variety of issues in-cluding new results in optimal design, inferencebased on robust methods, graphical analysis, newresults on dispersion modeling, Taguchi or “clas-sical” approaches, industrial training, and practi-cal experiences. Perry also plans to leave sometime for the sharing of “war stories.”

2. “Using the Statistical Computing and StatisticalGraphics Newsletter as a Vehicle to Keep Up withRecent Developments in Computing” NewsletterEditor Jim Rosenberger from Penn State Univer-sity ([email protected]) will lead this dis-cussion. This roundtable will be the perfect vehi-cle for those interested in providing input to ourever growing and evolving Section Joint Newslet-ter.

3. “The Design and Analysis of Computer-BasedExperiments” Albert M. Liebetrau from Battelle-Northwest (AM [email protected])plans to cover this topic by encompassing thework of the “Sandia” School (Iman and Conoverlatin hypercube sampling approach), the “OakRidge” School (Worley, Oblow, Pin and oth-ers), and the “Second Oak Ridge” School (Sacks,Welsh, Morris, and Mitchell). Discussion willalso cover the recent theoretical work by ArtOwen on latin hypercube designs and orthogonalarrays.

4. “Developing Departmental Computing”Mike Conlon from University of Florida([email protected]) will discuss the de-velopment, administration and maintenance ofmodern departmental computing networks. Is-sues of resource sharing (coordinating funds aswell as equipment), specifications (what shouldthe system be able to do), performance, software,vendors, migration (acceptance and training), in-stitutional barriers (technical and bureaucratic)and more will be discussed. Mike will consulton how to get from various starting points (standalone PCs, mainframes) to operational networkedsystems.

5. “Future Directions in Statistical Computing”John Chambers from AT&T Bell Laboratories([email protected]) will lead this dis-cussion. This luncheon parallels an InvitedSpeaker Session with John, Wayne Oldford,Werner Stuetzle, and Dave Andrews on the sametopic. Lively discussion on our future is expectedto be found here!

Interface ‘93 MeetingHighlights25th Symposium on the Interface: Computing Sci-ence and Statistics

Statistical Applications and Expanding ComputerCapabilities

April 14-17, 1993San Diego, CaliforniaPan Pacific Hotel

Keynote Speaker: David Brillinger "Statistics and Com-puting in Science"

Sponsor: Interface Foundation of North America

Hosts: Precision Data Group and University of Califor-nia, Berkeley

Cooperating Societies and Institutions:American Statistical Association (ASA); Institute ofMathematical Statistics (IMS); Society for Industrial& Applied Mathematics (SIAM); Operations ResearchSociety of America (ORSA); The Biometrics Soci-ety (WNAR); University of California, Berkeley; SanDiego State University, San Diego; Northern and South-ern California Chapters of the American StatisticalAssociation.

24 Statistical Computing and Statistical Graphics Newsletter April 1993

Invited Sessions include:Data Compression; Computing with EnvironmentalData; Biopharmaceutical Maps and Graphics; Clini-cal Trials; Protein Structure; Digital Networks; UserInterfaces; Geosciences; Software Engineering andStatistical Methods; The Interface at 25; LikelihoodApplications; Library Systems; Medical Applications;Multivariate Function Estimation; Networked Informa-tion Systems;Supercomputers; Time Series Analysis;Wavelets; Computers and Statistics in Drug Discovery;Quality Data Bases.

Inquiries should be sent to:

Interface ’93Michael E. Tarter, Program Chair140 Warren HallUniversity of California, BerkeleyBerkeley, CA 94720(510) [email protected]

Conference proceedings of invited and contributed pa-pers will be published. Camera-ready copy will be dueon June 1, 1993.

Conference Schedule: The conference begins onWednesday evening, April 14, with a get-acquaintedreception. Technical sessions will be held Thursdayand Friday with a banquet Thursday evening. Therewill be final technical sessions Saturday morning.

Short Courses:

� SYSTAT: Data Analysis – 2 days prior to the con-ference, Monday, April 12 and Tuesday, April13. SYSTAT produces statistical software de-signed for data analysis. Registration for thisshort course will be handled separately.

� BMDP: A New Windows-Based Statistical Anal-ysis Environment will be presented Wednesdayafternoon, April 14. There is no fee for thiscourse, however, please note on the registrationcard if you plan to attend.

� Randomization Tests: Jenny Baglivo, MarcelloPagano, and Cathie Spino will offer a short coursein Randomization Tests: Theory and Practice.There is no fee for this course. Please note on theregistration card if you plan to attend.

Registration: The registration fee is $165 for membersof the cooperating societies, ASA, IMS, SIAM, ORSA,the Biometrics Society (ENAR and WNAR), and forpersons affiliated with University of California, Berke-ley. The fee is $55 for students; for others it is $185.Registrations received after February 15, 1993 will be

charged a late fee of $20. The registration fee covers thereception, coffee breaks, banquet, and the proceedings.Please make checks payable to Interface ’93.

GENERAL INFORMATION

Accommodations: All meetings will be held at ThePan Pacific Hotel in downtown San Diego, California.Conference room rates of $99 per room have been ar-ranged for all attendees. Early registration is stronglyrecommended to assure a room. Please contact the hoteldirectly for reservations.

THE PAN PACIFIC HOTEL, 402 West Broadway, SanDiego, CA 92102-3580. Telephone (619)239-4500 or(800)626-3988.

Conference rate for Interface ’93 participants honoreduntil March 14, 1993. Rate can be extended 3 days priorand/or following the conference. Rates are $99 for ei-ther single or double room. Reservations held until 4p.m. without deposit or accepted credit card.

Air Fare: At the present time, no airlines are offer-ing discount association fares. Vineyard Travel in SanDiego will be happy to assist you with your travel plans.Their telephone number is (619) 741-6669. RosemaryNigro is our Corporate Travel Consultant.

San Diego International Airport: San Diego Interna-tional Airport is served by most major airlines. Shuttlebus and limousine service are available to the Pan Pa-cific Hotel in downtown San Diego.

Program Committee: Lynne Billard, Mary Ellen Bock,Noel Cressie, Arnold Goodman, Sam Greenhouse, JonKettenring, Diane Lampert, Michael Lock, Bob New-comb, Joseph Newton, John Rice, Ernest Scheuer, BobShumway, Michael Tarter, Grace Wahba, Edward Weg-man

Interface Foundation: The Interface Symposium is anactivity of the Interface Foundation of North Amer-ica, a nonprofit, educational corporation founded in Au-gust of 1987. The aim of the IFNA is to promote theInterface Symposium and related activities at the in-terface of computing science and statistics. The 25thSymposium is the sixth held under the auspices of theInterface Foundation. The next symposium will behosted by the SAS institute in North Carolina in 1994.

April 1993 Statistical Computing and Statistical Graphics Newsletter 25

News from NSFFebruary 16, 1993Dear Colleague:

Announcement of Proposal Target Dates

In order to improve the Division’s proposal manage-ment, and possibly to employ disciplinary panels in themerit review of proposals, the Division of Mathemati-cal Sciences plans to introduce target dates for proposalsubmission for disciplinary research activities for FY1994 NSF funds.

Beginning in the fall of 1993 the Division will introducetwo target dates for proposals submitted to the follow-ing programs:

Oct. 22, 1993 Algebra and Number Theory ProgramClassical Analysis ProgramModern Analysis ProgramTopology and Foundations Program

Nov. 19, 1993 Applied Mathematics ProgramComputational Mathematics ProgramGeometric Analysis ProgramStatistics and Probability Program

These dates do not overlap substantially with otherknown Foundation target dates, mesh reasonably wellwith academic calendars, and cluster the programs soas to provide a balance with respect to both overlap-ping scientific content and anticipated program proposalloads.

Proposals which miss the target dates will be handledas time permits. Priority will be given to proposalsarriving on or before the above target dates.

The above dates do not apply to the activities of theDivision’s Office of Special Projects. These activitiesalready have specified target or deadline dates.

Sincerely,

Frederic Y. M. WanDivision DirectorDivision of Mathematical Sciences

News from NSAANNOUNCEMENT

The NSA Mathematical Sciences Program continues itsefforts at funding high quality mathematical research inthe areas of Algebra, Number Theory, Discrete Math-ematics, Probability, Statistics and Cryptology. Theprogram, in its present form, had its beginning in 1987when the then director of the National Security Agency,

Lieutenant General William E. Odom, announced theexpansion and redirection of Program OCREAE, NSA’sgrants program for research in Cryptology and relatedareas. This effort is currently being vigorously sup-ported by the Agency.

The grant proposals submitted to the program are re-viewed by the NSA Mathematics Review Panel whichis appointed and administered by the American Math-ematical Society. Under the guidance of this panel,the program has been particularly interested in support-ing promising young investigators with small summersalary grants. The program has also directed a sig-nificant portion of its funds to senior investigators forthe support of their graduate students and to univer-sity departments for the support of special conferencesand workshops. In an attempt to formalize these ob-jectives, the MSP program now offers funding in fourdistinct categories; the Young Investigators Grant, theStandard Grant, the Senior Investigators Grant, andthe Conferences, Workshops, and Special SituationsGrants.

Firm deadline of October 15, 1993, for all proposalsexcept conferences. Funding as soon as possible afterOctober 1, 1994.

Conference proposals accepted any time. Allow 8months for funding.

Further information on grants is available by calling(301) 688-0400 or writing:

Dr. Charles F. Osgood, DirectorNSA Mathematical Sciences ProgramNational Security AgencyAttention: R51AFt. George G. Meade, MD [email protected]

Electronic White HouseThere has been a lot of recent information and disinfor-mation about the Clinton-Gore electronic White House.We have seen various postulated e-mail addresses forcontacting the new president of the United States, butnone of the addresses is completely satisfactory. Onecommon outcome of e-mail to the President is a reply,via regular paper mail—not quite what one would ex-pect! The most common explanation? : : : lack of fundsto buy computers and connect to the outside world. Withthat cleared up, it is still worth reporting some of theways of contacting the white cottage.

The new Clinton-Gore Administration has several elec-

26 Statistical Computing and Statistical Graphics Newsletter April 1993

tronic mail addresses. The MCI Mail box address (seebelow) and bulletin board have received the most pub-licity.

GEnie has a PF (Public Forum) section which carriesMIT-generated files from White House press briefingsand speeches, but that forum offers no e-mail feedbackto the Clinton Administration. It serves the public in-terest by making important text files and other publicinformation widely available.

What if you really want to get a message through toWashington? In the past this meant letters or telegrams,but now you can save paper and attempt direct electroniccontact in several ways.

The following mailbox addresses are reported to workfor sending e-mail to the White House and entering dis-

cussion/file areas related to the new administration:

� Compuserve: 75300,3115 (e-mail); GO: WHITEHOUSE (White House forum)

� America OnLine: clinton pz (e-mail); KEY-WORD: WHITEHOUSE (White House area)

� MCI; WHITE HOUSE (e-mail); VIEW WHITEHOUSE (views bulletin boards)

� Internet e-mail address:[email protected];[email protected];[email protected]

We haven’t personally tried any of these addresses ordiscussion forums. Perhaps our readers could give ussome feedback!

1994 Joint Statistical Meetings in Toronto CanadaAs Statistical Computing Program Chair for 1994, I am soliciting your help. Now is the time to give me suggestionsfor topics, speakers, and organizers for our invited sessions. If you have any ideas along these lines please contactme before or at this year’s Joint Meetings in San Francisco.

Sallie Keller-McNulty,Kansas State [email protected]

April 1993 Statistical Computing and Statistical Graphics Newsletter 27

The Statistical Computing and Statistical Graphics Newsletter is a publication of the Statistical Computing and StatisticalGraphics Sections of the ASA. All communications regarding this publication should be addressed to:

James L. RosenbergerEditor, Statistical Computing SectionDepartment of StatisticsThe Pennsylvania State UniversityUniversity Park, PA 16802-2111(814) [email protected]

Michael M. MeyerEditor, Statistical Graphics SectionDepartment of StatisticsCarnegie Mellon UniversityPittsburgh, PA 15213-1380(412) [email protected]

All communications regarding membership in the ASA and the Statistical Computing or Statistical Graphics Sections,including change of address, should be sent to:

American Statistical Association1429 Duke Street, Alexandria, VA 22314-3402

(703) 684-1221

Where to find it

A WORD FROM OUR CHAIRS : : : : : : : : : : : 1Statistical Computing : : : : : : : : : : : : : : : 1Statistical Graphics : : : : : : : : : : : : : : : : : 9

FEATURE ARTICLE : : : : : : : : : : : : : : : : : 1Saxpy, gaxpy, LAPACK, and BLAS : : : : : : : : 1

EDITORIAL : : : : : : : : : : : : : : : : : : : : : 2SECOND FEATURE : : : : : : : : : : : : : : : : : 2

Production of Stereoscopic Displays for Data Anal-ysis : : : : : : : : : : : : : : : : : : : : : : 2

DEPARTMENTAL COMPUTING : : : : : : : : : : 7Not just hardware and software : : : : : : : : : : 7

COMPUTER COMMUNICATION AND NETSNOOPING : : : : : : : : : : : : : : : : : : : 16

Gopher and other resource discovery tools : : : : : 16

Getting your Facts from FAQs : : : : : : : : : : : 17BITS FROM THE PITS : : : : : : : : : : : : : : : : 18

Statistical Computing and Graphics in Science andIndustry : : : : : : : : : : : : : : : : : : : 18

GEOGRAPHIC INFORMATION SYSTEMS : : : : : 19Designing the GIS Interface : : : : : : : : : : : : 19

BOOK REVIEW BEAT : : : : : : : : : : : : : : : : 21NEWS CLIPPINGS : : : : : : : : : : : : : : : : : : 22

Joint Statistical Meetings : : : : : : : : : : : : : : 22Interface 93 – Meeting Highlights : : : : : : : : : 24News from NSF : : : : : : : : : : : : : : : : : : 26News from NSA : : : : : : : : : : : : : : : : : : 26Electronic White House : : : : : : : : : : : : : : 26