Structured programming for virtual storage systems

  • Published on
    10-Mar-2017

  • View
    213

  • Download
    1

Transcript

  • The disciplines of structured programming and programmingfor virtual storage are examined to show how they affect eachother.

    Considerations and techniques are proposed that, when appliedduring the process of structured design and coding, produceprograms that place fewer demands on the computer storageresources.

    The techniques are illustrated by example programs.

    Structured programming for virtual storage systemsby J. G. Rogers

    The design and coding of computer programs have been affectedin recent years by two widely differing events. The introductionof virtual storage operating systems has reshaped our thinkingabout storage size and occupancy. At the same time, the disci-pline generally known as structured programming has causedgreat change in the process of design and coding. A body of lit-erature has grown in both these areas, but little has been writtenthat relates the two. This paper compares virtual storage sys-tems and structured programming, and shows the constraintsthat each places on the other.

    The concepts involved in structured programming began emerg-ing in the late 1960s, when it became apparent that the complex-ity of very large programs was causing increasing problems inprogramming and maintenance. Structured programming hasbeen very successful in reducing that complexity. Structuredprogramming, however, has developed largely without regard tovirtual storage. This has been primarily because virtual storageoperating systems were not in general use during much of thedevelopment of the structured programming concept and be-cause the concepts of structured programming transcend partic-ular operating systems.

    NO.4, 1975 STRUCTURED PROGRAMMING FOR VS SYSTEMS 385

  • structuredprogram

    design

    Implementations of virtual storage have proved not to be trans-parent to programs running in such systems. Because of this,techniques have evolved for coping with this lack of transparen-cy. These techniques have developed without regard to struc-tured programming and some suggestions that are quite alien togood structured programming practices have been made.

    The orientation of this paper is toward preserving the conceptsof structured programming. The paper also shows that, in mostcases, the concepts of structured programming are quite suitableto the virtual storage environment. Presented first are brief de-scriptions of structured programming and virtual storage con-cepts, which are followed by a discussion of ways in which eachaffects the other.

    Structured programming

    The discipline of structured programming has evolved as ameans of increasing program reliability and reducing program-ming and maintenance costs by making programs much lesscomplex and thus more understandable. Some of the work inthis field has been concerned primarily with program design.i"whereas other work is more general.t"

    The primary goal in the design of structured programs is to pro-duce modular program structure through successive functionaldecompositions. An example of this might be for a compiler tocreate a top-level module whose function is to compile the pro-gram, as shown in Figure 1. The first level of decompositionmight result in the following three modules: GET INTERNALTEXT, CONVERT SOURCE TO OBJECT, and WRITE OBJECT PRO-GRAM. The top-level module, COMPILE PROGRAM, calls theother three modules. A further decomposition of the GET INTER-NAL TEXT module might result in the following modules: GETPROGRAM STATEMENT, ANALYZE STATEMENT, and BUILD IN-TERNAL TEXT.

    A module in the context of Figure 1 is a compilable unit ofsource code (e.g., a pLiI external procedure or a FORTRANsubprogram). Modules created in this manner should be madehighly independent of each other, which can be done by mini-mizing the relationships among modules (called module coup-ling), and by maximizing the relationships among parts of eachmodule (called module strengths?

    The measure of the relationships among the parts of a module ismodule strength. Module strength is maximized if the parts ofthe module join forces to perform a single specific well-definedfunction." In this context, if a module calls another module, the

    386 ROGERS IBM SYST J

  • Figure 1 Modular structure by successive decomposition of function

    COMPILEPROGRAM

    II I I

    GET WRITEINTERNAL OBJECT

    TEXT PROGRAM

    II I I

    GET ANALYZE BUILDPROGRAM STATEMENT INTERNALSTATEMENTS TEXT

    function of the called module is considered part of the callingmodule's function. One test of a module's strength is to attemptto write a sentence about the module's function. The module inFigure I called ANALYZE STATEMENT is a strong module be-cause it can be described in a simple sentence with one verb andno words such as "if," "then," "next," "first," or "after."

    The measure of the relationships among modules is module cou-pling. In this sense, a relationship exists between two modules ifthey refer to the same data. Some of the more common tech-niques used for referencing data among modules are externallydeclared data (such as data with an EXTERNAL attribute in PL!Iprograms), global data areas (for example, control blocks ordata in a FORTRAN COMMON area), and parameters or argu-ments.' Externally declared data and global data areas createhigh coupling, which means that the modification of one modulemay require the modification of many more modules. To mini-mize coupling, data should be explicitly passed as parameters orarguments.'

    The nature of the data passed between modules should also beconsidered. The data are either control information - in whichthe calling module tells the called module what to do-or puredata. Control information should be avoided because it increasescoupling." If a module tells another what to do, the second mod-ule is not considered to be a "black box" module. The classifica-tion of data used in this paper depends on how the sending mod-ule sees the data.

    Some other characteristics that should be considered are deci-sion structure, module size, and predictability. Whenever pos-sible, it is desirable to arrange modules in such a way thatmodules directly affected by a decision are located beneath the

    No.4, 1975 STRUCTURED PROGRAMMING FOR VS SYSTEMS 387

  • structuredprogram

    writing

    module that contains the decision step.' Very small modulestend to create excessive overhead, whereas very large modulesmay be difficult to understand. Experience indicates that modulesof between ten and one hundred statements are the optimumsize." A module is said to be predictable if, for each time it iscalled with the same input, it produces the same output. In otherwords, the module does not "remember" what it has previouslydone and thus perform a slightly different function on subse-quent calls. Such a module is independent of its environment,and is more likely to be usable in several contexts.

    The programming phase of structured programming is con-cerned with reducing proposed modules to source languagestatements. The primary concept involved in structured pro-gramming is the use of structures such as DO-WHILE and IF-THEN-ELSE to control a program's flow rather than using GOTOstatements. Another concept is the possible decomposition ofmodules into segments."

    The process of segmentation is similar in many ways to the de-composition of a program into modules, each of which performsa single function. A module is segmented by decomposing itsfunction into more rudimentary subfunctions, each of whichbecomes a segment. Each segment is a separate entity whichexists in external storage and is included in the module in a pre-processor phase of the compiler. Just as there are-many levels ofmodules, each of which calls others at deeper levels, there aretypically many levels of segments, each of which contains IN-CLUDE statements for other segments.

    Each segment created consists of source language statementsand perhaps INCLUDE statements for other segments. Thesource language statements in a segment define the controlstructure of the segment and perform part of the segment's func-tion. Each INCLUDE statement for a segment represents somedistinct function whose source statements are to be included later.

    The concept of segmentation may be seen in Example 1, a pLiIprogram fragment:

    Example 1IF COMMAND = 'START' THEN

    DO;%INCLUDE STOPCMND:

    END;ELSE

    IF COMMAND = 'STOP' THENDO;

    %INCLUDE STRTCMND;END;

    388 ROGERS IBM SYST J

  • In Example 1, the programmer has written the control structurein the current segment. Because the intervening processing codehas been left to other segments, the control structure is easilyvisible and understandable. The segments STRTCMND andSTOPCMND are to be coded later. These segments are expectedto be more easily understood because they are small indepen-dent pieces of code, each of which performs a single function.

    Mills9 describes the programming process as follows:

    ". . . one can write the first segment which serves as a skeletonfor the whole program, using segment names, where appropriate,to refer to code that will be written later. In fact, by simply tak-ing the precaution of inserting dummy members into a librarywith those segment names, one can compile or assemble, andeven possibly execute this skeleton program while the remainingcoding is continued.

    "Now the segments at the next level can be written in the sameway, referring, as appropriate, to segments to be later writ-ten. . . . As each dummy segment becomes filled in with its codein the library, the recompilation of the segment that includes itwill automatically produce new updated expanded versions ofthe developing program."!'

    There are several characteristics that segments should have in astructured program. The segment should be small enough to belisted on a single page, Le., have no more than fifty statements.Each segment should represent a single function, it should beentered only at the first statement of the segment, and it shouldexit only at the last statement.

    The control logic contains no GOTO statements. In PL/I, thebranching control can be defined entirely in terms of DO loops,IF-THEN-ELSE, and ON statements. The resulting code can beread strictly from top to bottom, with greater understanding thanwould otherwise be possible."

    Programming for virtual storage systems

    Some programs tend to cause excessive paging when run in vir-tual storage systems. As these programs have been observedand the causes of paging analyzed, a set of techniques hasevolved. In contrast to the concepts of structured programming,which are applied systematically during the process of programdesign and coding, the concepts of programming for virtualstorage'

  • virtualstorage

    because the present state of programming for virtual storage sys-tems is much more of an art than a science.

    The virtual storage systems refered to in this paper are thoseimplemented in the IBM System / 370 series of computers. Inthese systems, virtual address space is divided into fixed-lengthpages that map into page-size units in the computer's real stor-age, which is generally smaller than its virtual storage. Thesepage-size units are called page frames. When a program refersto an address in virtual storage that is not currently in real stor-age, that occurrence is called a page fault. The operating systemis designed to bring the referenced page into real storage throughan action called paging.

    Each program that runs on a virtual storage system has, at anyparticular instant, a working set of pages that belong to it. Theworking set is simply the pages in real storage that the programis predicted to use in order to run for a given interval of timewithout incurring a page fault. The actual working set typicallychanges with the time intervals of the program's execution. Ifthe set of programs that is contending for real storage space hascumulative working sets larger than the real storage available,then paging also occurs for that reason. If the rate of pagingbecomes so high that the system resources are used as much ormore for paging than for productive work, a condition calledthrashing is said to prevail.

    When considering means for correcting an existing program thatappears to be causing thrashing, one may hastily conclude thatsuch a program may be incurring too frequent page faults. This,however, may not be the correct deduction from the facts. In thefirst place, not all page faults are incurred by a given program;other programs are incurring page faults as well. Further, if thegiven program were to run alone, it would probably not incurpage faults. Thus a more useful deduction may be that the givenprogram requires too much real storage to run in the prevailingmultiprogramming environment. Rather than seeking ways ofreducing page faults, it may be preferable to reduce the real stor-age necessary for the program by reducing the working-set sizeduring periods in which the working set size may be a problem.

    To determine the measures necessary to reduce a program'sworking set size, a useful approach is to begin with the workingset and proceed backward into the program. This procedure hasthe desirable characteristic of beginning with large elements andconsidering successively smaller elements with each step. Intaking this approach, one assumes that the working set size istoo large, and at each step he looks at what can be done to re-duce the working set size.

    390 ROGERS IBM SYST J

  • Figure 2 The recording ofcontrol sections, (A)Given working set,(B) Reduced workingset.

    If a program's working set consists of too many pages, one'sfirst action is to analyze the contents of the pages. Next in theprogramming hierarchy below the working set is the control sec-tion, which is the smallest element of a program that is identifi-able by the linkage editor. Rearrangement of the most frequentlyreferenced control sections may be possible so as to occupyfewer pages, as shown in Figure 2. Here the heavy black linesrepresent page boundaries, and the shaded portions representcontrol sections currently being referenced. Figure 2A showsthat control sections A, C, E, H, and K are all that are currentlyneeded in the working set. Because of the way in which the pro-gram is packaged, however, the working set consists of fivepages. Figure 2B shows the control sections after rearrangementso that the current working set requires only three pages. Sincethe working set varies from time to time, the overall objective isto optimize the total execution of the program. Myers" gives agood description of the process of reordering control sections.

    With the ability to reorder control sections, another considera-tion is to create control sections that lend themselves better toreordering. The easiest way to do this is to generate small con-trol sections that have desirable characteristics. Such small con-trol sections lend themselves to external arrangement to buildpages that have desirable characteristics. The ease with whichthis can be done varies considerably among the different lan-guages in use today."

    At this point, the paradigm program has been decomposed intomodules. The program is thus at the terminal point of virtualstorage design: a collection of proposed modules ready to becoded.

    It should be kept in mind during the design of program modulesthat are to be executed together with other program modules ina virtual storage environment that we are concerned with con-serving and making the most efficient use of real storage. Weneed not be concerned about virtual storage by itself. Thus,those parts of programs that are executed only rarely need notbe of concern." This indicates that for any program there is aset of modules - perhaps entire branches of trees - for whichstorage is of no concern either during the design process or dur-ing the coding of the modules.

    The modules must contain code, all of which is needed at thesame time. Further, the larger a module that is written, thegreater the distance the data needed by a particular instructionmay be from the instruction itself, a condition that might wellincrease the working set size and possibly cause page faults. Inessence, modularity should be by proximity of usage rather thanby similarity of function."

    o

    c

    (A)

    designingforvirtualstorage

    (8)

    NO.4' 1975 STRUCTURED PROGRAMMING FOR VS SYSTEMS 391

  • writingprograms

    forvirtual

    storage

    All code (if more than just a few bytes long) that is designed tohandle exceptional conditions such as error recovery, permanentdiagnostic capability, etc., should be apart from mainline code,preferably in separate control sections (CSECTS) .15 This givesthe programmer the ability to remove such code from the pro-gram's working set, which thus becomes smaller."

    At this time, it is necessary to consider data that are referencedby more than one module. In other words, algorithms internal tosome modules must be taken into account. As an example, sup-pose a module must build two 64 by 64 double-precision ma-trices. Each of these matrices requires 32K bytes or eight pagesif the page size is 4K bytes. The algorithmic decision to be madeat this time is whether these matrices are suitable for sparsematrix techniques. If they are, it is possible that a great savingsin real storage can be accomplished. If they are not, then somedecisions must be made. If another module must multiply thesetwo matrices a decision must be made at this level as to whetherone of them should be stored as its transpose." The coding ofmultiplication does not require more complexity by one methodthan by another. There may be, however, a tremendous differ-ence in the storage reference patterns. Figure 3A shows the cor-responding elements to be multiplied for two such matrices, Aand B, and Figure 3B shows the corresponding elements for Aand the transpose of B (written Bt ) .

    Similar cases may exist whenever two modules must pass largemasses of data between them. It is impossible to list all suchpossibilities. However, if one understands the principles wellenough, then he will be aware of such occasions.

    Whereas design for virtual storage is considered here in terms ofgeneral principles, writing programs for virtual storage involvesa somewhat more solid technique. There is still, however, anaura of art about such programming, and it is difficult to stategeneral principles that are always valid and can be applied in"cook-book" fashion. Many of the techniques also depend on aspecific language or a specific compiler implementation.

    Some further rules that are generally applicable to programmingfor virtual storage systems are the following:

    Reference the data in the order in which they are stored,and / or store the data in the order in which they are refer-enced." This is not always possible, of course, and the orderin which an array is referenced is of no consequence if thearray fits into a single page."

    If possible, separate read-only data from areas that are to bechanged." This is of lesser importance than good locality ofreference.

    392 ROGERS IBM SYST J

  • Figure 3 Module design for matrix multiplication, (A) Multiplication AB, (8) Multipli-cation AB'

    A

    B

    D

    D

    D

    D

    (A)

    ) 1 PAGE

    ) 1 PAGE

    Bt

    (8)

    } 1 PAGE

    ) 1 PAGE

    Avoid the use of elaborate search strategies for large dataareas, and avoid the use of large linked lists if these tech-niques cause a wide range of addresses to be referenced.P

    Virtual storage coding techniques vary by language and par-ticular language implementation.

    In PUI, the AREA variable is of particular interest. Basedstorage can be allocated inside an AREA variable. Thus,based variables that are to be used together can be local-ized." The code fragment in Example 2 shows that basedvariables ALPHA and BETA can be made close together.

    Example 2DCL GAMMA AREA;DCL (ALPHA, BETA) BASED;

    ALLOCATE ALPHA IN (GAMMA);

    ALLOCATE BETA IN (GAMMA);

    PUI allows considerable flexibility in declaring data aggre-gates. Also in PUI, arrays of structures and structures of ar-

    No.4, 1975 STRUCTURED PROGRAMMING FOR VS SYSTEMS 393

  • Figure 4 Example FORTRAN storage areas

    CONTROL SECTiONS

    MAIN

    XYZ

    $BLANKCOM

    N

    IBCOM

    FIOCS

    -------f...---------------------

    {----------------r-------

    {{{{

    CONTENTS

    ENTRYPOINT,BRANCHTOEXECUTABLE CODE

    STORAGE FORFORMATSTATEMENT

    STORAGE FORLOCALVARIABLES I, J, ETC.

    STORAGE FORLOCALARRAYS G, H, ETC.

    EXECUTABLE CODE

    SIMILARTO MAIN PROGRAM

    STORAGE FORVARIABLES AND ARRAYS IN BLANK COMMON

    STORAGE FORVARIABLES AND ARRAYS IN NAMED COMMONAREAN

    I/O, INITIALIZATION,AND ERROR HANDLINGMODULE.

    I/O MODULECALLEDFROMIBCOM

    rays should be declared in the order in which they are to bereferenced most frequently. Multiply dimensioned arrays inPL/I are stored rowwise.v

    FORTRAN stores arrays by column." Avoid implied FORTRAN DO loops in I/O statements because

    they cause repeated return to the calling program." In COBOL, data within the working storage section are allo-

    cated in the order in which they are declared. Therefore, de-clare data that are to be used together in consecutive state-ments.v'

    Also in COBOL, files that are used together should be openedin the same statement. This causes their buffers to be closetogether and improves the locality of references as buffersare processed.P

    Source language and compiler implementations

    During the virtual storage program designing and coding pro-cesses, a choice of source language must be made. It is possiblethat a single language may be used throughout the program, buta mixture of languages may also be used. In any case, the char-acteristics of the language and the implementation must be un-derstood if the program is to make efficient use of virtual stor-age.

    394 ROGERS IBM SYST J

  • The four primary languages used by IBM virtual storage systemsare COBOL, FORTRAN, PLlI, and Assembler Language. For thefirst three, several different compilers are available, of which weare concerned only with the optimizing compilers because theytend to produce less code and reduce data references. The mod-ule structure of Assembler Language programs is under the con-trol of the programmer, and it is not considered here. COBOL,FORTRAN, and PUI and their optimizing compilers are discussedin the following three sections.

    The ANS COBOL version 4 compiler generates one control sec-tion for a main program and one for each separate subprogram.Data referenced within a program module are stored in the samecontrol section. The data and executable statements appear inthe control section in almost exactly the same order in whichthey appear in the source language statements. A large mainprogram may be divided into several control sections by using thesegmentation feature. When using segmentation, all data remainwith the root segment; only executable code appears in the othercontrol sections along with a small amount of control informa-tion.

    The FORTRAN H extended compiler produces one control sec-tion for a main program and one control section for each subpro-gram. Blank COMMON is a control section. Each named COM-MON area is a separate control section. Library subprograms arecontrol sections. Most library subroutines are explicitly called,but some subroutines such as exponentiation routines and I/Oroutines are implicitly included. Figure 4 shows the control sec-tion structure for the code in Example 3.

    Example 3COMMON A, B, C' ..COMMON /N/D, E, F' ..DIMENSION G(10), H(20), ...

    J=[

    DO 20 [= LlOO

    20 CONTINUECALLXYZ(J)WRITE (6,30)J

    30 FORMAT (12)STOPENDSUBROUTINE XYZ(K)

    RETURNEND

    ANS COBOLversion 4

    FORTRANH extendedcompiler

    No.4' 1975 STRUCTURED PROGRAMMING FOR VS SYSTEMS 395

  • Figure 5 Example PL/l slorage areas

    CONTROLSECTIONS

    , "'." '\ ABC2

    ABC 1

    '" '" ,. XYZ2

    ." ~ .. "XYZ I

    D

    tFIXEDSTORAGEDYNAMICSTORAGE

    ~

    ALLOCATED ONLYWHEN XYZ ISEXECUTING

    1-- - - -- ---- - --

    1-------1------ -

    1-------

    CONTENTS

    CONSTANTS,CONTROLINFORMATIONFORABC

    STORAGE FORE

    SUBROUTINE CALL PARAMETERLISTS, OTHERCONTROLINFORMATION

    EXECUTABLE CODEFORABC

    SIMILAR TO PROCEDURE ABC

    EXECUTABLE CODEFORXYZ

    STORAGE FORVARIABLES,A, B, C

    STORAGE FORARRAY0

    LIBRARYSUBROUTINES

    SAVEAREA,ERRORAND STORAGE CONTROLINFORMATION FORABC

    STORAGE FOR F, P, I, J AND TEMPORARY WORK AREAS FOR ABC

    SAVEAREA, ERRORAND STORAGE CONTROLINFORMATIONFORXYZ

    STORAGE FORVARIABLESAND TEMPORARY WORK AREASFORXYZ

    STORAGE FORG

    DYNAMICALLY LOADED LlBRARY SUBROUIINES WHENNEEDED

    PL/Ioptimizing

    compiler

    The most notable characteristic of the module structure is thatlocal variables and arrays are in the same control sections asexecutable code, and thus they are not separable at linkage edit-ing time. The !BCOM module, which is fairly large, is used duringexecution only for uo statements and PAUSE and STOP state-ments. FlOeS is used for all I/O statements.

    The PL/I optimizing compiler produces two or more control sec-tions for each external procedure. One control section is createdthat contains only executable code, and another contains controlinformation, constants, and internal variables. Other control sec-tions are included as closed subroutines for actions such as char-acter string assignments. Each variable, array, or structure de-clared as STATIC EXTERNAL is a separate control section. pL/ralso uses dynamically acquired storage. Space for such pro-cedures as register save areas, error handling information, AU-TOMATIC variables, and work areas are in a dynamically ac-quired section of storage. These sections are obtained and freeddynamically when a procedure is entered. If procedure A calls

    396 ROGERS IBM SYST J

  • procedure B, then the dynamic storage area for procedure B isgenerally (and can be made to be) appended to the end of thedynamic storage area for procedure A. Variables that are allo-cated dynamically by the user with ALLOCATE statements are ina different dynamically acquired section of storage. Figure 5shows the storage structure for the PL!I program in Example 4.

    Example 4ABC: PROCEDURE OPTIONS(MAIN);

    DECLARE1 S STATIC EXTERNAL,2 (A,B,C) FIXED BINARY(l5);

    DECLARE D(lO) FLOAT DECIMAL(I6) STATICEXTERNAL;

    DECLARE E(20) FLOAT DECIMAL(l6) STATICINTERNAL;

    DECLARE F(lO) FIXED DECIMAL(4,2);DECLARE G(5) FIXED BINARY(3l) BASED(P);DECLARE P POINTER;

    J = 1;DO I = 1 TO 100;

    END;ALLOCATE G;CALL XYZ(J);

    END;XYZ:PROCEDURE(K);

    END;

    Virtual storage constraints on structured programming

    The disciplines involved in structured programming offer, as wasnoted earlier, specific techniques that-when applied in a sys-tematic manner - produce programs that are more easily under-stood than would otherwise be the case. The techniques in-volved in these disciplines take no notice of the virtual storageenvironment. In this section, the process of structured program-ming is analyzed and the principles of programming for virtualstorage are applied against it. Thus a picture should emerge thatshows the proper way to think about virtual storage during thestructured programming process.

    The first and one of the most important of all considerations isthat of whether real storage constraints are important. Manyprograms have very little impact on the user's system. The con-cern becomes considerably less for large storage configurationsand considerably greater for the smaller systems. Small pro-grams, short-running programs, and one-time programs are not

    No.4' 1975 STRUCTURED PROGRAMMING FOR VS SYSTEMS 397

  • designconstraints

    largedata

    aggregates

    typical subjects for programming for virtual storage. Large, long-running, and frequently-run programs benefit most from virtualstorage programming techniques. Of course, a program that islarge and long-running on a System/370 Model 145 with 512Kbytes of real storage may be a small, short-running program on aSystem /370 Model 168 with 4096 K bytes of real storage. This isone of the factors to be considered.t" It cannot be emphasizedtoo strongly that the designer must determine his degree of con-cern for real storage as early as possible in the design phase.

    Just as there are entire classes of programs that are insensitiveto virtual storage (when measured by system throughput) thereare also parts of many programs that are similarly insensitive. Aflurry of paging that occurs once every few minutes is of littleconcern. Thus there are things such as human interaction, errorcorrection, and so forth that may be designed and coded ascompletely and elaborately as is convenient, simply becausethey occur infrequently.

    The process of structured programming begins in the designphase. The culmination of the design phase is a set of proposedmodules that exhibit certain desirable characteristics. Amongthese are high module strength, low module coupling, predict-ability, size, and decision structure.

    Some of these characteristics are quite easily dealt with. Predict-ability and decision structure are usually insensitive to virtualstorage. Module strength tends to be generally compatible withvirtual storage programming principles because it tends to local-ize code that is used at the same time.

    Informational strength modules (modules that perform morethan one function with an entry point for each function) may besensitive to virtual storage. If not all of the included functionsare needed at the same points in the program, then code may bedragged into the working set at times when it will not be used.

    Module size is a somewhat more complicated factor. Size, inthis context, refers to number of source language statements. Ifthis number is kept small, then the amount of executable codegenerated is also small. On the other hand, the addition of astatement such as

    DECLARE x(lOOO) FLOAT DECIMAL( 16);

    increases the object module size by 8000 bytes, while addingonly one source statement to the module. Although the modulesize - in terms of source statements - is small, the generatedobject module (which depends on the language implementation)

    398 ROGERS IBM SYST J

  • Figure 6 large aggregate storage problem illustrated by matrix multiplication

    I ALPHA

    BETA

    IN

    ALPHA

    ALPHA, BETA

    OUT

    ALPHA

    ALPHAI GAMMA

    c

    becomes very large, Further, in high-level language implementa-tions, there tends to be a separation of data from the code thatreferences it

    The. preceding problem of large data aggregates is merely onemanifestation of an even greater problem of large data aggre-gates. The suggestion usually given is to make large data aggre-gates into independent modules (STATIC EXTERNAL in PL/I orCOMMON in FORTRAN). This procedure, however, in the struc-tured designer's view, causes a potential COMMON coupling thatis undesirable. The purpose of making these data aggregatesexternally known is so that their location can be chosen at link-age editing time, and the user can arrange the aggregates so thatreal storage is conserved.

    In the case of COBOL, all data must be internal. FORTRAN offersCOMMON, which creates an undesirable COMMON coupling con-dition. PL/I offers STATIC EXTERNAL, which also creates COM-MON coupling, but also offers the AREA variable. The AREA vari-able reserves a section of storage, either statically or dynamicallyacquired. BASED variables may then be allocated to space withinthe AREA. Under some circumstances, this mechanism can beused to preserve locality of reference.

    An example of this problem is illustrated by Figure 6. Module Acontains the storage for and creates matrix ALPHA. Module Acalls Module B, and passes ALPHA as a parameter. Module B

    No.4' 1975 STRUCTURED PROGRAMMING FOR VS SYSTEMS 399

  • contains the storage for and creates matrix BETA. Module Bcalls module C, and passes ALPHA and BETA as parameters.Module C multiplies ALPHA and BETA, and stores the result inGAMMA. Module C then sets ALPHA equal to GAMMA and re-turns. The FORTRAN program fragment in Example 5 accom-plishes these operations.

    Example 5SUBROUTINE ADIMENSION ALPHA(l6,16)

    CALL B(ALPHA)

    STOPENDSUBROUTINE B (ALPHA)DIMENSION ALPHA(16,16), BETA(16,16)

    CALL C(ALPHA,BETA)

    RETURNENDSUBROUTINE C(ALPHA,BETA)DIMENSION ALPHA(16,16), BETA (16.16), GAMMA(16,16)

    DO 101 = 1,16DO 10J = 1.16GAMMA(I.J) = 0.0DO 10 K = 1,16

    10 GAMMA (l.J) = GAMMA(I,J) + ALPHA(I,K)*BETA(K,J)

    DO 20 I = 1.16DO 20J = 1.16

    20 ALPHA(I,J) = GAMMA(I,J)

    RETURNEND

    In this example, the user has little control over the areas thatALPHA, BETA, and GAMMA occupy in storage. Packageability ofA, B, and C are hampered by the 1024 byte arrays inside eachmodule. It is easily possible that the three modules can re-quire three pages of real storage to execute, and it is possiblethat they may require six pages of real storage. If these arrayswere put in COMMON - the only alternative in FORTRAN - theycould all be put into one page, which would be present onlywhen the arrays are being used. What appears to be desirable inthis case in designing for virtual storage is clearly undesirable instructured design.

    400 ROGERS IBM SYST J

  • Data aggregates are not easy to cope with. On the one hand,making arrays internal is not a good practice for virtual storage.On the other hand, making the arrays COMMON opens them upfor use by entirely different parts of the program, thereby creat-ing a potential reliability problem. A closer look at the designprocess may help in finding a solution to the problem.

    In some mathematical applications, the designer may be awareof alternative algorithms before the design process begins, ormay at least become aware of them at some point during the de-sign phase. There are, for example, at least two ways of calculat-ing eigenvectors, one of which is highly inefficient in virtual stor-age and the other quite efficient. The designer should maintainan awareness of cases where large tables of data items are likelyto be needed. Instead of filling a large table sequentially withvalues and then sequentially using the values one by one, itmight be better to use each value as it is generated, and thusavoid the use of a table entirely. A large, sparsely used table thatoccupies several pages might be better utilized by storing tableentries in sequential locations and keeping an index to the tablerather than storing entries in locations of the table that corre-spond to item numbers.

    Choice of implementation language may also be affected bysome of the above design considerations. pLiI offers a better so-lution to the problem illustrated by FORTRAN in Example 5. ThepLiI solution, which is shown later in Example 6, allows the ma-trices to be kept together and separate from the code, while itpreserves DATA coupling between the modules.

    All the items discussed here thus far have been directed towardthe design of structured programs. These principles also apply tocoding within a module. If the designer realizes that a particulartechnique may be applicable within a module, he should pointthat fact out to the programmer, who may not be aware of it.Many techniques of programming for virtual storage are alsoavailable to the practitioner of structured programming.

    Some general considerations in programming for virtual storagethat also apply to structured programming are the following:

    Exceptional-condition code should be in a separate modulethat is called from the module in which the condition (sayanerror) occurs.

    Data should be referenced in the order in which they arestored, if possible, especially for large data aggregates thatoccupy several pages.

    Store data as closely as possible to other data that are to beused at the same time.

    constraintson writingstructuredprograms

    NO.4, 1975 STRUCTURED PROGRAMMING FOR VS SYSTEMS 401

  • ANS COBOLcompilerversion 4

    PLIIoptimizing

    compiler

    Avoid the use of elaborate search strategies for large dataareas."

    Many virtual storage programming techniques are valid for par-ticular languages and compiler implementations only. Some vir-tual storage programming techniques that are usable with struc-tured programming are now discussed.

    Data within the working storage section are allocated in the or-der in which they are declared. Therefore, declare data that areto be used together in consecutive statements.v' Declare filesthat are to be used together in consecutive declarations.'! Filesthat are used together should be opened in the same statement.This causes their buffers to be closer together and improves thelocality of reference as buffers are processed." Do not use alter-nate areas unless they are needed for a special reason. The neteffect of alternate buffers is to separate data areas.P

    The AREA variable is of particular interest because based stor-age can be allocated within an AREA variable. In this way, basedvariables that are to be used together can be localized." Thelocalization of based variables is illustrated in Example 6, whichaccomplishes the same function as FORTRAN, as illustrated inExample 5.

    Example 6A: PROCEDURE;

    DECLARE OMEGA AREA(4000);DECLARE ALPHA(l6,16) BASED(APTR);

    ALLOCATE ALPHA IN(OMEGA);

    CALL B(OMEGA,ALPHA);

    END A;

    B: PROCEDURE(OMEGA,ALPHA);DECLARE OMEGA AREA(*);DECLARE ALPHA(l6,16);DECLARE BETA06,16) BASED(BPTR);

    ALLOCATE BETA IN(OMEGA);

    CALL C(OMEGA,ALPHA,BETA);

    END B;

    402 ROGERS IBM SYST J

  • C: PROCEDURE(OMEGA,ALPHA,BETA);DECLARE OMEGA AREA(*);DECLARE (ALPHA,BETA)(l6,16);DECLARE GAMMA(l6,16) BASED(GPTR);

    ALLOCATE GAMMA IN(OMEGA);

    DO 1=1 TO 16;DO J = 1 TO 16;

    GAMMA(I,J) = 0.0;DO K = 1 TO 16;

    GAMMA(I,J) = GAMMA(I,J) + ALPHA(I,K)*BETA(K,J);END;

    END;END;ALPHA = GAMMA;

    ENDC;

    In Example 6, the AREA, OMEGA, has been made large enoughto hold ALPHA, BETA, and GAMMA, so that, when they are allo-cated to the area, ALPHA, BETA, and GAMMA are together instorage and create a better locality of reference.

    Avoid, if possible, putting variables with the initial attribute inautomatic storage.>

    Be especially careful of arrays of structures and structures ofarrays. Be sure to declare them in the order in which they areexpected to be referenced most frequently.

    PL/I passes all arguments in CALL statements by location ratherthan by value. Consider the program fragment in Example 7.

    Example 7x. PROCEDURE OPTIONS(MAIN);

    DECLARE I STATIC INTERNAL;

    CALL Y(I);

    END x.Y: PROCEDURE(I);

    DECLAREJ STATIC INTERNAL;

    CALL Z(I,J);

    END Y;

    NO.4, 1975 STRUCTURED PROGRAMMING FOR VS SYSTEMS 403

  • generalguidelines

    z: PROCEDURE(I,J);

    K = 1+ J;

    END Z;

    Whenever modules Y or Z use the variable I, they reference theSTATIC INTERNAL control section of X. Further, wheneverZ references J, it references the STATIC INTERNAL control sec-tion of Y. Thus the working set, when Z is executing, may belarger than need be. To reduce the working set, assign the para-meters to internal variables and then use those variables.

    The purpose of special programming for virtual storage is toreduce a program's real storage requirement, an objective to-ward which the following guides may prove to be helpful:

    Separate unused code and data space from code that is fre-quently used.

    Seek algorithms or techniques that use small data areas. Remember that the modules created during the design and

    coding phases can be reordered during linkage editing to cre-ate better reference patterns.

    Structured programming constraints on programming forvirtual storage

    Because the available literature on programming for virtual stor-age consists mainly of isolated techniques, this section attemptsto bring together such practices and point out ways in whichthey do not fit within the constraints of structured programming.These practices may be well worth avoiding when they conflictwith structured programming for virtual storage systems.

    Grouping high-use buffers and data areas together in com-mon storage" causes a potential COMMON coupling and fur-ther forces an artificial structure on data.

    Making common data areas more productive by using thesame area for different data in different phases of theprogram" creates even more severe COMMON coupling thangrouping high-use buffers, because program changes in onearea may cause errors in unrelated areas.

    Setting up initial conditions at the beginning of the program'

Recommended

View more >