5
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. SE-3, NO. 5, SEPTEMBER 1977 The Influence of Structured Programming on PL/I Program Profiles JAMES L. ELSHOFF, SENIOR MEMBER, IEEE Abstract-Two sets of commercial PL/I programs are studied. One set represents programming practice before the introduction of structured programming techniques and the other set after their introduction. The use of these new methods is found to make a measurable difference on the quality of the programs. A few minor changes in the use of PL/I are noted. Substantial modifications to the control structure of the programs are measured. Also, some improvements in the qualitative aspects of the two sets are discussed. Although, they are much im- proved, further alterations can make the programs still better. The time and training required to introduce structured programming techniques will begin paying dividends within six months. TABLE I A COMPARISON OF THE Two SAMPLES Sample Attribute NSP Programs SP Programs Average Number Of Source Records 1216 952 Average Number Of PL/I Statements 853 593 Maximum Number Of Source Records 4651 2898 Maximum Number Of PL/I Statements 3735 2011 Index Terms-Program analysis, program measurement, programming language usage, structured programming. INTRODUCTION STRUCTURED programming [1] - [3] has become a generic term for all the various techniques (i.e., top-down design, structured coding, modularization, etc.) that programmers may use to improve their productivity and their programs. In this paper structured programming is treated with the spirit of a generic term and not as the letter of definitive rules. In this regard the global influence of structured programming is con- sidered as opposed to the results of specific rules that may be embodied within it. The purpose of this work is to compare two sets of pro- grams: one set produced before programmers had been intro- duced to structured programming and a second set produced some months after its introduction. The specific differences between the programs are identified and measured. Two SETS OF PROGRAMS The 120 programs which were written without using struc- tured programming techniques were collected in January, 1974. They came from several commercial installations and represent programming practices up until that time. The sam- ple of 120 programs was collected with respect to a set of guidelines including factors such as program size, program function, program use, programmer background, frequency of use and modification, and so forth. These programs have been studied in detail previously [4], [5] . The 34 programs in the sample which use structured pro- gramming come from two programming installations, both of which are also represented in the first sample. The programs were collected in 1975 and represent six months of effort with structured programming at installation #1 and twelve months Manuscript received January 1976; revised March 24, 1977. The author is with the Computer Science Department, General Motors Research Laboratories, Warren, MI 48090. of effort at installation #2. Each installation submitted a set of programs which the installation personnel felt were some of the better programs which had been written with the struc- tured programming techniques. Throughout the remainder of this paper the two sets of pro- grams just described will be compared. The programs not using structured programming will be denoted as NSP programs and those using the techniques will be denoted as SP programs. Generally, the SP programs will be treated as a single set; how- ever, installation #1 and #2 will be used to identify the two major subsets when it is convenient or necessary to do so. Table I displays some data about the two sets of programs. All of the programs are written in PL/I. A program is a single compilable external procedure with zero or more internal pro- cedures. Except for a few utility programs in each set, each program is a complete self-contained data processing applica- tion. Since the two samples differ significantly with respect to average program size, most data will be presented in terms of percentages within the sample program sets. CHANGES IN PL/I USAGE The distribution of statement types is shown in Table II. A large decrease in GOTO statements is offset by increases in CALL, Null, PUT, and PROCEDURE statements. In the NSP programs about twenty percent of the programs came from an installation which used CALL statements to perform all I/O. Also, the number of unresolved preprocessor statements was much larger in the SP programs. A large increase in the use of the DO WHILE statement has been found. There is also an increase in the number of com- plex DO statements, those DO statements with both an iteration clause and a WHILE clause. Probably the most significant DO datum shown in Table III is the percent of programs with at least one DO WHILE statement. This indicates that most of the programmers are now aware that the statement exists and can use it. 364

The Influence of Structured Programming on PL/I Program Profiles

  • Upload
    donhi

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. SE-3, NO. 5, SEPTEMBER 1977

The Influence of Structured Programming onPL/I Program Profiles

JAMES L. ELSHOFF, SENIOR MEMBER, IEEE

Abstract-Two sets of commercial PL/I programs are studied. One setrepresents programming practice before the introduction of structuredprogramming techniques and the other set after their introduction. Theuse of these new methods is found to make a measurable difference onthe quality of the programs. A few minor changes in the use of PL/Iare noted. Substantial modifications to the control structure of theprograms are measured. Also, some improvements in the qualitativeaspects of the two sets are discussed. Although, they are much im-proved, further alterations can make the programs still better. The timeand training required to introduce structured programming techniqueswill begin paying dividends within six months.

TABLE IA COMPARISON OF THE Two SAMPLES

Sample Attribute NSP Programs SP Programs

Average Number Of Source Records 1216 952

Average Number Of PL/I Statements 853 593

Maximum Number Of Source Records 4651 2898

Maximum Number Of PL/I Statements 3735 2011

Index Terms-Program analysis, program measurement, programminglanguage usage, structured programming.

INTRODUCTIONSTRUCTURED programming [1] - [3] has become a generic

term for all the various techniques (i.e., top-down design,structured coding, modularization, etc.) that programmers

may use to improve their productivity and their programs. Inthis paper structured programming is treated with the spirit ofa generic term and not as the letter of definitive rules. In thisregard the global influence of structured programming is con-

sidered as opposed to the results of specific rules that may beembodied within it.The purpose of this work is to compare two sets of pro-

grams: one set produced before programmers had been intro-duced to structured programming and a second set producedsome months after its introduction. The specific differencesbetween the programs are identified and measured.

Two SETS OF PROGRAMSThe 120 programs which were written without using struc-

tured programming techniques were collected in January,1974. They came from several commercial installations andrepresent programming practices up until that time. The sam-

ple of 120 programs was collected with respect to a set ofguidelines including factors such as program size, program

function, program use, programmer background, frequency ofuse and modification, and so forth. These programs have beenstudied in detail previously [4], [5] .

The 34 programs in the sample which use structured pro-

gramming come from two programming installations, both ofwhich are also represented in the first sample. The programs

were collected in 1975 and represent six months of effort withstructured programming at installation #1 and twelve months

Manuscript received January 1976; revised March 24, 1977.The author is with the Computer Science Department, General Motors

Research Laboratories, Warren, MI 48090.

of effort at installation #2. Each installation submitted a setof programs which the installation personnel felt were some ofthe better programs which had been written with the struc-tured programming techniques.Throughout the remainder of this paper the two sets of pro-

grams just described will be compared. The programs not usingstructured programming will be denoted as NSP programs andthose using the techniques will be denoted as SP programs.Generally, the SP programs will be treated as a single set; how-ever, installation #1 and #2 will be used to identify the twomajor subsets when it is convenient or necessary to do so.Table I displays some data about the two sets of programs.

All of the programs are written in PL/I. A program is a singlecompilable external procedure with zero or more internal pro-cedures. Except for a few utility programs in each set, eachprogram is a complete self-contained data processing applica-tion. Since the two samples differ significantly with respect toaverage program size, most data will be presented in terms ofpercentages within the sample program sets.

CHANGES IN PL/I USAGEThe distribution of statement types is shown in Table II. A

large decrease in GOTO statements is offset by increases inCALL, Null, PUT, and PROCEDURE statements. In the NSPprograms about twenty percent of the programs came from aninstallation which used CALL statements to perform all I/O.Also, the number of unresolved preprocessor statements wasmuch larger in the SP programs.A large increase in the use of the DO WHILE statement has

been found. There is also an increase in the number of com-plex DO statements, those DO statements with both an iterationclause and a WHILE clause. Probably the most significant DOdatum shown in Table III is the percent of programs with atleast one DO WHILE statement. This indicates that most of theprogrammers are now aware that the statement exists and canuse it.

364

ELSHOFF: INFLUENCE OF STRUCTURED PROGRAMMING

TABLE IIDISTRIBUTION OF STATEMENT TYPES

Statement NSP Programs SP Programs(% of all statements)

Assignment 41.2 33.7IF 17.8 15.6GOTO 11.7 2.8END 7.5 11.6DO 7.2 9.5DECLARE 6.3 7.1WRITE 2.6 1.1CALL 2.0 8.2READ 0.5 0.4Null 0.5 1.8OPEN 0.4 0.2PUT 0.4 1.2CLOSE 0.4 0.2ON 0.3 0.4PROCEDURE 0.2 1.7BEGIN 0.1 0.3DISPLAY 0.1 0.0+Preprocessor 0.1 2.9REWRITE 0.1 0.0RETURN 0.1 0.1SIGNAL 0.0+ 0.1GET 0.0+ 0.0LOCATE 0.0+ 0.0STOP 0.0+ 0.0ALLOCATE 0.0+ O.+DELETE 0.0+ 0.0ENTRY 0.0+ 0.0+FORMAT 0.0+ 0.0FREE 0.0+ 0.0+DELAY 0.0+ 0.0DEFAULT 0.0 0.0+FETCH 0.0 0.0+REVERT 0.0 0.0+All Others 0.0 0.0

TABLE IIIDO STATEMENT DATA

Item NSP Programs SP Programs

D0 WHILE Statements

Total number 11 109Percent of all DO statements 0.1 5.6Percent of all programs with 5 79

at least one occurrence

DO Statements With Multiple Clauses

Total number 60 62Percent of all DO statements 0.8 3.2Percent of all programs with 18 34

at least one occurrence

The use of the optional ELSE clause on the IF statement hasalso increased. The data in Table IV indicate that, as with theDO WHILE, more programmers are using this part of the pro-gramming language.The data in Table V show how the distribution of GOTO

statements has changed. Two factors influence the differenceshown between installation #1 and #2. Installation #1 hasonly been using structured programming for six months ascompared to twelve nmonths for installation #2; furthermore,installation #1 is using an interface to a data base managementsystem that requires the passing of labels for error handling.Each error handling segment is then terminated by a GOTOstatement to a restart point after the error has been handled.The data indicate that most programmers did not find gettingrid of most of the GOTO statements very difficult.The use of expressions was about the salme for the two sets

of programs. The number of bit string operators increased toaccount for about 14 percent of the total number of operatorsin the SP programs, whereas they accounted for only 5.2 per-

TABLE IVELSE CLAUISE USAGE

Item NSP Programs SP Programs

Percent of all IF statementswith an ELSE clause 17.0 36.4

Percent of all programs withat least one ELSE clause 84 91

TABLE VGO TO STATEMENT USAGE

Programs With X Percent GOTO Statements

NSP Programs SP Programs#1 #2 All

X = 0% 0% 11% 56% 32%2% > X > 0% 0% 33% 37% 35%5% > X > 2% 10% 22% 6% 14%

10% > X > 5% 36% 27% 0% 14%X > 10% 53% 5% 0% 3%

cent of the operators in the NSP programs. A slight reductionin the number of arithmetic and character string operators ac-count for the difference. The number of expressions with twoor more operators increased from 1.8 percent of all expressionsin the NSP programs to 2.5 percent of all expressions in the SPprograms. Most expressions still have zero or one operators.The constants and identifiers used in the two sets of pro-

grams are similar. Generally, only decimal fixed numeric con-stants are used. About 94 percent of all string constants arecharacter strings with the remainder being bit strings. A slight

365

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. SE-3, NO. 5, SEPTEMBER 1977

TABLE VILABEL USAGE IN CONTROL OF FLOW

Type Of Branching NSP Programs SP ProgramsTo Label (Percent of all labels)

No Branching 6.8 20.4

Single Forward Branch 44.6 44.4

Single Backward Branch 9.8 6.4

Multiple BranchesAll Forward 23.7 23.6

Multiple BranchesAll Backward 6.2 2.8

Multiple BranchesBi-directional 8.7 2.4

TABLE VIIPROCEDURE STATEMENTS AND PROCEDURES

Item NSP Programs SP Programs

External Procedures 120 34Internal Procedures 83 318Total Procedures 203 352Total Statements / Total Procedures 504 57Average Number Of Statements 679 84

Per Procedure StatementIn An External Procedure

50

40

[ - Block (BEGIN, DO, IF, ON, PROCEDURE)

- GO TO statement

NSP Program

I

PercentOf AllBlocks

30

20

10

SP Program

[ I[

[[[ [

- [ [ IFig. 1. Abstract view of program structures.

reduction in identifiers of type label constant is noted goingfrom the NSP programs to the SP programs. Type entry con-stant increases correspondingly.

In addition to having fewer labels, the labels that do appearare referenced less frequently. Table VI indicates how thenumber of multiple entry points is being reduced. The averagenumber of labels in an NSP program is 50 while in an SP pro-gram it is 14.

DIFFERENCES IN PROGRAM STRUCTUREThe structural profile of the NSP programs has been drasti-

cally changed by the introduction of the new techniques usedto develop the SP programs. On an abstract level, as shown inthe hypothetical examples in Fig. 1, the SP programs are muchcleaner. The programs read much more in a top to bottom,left to right fashion. There are a series of structural measureswhich verify the differences between NSP programs and SP

1 2 3 4 5LE

o NSP Programs

+ SP Programs

evel

6 7 + 9+

6 7 8 9 >

Fig. 2. Distribution of blocks by level.

programs. This section of the paper will review the variousmeasures.The language is used to package source code sequences into

more and smaller modules. The use of PROCEDURE statementsand the effect on the average size of a procedure is displayedin Table VII. The smaller modules enhance the readability ofthe programs and make them easier to understand, one of thegoals of structured programming.A PL/I program' consists of nested blocks which are defined

to be BEGIN, DO, IF, ON, and PROCEDURE statements for ourpurposes. The nesting levels of the blocks and the number ofstatements at each level reveal that more work is being done atlower levels in the SP programs. The following code sequenceshows how the counting is performed.

A: PROCEDURE;DECLARE B FIXED;B=1;IF B=1THEN DO;

B=2;END;

ELSE PUT DATA ('ERR');END A;

Level 1 block Level 0 statementLevel 1 statementLevel 1 statement

Level 2 block Level 1 statementLevel 3 block Level 2 statement

Level 3 statementEnd Level 3 Level 3 statement

End level 2 Level 3 statementEnd level 1 Level 1 statement

The movement of work to lower levels is clearly demon-strated in Figs. 2 and 3. Both blocks and statements are dis-tributed through lower levels of the SP programs. Instead ofstacking a sequence of tests and then branching to level 1 toperform the work when a decision is made, the SP programsperform the work at the level where the decision is made. Thischange clarifies the program structure and flow of control,thus making the program less complicated to understand.

366

ELSHOFF: INFLUENCE OF STRUCTURED PROGRAMMING

50

40

PercentOf All 30Statements

20

10

0

0 NSP Programso

+ SP Programs

+ \++0o - o-+ +-

0~~~~~~~

1 2 3 4 5 6 7 8 9 >Level

Fig. 3. Distribution of statements by level.

TABLE VIIIAVERAGE BLOCK SIZE AT EACH LEVEL

Level Statements Per BlockNSP Programs SP Programs

1 350.8 109.42 1.5 6.33 3.4 2.34 2.2 2.75 2.8 2.56 2.3 2.77 2.4 2.68 2.5 2.89 2.4 2.6

More 2.4 2.4

Overall 3.9 3.6

Another view of how the work is being moved from level 1blocks to lower levels is shown in Table VIII. Two importantitems of information appear in these data. First, there is a greatdecrease in the size of level 1 blocks. Perhaps some of thedecrease is due to slightly smaller programs, but most of it isdue to the movement of work to lower levels. Furthermore,when the declarations are removed from the level 1 blocks, an80 percent decrease in executable statements is measured forlevel 1 going from the NSP programs to the SP programs. Sec-ond, the block sizes at the lower levels are not being increasedsignificantly. The work is distributed through the lower levelsvery evenly. The number of blocks increases with respect tothe total number of statements and the overall average blocksize decreases.The use of deeper nesting levels is common across the SP

programs as evidenced in Table IX. Note that every SP pro-gram is at least four levels deep and six levels are common.The combination of smaller blocks and deeper nesting levelsare direct results of structured programming. The problemsbeing solved are being broken into small functional units em-

bedded in a clear, concise framework.By eliminating many statements of the form "IF expression

THEN GO TO label" many of the one statement blocks (i.e.,GO TO label) have been removed. This change can be seen inthe data in Table X where a bottom level block is a block withno other block embedded within it. The decrease in overallaverage block size is even more significant with this reductionin blocks of just one statement.

TABLE IXPROGRAM USAGE OF BLOCK LEVELS

Level Percent Of All Programs With At LeastOne Block At The Specified Level

NSP Programs SP Programs

1 100 1002 100 1003 96 1004 89 1005 81 946 70 947 56 768 40 679 30 55

More 23 41

TABLE XDISTRIBUTION OF BLOCKS BY SIZE

Number Of Percent Of All Blocks Percent Of All BottomStdtements Level Blocks

NSP Programs SP Programs NSP Programs SP Programs

1 31.6 13.9 59.5 34.02 9.7 10.1 11.4 21.93 11.1 12.4 15.0 21.44 11.2 11.8 6.0 9.15 7.4 7.6 3.2 4.26 5.0 6.3 1.7 3.97 3.6 4.6 1.0 1.68 3.1 4.5 0.6 1.4

9-18 10.6 15.6 1.1 1.819-28 2.6 4.9 0.1 0.229-38 1.2 2.3 0.0+39-48 0.6 1.649-58 0.3 0.759-68 0.3 0.669-78 0.2 0.379-88 0.1 0.289-98 0.0+ 0.1More 0.8 1.8

PROGRAM QUALITY IMPROVEMENTSThe SP programs are much more readable and understand-

able. The modifications to program control structure make theprograms read from top to bottom and from left to right, morelike normal English reads. The selection of better identifiernames to represent data values also greatly helps the readability.The SP programmers have also made good use of enhance-

ments to improve the readability of the programs. Indentationis used to help show the control structure of a program throughits physical layout on the page. Comments are used differentlyand in key situations. Very few comments appear in the decla-ration section of the SP programs, whereas in the NSP pro-grams nearly all of the comments appeared there. Many pro-grams now have a block -comment at the start of the programwhich describes the function and basic approach of the algo-rithm of the program. Block comments appear throughout aprogram to identify major sections of the program. Othercomments are used to present details about the program. Fi-nally, pagination controls are used to make the SP programlistings easier to read. Basic subfunctions of an algorithm arepackaged on a single page. Blank lines are inserted to help setoff important comments, to identify key decision points, andto spread out complex code sequences.

367

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. SE-3, NO. 5, SEPTEMBER 1977

MORE IMPROVEMENT STILL POSSIBLEDespite the improvements that have been made in the first

year of using structured programming, further refinementscould be made. The basic goal to be achieved is more consis-tency through all the programs. Each programmer has an indi-vidual method of indenting and commenting a program. Eventhough each program is more readable, it is not clear whetherthe programmers can read each others' programs as easily asthey might if they were indented more nearly the same. Somemethods of indenting are clearly better than others. Thosemethods should be used. Also, having a standard set of infor-mation at the beginning of every program in a block commentcan help the programmer more quickly find what must beknown when a program is to be modified.Most good programming techniques can be found in a pro-

gramming installation. The problem is 1) to make all of theprogrammers aware of the techniques and 2) to get the pro-grammers to use them. The programmers must be encouragedto read each others' programs in order that the good techniquescan be recognized in the context which makes them good.Then the better techniques will begin to be used by more andmore programmers in the installation. The good programmingpractices will float to the top and more consistent programswill result.Another area in which programs should change is in the way

they are packaged. Internal procedures help break down thecomplexity of the flow of control of a program; however, thedata reference pattem is still nearly as error prone as before.The data should not all be declared in one global heap whichcan be referenced from anywhere. Internal procedures mustonly have access to that data which they really need. Param-eters should be used to pass information from one procedureto another instead of the global heap. As much concern mustnow be shown for data flow in the program as has been shownfor control flow. Furthermore, program design is as much of aculprit in this area as the writing of the source code.

SUMMARYTwo sets ofprograms, one set written before the introduction

of structured programming techniques, and the other set writ-ten after their introduction, have been compared in this paper.A special program was written to measure all of the programsin the two sets. There was no direct measurement of the pro-gram development process; only programs produced as an endproduct were studied. The programs have been examined fromboth a quantitative and a qualitative point of view.Only a few changes in the way the programming language

PL/I was used were noted. More PROCEDURE statements,more CALL statements, more ELSE clauses, and more DOWHILE statements were used; on the other hand, fewer GO TOstatements and labels were used in the structured programs.The basic control structure of the programs changed signifi-

cantly. The structured programs basically read from top tobottom and from left to right which is not true of the unstruc-

tured programs. The structured programs perform their datamanipulation at the point where a decision is made instead ofbranching to another point to do it. As a result the structuredprograms have deeper nesting levels but a more restrained flowof control.Other factors have helped make the structured programs

more readable and understandable. The programmers havemade better use of comments to help explain the algorithmbeing used in the program. Furthermore, indentation and pagi-nation are used so that the physical appearance of the writtenprogram enhances readability. Also, programmers are groupingdeclarations by type, using various alignment schemes, and ingeneral making programs look like objects to be read by humanbeings as well as by computers.Structured programming has certainly improved the end

product ofthe programmers. The two installations from whichthe structured programs were gathered have been using struc-tured programming for six months and twelve months, respec-tively. The time and training necessary to introduce the struc-tured programming methods is nearly recovered after just sixmonths. The new programs being produced are expected tohave a much longer lifetime with a much reduced maintenancebill because of the ease with which they can be read andunderstood.

REFERENCES[11 0. J. Dahl, E. W. Dijkstra, and C. A. R. Hoare, Structured Program-

ming. New York: Academic, 1972.[2] Datamation (Special Issue on Structured Programming), vol. 19,

no. 6, Dec. 1973.[3] Comput. Surpeys (Special Issue on Programming), vol. 6, no. 4,

Dec. 1974.[4] J. L. Elshoff, "A numerical profile of commercial PL/I programs,"

Software-Practice and Experience, vol. 6, no. 4, pp. 505-526,Oct.-Nov. 1976.

[5] -, "An analysis of some commercial PL/I programs," IEEETrans. Software Eng., vol. SE-2, pp. 113-120, June 1976.

James L Elshoff (S'68-M'71-SM'77) receivedthe B.A. degree in mathematics from MiamiUiversity, Oxford, OH, in 1966, and the M.S.

and Ph.D. degrees in computer science fromThe Pennsylvania State University, University*

Park, in 1969 and 1970, respectively.jjjjjjjE k He has taught at the Pennsylvania State Uni-

versity, Oakland University, Rochester, MI, andand the University of Detroit, Detroit, MI. Heis currently a Senior Research Computer Scien-tist at the General Motors Research Laborato-

ries, Warren, MI. His work encompasses the study of software, its de-velopment and use. He has published papers on topics including digitaldifferential analyzers, asynchronous interrupt handling, and program-ming. His paper on processing matrices in a paging environment wasjudged Best Technical Paper at the 1974 National Computer Conference.

Dr. Elshoff is a member of the Association for Computing Machinery.

368