11
AUTOMATIC GENERATION OF SOFTWAR E SYSTEMS By Noah S . Prywe s Despite the continuing improvements in hardware , system sottware and programming languages, they ar e not considered sufficient to compensate for the dra- matic projected increases in demand for sottware . One approach in forecasting the needs of busi- nessmen, industrialists and scientists is as follows : The ratio of cost of sottware to hardware, whic h in 1970 was 2 :1, is projected to increase by 1985 t o 10 :1' . The 1970 U .S . Bureau of Labor Statistics estimate s show a work force of 360,000 engaged in softwar e development in the United States . Of these 137,000 were classified as business programmers, 97,000 a s business systems analysts and 126,000 were employe d in systems and scientific software° . Approximately, i n 1970, the cost of software was $10 billion, while th e cost of hardware was $5 billion' . These estimates ar e on the low side as they exclude some segments o f the U .S . economy and were registered at a low eco- nomic point . In spite of the decreasing costs of computer equip- ment, total U .S . revenues from computing hardwar e have been increasing since 1970 at a real rate tha t may be averaged at 10 percent annually .` Considerin g the large numbers of potential new users and ne w areas of industrial applications, this rate is projecte d to continue to 1985 . It will amount to quadruplin g the real annual cost of new computer equipment b y The author is a professor in the Department of Corn- 1985 . outer and information Sciences, Moore School of When this is coupled with the 1985 projection o f Electrical Engineering, University of Pennsylvania . The a "10 :1 ratio of programming cost there could be a survey from which this article was extracted was sup- 20-told increase in programming manpower by 198 5 ported by the ONR Information Systems Program . (a 22 percent annual growth rate) . These rates ar e 7

Automatic generation of software systems

  • Upload
    noah-s

  • View
    213

  • Download
    1

Embed Size (px)

Citation preview

AUTOMATIC GENERATION OF SOFTWAR ESYSTEMS

By Noah S . Prywe s

Despite the continuing improvements in hardware ,system sottware and programming languages, they ar enot considered sufficient to compensate for the dra-matic projected increases in demand for sottware .One approach in forecasting the needs of busi-nessmen, industrialists and scientists is as follows :

The ratio of cost of sottware to hardware, whic hin 1970 was 2 :1, is projected to increase by 1985 t o10:1' . The 1970 U .S . Bureau of Labor Statistics estimate sshow a work force of 360,000 engaged in softwar edevelopment in the United States . Of these 137,000were classified as business programmers, 97,000 a sbusiness systems analysts and 126,000 were employe din systems and scientific software° . Approximately, i n1970, the cost of software was $10 billion, while th ecost of hardware was $5 billion' . These estimates areon the low side as they exclude some segments o fthe U .S . economy and were registered at a low eco-nomic point .

In spite of the decreasing costs of computer equip-ment, total U .S . revenues from computing hardwar ehave been increasing since 1970 at a real rate tha tmay be averaged at 10 percent annually .` Considerin gthe large numbers of potential new users and ne wareas of industrial applications, this rate is projecte dto continue to 1985 . It will amount to quadruplin gthe real annual cost of new computer equipment b y

The author is a professor in the Department of Corn-

1985 .

outer and information Sciences, Moore School of

When this is coupled with the 1985 projection o fElectrical Engineering, University of Pennsylvania . The

a "10 :1 ratio of programming cost there could be asurvey from which this article was extracted was sup-

20-told increase in programming manpower by 198 5ported by the ONR Information Systems Program .

(a 22 percent annual growth rate) . These rates ar e

7

20

Current State of the Art Software Manpowe rNeeds, 22% Annual Increase

Estimate of Potentia lContribution t oCost Reductio n

Automatic Progra mGeneratio n

Automatic File &Module Desig n

Automatic Functio nSpecificatio n

Automatic Proble mRequirement s

6

5 ~.

4 _

3 _

I

I73

7 670 8279 8 5

8

7

1970 Hardware (U .S . Domestic Revenues)

$5 Billio n

1970 Software Business Programmer sBusiness Analyst sSystem & Scientifi c

Total

13700 09700 0

126000

360,000

FIG . 1 . SUMMARY OF SOFTWARE/HARDWARE COSTS AND POTENTIA LCONTRIBUTION TO COST REDUCTION S

8

Information Deman dEvaluation of Alternative Automation Plan sFunctional Specifications of Computer Syste m

Evaluation of File Structure Alternative sProgram Module Identificatio nProgram Module Input/Output Logical SpecificationsProgram Desig nHigh Level (Compiler, JCL&DM) Language Programmin g

High Level to Assembly Languag eAssembly Language to Machine Cod e

Problem and Supervisor MacrosProblem Instruction Sequence sMicro Instruction Processing of Data

Overal l

- Proble mRequirement s

ProgramDesign an dImplementatio n

Program

- Translatio n' and Optimizatio n

— Executio n

1

FIG . 2 TRANSLATION/DEVELOPMENT LAYERS IN DATA PROCESSINGSYSTEMS AND IN THEIR APPLICATION S

illustrated in Figure 1 . Obviously such a growth re-quirement would constitute a financial obstacle t oadvances in the use of computers . Nor could the laborforce be recruited, trained or controlled effectively .To keep software manpower increases at a pace wit hthe 10 percent for hardware costs, the manpowe rrequirements will have to be reduced 80 percent b y1985, as compared with present (mostly manual )methods .

The total process of development and use of dat aprocessing systems has been pictured as consistin gof many layers . Figure 2 shows thirteen . The top thre eare associated with determining automation require-ments, which sometimes requires definition of theproblem . The next five layers concern the design an dimplementation of software, in a language convenien tfor the problem . The two layers below, are concerne dwith the translation and optimization of the programs .The three lowest levels are concerned with the execu -tion of the programs to perform the desired automa-tion functions .

One way of regarding these layers is as a bottom-u psequence, in the sense that the automation of a bot-tom layer is utilized in building the layer above it .

This has been the historic approach . The first layer sto be automated were those associated with progra mexecution . This was followed by the layers associate dwith language tr anslation and program optimization .To date, a great emphasis has also been placed o nthe efficiency and utility of programming languages . 'Improved facilities were quickly put to a test in avariety of applications . Relatively, little automatio nhas been applied to the top seven layers in Figure 2 .

A top-down view of Figure 2 pictures the layer sas a series of transducers-translators, where the flowis downward so that output of a top layer is use das input to the layer below it . A feedback flow i sused for corrections and modifications .

SOFTWARE DEVELOPMENT : ANALYSIS GOAL SAdvances in automation of programming ar e

needed in large-scale software development projects ,particularly in the business data processing area whic hemploys the majority of programmers and where ther eis likely to be the most growth in demand for suc hservices . Large-scale projects are not the creation o fa single or a small group of managers . Such systemshave an impact on top management, business spe -

9

PERFORMED BYSOFTWARE SYSTEM END PRODUCT SDEVELOPMENT PHASES LANGUAGES

Ad Ho c

Top Managemen tan d

Analysis Staff

Overal lProble m

Requirements 1. Overall Descriptio nof Syste m

2. Cost/Benefi tAnalysi s

Busines s(Accounting ,Finance, etc . )Specialist s

Compute rSpecialists :

Chie fProgramme r

Programmers/Analysts

Computer /Business Tea m

Programmers/Busines sSpecialists

Syste mFunctiona l

Specification s

System Design an dImplementatio n

Identificatio nof Require d

Program Module sand Fil e

Structure s

Design, Code ,Debug andDocumen t

Program Modul e

Syste mIntegratio n

andInstallation

Maintenanc eModification sand Additions

1. Information Flow ,Routing an dSequencin g

2. Description o fComputer Input sand Outputs

3. Description o fData Manipulatio n

Description o fInput-Output an dData Manipulatio nof Each Modul e

Completed Program sand Documentatio n

1. Operationa lSoftwar e

2. User Documentatio n

H

Forma lSyste mFunctionalSpecificatio n

Forma lProgra mFunctiona lSpecificatio n

Programmin g

Ad Ho c

FIG. 3 OVERVIEW OF SOFTWARE SYSTEMS DEVELOPMENT PROJECTS

1 0

cialists in areas such as accounting or finance, opera -tional staffs of the organizations and, finally, th epublic that interacts with the organization, such a scustomers and vendors . Figure 3 gives an overvie wof the software development process, indicating th erespective phases, by whom they are performed, th eend products and the languages used in the docu-mentation .

Data on distribution of labor and costs in softwar edevelopment phases is difficult to obtain . Phase clas -sification in accounting of development costs is rarelyadhered to systematically' and project costs var ygreatly, depending on the quality of planning an dexecution . Costs frequently vary in the ratio of 2 : 1for similar projects ; in some instances, the ratio i sas high as 20 :1 . Therefore, the estimates that are mad ebelow indicate widely accepted typical values, base don published reports"' and discussions with col -leagues . They are considered as averages for all busi -ness data processing projects .

Overall Problem Requirements—This is a projec tkick-off concerned primarily with determining wha tinformation management needs and the resource sneeded for the development . The phase include spreparing cost/benefit analyses, which requires ex-pertise in the application area, as well as in sizingcomputer requirements . To give a relative weight, thi sphase is estimated at 8 percent of total project cost .

System Functional Specifications—These describ ethe information flow involving humans and com-munications, but computers are addressed only im-plicitly . The specifications describe all the relevan tbusiness transactions and management reports use din an organization . The group that prepares the speci -fications need not include computer specialists . It ha sbeen found useful to employ a formal functiona lspecitication language in documenting the system ,which can be processed automatically to indicat eincompleteness or inconsistency . Ineffectiveness an dwaste result where the functional specifications ar eincomplete or ambiguous, thus necessitating redesig nand reprogramming . Twenty-two percent of the tota lcosts of software development projects is estimated .for this phase .

The remaining 70 percent of the cost of a sottwar edevelopment project is estimated to be for progra mmodules that perform in accordance with the func-tional specification . If the functional specifications ar ecomplete, the work in this phase requires only com-puter-oriented skills . Thesoftware development in thi sphase may be subdivided into three steps .

File Structure and fvlodule Design—The functiona lspecifications written by the business specialists ar ebroken clown into distinct logical specifications for

each program module and tor the data structures o nwhich the modules act . A . program module is typicall yassociated with a transaction, an updating or a re -porting function . In a typical $1 million project, ther ewould be 50 to 100 program modules each consistin gof 1,000 to 2,000 lines of high-level language code .The selection of modules is based on consideration sof minimizing the cost of data processing . Fiftee npercent of project cost is estimated to go into creatin gthis specification . (In many projects this step is in-tegrated with program design, described below, an dit is not then possible to estimate costs separately . )

Program Design, Code and Debug—The difficultie sencountered in this step result from having to super -vise large numbers of programmers and the low man-agement visibility of progress or lack thereof . There -fore, problems are discovered late during the debug-ging of individual program modules, or even late rduring installation, resulting in delays and cost sbeyond original estimates . This step is estimated t orequire currently 25 percent of the total project cost .

System Integration and Installation—This consist sof integration of individual program modules into th etotal system, testing and sometimes parallel operatio nof the system . It is during this phase that problem sthat have not been previously visible to managemen tcome to light . They may appear as malfunctions tha tmust be corr ected . If difficulties are discovered i nuser operation and contr ol of the system, the incli-cated changes must be communicated to modify th efunctional specification or either the entire syste mor specific program modules . Only then can modifi-cations be made. Depending on the quality of testin gprior to installation, this last subphase may accoun tfor 30 percent of the total costs of the project .

Maintenance—After the system has been in opera-tion, it will require maintenance to correct errors no tdetermined during the installation process . But evenmore so, it will require maintenance to modify an dadd facilities . This is a post-software-developmen tphase which must be facilitated during the develop-ment itself . The software development should provid ea method by which modifications can be entered i nan orderly way in the specifications, the progra mmodules, the documentations and the operationa lsystem itself .

Considerable efforts are under way on the complet eautomation of the five layers in Figure 2 that consti-tute program design and implementation . The out -look is for automatic generation of programs, base don program module logical specifications ; namely ,the automatic production of ad hoc programs i ncompiler languages such as COBOL or PL/1 . This wil l

11

Languag eSyntax

Semantics

Languag eAnalysi sProgra mGenerator

Desig nMethod -4.-

Cod eGeneratio nMethod

Design and Cod eGeneratio nProgra mGenerato r

Business ProgramModule Generato r

Module

Dat aNonprocedural S DescriptionFunctiona lSpecification

Assertions

Design an dCod eGeneratio nProgra m

Languag eAnalysi sProgram

►- Program Listin gand Documentation

n Sourc eFile

Busines sProgram Modul e

Execution

m Targe tFiles

FIG. 4 INTERACTIONS IN AUTOMATIC GENERATION OF A PROGRAM MODUL E

12

reduce, particularly, the requirements for busines sprogrammers, 60 percent of the total employed . Theestimated savings in this area are the largest : 32 per -cent, plus or minus 15 percent .

The manpower associated with the top three layer sin Figure 2, of determination of overall problem srequirements, cur rently represents a relatively smal lamount of manpower, which is primarily trained i nthe respective applications rather than in compute rsystems . Most probably the bottom-up sequence o fdevelopments will continue and will gradually effec tthe top layers in Figure 2 . However, the automatio nof determining overall problem requirements may b edelayed until automatic program design and genera-tipn methodology (in lower layers) has been wel lestablished .

AUTOMATIC DESIGN AND IMPLEMENTATIO NAutomation of the five layers in Figure 2 of progra m

design and implementation requires only computer -related knowledge . The generation of high-level lan-guage programs constitutes a logical-abstract mode lof the process . The translations to physical device sare performed in the control programs in lower fiv elayers of Figure 2, (e .g . data and teleprocessing acces smethods, and so on) .

An alternative, more application-dependent ap-proach is to automate software development by us eof prefabricated program components oriented to aspecific function, business or industry . The compo-nents are appropriately parametrized so they can b eadapted for use by many organizations . Typically apotential user selects functional components that ar erequired for his operation .

Also an editor and a report generator are providedto enter user parameters and report formats . Suc hsystems are used, for instance, by service organi-zations and computer manufacturers to install smal lbusiness systems for wholesale and distribution in-dustries ." There are two shortcomings with this ap-proach . First there is a continuing requirement fo rprogrammer skills for selecting components and de-termining the values of the parameters . Also, neces-sary functions frequently cannot be performed bypreviously prefabricated components and additiona lspecial purpose programs are required . Second, us eof such packages requires molding the needs fo rautomation of a business into the structure of th eprefabricated software packages . Inadequate provi-sions are made to determine management's critica ldecisions regarding efficient operation and whethe rdata for decision making is indeed made availabl eto management .

Followirig the bottom-up sequence (Figure 2), au -

tomatic program module generation is based on th euse of processors capable of generating broad classe sof ad hoc programs. Figure 4 illustrates this method-ology. Box(1) at the bottom of Figure 4 illustrates abroad class of programs, characterized as intendedfor business applications where they have to proces sa number of input data (or message) files and toproduce a number of output files (or reports) . Box(2 )illustrates a processor which can generate the progra mmodule for (1). The Program Module Generator (2 )consists of two parts, for processing the input formal ,nonprocedural functional specification of the desire dprogram module and for performing design and cod egeneration .

The design and code generation programs in bo x(2) embody a mathematical model of a program de -sign process . They check the consistency and th ecompleteness of specifications by tracing each valu ein the target data to its sources . From these trace sas well as from requirements imposed by file struc-tures of the respective tiles, (for example, sequentia lor indexed), they determine the sequencing of th einput commands, computation and output com-mands to attain program module efficiency .

What is a nonprocedural specification? Ideally i tshould be entirely a descriptive-declarative languag ewith no imperatives. It is to be used to describe sourc eand target data and the logical or arithmetical inter -relating dependencies, without references to any proc -esses, steps, sequences, registers or computers . Be -cause state or sequencing information is not explicitl yexpressed, the language cannot be considered com-plete . Namely, some data interrelationships canno tbe stated with only primitive logical and arithmeti coperators without resorting to specifying a sequence .Therefore nonprocedural languages include prob-lem-oriented facilities to describe relationship sunique to classes of applications. Because of thi saspect, nonprocedural languages have great affinit yto problem-oriented or very high-level programmin glanguages . In practice, some of the above enumeratedcharacteristics of an ideal nonprocedural languag ehave been compromised . "

Several formal languages for stating nonprocedura lfunctional specifications have been developed an dsome have been in use . Because of the emphasi sin these languages on facilities to describe data, the yare similar to data description languages .'°'' They ca nhave a combination of several formats, such as i na formal language, table format or in a question -answer format . Still, only limited usage experienc eis available for their evaluation .

To have a capability to accept several selectedlanguages or formats and to modify them, it is desir -

13

able to generate automatically the language analysi sprogram . This capability is indicated in Figure 4 b ya higher level language analysis program generato rprocessor box (3) which automatically generate sanalysis programs based on specifications of languag esyntax and semantics .

Another desirable aspect of a nonprocedural lan-guage is that individual statements stand alone, whic hlends itself to a structured design, and top-dow nstructured program performed by the module pro -gram generator.''' To illustrate this, the applicatio nof two levels are related below to the two parts o fthe module specification shown in Figure 4 . The to pprogram level could be based largely on the modul esource and target data descriptions . It would consis tof all the data definitions, input/output command sand control logic statements . The bottom level of th eprogram would be based on the assertions that specif yinterrelationship between source and target data ele-ments . There could be a submodule for each asser-tion .

A system of the type described in Figure 4, wit ha restriction of only one input and one output fil efor a program module (n = m =1), has been develope dat the University of Wisconsin'' and at the Universit yof Pennsylvania .' The Ramirez' system includes au-tomatic capability for generating language analysi sprograms (box 3) based on an extended BNF synta xspecification and subroutine calls that express som eof the semantics (other semantics are in hand-codedcode generation programs) . The generated progra mmodule code is in PL/1 requiring a subsequent com-pilation to produce a load module . The Ramirez sys-tem also produces the necessary JCL statements an da variety of documentation, in addition to the PL/ 1program module listing .

The design process in the Program Module Genera -tor (box 2 in Figure 4) requires a painstaking analysi sto specify an acceptable program design process andto state it in terms of a mathematical model . Whil ethere exists extensive work on automatically generat -ing language analysis programs," there has been ver ylimited research on automatic generation of progra mdesign and code generation programs . Until auto-matic methodology is developed and applied, it i spossible only to proceed slowly and laborously b ymanually producing design and code generation pro -grams . This inability to easily "teach" a computer ho wto employ a method, that a human can easily learn ,is an extremely difficult obstacle to surmount on th eway toward all-automatic programming .

File structure and program module definitions ar ebased on functional specification of the total com-puter system, consisting of a nonprocedural func -

tional specification of all the source and target dat aand a preliminary determination of the system hard -ware and software that is required . The outcome o fthis process are the nonprocedural functional specifi-cations of the respective modules . This is a mor eglobal software design activity than the module de -sign . Presently a human designer relies on automati csimulation for evaluation of variously defined designs .Developments of manual design methods were re -viewed recently . '

The first step is to analyze the overall syste mnonprocedural functional specifications to determin ecompleteness and consistency. An example of thi scapability is an analyzer developed for functiona lspecifications expressed in the ADS (Accurately De -fined Systems) language . ' 9 .20 Analysis reports includ eindices of data elements and cross referencing of datamanipulation routines and the respective sourc e(input) and target (output) data elements . A networ kcan be generated where hierarchically related group sof data elements that interact in computations woul dbe connected .

Closely connected groups of data elements consti-tute candidate files . The related processing function srepresent candidate program modules . Attempts arethen made to consolidate or partition modules an dfiles to increase efficiency . For instance, if two proc-essing functions occur in the same processing cycl eor if the preliminarily selected data files overla pgreatly in having common data elements, the respec-tive processing functions and files may be consoli-dated . To make such decisions it is necessary to refe rnot only to the logical structure of the data and th edata manipulation rules but also to the frequencie sand cycles of the processing functions .

Partitioning of the processing files, as well as us eof intermediate files sometimes improve efficiency .For instance, effect of pre- or post-sorting to orde rthe data may be an important consideration for ef-ficiency . A third type of consideration is to evaluat eimpact of alternative file organizations . For example ,whether the data is accessible sequentially or o na random access basis has impact on efficiency o fprocessing . Such alternatives may ' be evaluatedthrough simulation .''"

An all-automatic process will require a data bas eof global computer design knowledge accessible t oa computer that integrates the above enumerate ddesign steps . It will be necessary to research first ho wto formalize such knowledge or how to input it t oa computer in a natural language ad hoc manner .Until such capability is developed, it will be verycostly to enter such information through manua lmodelling and programming . Therefore, the achieve -

14

ment of an all automatic process with no huma nparticipation will be necessarily delayed for som etime .OVERALL PROBLEM REQUIREMENT S

The top-level system design activity discussed i nthis section corresponds to the top three layers i nFigure 2 . The top layer, "Information Demand," i sconcerned with determining what information man-agement would need to evaluate economic and busi-ness alternatives and to make decisions for its overal lorganizational progress and effectiveness .

The second and third layers are concerned wit hdeveloping an automatic system that would collec tand make the indicated information available t omanagement . The second layer consists of generatin gcandidate operational concepts and computer con -figurations and the evaluation of these compute rconfigurations to establish cost/benefits of alternativ eproposed information systems . The third layer consist sof specifying the selected system in a formal manne rto be directly applicable in lower layers .

Performing the top layer activity in an automati cfashion presents the greatest difficulty . It is the mos tcomplex and imaginative part of the total process .It has little to do, if any, with computer methodology ,which becomes important only in lower layers . Tradi-tionally this process involves interacting with man yparticipants, with expertise in different disciplines :with top management, staff specialists, operationa lstaff and sometimes with outsiders to the organi-zation, such as customers, vendors and financial andgovernment organizations .

Presumably, a future computer system that coul dperform this activity automatically would need acces sto the combined knowledge of the present partici-pants in this process . Current techniques for enterin ginto a computer knowledge of how to evaluate com-plex economic or business situations consist of cod-ing the knowledge in program form, which require san enormous amount of manual analysis and struc-turing. This is the major problem in automating thi sactivity .

The features of the Hax and Martin' approach areto specialize in a relatively narrow application field ,reportedly in inventory control ; to incorporate simu-lation models for evaluating economics and busines squestions or operational methods; to use prefab-ricated program components for the operational sid eof a business, and to use artificial intelligence tech-niques for communicating with the computer in-teractively in English for the entry of additionall yneeded information into the computer .

Since the area of application is highly specialize dand relatively narrow, the cost of the software gener -

ation system development may have to be justifie dby potential utilization opportunities . The usefulnes sand end value of such a generalized inventory contro lsystem could have been tested, for instance, durin gthe situation of major shortages in essential material sthat arose at the end of 1973 . Could the system fo rinstance, have generated an inventory control syste mfor oil products that would be effective for conservin goil and for planning utilization of oil products t ominimize impact of shortages on the economy? Th ealmost instantaneous availability of such a syste mwould have demonstrated its value as surely bein gsufficiently great to justify the costs of development .

Balzer" proposes the construction of a compute rsystem which will have a generalized learning capa-bility that could be utilized to enter application srelated knowledge into the computer . The quotationbelow summarizes the functions of the computer sys-tem in his proposed concept of its operation :

"1. Problem statement in natural language in term sof the problem domain .

"2. Knowledge about the domain acquired inter -actively in natural language in terms of th ecomplete model of the problem domain .

"3. Resulting programs which are optimized wit hrespect to data representations, control struc-ture and code .

"This approach requires significant advances i nartificial intelligence techniques, in such areas a sknowledge representation, inference systems ,learning and problem solving and in the codifi-cation of programming knowledge in the areas o fdata representations, algorithm selection and op-timization techniques ."Items 1 and 2 in the above quotation refer to th e

top level design activities addressed in this section .Item 3 refers to program generation and optimizatio nactivities similar to those discussed in the previou ssection . The reference to "domain" is similar to th euse of the word "application" above . The startingknowledge in the computer is assumed to be confine dto that of computer system analysis and design . A sstated, in items l and 2, the description of the proble mthat needs to be solved, together with the knowledgethat is needed to solve the problem, would be im-parted to the computer system through interactiv euser-computer question-answer sessions conducte din English . The computer would then "learn" fro mthe user enough about his business to be able t odetermine the information needed for business deci-sions and system requirements . The dependence o fthis type of activity on artificial intelligence metho-dology is stated in the second paragraph in the abov equotation .

15

Balzer admits that his proposed system concept i sconjectural . There are questions in regard to bot heffectiveness of the process and feasibility . In regardto effectiveness, the approach implies, for instance ,that a president of a company would find it beneficia lto teach his problem and his business to a compute rthat is equipped with only computer-oriented knowl-edge, so that the computer could integrate the tw otypes of knowledge to provide an effective solutio nto the president's problems, Additionally, some of th einformation would have to be obtained from the vic epresidents for operations, marketing and finance .Consider the problem of all these participants enter-ing information in a consistent manner so that it al lcan be integrated . If the artificial intelligence tech-niques cited by Balzer are to be used, the human sinteracting with the computer would be required t ohave knowledge about how the computer acquire sknowledge .

A basic assumption of this approach is that interac-tive communication in English is an effective way t oteach knowledge to a computer, although this is arelatively slow process in teaching humans . Note also ,that when humans are taught by interactive com-munication, they already have a laboriously acquire dbasic knowledge of word meanings and relationships ,which would not be true for the computer system .This problem might have been less conjectural if i thad been proposed that the computer could accep ttextbooks as input material .

It would then also constitute a base line of th eknowledge in the computer on which further knowl-edge incorporation could be based . Also, such a sys-tem could be continuously tested as the informatio nwas entered . For instance, after entry of a chapte rfrom an introductory economics textbook, the prob-lems at the end of the chapter could be submittedas well, and the computer could be required to us ethe knowledge in the chapter to produce answer sfor the problems .

Another problem area is the inadequacy of presen tartificial intelligence techniques cited by Balzer fo rentering into the computers large vocabularie sneeded for handling voluminous English languag ecommunications with the computer . The system scited by Balzer "6 . '' to demonstrate that the technol-ogy is available have vocabularies of a few hundre dwords. To communicate applications knowledg ewould require vocabularies of tens of thousands o fwords .

The problem is not simply in the larger numbe rof words needed for English language communicatio nof application knowledge, but in the amount of wor kthat it takes to enter multiple word meanings and

relationships . Development of methodology to tabu -late multiple meanings and relationships of word shave been reported ." It estimates that there are 8,00 0high-usage words which have acquired many mean-ings and usages .

These words, as well as many others of lesser usage,would have to be integrated within detailed model sof particular applications . To enter all this informa-tion, word by word, in an interactive process woul drequire a long time . Other techniques, by which acomputer may acquire knowledge of words purel yfrom the usage in text, must be developed to enabl erapid entry of an application oriented knowledge bas einto a computer system. This would eliminate th eneed to tabulate meanings and relationships in vocab-ularies . Processes of this type have been considere dand developed in the areas of information storag eand retrieval, content analysis and text processing ."

All the questions that have been raised above see mto be open research questions . It appears, therefore ,that the adequacy of artificial intelligence technique sfor implementing the type of system envisaged b yBalzer is far from proven, and the feasibility of suc ha system will still need to be demonstrated . Therefore ,also an effective automation of this generalized hig hlevel analysis and design process cannot be foresee nwith confidence .CONCLUSION S

Predictions or projections of technological devel-opments are always hazardous, especially when mad efor a prolonged period such as to the year 1985 .Therefore, it is in order to comment on reasons forconfidence, or lack of, in the projections .

The projections of growth of software manpowe rrequirements are not reliable . They are based on astudy of future U .S . Air Force software requirement swhich have been applied to the total U .S . economy .New industrial uses of computers through 1985 tha tshould figure prominently in such projections hav enot been considered . The objective in making th egrowth projection was for estimating a target fo rmanpower savings from advances in automatic pro-gramming . Even if the growth projections are 50 per -cent too high, there would still be need for advances .

One assumption was that software developmen twill continue in a bottom-up fashion indicated b ythe layers in Figure 2 . The alternative would be som emajor breakthroughs which would upset the pas torder of progress . For instance, breakthroughs in th efield of artificial intelligence, by which computer scould quickly "learn" methods could have such a nimpact . But there are important research question sthat need to be answered before such breakthrough scan be predicted .

16

REFERENCE S

1. Barry W . Boehm, "Software and Its Impact : AQuantitative Assessment," Datamation, May 1973,pp . 48-59 .

2. B . Gilchrist and R .E. Weber, "Employment o fTrained Computer Personnel—A Quantitative Sur-vey," AFIPS Conference Proceedings, Vol . 40, Ma y1972, pp . 641-648 .

3. AFIPS, "The State Of The Computer Industry I nThe U .S ." Montvale, New Jersey, 1973 .

4. Jean E . Sammet,"Programming Languages : Historyand Future," CACM, July 1972, Vol . 15, No. 7 ,pp . 607-610 .

5. Nelson, E.A ., "Management Handbook For TheEstimation of Computer Programming Costs, "SDCTM 3224, 1966 .

6. F .T . Baker, "Chief Programmer Team Managemen tof Production Programming," IBM Systems Jour-nal, Vol . II, I, 1972, pp. 57-73 .

7. R .W . Wolverton, "The Cost of Developing Larg eScale Software," IEEE Trans . on Computers, Vol .Cy3, N6, June 1974, pp . 615-636 .

8. J . Daniel Couger, "Evolution of Business System sAnalysis Techniques," Computing Surveys, Vol . 5 ,No. 3, September 1973, pp . 167-198 .

9. Examples are IBM Customizing Service for user sof IBM System 3 and the proprietary services o fKeydata and others for the distribution industry .

10. D . Teichrow, "Survey of Languages For Statin gRequirements for Computer Based Informatio nSystems," AFIPS Conference Proc ., Vol . 42, Fal l1972, pp . 1203-1244 .

11. Burt M . Leavenworth and Jean E. Sammet, "AnOverview of Nonprocedural Languages," Proc . o fthe Symposium on Very High Level Languages ,SIG PLAN Notices, Vol . 3,4, April 1974, ACM, N .Y . ,1974, pp . 1-12 .

12. CODASYL Data Base Task Group Report, Repor tto the CODASYL Programming Languages Com-mittee, ACM, New York, 1971 .

13. N .S . Prywes and D . Pirog Smith, "Organizatio n

14. W.P. Stevens, G .) . Myers and L .L . Constantine ,"Structural Design," IBM Systems Journal, 13-2 ,1974, pp . 115-139 .

15. Harlen Mills, ""Top Down Programming In LargeSystem," Courant Computer Science Symposiu mI, July 1970. Debugging Techniques In Large Sys-tems, Randall Rustin Ed ., Prentice Hall, 1971, pp .41-55 .

16. M .E . Ellis, W. Katke, J . Olson and S .C. Yang,""SINS—An Integrated User Oriented Informatio nSystem," Proc . of The Fall Joint Computer Confer-ence 1972, pp . 1117-1131 .

17. J . Ramirez, "Automatic Generation Of Data Con -version Programs Using A Data Description Lan-guage," Ph .D . dissertation, U . of Pennsylvania,1973 .

18. W.N . McKeeman, et . al ., "A Compiler Generator, "Prentice Hall, 1970 .

19. J .F . Nunmaker, et . al ., "A Nonprocedural High -Level Language For Automated Design of Appli-cation Systems," Computer Science Department ,Purdue University .

20. ) .F Nunmaker, Jr ., "A Methodology for the Desig nand Optimization of Information Processing Sys-tem," AFIPS Proc ., 1971, S)CC, pp . 283-294 .

21. A.F . Cardenas, "Evaluation and Selection of Fil eOrganization—A Model and System," CACM 16,9 ,September 1973, pp . 540-548 .

22. J . Yeh and J . Minker, ""Key Word in Context Inde xand Bibliography on Computer Systems Evalua-tion Techniques," University of Maryland, Com-puter Science Center, June 1973 .

23. A.D . Hax and W .A . Martin, "Automatic Genera-tion of Customized, Model Based Informatio nSystems For Operations Management," Proc . o fthe Wharton Conference on Research on Com-puters in Organizations, October 1973, H .L. Mor-gan, Editor, pp. 117-121 .

24. R .M. Balzer, "A Global View of Automatic Pro-gramming," also memorandum on "Automati cProgramming," September 1972, USC Informatio nSciences Institute .

25. G .S . Sussman and D . McDermott, ""Why Conniv-ing is Better Than Planning," AFIPS Conferenc eProceeding, Fall 1972 .

26. C. Hewitt, "PLANNER: A Language For Provin gTheorems In Robots," Proc . of Intl . Joint Confer-ence on Artificial Intelligence, Mitre Corp ., 1969 ,pp. 245-301 .T . Winograd, Understanding Natural Languages ,Academic Press, 1972 .Louis L . Earl, "Use of Word Government in Re -solving Syntactic and Semantics Ambiguities, "Conference Proc . Computer Text Processing an dScientific Research, Office of Naval Research ,March 1973, pp . 55-96 .

29 . N . Prywes, A . Lang and S . Zagorsky, "A Posterior iIndexing Classification and Retrieval of Textua lData," to be published in Information Storage an dRetrieval .

of Information," in Annual Review of Information 27 .Science and Technology, Vol . 7, C .A. Cuadra, Ed . ,ASIS, Washington, D .C., 1972, pp . 103-158 . 28 .

1 7