Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
THE CONCOR (CONSISTENCY AND CORRECTION)
EDIT AND IMPUTATION SYSTEM
ITS ADEQUACY AND STATE OF COMPLETION
A Report Prepared By FREDERIC J GRANT
During The Period JANUARY 7-31 1980
Under The Auspices Of The AMERICAN PUBLIC HEALTH ASSOCIATION
Supported By The US AGENCY FOR INTERNATIONAL DEVELOPMENT OFFICE OF POPULATION AIDDSPEC-C-0053
AUTHORIZATIOIN Ltr POPFPS 12379 Assgn No 582-012
PREFACE
Since June 1979 a major design of the COBOL CONCOR edit and imputation system has been undertaken by the International Statistical Programs Center (ISPC) of the US Department of Commerce Bureau of the Census A one-day program held October 10 1979 previewed enhancements which were planned to be implemented to the system Based upon the information furnished at that workshyshop I uidertook an interim review of the state of completion of the COBOL CONCOR package The result of that review was a working document entitled Report on the Developing COBOL CONCOR Edit and Imputation System At the timeof that writing the system was not in a sufficient degree of completion to definitively gauge its adequacy for exportation to developing countries This current publication COBOL CONCOR 1980 Its Adequacy and State of Completion while substantial in its own regard can best be understood in light of that previous report
On January 7-19 1980 I attended a workshop designed to provide particishypants with an in-depth explanation of the full range of capabilities the new COBOL CONCOR supports During this time I was able to learn the new CONCOR language and conduct tests bearing on the adequacy and completeness of the system Results from these test programs comprise parts of many of the Appendices
On January 18 1980 I was debriefed at the Office of Population Agency for International Development Rosslyn Virginia over the specific areas which compose the body of this report In this instance any comments of a critical nature about CONCOR must be preceded by a statement attesting to the compeshytence and dedication of the ISPC staff who have done an extraordinary job in redesigning and rewriting many of the programs comprising the system since October 10 1979
Though my experience with systems analysis and design utilizing the COBOL programming language encompasses three years local circumstances and specialishyzation are important considerations The discussions of this report are based on my overall experience in the data processing field and how I think they apply to the development of CONCOR At the time of the writing of this report I am the Senior Systems Analyst and the Director of Data Base Administration for the Georgia World Congress Institute a state-operated nonprofit research organization located in Atlanta Georgia
ii
CONTENTS
Page
PREFACE
EXECUTIVE SUMMARY iii
I BACKGROUND 1
II THE ADEQUACY OF CONCOR 3
III PROPOSED CHANGES TO THE LANGUAGE STRUCTURE 5
IV THE ADEQUACY OF COBOL CONCOR DOCUMENTATION 12
V CONCLUSION 14
APPENDICES
Appendix A Appendix B Appendix C Appendix D Appendix E Appendix F Appendix G Appendix H Appendix I Appendix J
- Bucen Enforcement Proposal - Evaluative Criteria - Workshop Itinerary - Participants - CONCOR Evaluation Form - NewOld Command Comparisons - ISPC Future Enhancement List - CONCOR System Internal Variables - CONCOR-EDITOR Execution Statistics - Diagnostic Message Guide Example
0
EXECUTIVE SUMMARY
The December 1979 COBOL CONCOR (Version 2) is a much improved software package All commands appear to be functional however the system should be exhaustively tested by an independent agency prior to its general release This agency Should also precisely determine the systems relative speed and core processor requirements While the system (exclusive of documentation) could immediately be utilized in a situation of extreme need some CONCOR language coding inconsistencies detract from the learnability and exportashybility of the package and should be corrected Additionally there are other modifications or adjustments which would enhance the overall utility and productivity of the language in census and survey applications
System documentation continu~s to be a problem There is no Users Guide The Systems Manual though well-constituted informationally should be thoroughly reorganized in accordance with the guidelines set forth in this report
The staff of ISPC exhibited competence and professionalism in the conduct of the two-week workshop January 7-18 1980 ISPC generally is aware of both the potential and shortcomings of the CONCOR project The current CONCOR version makes a significant redesign of the overall system As a package it is in a state where its completion is within reach
I BACKGROUND
CONCOR (an acronym of Consistency and Correction) is best characterized as a software tool designed to expedite the processing of data files duringthe edit and imputation phase of population census and surveys As a metashycompiler written in the COBOL language the system reads and verifies CONCOR language statements to produce an executable EDITOR program The objectiveof this process is the creation of an error-free file which can be used at a later time for tabulation purposes
Since its release as Version 1 December 30 1978 numerous elements of the COBOL CONCOR system have undergone continual change and redefinition Infact the system has not been permitted to stand still for any period of time nor has it been exhaustively tested In June of 1979 ISPC suspendedthe further distribution of the COBOL CONCOR system This decision was based principally upon reports of the packages unsatisfactory performance at workshops held in Panama and Thailand ISPC upon their own initiative developed a proposal to overhaul CONCOR and its accompanying documentation This proposal is contained in Appendix A representing an ambitious undertakingWhile not all of the desired changes and capabilities could be implementedVersion 2 of December 1979 represents a significant managerial effort The questions are now whether COBOL CONCOR Version 2 will be a demonstrably adeshyquate sofrvare package -- a package capable of exportation to developingcountries -- a package requiring no further modification The purpose of this report is to address these critical issues In connection with this Appendix B sets forth the specific criteria around which such a discussion must evolve As this is not intended to be a compendium some of these broader issues will be immediately treated following chapters and appendiceswill qualify the exact nature of system altu ations already undertaken as well as further adjustments believed to be essential in realizing the goals of the systems philosophy
Workshop
During the period of January 7-18 1930 a workshop was held under the sponsorship of the ISPC to demonstrate the capabilities of the latest reshyvision of the COBOL CONCOR software package A schedule of events of this workshop is contained in Appendix C This workshop was intended to provideparticipants with the opportunity to program in the CONCOR language and to thereby test aspects of the system as individually appropriate A listing of the participants and the international organizations they represented is
EDITOR is the new name of the EXECUTOR module of previous language versions
A complete history of the development of COBOL CONCOR can be found in both the ACCENTER 1978 Version 1 and December 1979 Version 2 systemsmanuals as well as in previous consulting reports
2
contained in Appendix D During the concluding days of the workshop each participant was asked by ISPC to provide a written evaluation of the now-called December 1979 version of CONCOR This evaluation form Appendix E also inshycludes space for comments concerning the competence of the system documentation as well as any additional comments including these regarding the organization and clarity of workshop presentations It is assumed that in the near future summaries of these comments will be available to interested agencies
While virtually all instructional aspects of this two-week workshop were conducted in a highly professional manner -- a manner which revealed a high degree of coordination among staff members in their efforts -- there are several areas which future workshops may improve upon
1 All publications should be assembled in their entirety and proof-read prior to distribution
2 A complete CONCOR language program example and accompanying 110 documents should be provided at the onset of the workshyshop for reference
3 Numerous short application programming problems involving all CONCOR language divisions should be utilized in place of a single lengthy problem
It is noted that this workshop was not intended to teach the CONCOR language as the organization and presentation of materials probably would have been different It is believed that the two-week time period was sufficient time to provide participants a familiarity with the use of the new CONCOR features especially in light of the fact that workshop participants were permitted to work weekends and beyond normal working hours at their disshycretion Though funding was not generally available it is known that several workshop members chose to extend their stay inWashington to continue testing the COMCOR package or to work on projects which they could attempt to immedishyately install on their home computers At the conclusion of the workshops participants were permitted to take with them an installation tape of CONCOR as well as all the other materials they had acquired during the course of the project
3
II THE ADEQUACY OF CONCOR
CONCOR has been described by its designers as an adequate packageAdequacy as an evaluative criteria is often relative to need and should not be confused with readiness as an issue The CONCOR system exclusive of documentation is sufficiently corplete that in a situation of extreme need it could be used as a data-cleaning tool in the editing and imputation phaseof census processing Less extreme circumstances would impose reticence on such an endorsement Though non-exhaustive tests indicate that CONCOR appearsto be capable of performing all of the commands as implemented because of the rapidness with which the system was rewritten it is thought that there has not been enough time to fully test all aspects of the project Thereforeprior to its general dissemination it is recommended that an independent agency conduct exhaustive tests to certify the integrity of the system proshygrams The importance of this certification cannot be understated in lightof previous workshop experiences Concurrent with this testing process the same agency should determine the relative speed and size of the system under actual production circumstances and further determine CONCORs ease of nstalshylation Later sections of this discussion set forth additional testing recom endations
It is generally recognized that of all the data-cleaning tools available for exportation CONCOR is potentially the most powerful especially with the addition of its new commands as outlined in Appendix F While its utility is not in doubt one must ask the question of how much more useful could CONCOR be if modified and would this additional utility be worth the costs involved The nature of modifications (excluding documentation) to COBOL CONCOR approprishyate at this time for cnsideration are threefold
1 Adjustments to the elements of the system which are internallyinconsistent or awkward to facilitate its learnability and usability am ig developing country programmers
a Implementation of the division headings DICTIONARY-DIVISION EXECUTION-DIVISION and REPORT-DIVISION De-emphasis of section headings
b Improvement of the consistency among data identifiers allow alphanumeric variables to be coded without mandatory comshyparison strings throughout the DATA-DIVISION and to be of the same length of numeric variables Permit numeric identifiers to be of an equal length to NEW DATA identishyfiers Permit the coding of single dimension row and column vectors in the same manner as multi-dimensional arrays
2 Implementation of selective commands and internal variables to facilitate the production environment use of CONCOR in census applications These include
4
a LOADUNLOAD arrays Commands which would save and replace automatically hot-decked values from batch to batch
b TOTAL-QUESTIONNAIRE-COUNT-RECORD-COUNT internal
variables independent of AREA CONTROL
3 Other modifications
a Default values for max-storage parameter set in realistic range
b Allowance of more variables for survey applications
Some of these modifications are part of what ISPC calls its wish list for the future development of CONCOR This document has been included in this report as Appendix G It is arguable that these features are essential to the completion of the CONCOR package While it is beyond the scope of this report to draw a conclusion in this area the enhancements as outlined above are ones that would make the language more internally consistent and thereby easier to learn and apply to a census data production environment These modifications are not arbitrary or cosmetic but are a direct result of handsshyon programming experience in the language as well as observations and disshycussions with other workshop participants While it is probably impossible to ever be satisfied with the overall structure of any programming language the resolution of this issue of completeness must be made relative to the objecshytives for developing the COBOL CONCOR system in the first place An explicit statement of these objectivEs has been absent in all systems documentation to date
5
III PROPOSED CHANGES TO THE LANGUAGE STRUCTURE
Based upon the assumption that it is the intent uf sponso-ing agenciesto optimize the COBOL CONCOR package -- a goal which is believed currentlyobtainable -- an understanding of the nature of these changes and how theywould impact users is essential Appendix F sets forth in a comparative manner differences between the old December 1978 and the new December 1979 editions of CONCOR Studying this appendix obviates the fact that while the new version of the language is clearly superior to the old in nearly everyaspect the basic and overall structure of the language is essentially unshychanged Compartmentalization of aspects of the language into divisions represents a significant ideological enhancement to the language Indeeddevelopment of programs by divisions proved to be an extremely useful way of understanding the nature of editing work to be performed However note that while the END-DIVISION comnand is essential to the language the division headings DICTIONARY-DIVISION EXECUTION-DIVISION and REPORT-DIVISION were not implemented and are therefore preceded by a period to be treated as comment lines in the program listing It is inconsistent to implement END-DIVISION commands while not implementing the division headings It is believed that this division structuring is important enough to the overall organizational structure of a CONCOR source language program that it should be implementedprior to general distribution The section headers shown on the figures inAppendix F however are another matter They are cumbersome and were generally not coded by workshop participants and they could be deleted from this version of the language altogether with little loss in organizational understanding The CONCOR language is significantly powerful to stand on its own as a distinct product and is not meant to be a COBOL imitation its present degree of development and specialization do not warrant the structural drag of additional section identifiers is the probable intent of the originalCONCOR project was to develop a package which was uncomplicated and unwieldy to use The question of division and section names implementation while seemingly cosmetic can have real impact on its perceived easiness of learning and use
Figure 1 on the following page illustrates common mistakes programmersmake coding numeric and alphanumeric variables in the DATA-DIVISION These mistakes are the result of the inconsistent variable formats For instancein the numeric data definition statement it is permissible to specify 19-23 where N signifies numeric 9 signifies the length of the item and 23 specifiesthe starting position in the record In NEW-DATA however it is possible to code an item with a maximum length of 18 While on the surface this inconshysistency would seem harmless typically some data defined user variables in NEW-DATA defined N18 could be moved inadvertently to output record fields defined by a data definition statement 119-23 such an action would result in a data error Under certain circumstances itwould be highly desirable to output these larger length values A similar circumstance exists between the numeric and alphanumeric data coding conventions While the maximum lengthof the numeric is permitted to be 9 in the data definition statement (18 in NEW-DATA) the maximum alphanumeric variable is permitted to be only 4 characters in length In the current systems manual it is recommended that
FIGURE 1
DICTIONARY-DIVISION
DICTIONARY-NAME DATA-CODING-EXAMPLE
INPUT-FILE
OUTPUT-FILE
AREA-CONTROL N2-2 N2-4 N3-6 N2-9 N9-23 QUESTIONNAIRE-CONTROL A4-2 A3-6 A2-9 A3-11 A3-14
RECORD-CONTROL Al-l
DEFINE-RECORD
HOI-TYPE-OF-HOUISING-UNIT Nl-17
H02-MATERIAL-OF-ROOF N1-19 10 9
H03-TOTAL-PERSONS-IN-UNIT N8-40 NOT-NUMERIC BLAIK
1104-STATE-OF-UIIII-CODE A4-50 0 U 1 D
DEFINE-RECORD
P01-SEX 1-13 W F
NEW-DATA
NOI-SAVE-TYPE-OF-HOUSING-UNIT
N02-SAVE-TYPE-OF-ROOF 1
N03-COUNT-TOTAL-IN-UNITS 10 0
N04-AGGREGATE-INCOME 18 0
END-DIVISION
Explanations
N2-4 This is an example of an external numeric input data item (N) with a length of 2 bytes starting in column 4 of the input record The maximumlength of this type of variable outside of NEW-DATA is 9 When coded in
NEW-DATA 18 is permitted
A4-2 This is an example of an external alphanumeric input data item (A)
with a length of 4 bytes starting in column 2 of the input record This
construction for alphanumeric variable is valid only in the control stateshyments Additionally it can never be over 4 bytes in length When alphshynumeric data fields are defined within record types the EDITOR program
requires that the comparison strings always be specified A maximum of 3 is permitted The purpose of these strings is to force recode the data to a numeric value If no match is found EDITOR automatically assigns a unique negative value to the field
7
alphanumeric coding be utilized in the QUESTIONNAIRE-CONTROL and RECORD-CONTROL statements where each input data item must be of the same data type as shown in the example When alphanumeric data variables are used in these control stateshyments their construction is identical to that of numeric items However when used elsewhere in the DATA-DIVISION alphanumeric variables are required to specify one of three possible comparison values as shown There are number of production instances when it never would be necessary or even desirable to reshycode alphanumeric data However as CONCOR attempts to force data into a totally numeric format upon output there is no current way to preserve these values if desired
An unwieldy alternative to this situation which may be acceptable under some circumstances would be the expansion of the number of comparison stringsfrom three to a more realistic number The limitation of this compromise is that a full twenty-six comparison identifiers would be required in order to accommodate data which utilized the entire alphabet A better solutionhowever would be to make the general format of the alphanumeric variables identical to that of numeric identifiers ie A9-23 and to permit alphashynumeric values so defined to pass unaltered through the CONCOR system
Anocher data-naming convention which caused several errors and which could be corrected concerns the array data definitional statements While arraysof two and more dimensions are handled in a superior manner by the CONCOR proshygram single-dimension arrays pose a problem in coding as shown in the Figure 2 It is suggested that the command imperatives be changed to permit the codingof both rows and columns in single dimension arrays ie allow a single row vector as well as a single column vector to maintain the consistel -yof the array data definitional statements
A major requirement of COBOL CONCOR file processing concerns the fact that all related data records must be physically contiguous on the input file The implication of this requirement is that files may require preprocessing prior to actual data editing (This preprocessing is usually a sort routine upon a selected CONTROL-AREA key) While this type of processing merely introduces a new step in file processing a major limitation becomes apparent when a largenumber of DISCRETE DATA files of the same census or survey questionnaire are to be processed This limitation is the introduction of manual steps to save the most recent inputed values ie preventing the program from startingwith cold values each batch run If a command such as LOADUNLOAD ARRAYS was incorporated into the language (an enhancement not believed to be difficult to implement) manual processing would be reduced to a minimum between batches and the maximum benefits of the hot deck methodology would be realized It is envisioned that such a command would automatically insure the transfer of the appropriately designated hot values Automatic processing of this nature if done correctly can greatly reduce the time required to clean multishyvolume files for once CONCOR language statements have been compiled linked
While it is possible at this time to save the arrays that amp-e used in the imputation processes on a separate write-file right now it is not possibleto automatically load those values back to an object program and to iTmedishyately resume processing on another volume It isbelieved that suh an automatic feature of the language would cut down the manual processing time significantly enough that it warrants inclusion into the package prior to its general distribution
FIGURE 2
A05-DI FF-BETWEEN-AGE-OF-FEMALE-BY-RECATION v2 4 4
AGE iF LHUSBAND RELATION Connents The ARRAY-DATA command statement
12-L7 18-24 25-35 36+ provides the means to declare array identifiers
2 1 3v 4v HEAD with up to five dimensions Current documentation 2 -1 3 CHILD is not as explicite about the rules of this 1 31 -2 -4 OTHER command as is desirable The parameters of I 2 2 2 NONRFLATTVE the command should function as follows
user-identifier number of dimensions D R C M number of rows number of columns
magnitude of element intiial start up valuesA06-DI FF-BETI4EEN-ArE-DF-PERSON-AND-M4OTHER 114
(This coding generates the below In the example A05 is a two dimensional array 16 18 21 23 error message) with 4 rows 4 columns a default magvitude of 9
and cold deck values as labeled
A06-DI FF- ETWEEN-AGE-OF-PERSON-AND-MOTHER 11t 587
I 2 III jqARNINIGDD-207) COMMAND TERMINATOR I) NOT FOUND C) ASSUMED PRESENT (2) EPROR (DD-9lI) DIMENSION OF USER-SPECIFIED ARRAY IS LESS THAN THE MINIMUM VALUE PERMITTFD (2)
PREEV1OUS DIAGIOSTIC AT CINIE 563
As shown by the array variable A06 CONCORs treatment of vectors is not consistent with the above multidimensional array skeme ie
(Example of how vectors must be currently A06 must be coded as follows coded to be correct)
A06-DIFF-BETWEEN-AGE-OF- PERSON-AND-MOTHERtl42 user-identifier 1 dimension number of elements in vector magnitude of element initial start up values
A simple modification to this command would permit 6 Lthe coding of both row and column vectors and make
16 LB 21 23 this command less error prone
0 0 0
9
and stored as an object module on the system no other compilations should berequired for questionnaire processing files of the same type Theoreticallya single well-written CONCOR program is all that would be required to process an entire census run
Appendix H contrasts the internal identifiers of the old and new languageversions Without such identifiers a user would have little information about the status of input as it is processed by EDITOR As noted in theappendix most internal pointers are reset upon each break in the CONTROL-AREAprovided a CONTROL-AREA has been defined The limitation here is that there are obvious instances when the termination in the processing mode would beadvantageous based on run counts although a CONTROL-AREA has been specifiedeg debugging CONCOR programs or comparing input files Therefore another set of pointers should be implemented for this purpose and made available for programmer reference
One clearly disturbing development which needs to be pursued during inshydepth testing of the system concerns the MAX-STORAGE parameters of the DEFINE-RECORD statement As shown in the figure on the following page when MAX-STORAGE was set equal to the maximum value a COBOL program was generated whichrequired 1O00K of core to run The MAX-STORAGE value of 999 is clearly notrealistic under most processing circumstances This example drives home severalimportant points about CONCOR The core requiremenis of CONCOR generated proshygrams can be influenced significantly by the amount or nature of programmerspecified I0 operations In fact it is possible to generate a program of a size most foreign country machines could not process It is recommended that tests determine a realistic max-value restriction for implementation to prevent problems in this area
The final area of recommended modification concerns the newly implementedREPORT-DIVISION The purpose of the REPORT-DIVISION is to enable a user todescribe or specify certain CONCOR language statements which will generatestatistical reports These reports contain statistics generated by EDITOR as specified by the GENERATE-EDIT-STATISTICS command of the EXECUTION4-DIVISIONAll of the reports produced are organized according to the data fields definedby the AREA-CONTROL command of the DATA-DICTIONARY If the AREA-CONTROL command is not defined in the DATA-DICTIONARY then all the statistics aresummarized at the total run level If a control area field is defined then allstatistics will be summarized for each unique CONTROL AREA as encountered bythe EDITCR program on the input file Statistics by total run level will notbe available This in part relates back to previous discussions citing theneed for new internal identifiers Report listings may contain the values ofentire records or entire questionnaires depending upon the keyword used inthe report generation commands The problem centers upon the homogeneity of CONCOR printouts during a production run
It is virtually impossible to distinguish reports on the basis of thevolumes they were run against Some means should be provided to allow users touniquely and purposefully label the reports generated in this division Indeedthe whole name REPORT-DIVISION suggests that such a command is implicit andappropriate Such a LABEL-REPORT or REPORT-FILE command along with file inforshymation from the system should not be difficult to implement
FIGURE 3
C O N C o R PAGE
SC0 N S I S T E N C Y A N D C-0 R R E C t I ON ) RUN 6AE 0hi88n
E D I T A N D I M P U T ATI 0 N S Y S T EM
USER DICTIONARY DIVISION-SOURCE LISTING
LINE NUMBER
7070
71
72 DEFINE-RECORD RECORDNAME= HOUSING-RECORDi 72
73 MAX-STORAGE=- 999 73
74 RLrRD-TYPE 11t NOTE AN LITERAL 74
267
71
P6T
268 DEFINE-RECORD RECORD-NAME= POPULATION-RECORD p68
P69MAX-STORAGE= 999269
270RECORD-TYPE= F1270
IEF2R5I JES2JOB04966SO0114 SYSOUT iEF2851 SYS8nO18T095553RAOOOAMCP12ARO000005 PASSED
= IEF2851 VOL qFP NOS TmPOn2 IEF285I wHKHDGMCONCORP12AERRORRECS KEPT
= IEF2851 VOL SEP NOS TMPO03 IEF3731 STEP LOAD START 800181001
IEF3741 STEP LOAD STOP 800181002 CPU OMIN 0165SEC SRR flMIN O25SEC VIRT iO00K SYS 3
STEP START TIME - 100155 STEP END TIME - 100257 STEP CPU TIME - 165 SECONDS STEP COMPLETION CODE - C 0
O TAPES - 0 DISKS - 0 PEGION (REQUESTED - 1000K USED - 1000K) IO COUNT 211
CARDS READ - 0 CARDS PUNCHED - 0 LINES PRINTE6 - 22
STEPNAME - LOAD CHARGE (MACHINE UNITS - 1198 DOLLARS- S-77 5TONUM9E 1
11
Concluding Remarks of System Modifications
Version 1 of the COBOL CONCOR language was notorious for abending without the generation of diagnostic messages Appendix I sets forth a test of some of these historical problems In eac) instance the system provided messagesand source program line references wLch enabled immediate identification of the processing difficulties However tests of the various commands revealed the general high performance character of the language in its newest version Under these circumstances the recommended modifica-ions should be considered as ones which will uniformly allow CONCOR to realize t-r potential of its design enhance its utility and moreover endure as a software product
12
IV THE ADEQUACY OF COBOL CONCOR DOCUMENTATION
The current documentation of CONCOR consists of a newly developed relativelyvoluminous systems manual a Diagnostic Message Guide and some loose-leaf inshystructions concerning the installation of the system on IBM computers Of these three documents the Diagnostic Message Guide is clearly complete and adequateand could bear no changes except to provide message number tabs to enable a programmer to more quickly locate the text of a message number An example of the superior format of this guide appears as Appendix J
Conspicuously absent from the list of systems documents is a Users Guide as originally specified in the ISPC enhancement proposal
A thorough assessment of the Systems Manual has been undertaken and indishycates with some reorganization and editing it could be made a more useful and usable document as itis well constituted informationally In its current form the Systems Manual can be best described as a working document a document which was not intended to teach the CONCOR language but centered upon an audishyence already familiar with previous CONCOR releases Further it is not asystems manual per se as some of its components and appendixes could more appropriately compose a users guide For example excerpts from the POPSTAN publications which review the basic editing concepts and contain the POPSTAN example problem should be deleted from the systems manual and would be more contextually effective in the Users Guide The EXECUTION-DIVISION chapter of the Systems Manual would especially benefit from general editing and reorganishyzation (Specifically the contents of pages 90-92 should be moved to page 63 -shythese pages explain the placement and purpose of the three permissible routines of the CONCOR language Had the author been granted more time he could have outlined additional specific modifications to bring existing documentation to a level of adequacy) However some guidelines stand out
1 Installation instruction and considerations should compose a chapter of the Systems Manual rather than a separate document (Hand-out materials covering installation were not adequate) Further a documentation scheme setting forth how CONCOR was installed on each computer and any modifications required during installation could be serially added to and become a permanent part of the installations systems manual If CONCOR is going to be used and if agencies are going to actively support it as a package installation limitations must be known and well-documented The prescribed format for this documentation should be set forth and any modifishycations to CONCOR should be continuously updated on these shcets so that a history of changes to the language will be readily accessible to supporting agencies If CONCOR is to survive as a standardized system editing results must be capable of being replicated among installations Such a form could contain other information as well concerning the tyce of computer amount of care and nature of utilities supporting
13
users Upon installation a copy of this form could be
sent to the US agency which will ultimately be responsible
for supporting the CONCOR package
an appendix2 A complete COBOL CONCOR program should appear in
for reference
3 The development of the Users Guide should include an intensive
review of the editing concepts involved in processing census
data files beyond the POPSTAN materials
4 An explanation of the CONCOR benchmark program syould appear
in the Users Guide and the Systems Manual The running of a
supplied benchmark program should be a standard installation
protocol used to test all operational aspects of a new
installation
This tool which has5 The development of a CONCOR pocket card been proven useful in teaching and assisting programmers in
utilizing programming language lays out all commands options on
a single small card An example of such a pocket card is the
Such a small document is useful in clari-IBM Assembler Card fying the language and setting forth structural rules without
continual reference to full-size manuals
14
V CONCLUSION
In a situation of extreme need CONCOR could probably be utilized as it seems capable of performing its editing functiens Its usefulness as a data
cleaning tool is generally undisputed The importance of having CONCOR exshyhaustively tested by an independent agency cannot be over-emphasized in light of prior failures of the system to perform adequately Additionally core requirements and processing speed should be precisely determined One must be encouraged in the amount of progress which has been made in a relatively short amount of time towards the completion of the CONCOR package While it is clearly beyond the scope of this report to estimate the cost and the amount of time required to implement the herein outlined highly desirable changes to the CONCOR language and documentation and then exhaustively test the system it is believed that this process could be completed within a single quarter (The changes are of the nature that no structural redefinishytions of a systematic nature would be required) These enhancements would make the COBOL CONCOR package more complete consistent and reliable in the field ISPC has competently moved the system to the point that its comshypletion iswithin reach
Given the probable revision of the current Systems Reference Manual and the development of a suitable Users Guide document it is likely that this version of CONCOR could be taught to experienced programmers (unfamiliar with previous versions of the language) in as little as four weeks And while one is impressed by the potential power and overall simplicity of the COBOL CONCOR language it is difficult to envision individuals with no real computer programming experience utilizing its full capabilities Hence the importance of including a review of basic editing concepts in the documentation is undershylined
Further as documentation is often critical to the perceived ease of use of computer-based systems itwill be essential to have these various manuals examined independently prior to their general use
As previously mentioned Appendix G represents features of which the ISPC feels could have positive impact on CONCOR Depending upon the outcome of the testing recommended in previous chapters and the availability of funding supporting agencies may desire to look again at an assembler (ALC) version of the language
Overall COBOL CONCOR remains an intermediate product yet to realize its potential which is thought considerable Software development of this nature is characteristically of a long-term evolutionary design In light of this reality agencies should be prepared to evaluate such projects success or failure in terms other than that of immediate utility It is the philosophy underlying the system not the system itself which will ultimately be exported
APPENDIX A
Bucen Enforcement Proposal
APPENDIX A
BUCEN IN-HOUSE REVIEW ENHANCEMENT PROPOSAL
1 Easy to use interrecord referencing
2 Improved output file capabilities
A provide overflow protection on WRITE command
B provide OUTPUT command to create corrected questionnaire file that matches IP file data dictionary
3 Improvededit statistics reported (LISTERR)
A provide automatic (user-specified) area break
B provide options for compilation and displaying edit statistics at various levels
C provide automatic (user-specified) tolerance checking of error rates by area
D automatically capture IDs of areas failing tolerance check
4 Clean up known bugs in code
5 Comprehensive testing
6 Clean up and enhance documentation
A reference manual more examples error message guide
B installation guide
C systems manual
APPENDIX B
EVALUATIVE CRITERIA
APPEiUIX B [4 si 11 T q tII
UNITED STATES GOVERNMENshy
Pfern0 KGa 7 dll TO DSPOPFPSD Robert H Haladay
DATE December 3 1979
DSPOPDEIO Liliane Floge
SUBJECT Criteria for Evaluation of SuCen COBOL CONCOR September 30thVersion as presented at January 1930 orkshop
The following points should be considered when evaluatingCONCOR the COBOLediting system during participation at the January 19380 workshyshop
1 Are the Installation anual the System ilanual and theUsers anual ready for use Are these manuals adequate fortheir purposes Are the manuals clear and consistentthey be understood by subject-matter personnel Can
as well program ers
as
2 Is the editing system capable of impleaenting all thefeatures as described in the documenit_tion 3 Does the system appc3Ar to be ely debugged andtested and ready for release to dLvlooing conris 4 Does the system appear to be fast enough to edit a
census in a reasonable airount of time 5 hat size core does tine sys t- equire
6 CorFm-ent in general on coole te-nss usefulness andefficiency of ir-nease usethe n of by developingcountry prograrrers and other c nsus personnel
cc Su-a-- e Olds APiA SDi OrWJulio Ortuizar luotolaCelta Systems
APPENDIX C
WORKSHOP ITINERARY
APPENDIX C
CONCOR Workshop Schedule January 7-18 1980
U S Bureau of the Census International Statistical Programs Center
Scuderi Building Room 209 4235 28th Avenue Marlow Heights Maryland
Monday January 7
930 am shy 1000 Welcoming Remarks
Overview of Workshop
1000 - 1000 Introduction to CONCOR - Purpose and function
- History of development - General computer
requirements
115 - 200 System Description
-Constraints in design of CONCOR
-Basic subsystems of CONCOR
-User interactions with system
-Examples of outputs produced
1030 shy 1045 Break
1045 - 1200 Editing Concepts - Ways to interrogate
data - Ways to correct data
- Editing housing and
population data - POPSTAN
- Advantages of CONCOR
1200 shy 115 pm Break
200 - 230
230 - 245
245 - 325
User Program Organization -Divisions - Sections
- Routines - Commands
Break
Command Language
Description -Types of statements -Format -Syntax
is
2
Tuesday January 8
Dictionary Division Command Statements
115 pm-2 15 Input-Record-Section930 am-10
30 Punctuation - Define-RecordInput data referencing
Working-Data-SectionWorking data declara-- New-Datations - Array-DataInternal data represen-
tation and storage Break215 - 230
1030 - 1045 Break 230 - 325 Dictionary Examples
- Minimum dictionary1045 - 1200 Dictionary-Attributes-
structureSection - Maximum dictionary- Dictionary-Name
structure
- Hand out dictionaryFile-Section problem- Input-File
- Output-File - Write-File
- Error-File
1200 - 115 pm Break
Wednesday January 9
115 pm-2 1 5 Execution Division Command
930 am-1030 Discussion of Partici-Statements (continued)pants Dictionary
- Routines of Edit-Specishyproblems fications-Section
Free work time -Prolog-Routine -Filter-Routine -Epilog-Routine
- Types and functions 1030 - 1045 Break
of edit specification
commands1045 - 1200 Execution Division
Command Statements - Range- Punctuation - Assert- Subscripting
- Internal Identifiers - Report-Control-Section
-Generate-Edit-Statistics 215 - 230 Break-Count-Imputes
-Examples 230 - 325 - PassFail clauses
- List1200 - 115 pm Break
3
Thursday January 10
930 am-1030 Discussion of Problems - If115 pm-215
Free work time - UntilExit - Stop
1030 - 1045 Break 215 - 230 Break
1045 shy 1200 Edit Specification Command Statements 230 - 325 - Drecode
(continued) - Grecode
- Allocate - Update - Let
1200 - 115pm Break
Friday January 11
930 am-1030 Edit Specification 115 pm-3 25 Hand out Edit Specishyfications as problemCommand Statements
(continued) Free work time
- Output - Write
1030 - 1045 Break
1045 - 1200 Report Division Command Statements - Display-Control-
Section -Display-Edit-Statistics
- Tolerance-Control-Section -Error-Rate-Check -Reject-File
-Report Examples
1200 - 115 pm Break
I
29
Monday January 14
930 am-1030 Discuss procedures for running problems on computer
1030-1045 Break
1045-1200 Component Programs of the CONCOR system
1200- 115 pm Break
Tuesday January 15
930 am - 325 pm Free work time
Wednesday January 16
930 am 1200 Free work time
1200- 115 pm Break
115 pm-215 How to Install CONCOR on IBM 360370 OS
215- 230 Break
230-325 Free work time
Thursday January 17
930 am-325 Free work time
115 pm-215 Future CONCOR Design Considerations - for census processing - for survey processing
- manual correction system
215- 230 Break
230 - 245 Evaluation Guidelines
- Hand out evaluation forms
245 - 325 Free work time
Friday January 18
930 am-1030 Free work time 115-325 Free work time
1030 - 1045 Break
1045 - 1200 Group Discussion on CONCOR - submission of written evaluations Distribution of IBM installation tapes Presentation of Certificates to Participants
1200-115 pm Break
APPENDIX D
PARTICIPANTS
APPENDIX D
CONCOR WORKSHOP January 7 - 18 1980 Marlow Heights MD
PARTICIPANTS
KENYA
James Midianga Government Computer Centre Central Bureau of Statistics PO Box 30266 Nairobi Kenya
PANAMA
Omar Rivera Rios Data Processing Specialist Contraloria General de la Republica de Panama Apartado Postal 5213 Panama 5 Panama
PHILIPPINES Guida T Capellan OIC Computer Programming Division National Census and Statistics Office Solicarel Bldg 1 Magsaysay Blvd
Sta Mesa Manila Philippines
THAILAND
Angsumal Sunalai National Statistical Office Bangkok 1 Thailand
EGYPT
Farag Sedky Mourad Ghaleb CAPMAS (Centeral Agency for Public
Mobilization and Statistics)NASR City Cairo Egypt
L
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
PREFACE
Since June 1979 a major design of the COBOL CONCOR edit and imputation system has been undertaken by the International Statistical Programs Center (ISPC) of the US Department of Commerce Bureau of the Census A one-day program held October 10 1979 previewed enhancements which were planned to be implemented to the system Based upon the information furnished at that workshyshop I uidertook an interim review of the state of completion of the COBOL CONCOR package The result of that review was a working document entitled Report on the Developing COBOL CONCOR Edit and Imputation System At the timeof that writing the system was not in a sufficient degree of completion to definitively gauge its adequacy for exportation to developing countries This current publication COBOL CONCOR 1980 Its Adequacy and State of Completion while substantial in its own regard can best be understood in light of that previous report
On January 7-19 1980 I attended a workshop designed to provide particishypants with an in-depth explanation of the full range of capabilities the new COBOL CONCOR supports During this time I was able to learn the new CONCOR language and conduct tests bearing on the adequacy and completeness of the system Results from these test programs comprise parts of many of the Appendices
On January 18 1980 I was debriefed at the Office of Population Agency for International Development Rosslyn Virginia over the specific areas which compose the body of this report In this instance any comments of a critical nature about CONCOR must be preceded by a statement attesting to the compeshytence and dedication of the ISPC staff who have done an extraordinary job in redesigning and rewriting many of the programs comprising the system since October 10 1979
Though my experience with systems analysis and design utilizing the COBOL programming language encompasses three years local circumstances and specialishyzation are important considerations The discussions of this report are based on my overall experience in the data processing field and how I think they apply to the development of CONCOR At the time of the writing of this report I am the Senior Systems Analyst and the Director of Data Base Administration for the Georgia World Congress Institute a state-operated nonprofit research organization located in Atlanta Georgia
ii
CONTENTS
Page
PREFACE
EXECUTIVE SUMMARY iii
I BACKGROUND 1
II THE ADEQUACY OF CONCOR 3
III PROPOSED CHANGES TO THE LANGUAGE STRUCTURE 5
IV THE ADEQUACY OF COBOL CONCOR DOCUMENTATION 12
V CONCLUSION 14
APPENDICES
Appendix A Appendix B Appendix C Appendix D Appendix E Appendix F Appendix G Appendix H Appendix I Appendix J
- Bucen Enforcement Proposal - Evaluative Criteria - Workshop Itinerary - Participants - CONCOR Evaluation Form - NewOld Command Comparisons - ISPC Future Enhancement List - CONCOR System Internal Variables - CONCOR-EDITOR Execution Statistics - Diagnostic Message Guide Example
0
EXECUTIVE SUMMARY
The December 1979 COBOL CONCOR (Version 2) is a much improved software package All commands appear to be functional however the system should be exhaustively tested by an independent agency prior to its general release This agency Should also precisely determine the systems relative speed and core processor requirements While the system (exclusive of documentation) could immediately be utilized in a situation of extreme need some CONCOR language coding inconsistencies detract from the learnability and exportashybility of the package and should be corrected Additionally there are other modifications or adjustments which would enhance the overall utility and productivity of the language in census and survey applications
System documentation continu~s to be a problem There is no Users Guide The Systems Manual though well-constituted informationally should be thoroughly reorganized in accordance with the guidelines set forth in this report
The staff of ISPC exhibited competence and professionalism in the conduct of the two-week workshop January 7-18 1980 ISPC generally is aware of both the potential and shortcomings of the CONCOR project The current CONCOR version makes a significant redesign of the overall system As a package it is in a state where its completion is within reach
I BACKGROUND
CONCOR (an acronym of Consistency and Correction) is best characterized as a software tool designed to expedite the processing of data files duringthe edit and imputation phase of population census and surveys As a metashycompiler written in the COBOL language the system reads and verifies CONCOR language statements to produce an executable EDITOR program The objectiveof this process is the creation of an error-free file which can be used at a later time for tabulation purposes
Since its release as Version 1 December 30 1978 numerous elements of the COBOL CONCOR system have undergone continual change and redefinition Infact the system has not been permitted to stand still for any period of time nor has it been exhaustively tested In June of 1979 ISPC suspendedthe further distribution of the COBOL CONCOR system This decision was based principally upon reports of the packages unsatisfactory performance at workshops held in Panama and Thailand ISPC upon their own initiative developed a proposal to overhaul CONCOR and its accompanying documentation This proposal is contained in Appendix A representing an ambitious undertakingWhile not all of the desired changes and capabilities could be implementedVersion 2 of December 1979 represents a significant managerial effort The questions are now whether COBOL CONCOR Version 2 will be a demonstrably adeshyquate sofrvare package -- a package capable of exportation to developingcountries -- a package requiring no further modification The purpose of this report is to address these critical issues In connection with this Appendix B sets forth the specific criteria around which such a discussion must evolve As this is not intended to be a compendium some of these broader issues will be immediately treated following chapters and appendiceswill qualify the exact nature of system altu ations already undertaken as well as further adjustments believed to be essential in realizing the goals of the systems philosophy
Workshop
During the period of January 7-18 1930 a workshop was held under the sponsorship of the ISPC to demonstrate the capabilities of the latest reshyvision of the COBOL CONCOR software package A schedule of events of this workshop is contained in Appendix C This workshop was intended to provideparticipants with the opportunity to program in the CONCOR language and to thereby test aspects of the system as individually appropriate A listing of the participants and the international organizations they represented is
EDITOR is the new name of the EXECUTOR module of previous language versions
A complete history of the development of COBOL CONCOR can be found in both the ACCENTER 1978 Version 1 and December 1979 Version 2 systemsmanuals as well as in previous consulting reports
2
contained in Appendix D During the concluding days of the workshop each participant was asked by ISPC to provide a written evaluation of the now-called December 1979 version of CONCOR This evaluation form Appendix E also inshycludes space for comments concerning the competence of the system documentation as well as any additional comments including these regarding the organization and clarity of workshop presentations It is assumed that in the near future summaries of these comments will be available to interested agencies
While virtually all instructional aspects of this two-week workshop were conducted in a highly professional manner -- a manner which revealed a high degree of coordination among staff members in their efforts -- there are several areas which future workshops may improve upon
1 All publications should be assembled in their entirety and proof-read prior to distribution
2 A complete CONCOR language program example and accompanying 110 documents should be provided at the onset of the workshyshop for reference
3 Numerous short application programming problems involving all CONCOR language divisions should be utilized in place of a single lengthy problem
It is noted that this workshop was not intended to teach the CONCOR language as the organization and presentation of materials probably would have been different It is believed that the two-week time period was sufficient time to provide participants a familiarity with the use of the new CONCOR features especially in light of the fact that workshop participants were permitted to work weekends and beyond normal working hours at their disshycretion Though funding was not generally available it is known that several workshop members chose to extend their stay inWashington to continue testing the COMCOR package or to work on projects which they could attempt to immedishyately install on their home computers At the conclusion of the workshops participants were permitted to take with them an installation tape of CONCOR as well as all the other materials they had acquired during the course of the project
3
II THE ADEQUACY OF CONCOR
CONCOR has been described by its designers as an adequate packageAdequacy as an evaluative criteria is often relative to need and should not be confused with readiness as an issue The CONCOR system exclusive of documentation is sufficiently corplete that in a situation of extreme need it could be used as a data-cleaning tool in the editing and imputation phaseof census processing Less extreme circumstances would impose reticence on such an endorsement Though non-exhaustive tests indicate that CONCOR appearsto be capable of performing all of the commands as implemented because of the rapidness with which the system was rewritten it is thought that there has not been enough time to fully test all aspects of the project Thereforeprior to its general dissemination it is recommended that an independent agency conduct exhaustive tests to certify the integrity of the system proshygrams The importance of this certification cannot be understated in lightof previous workshop experiences Concurrent with this testing process the same agency should determine the relative speed and size of the system under actual production circumstances and further determine CONCORs ease of nstalshylation Later sections of this discussion set forth additional testing recom endations
It is generally recognized that of all the data-cleaning tools available for exportation CONCOR is potentially the most powerful especially with the addition of its new commands as outlined in Appendix F While its utility is not in doubt one must ask the question of how much more useful could CONCOR be if modified and would this additional utility be worth the costs involved The nature of modifications (excluding documentation) to COBOL CONCOR approprishyate at this time for cnsideration are threefold
1 Adjustments to the elements of the system which are internallyinconsistent or awkward to facilitate its learnability and usability am ig developing country programmers
a Implementation of the division headings DICTIONARY-DIVISION EXECUTION-DIVISION and REPORT-DIVISION De-emphasis of section headings
b Improvement of the consistency among data identifiers allow alphanumeric variables to be coded without mandatory comshyparison strings throughout the DATA-DIVISION and to be of the same length of numeric variables Permit numeric identifiers to be of an equal length to NEW DATA identishyfiers Permit the coding of single dimension row and column vectors in the same manner as multi-dimensional arrays
2 Implementation of selective commands and internal variables to facilitate the production environment use of CONCOR in census applications These include
4
a LOADUNLOAD arrays Commands which would save and replace automatically hot-decked values from batch to batch
b TOTAL-QUESTIONNAIRE-COUNT-RECORD-COUNT internal
variables independent of AREA CONTROL
3 Other modifications
a Default values for max-storage parameter set in realistic range
b Allowance of more variables for survey applications
Some of these modifications are part of what ISPC calls its wish list for the future development of CONCOR This document has been included in this report as Appendix G It is arguable that these features are essential to the completion of the CONCOR package While it is beyond the scope of this report to draw a conclusion in this area the enhancements as outlined above are ones that would make the language more internally consistent and thereby easier to learn and apply to a census data production environment These modifications are not arbitrary or cosmetic but are a direct result of handsshyon programming experience in the language as well as observations and disshycussions with other workshop participants While it is probably impossible to ever be satisfied with the overall structure of any programming language the resolution of this issue of completeness must be made relative to the objecshytives for developing the COBOL CONCOR system in the first place An explicit statement of these objectivEs has been absent in all systems documentation to date
5
III PROPOSED CHANGES TO THE LANGUAGE STRUCTURE
Based upon the assumption that it is the intent uf sponso-ing agenciesto optimize the COBOL CONCOR package -- a goal which is believed currentlyobtainable -- an understanding of the nature of these changes and how theywould impact users is essential Appendix F sets forth in a comparative manner differences between the old December 1978 and the new December 1979 editions of CONCOR Studying this appendix obviates the fact that while the new version of the language is clearly superior to the old in nearly everyaspect the basic and overall structure of the language is essentially unshychanged Compartmentalization of aspects of the language into divisions represents a significant ideological enhancement to the language Indeeddevelopment of programs by divisions proved to be an extremely useful way of understanding the nature of editing work to be performed However note that while the END-DIVISION comnand is essential to the language the division headings DICTIONARY-DIVISION EXECUTION-DIVISION and REPORT-DIVISION were not implemented and are therefore preceded by a period to be treated as comment lines in the program listing It is inconsistent to implement END-DIVISION commands while not implementing the division headings It is believed that this division structuring is important enough to the overall organizational structure of a CONCOR source language program that it should be implementedprior to general distribution The section headers shown on the figures inAppendix F however are another matter They are cumbersome and were generally not coded by workshop participants and they could be deleted from this version of the language altogether with little loss in organizational understanding The CONCOR language is significantly powerful to stand on its own as a distinct product and is not meant to be a COBOL imitation its present degree of development and specialization do not warrant the structural drag of additional section identifiers is the probable intent of the originalCONCOR project was to develop a package which was uncomplicated and unwieldy to use The question of division and section names implementation while seemingly cosmetic can have real impact on its perceived easiness of learning and use
Figure 1 on the following page illustrates common mistakes programmersmake coding numeric and alphanumeric variables in the DATA-DIVISION These mistakes are the result of the inconsistent variable formats For instancein the numeric data definition statement it is permissible to specify 19-23 where N signifies numeric 9 signifies the length of the item and 23 specifiesthe starting position in the record In NEW-DATA however it is possible to code an item with a maximum length of 18 While on the surface this inconshysistency would seem harmless typically some data defined user variables in NEW-DATA defined N18 could be moved inadvertently to output record fields defined by a data definition statement 119-23 such an action would result in a data error Under certain circumstances itwould be highly desirable to output these larger length values A similar circumstance exists between the numeric and alphanumeric data coding conventions While the maximum lengthof the numeric is permitted to be 9 in the data definition statement (18 in NEW-DATA) the maximum alphanumeric variable is permitted to be only 4 characters in length In the current systems manual it is recommended that
FIGURE 1
DICTIONARY-DIVISION
DICTIONARY-NAME DATA-CODING-EXAMPLE
INPUT-FILE
OUTPUT-FILE
AREA-CONTROL N2-2 N2-4 N3-6 N2-9 N9-23 QUESTIONNAIRE-CONTROL A4-2 A3-6 A2-9 A3-11 A3-14
RECORD-CONTROL Al-l
DEFINE-RECORD
HOI-TYPE-OF-HOUISING-UNIT Nl-17
H02-MATERIAL-OF-ROOF N1-19 10 9
H03-TOTAL-PERSONS-IN-UNIT N8-40 NOT-NUMERIC BLAIK
1104-STATE-OF-UIIII-CODE A4-50 0 U 1 D
DEFINE-RECORD
P01-SEX 1-13 W F
NEW-DATA
NOI-SAVE-TYPE-OF-HOUSING-UNIT
N02-SAVE-TYPE-OF-ROOF 1
N03-COUNT-TOTAL-IN-UNITS 10 0
N04-AGGREGATE-INCOME 18 0
END-DIVISION
Explanations
N2-4 This is an example of an external numeric input data item (N) with a length of 2 bytes starting in column 4 of the input record The maximumlength of this type of variable outside of NEW-DATA is 9 When coded in
NEW-DATA 18 is permitted
A4-2 This is an example of an external alphanumeric input data item (A)
with a length of 4 bytes starting in column 2 of the input record This
construction for alphanumeric variable is valid only in the control stateshyments Additionally it can never be over 4 bytes in length When alphshynumeric data fields are defined within record types the EDITOR program
requires that the comparison strings always be specified A maximum of 3 is permitted The purpose of these strings is to force recode the data to a numeric value If no match is found EDITOR automatically assigns a unique negative value to the field
7
alphanumeric coding be utilized in the QUESTIONNAIRE-CONTROL and RECORD-CONTROL statements where each input data item must be of the same data type as shown in the example When alphanumeric data variables are used in these control stateshyments their construction is identical to that of numeric items However when used elsewhere in the DATA-DIVISION alphanumeric variables are required to specify one of three possible comparison values as shown There are number of production instances when it never would be necessary or even desirable to reshycode alphanumeric data However as CONCOR attempts to force data into a totally numeric format upon output there is no current way to preserve these values if desired
An unwieldy alternative to this situation which may be acceptable under some circumstances would be the expansion of the number of comparison stringsfrom three to a more realistic number The limitation of this compromise is that a full twenty-six comparison identifiers would be required in order to accommodate data which utilized the entire alphabet A better solutionhowever would be to make the general format of the alphanumeric variables identical to that of numeric identifiers ie A9-23 and to permit alphashynumeric values so defined to pass unaltered through the CONCOR system
Anocher data-naming convention which caused several errors and which could be corrected concerns the array data definitional statements While arraysof two and more dimensions are handled in a superior manner by the CONCOR proshygram single-dimension arrays pose a problem in coding as shown in the Figure 2 It is suggested that the command imperatives be changed to permit the codingof both rows and columns in single dimension arrays ie allow a single row vector as well as a single column vector to maintain the consistel -yof the array data definitional statements
A major requirement of COBOL CONCOR file processing concerns the fact that all related data records must be physically contiguous on the input file The implication of this requirement is that files may require preprocessing prior to actual data editing (This preprocessing is usually a sort routine upon a selected CONTROL-AREA key) While this type of processing merely introduces a new step in file processing a major limitation becomes apparent when a largenumber of DISCRETE DATA files of the same census or survey questionnaire are to be processed This limitation is the introduction of manual steps to save the most recent inputed values ie preventing the program from startingwith cold values each batch run If a command such as LOADUNLOAD ARRAYS was incorporated into the language (an enhancement not believed to be difficult to implement) manual processing would be reduced to a minimum between batches and the maximum benefits of the hot deck methodology would be realized It is envisioned that such a command would automatically insure the transfer of the appropriately designated hot values Automatic processing of this nature if done correctly can greatly reduce the time required to clean multishyvolume files for once CONCOR language statements have been compiled linked
While it is possible at this time to save the arrays that amp-e used in the imputation processes on a separate write-file right now it is not possibleto automatically load those values back to an object program and to iTmedishyately resume processing on another volume It isbelieved that suh an automatic feature of the language would cut down the manual processing time significantly enough that it warrants inclusion into the package prior to its general distribution
FIGURE 2
A05-DI FF-BETWEEN-AGE-OF-FEMALE-BY-RECATION v2 4 4
AGE iF LHUSBAND RELATION Connents The ARRAY-DATA command statement
12-L7 18-24 25-35 36+ provides the means to declare array identifiers
2 1 3v 4v HEAD with up to five dimensions Current documentation 2 -1 3 CHILD is not as explicite about the rules of this 1 31 -2 -4 OTHER command as is desirable The parameters of I 2 2 2 NONRFLATTVE the command should function as follows
user-identifier number of dimensions D R C M number of rows number of columns
magnitude of element intiial start up valuesA06-DI FF-BETI4EEN-ArE-DF-PERSON-AND-M4OTHER 114
(This coding generates the below In the example A05 is a two dimensional array 16 18 21 23 error message) with 4 rows 4 columns a default magvitude of 9
and cold deck values as labeled
A06-DI FF- ETWEEN-AGE-OF-PERSON-AND-MOTHER 11t 587
I 2 III jqARNINIGDD-207) COMMAND TERMINATOR I) NOT FOUND C) ASSUMED PRESENT (2) EPROR (DD-9lI) DIMENSION OF USER-SPECIFIED ARRAY IS LESS THAN THE MINIMUM VALUE PERMITTFD (2)
PREEV1OUS DIAGIOSTIC AT CINIE 563
As shown by the array variable A06 CONCORs treatment of vectors is not consistent with the above multidimensional array skeme ie
(Example of how vectors must be currently A06 must be coded as follows coded to be correct)
A06-DIFF-BETWEEN-AGE-OF- PERSON-AND-MOTHERtl42 user-identifier 1 dimension number of elements in vector magnitude of element initial start up values
A simple modification to this command would permit 6 Lthe coding of both row and column vectors and make
16 LB 21 23 this command less error prone
0 0 0
9
and stored as an object module on the system no other compilations should berequired for questionnaire processing files of the same type Theoreticallya single well-written CONCOR program is all that would be required to process an entire census run
Appendix H contrasts the internal identifiers of the old and new languageversions Without such identifiers a user would have little information about the status of input as it is processed by EDITOR As noted in theappendix most internal pointers are reset upon each break in the CONTROL-AREAprovided a CONTROL-AREA has been defined The limitation here is that there are obvious instances when the termination in the processing mode would beadvantageous based on run counts although a CONTROL-AREA has been specifiedeg debugging CONCOR programs or comparing input files Therefore another set of pointers should be implemented for this purpose and made available for programmer reference
One clearly disturbing development which needs to be pursued during inshydepth testing of the system concerns the MAX-STORAGE parameters of the DEFINE-RECORD statement As shown in the figure on the following page when MAX-STORAGE was set equal to the maximum value a COBOL program was generated whichrequired 1O00K of core to run The MAX-STORAGE value of 999 is clearly notrealistic under most processing circumstances This example drives home severalimportant points about CONCOR The core requiremenis of CONCOR generated proshygrams can be influenced significantly by the amount or nature of programmerspecified I0 operations In fact it is possible to generate a program of a size most foreign country machines could not process It is recommended that tests determine a realistic max-value restriction for implementation to prevent problems in this area
The final area of recommended modification concerns the newly implementedREPORT-DIVISION The purpose of the REPORT-DIVISION is to enable a user todescribe or specify certain CONCOR language statements which will generatestatistical reports These reports contain statistics generated by EDITOR as specified by the GENERATE-EDIT-STATISTICS command of the EXECUTION4-DIVISIONAll of the reports produced are organized according to the data fields definedby the AREA-CONTROL command of the DATA-DICTIONARY If the AREA-CONTROL command is not defined in the DATA-DICTIONARY then all the statistics aresummarized at the total run level If a control area field is defined then allstatistics will be summarized for each unique CONTROL AREA as encountered bythe EDITCR program on the input file Statistics by total run level will notbe available This in part relates back to previous discussions citing theneed for new internal identifiers Report listings may contain the values ofentire records or entire questionnaires depending upon the keyword used inthe report generation commands The problem centers upon the homogeneity of CONCOR printouts during a production run
It is virtually impossible to distinguish reports on the basis of thevolumes they were run against Some means should be provided to allow users touniquely and purposefully label the reports generated in this division Indeedthe whole name REPORT-DIVISION suggests that such a command is implicit andappropriate Such a LABEL-REPORT or REPORT-FILE command along with file inforshymation from the system should not be difficult to implement
FIGURE 3
C O N C o R PAGE
SC0 N S I S T E N C Y A N D C-0 R R E C t I ON ) RUN 6AE 0hi88n
E D I T A N D I M P U T ATI 0 N S Y S T EM
USER DICTIONARY DIVISION-SOURCE LISTING
LINE NUMBER
7070
71
72 DEFINE-RECORD RECORDNAME= HOUSING-RECORDi 72
73 MAX-STORAGE=- 999 73
74 RLrRD-TYPE 11t NOTE AN LITERAL 74
267
71
P6T
268 DEFINE-RECORD RECORD-NAME= POPULATION-RECORD p68
P69MAX-STORAGE= 999269
270RECORD-TYPE= F1270
IEF2R5I JES2JOB04966SO0114 SYSOUT iEF2851 SYS8nO18T095553RAOOOAMCP12ARO000005 PASSED
= IEF2851 VOL qFP NOS TmPOn2 IEF285I wHKHDGMCONCORP12AERRORRECS KEPT
= IEF2851 VOL SEP NOS TMPO03 IEF3731 STEP LOAD START 800181001
IEF3741 STEP LOAD STOP 800181002 CPU OMIN 0165SEC SRR flMIN O25SEC VIRT iO00K SYS 3
STEP START TIME - 100155 STEP END TIME - 100257 STEP CPU TIME - 165 SECONDS STEP COMPLETION CODE - C 0
O TAPES - 0 DISKS - 0 PEGION (REQUESTED - 1000K USED - 1000K) IO COUNT 211
CARDS READ - 0 CARDS PUNCHED - 0 LINES PRINTE6 - 22
STEPNAME - LOAD CHARGE (MACHINE UNITS - 1198 DOLLARS- S-77 5TONUM9E 1
11
Concluding Remarks of System Modifications
Version 1 of the COBOL CONCOR language was notorious for abending without the generation of diagnostic messages Appendix I sets forth a test of some of these historical problems In eac) instance the system provided messagesand source program line references wLch enabled immediate identification of the processing difficulties However tests of the various commands revealed the general high performance character of the language in its newest version Under these circumstances the recommended modifica-ions should be considered as ones which will uniformly allow CONCOR to realize t-r potential of its design enhance its utility and moreover endure as a software product
12
IV THE ADEQUACY OF COBOL CONCOR DOCUMENTATION
The current documentation of CONCOR consists of a newly developed relativelyvoluminous systems manual a Diagnostic Message Guide and some loose-leaf inshystructions concerning the installation of the system on IBM computers Of these three documents the Diagnostic Message Guide is clearly complete and adequateand could bear no changes except to provide message number tabs to enable a programmer to more quickly locate the text of a message number An example of the superior format of this guide appears as Appendix J
Conspicuously absent from the list of systems documents is a Users Guide as originally specified in the ISPC enhancement proposal
A thorough assessment of the Systems Manual has been undertaken and indishycates with some reorganization and editing it could be made a more useful and usable document as itis well constituted informationally In its current form the Systems Manual can be best described as a working document a document which was not intended to teach the CONCOR language but centered upon an audishyence already familiar with previous CONCOR releases Further it is not asystems manual per se as some of its components and appendixes could more appropriately compose a users guide For example excerpts from the POPSTAN publications which review the basic editing concepts and contain the POPSTAN example problem should be deleted from the systems manual and would be more contextually effective in the Users Guide The EXECUTION-DIVISION chapter of the Systems Manual would especially benefit from general editing and reorganishyzation (Specifically the contents of pages 90-92 should be moved to page 63 -shythese pages explain the placement and purpose of the three permissible routines of the CONCOR language Had the author been granted more time he could have outlined additional specific modifications to bring existing documentation to a level of adequacy) However some guidelines stand out
1 Installation instruction and considerations should compose a chapter of the Systems Manual rather than a separate document (Hand-out materials covering installation were not adequate) Further a documentation scheme setting forth how CONCOR was installed on each computer and any modifications required during installation could be serially added to and become a permanent part of the installations systems manual If CONCOR is going to be used and if agencies are going to actively support it as a package installation limitations must be known and well-documented The prescribed format for this documentation should be set forth and any modifishycations to CONCOR should be continuously updated on these shcets so that a history of changes to the language will be readily accessible to supporting agencies If CONCOR is to survive as a standardized system editing results must be capable of being replicated among installations Such a form could contain other information as well concerning the tyce of computer amount of care and nature of utilities supporting
13
users Upon installation a copy of this form could be
sent to the US agency which will ultimately be responsible
for supporting the CONCOR package
an appendix2 A complete COBOL CONCOR program should appear in
for reference
3 The development of the Users Guide should include an intensive
review of the editing concepts involved in processing census
data files beyond the POPSTAN materials
4 An explanation of the CONCOR benchmark program syould appear
in the Users Guide and the Systems Manual The running of a
supplied benchmark program should be a standard installation
protocol used to test all operational aspects of a new
installation
This tool which has5 The development of a CONCOR pocket card been proven useful in teaching and assisting programmers in
utilizing programming language lays out all commands options on
a single small card An example of such a pocket card is the
Such a small document is useful in clari-IBM Assembler Card fying the language and setting forth structural rules without
continual reference to full-size manuals
14
V CONCLUSION
In a situation of extreme need CONCOR could probably be utilized as it seems capable of performing its editing functiens Its usefulness as a data
cleaning tool is generally undisputed The importance of having CONCOR exshyhaustively tested by an independent agency cannot be over-emphasized in light of prior failures of the system to perform adequately Additionally core requirements and processing speed should be precisely determined One must be encouraged in the amount of progress which has been made in a relatively short amount of time towards the completion of the CONCOR package While it is clearly beyond the scope of this report to estimate the cost and the amount of time required to implement the herein outlined highly desirable changes to the CONCOR language and documentation and then exhaustively test the system it is believed that this process could be completed within a single quarter (The changes are of the nature that no structural redefinishytions of a systematic nature would be required) These enhancements would make the COBOL CONCOR package more complete consistent and reliable in the field ISPC has competently moved the system to the point that its comshypletion iswithin reach
Given the probable revision of the current Systems Reference Manual and the development of a suitable Users Guide document it is likely that this version of CONCOR could be taught to experienced programmers (unfamiliar with previous versions of the language) in as little as four weeks And while one is impressed by the potential power and overall simplicity of the COBOL CONCOR language it is difficult to envision individuals with no real computer programming experience utilizing its full capabilities Hence the importance of including a review of basic editing concepts in the documentation is undershylined
Further as documentation is often critical to the perceived ease of use of computer-based systems itwill be essential to have these various manuals examined independently prior to their general use
As previously mentioned Appendix G represents features of which the ISPC feels could have positive impact on CONCOR Depending upon the outcome of the testing recommended in previous chapters and the availability of funding supporting agencies may desire to look again at an assembler (ALC) version of the language
Overall COBOL CONCOR remains an intermediate product yet to realize its potential which is thought considerable Software development of this nature is characteristically of a long-term evolutionary design In light of this reality agencies should be prepared to evaluate such projects success or failure in terms other than that of immediate utility It is the philosophy underlying the system not the system itself which will ultimately be exported
APPENDIX A
Bucen Enforcement Proposal
APPENDIX A
BUCEN IN-HOUSE REVIEW ENHANCEMENT PROPOSAL
1 Easy to use interrecord referencing
2 Improved output file capabilities
A provide overflow protection on WRITE command
B provide OUTPUT command to create corrected questionnaire file that matches IP file data dictionary
3 Improvededit statistics reported (LISTERR)
A provide automatic (user-specified) area break
B provide options for compilation and displaying edit statistics at various levels
C provide automatic (user-specified) tolerance checking of error rates by area
D automatically capture IDs of areas failing tolerance check
4 Clean up known bugs in code
5 Comprehensive testing
6 Clean up and enhance documentation
A reference manual more examples error message guide
B installation guide
C systems manual
APPENDIX B
EVALUATIVE CRITERIA
APPEiUIX B [4 si 11 T q tII
UNITED STATES GOVERNMENshy
Pfern0 KGa 7 dll TO DSPOPFPSD Robert H Haladay
DATE December 3 1979
DSPOPDEIO Liliane Floge
SUBJECT Criteria for Evaluation of SuCen COBOL CONCOR September 30thVersion as presented at January 1930 orkshop
The following points should be considered when evaluatingCONCOR the COBOLediting system during participation at the January 19380 workshyshop
1 Are the Installation anual the System ilanual and theUsers anual ready for use Are these manuals adequate fortheir purposes Are the manuals clear and consistentthey be understood by subject-matter personnel Can
as well program ers
as
2 Is the editing system capable of impleaenting all thefeatures as described in the documenit_tion 3 Does the system appc3Ar to be ely debugged andtested and ready for release to dLvlooing conris 4 Does the system appear to be fast enough to edit a
census in a reasonable airount of time 5 hat size core does tine sys t- equire
6 CorFm-ent in general on coole te-nss usefulness andefficiency of ir-nease usethe n of by developingcountry prograrrers and other c nsus personnel
cc Su-a-- e Olds APiA SDi OrWJulio Ortuizar luotolaCelta Systems
APPENDIX C
WORKSHOP ITINERARY
APPENDIX C
CONCOR Workshop Schedule January 7-18 1980
U S Bureau of the Census International Statistical Programs Center
Scuderi Building Room 209 4235 28th Avenue Marlow Heights Maryland
Monday January 7
930 am shy 1000 Welcoming Remarks
Overview of Workshop
1000 - 1000 Introduction to CONCOR - Purpose and function
- History of development - General computer
requirements
115 - 200 System Description
-Constraints in design of CONCOR
-Basic subsystems of CONCOR
-User interactions with system
-Examples of outputs produced
1030 shy 1045 Break
1045 - 1200 Editing Concepts - Ways to interrogate
data - Ways to correct data
- Editing housing and
population data - POPSTAN
- Advantages of CONCOR
1200 shy 115 pm Break
200 - 230
230 - 245
245 - 325
User Program Organization -Divisions - Sections
- Routines - Commands
Break
Command Language
Description -Types of statements -Format -Syntax
is
2
Tuesday January 8
Dictionary Division Command Statements
115 pm-2 15 Input-Record-Section930 am-10
30 Punctuation - Define-RecordInput data referencing
Working-Data-SectionWorking data declara-- New-Datations - Array-DataInternal data represen-
tation and storage Break215 - 230
1030 - 1045 Break 230 - 325 Dictionary Examples
- Minimum dictionary1045 - 1200 Dictionary-Attributes-
structureSection - Maximum dictionary- Dictionary-Name
structure
- Hand out dictionaryFile-Section problem- Input-File
- Output-File - Write-File
- Error-File
1200 - 115 pm Break
Wednesday January 9
115 pm-2 1 5 Execution Division Command
930 am-1030 Discussion of Partici-Statements (continued)pants Dictionary
- Routines of Edit-Specishyproblems fications-Section
Free work time -Prolog-Routine -Filter-Routine -Epilog-Routine
- Types and functions 1030 - 1045 Break
of edit specification
commands1045 - 1200 Execution Division
Command Statements - Range- Punctuation - Assert- Subscripting
- Internal Identifiers - Report-Control-Section
-Generate-Edit-Statistics 215 - 230 Break-Count-Imputes
-Examples 230 - 325 - PassFail clauses
- List1200 - 115 pm Break
3
Thursday January 10
930 am-1030 Discussion of Problems - If115 pm-215
Free work time - UntilExit - Stop
1030 - 1045 Break 215 - 230 Break
1045 shy 1200 Edit Specification Command Statements 230 - 325 - Drecode
(continued) - Grecode
- Allocate - Update - Let
1200 - 115pm Break
Friday January 11
930 am-1030 Edit Specification 115 pm-3 25 Hand out Edit Specishyfications as problemCommand Statements
(continued) Free work time
- Output - Write
1030 - 1045 Break
1045 - 1200 Report Division Command Statements - Display-Control-
Section -Display-Edit-Statistics
- Tolerance-Control-Section -Error-Rate-Check -Reject-File
-Report Examples
1200 - 115 pm Break
I
29
Monday January 14
930 am-1030 Discuss procedures for running problems on computer
1030-1045 Break
1045-1200 Component Programs of the CONCOR system
1200- 115 pm Break
Tuesday January 15
930 am - 325 pm Free work time
Wednesday January 16
930 am 1200 Free work time
1200- 115 pm Break
115 pm-215 How to Install CONCOR on IBM 360370 OS
215- 230 Break
230-325 Free work time
Thursday January 17
930 am-325 Free work time
115 pm-215 Future CONCOR Design Considerations - for census processing - for survey processing
- manual correction system
215- 230 Break
230 - 245 Evaluation Guidelines
- Hand out evaluation forms
245 - 325 Free work time
Friday January 18
930 am-1030 Free work time 115-325 Free work time
1030 - 1045 Break
1045 - 1200 Group Discussion on CONCOR - submission of written evaluations Distribution of IBM installation tapes Presentation of Certificates to Participants
1200-115 pm Break
APPENDIX D
PARTICIPANTS
APPENDIX D
CONCOR WORKSHOP January 7 - 18 1980 Marlow Heights MD
PARTICIPANTS
KENYA
James Midianga Government Computer Centre Central Bureau of Statistics PO Box 30266 Nairobi Kenya
PANAMA
Omar Rivera Rios Data Processing Specialist Contraloria General de la Republica de Panama Apartado Postal 5213 Panama 5 Panama
PHILIPPINES Guida T Capellan OIC Computer Programming Division National Census and Statistics Office Solicarel Bldg 1 Magsaysay Blvd
Sta Mesa Manila Philippines
THAILAND
Angsumal Sunalai National Statistical Office Bangkok 1 Thailand
EGYPT
Farag Sedky Mourad Ghaleb CAPMAS (Centeral Agency for Public
Mobilization and Statistics)NASR City Cairo Egypt
L
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
ii
CONTENTS
Page
PREFACE
EXECUTIVE SUMMARY iii
I BACKGROUND 1
II THE ADEQUACY OF CONCOR 3
III PROPOSED CHANGES TO THE LANGUAGE STRUCTURE 5
IV THE ADEQUACY OF COBOL CONCOR DOCUMENTATION 12
V CONCLUSION 14
APPENDICES
Appendix A Appendix B Appendix C Appendix D Appendix E Appendix F Appendix G Appendix H Appendix I Appendix J
- Bucen Enforcement Proposal - Evaluative Criteria - Workshop Itinerary - Participants - CONCOR Evaluation Form - NewOld Command Comparisons - ISPC Future Enhancement List - CONCOR System Internal Variables - CONCOR-EDITOR Execution Statistics - Diagnostic Message Guide Example
0
EXECUTIVE SUMMARY
The December 1979 COBOL CONCOR (Version 2) is a much improved software package All commands appear to be functional however the system should be exhaustively tested by an independent agency prior to its general release This agency Should also precisely determine the systems relative speed and core processor requirements While the system (exclusive of documentation) could immediately be utilized in a situation of extreme need some CONCOR language coding inconsistencies detract from the learnability and exportashybility of the package and should be corrected Additionally there are other modifications or adjustments which would enhance the overall utility and productivity of the language in census and survey applications
System documentation continu~s to be a problem There is no Users Guide The Systems Manual though well-constituted informationally should be thoroughly reorganized in accordance with the guidelines set forth in this report
The staff of ISPC exhibited competence and professionalism in the conduct of the two-week workshop January 7-18 1980 ISPC generally is aware of both the potential and shortcomings of the CONCOR project The current CONCOR version makes a significant redesign of the overall system As a package it is in a state where its completion is within reach
I BACKGROUND
CONCOR (an acronym of Consistency and Correction) is best characterized as a software tool designed to expedite the processing of data files duringthe edit and imputation phase of population census and surveys As a metashycompiler written in the COBOL language the system reads and verifies CONCOR language statements to produce an executable EDITOR program The objectiveof this process is the creation of an error-free file which can be used at a later time for tabulation purposes
Since its release as Version 1 December 30 1978 numerous elements of the COBOL CONCOR system have undergone continual change and redefinition Infact the system has not been permitted to stand still for any period of time nor has it been exhaustively tested In June of 1979 ISPC suspendedthe further distribution of the COBOL CONCOR system This decision was based principally upon reports of the packages unsatisfactory performance at workshops held in Panama and Thailand ISPC upon their own initiative developed a proposal to overhaul CONCOR and its accompanying documentation This proposal is contained in Appendix A representing an ambitious undertakingWhile not all of the desired changes and capabilities could be implementedVersion 2 of December 1979 represents a significant managerial effort The questions are now whether COBOL CONCOR Version 2 will be a demonstrably adeshyquate sofrvare package -- a package capable of exportation to developingcountries -- a package requiring no further modification The purpose of this report is to address these critical issues In connection with this Appendix B sets forth the specific criteria around which such a discussion must evolve As this is not intended to be a compendium some of these broader issues will be immediately treated following chapters and appendiceswill qualify the exact nature of system altu ations already undertaken as well as further adjustments believed to be essential in realizing the goals of the systems philosophy
Workshop
During the period of January 7-18 1930 a workshop was held under the sponsorship of the ISPC to demonstrate the capabilities of the latest reshyvision of the COBOL CONCOR software package A schedule of events of this workshop is contained in Appendix C This workshop was intended to provideparticipants with the opportunity to program in the CONCOR language and to thereby test aspects of the system as individually appropriate A listing of the participants and the international organizations they represented is
EDITOR is the new name of the EXECUTOR module of previous language versions
A complete history of the development of COBOL CONCOR can be found in both the ACCENTER 1978 Version 1 and December 1979 Version 2 systemsmanuals as well as in previous consulting reports
2
contained in Appendix D During the concluding days of the workshop each participant was asked by ISPC to provide a written evaluation of the now-called December 1979 version of CONCOR This evaluation form Appendix E also inshycludes space for comments concerning the competence of the system documentation as well as any additional comments including these regarding the organization and clarity of workshop presentations It is assumed that in the near future summaries of these comments will be available to interested agencies
While virtually all instructional aspects of this two-week workshop were conducted in a highly professional manner -- a manner which revealed a high degree of coordination among staff members in their efforts -- there are several areas which future workshops may improve upon
1 All publications should be assembled in their entirety and proof-read prior to distribution
2 A complete CONCOR language program example and accompanying 110 documents should be provided at the onset of the workshyshop for reference
3 Numerous short application programming problems involving all CONCOR language divisions should be utilized in place of a single lengthy problem
It is noted that this workshop was not intended to teach the CONCOR language as the organization and presentation of materials probably would have been different It is believed that the two-week time period was sufficient time to provide participants a familiarity with the use of the new CONCOR features especially in light of the fact that workshop participants were permitted to work weekends and beyond normal working hours at their disshycretion Though funding was not generally available it is known that several workshop members chose to extend their stay inWashington to continue testing the COMCOR package or to work on projects which they could attempt to immedishyately install on their home computers At the conclusion of the workshops participants were permitted to take with them an installation tape of CONCOR as well as all the other materials they had acquired during the course of the project
3
II THE ADEQUACY OF CONCOR
CONCOR has been described by its designers as an adequate packageAdequacy as an evaluative criteria is often relative to need and should not be confused with readiness as an issue The CONCOR system exclusive of documentation is sufficiently corplete that in a situation of extreme need it could be used as a data-cleaning tool in the editing and imputation phaseof census processing Less extreme circumstances would impose reticence on such an endorsement Though non-exhaustive tests indicate that CONCOR appearsto be capable of performing all of the commands as implemented because of the rapidness with which the system was rewritten it is thought that there has not been enough time to fully test all aspects of the project Thereforeprior to its general dissemination it is recommended that an independent agency conduct exhaustive tests to certify the integrity of the system proshygrams The importance of this certification cannot be understated in lightof previous workshop experiences Concurrent with this testing process the same agency should determine the relative speed and size of the system under actual production circumstances and further determine CONCORs ease of nstalshylation Later sections of this discussion set forth additional testing recom endations
It is generally recognized that of all the data-cleaning tools available for exportation CONCOR is potentially the most powerful especially with the addition of its new commands as outlined in Appendix F While its utility is not in doubt one must ask the question of how much more useful could CONCOR be if modified and would this additional utility be worth the costs involved The nature of modifications (excluding documentation) to COBOL CONCOR approprishyate at this time for cnsideration are threefold
1 Adjustments to the elements of the system which are internallyinconsistent or awkward to facilitate its learnability and usability am ig developing country programmers
a Implementation of the division headings DICTIONARY-DIVISION EXECUTION-DIVISION and REPORT-DIVISION De-emphasis of section headings
b Improvement of the consistency among data identifiers allow alphanumeric variables to be coded without mandatory comshyparison strings throughout the DATA-DIVISION and to be of the same length of numeric variables Permit numeric identifiers to be of an equal length to NEW DATA identishyfiers Permit the coding of single dimension row and column vectors in the same manner as multi-dimensional arrays
2 Implementation of selective commands and internal variables to facilitate the production environment use of CONCOR in census applications These include
4
a LOADUNLOAD arrays Commands which would save and replace automatically hot-decked values from batch to batch
b TOTAL-QUESTIONNAIRE-COUNT-RECORD-COUNT internal
variables independent of AREA CONTROL
3 Other modifications
a Default values for max-storage parameter set in realistic range
b Allowance of more variables for survey applications
Some of these modifications are part of what ISPC calls its wish list for the future development of CONCOR This document has been included in this report as Appendix G It is arguable that these features are essential to the completion of the CONCOR package While it is beyond the scope of this report to draw a conclusion in this area the enhancements as outlined above are ones that would make the language more internally consistent and thereby easier to learn and apply to a census data production environment These modifications are not arbitrary or cosmetic but are a direct result of handsshyon programming experience in the language as well as observations and disshycussions with other workshop participants While it is probably impossible to ever be satisfied with the overall structure of any programming language the resolution of this issue of completeness must be made relative to the objecshytives for developing the COBOL CONCOR system in the first place An explicit statement of these objectivEs has been absent in all systems documentation to date
5
III PROPOSED CHANGES TO THE LANGUAGE STRUCTURE
Based upon the assumption that it is the intent uf sponso-ing agenciesto optimize the COBOL CONCOR package -- a goal which is believed currentlyobtainable -- an understanding of the nature of these changes and how theywould impact users is essential Appendix F sets forth in a comparative manner differences between the old December 1978 and the new December 1979 editions of CONCOR Studying this appendix obviates the fact that while the new version of the language is clearly superior to the old in nearly everyaspect the basic and overall structure of the language is essentially unshychanged Compartmentalization of aspects of the language into divisions represents a significant ideological enhancement to the language Indeeddevelopment of programs by divisions proved to be an extremely useful way of understanding the nature of editing work to be performed However note that while the END-DIVISION comnand is essential to the language the division headings DICTIONARY-DIVISION EXECUTION-DIVISION and REPORT-DIVISION were not implemented and are therefore preceded by a period to be treated as comment lines in the program listing It is inconsistent to implement END-DIVISION commands while not implementing the division headings It is believed that this division structuring is important enough to the overall organizational structure of a CONCOR source language program that it should be implementedprior to general distribution The section headers shown on the figures inAppendix F however are another matter They are cumbersome and were generally not coded by workshop participants and they could be deleted from this version of the language altogether with little loss in organizational understanding The CONCOR language is significantly powerful to stand on its own as a distinct product and is not meant to be a COBOL imitation its present degree of development and specialization do not warrant the structural drag of additional section identifiers is the probable intent of the originalCONCOR project was to develop a package which was uncomplicated and unwieldy to use The question of division and section names implementation while seemingly cosmetic can have real impact on its perceived easiness of learning and use
Figure 1 on the following page illustrates common mistakes programmersmake coding numeric and alphanumeric variables in the DATA-DIVISION These mistakes are the result of the inconsistent variable formats For instancein the numeric data definition statement it is permissible to specify 19-23 where N signifies numeric 9 signifies the length of the item and 23 specifiesthe starting position in the record In NEW-DATA however it is possible to code an item with a maximum length of 18 While on the surface this inconshysistency would seem harmless typically some data defined user variables in NEW-DATA defined N18 could be moved inadvertently to output record fields defined by a data definition statement 119-23 such an action would result in a data error Under certain circumstances itwould be highly desirable to output these larger length values A similar circumstance exists between the numeric and alphanumeric data coding conventions While the maximum lengthof the numeric is permitted to be 9 in the data definition statement (18 in NEW-DATA) the maximum alphanumeric variable is permitted to be only 4 characters in length In the current systems manual it is recommended that
FIGURE 1
DICTIONARY-DIVISION
DICTIONARY-NAME DATA-CODING-EXAMPLE
INPUT-FILE
OUTPUT-FILE
AREA-CONTROL N2-2 N2-4 N3-6 N2-9 N9-23 QUESTIONNAIRE-CONTROL A4-2 A3-6 A2-9 A3-11 A3-14
RECORD-CONTROL Al-l
DEFINE-RECORD
HOI-TYPE-OF-HOUISING-UNIT Nl-17
H02-MATERIAL-OF-ROOF N1-19 10 9
H03-TOTAL-PERSONS-IN-UNIT N8-40 NOT-NUMERIC BLAIK
1104-STATE-OF-UIIII-CODE A4-50 0 U 1 D
DEFINE-RECORD
P01-SEX 1-13 W F
NEW-DATA
NOI-SAVE-TYPE-OF-HOUSING-UNIT
N02-SAVE-TYPE-OF-ROOF 1
N03-COUNT-TOTAL-IN-UNITS 10 0
N04-AGGREGATE-INCOME 18 0
END-DIVISION
Explanations
N2-4 This is an example of an external numeric input data item (N) with a length of 2 bytes starting in column 4 of the input record The maximumlength of this type of variable outside of NEW-DATA is 9 When coded in
NEW-DATA 18 is permitted
A4-2 This is an example of an external alphanumeric input data item (A)
with a length of 4 bytes starting in column 2 of the input record This
construction for alphanumeric variable is valid only in the control stateshyments Additionally it can never be over 4 bytes in length When alphshynumeric data fields are defined within record types the EDITOR program
requires that the comparison strings always be specified A maximum of 3 is permitted The purpose of these strings is to force recode the data to a numeric value If no match is found EDITOR automatically assigns a unique negative value to the field
7
alphanumeric coding be utilized in the QUESTIONNAIRE-CONTROL and RECORD-CONTROL statements where each input data item must be of the same data type as shown in the example When alphanumeric data variables are used in these control stateshyments their construction is identical to that of numeric items However when used elsewhere in the DATA-DIVISION alphanumeric variables are required to specify one of three possible comparison values as shown There are number of production instances when it never would be necessary or even desirable to reshycode alphanumeric data However as CONCOR attempts to force data into a totally numeric format upon output there is no current way to preserve these values if desired
An unwieldy alternative to this situation which may be acceptable under some circumstances would be the expansion of the number of comparison stringsfrom three to a more realistic number The limitation of this compromise is that a full twenty-six comparison identifiers would be required in order to accommodate data which utilized the entire alphabet A better solutionhowever would be to make the general format of the alphanumeric variables identical to that of numeric identifiers ie A9-23 and to permit alphashynumeric values so defined to pass unaltered through the CONCOR system
Anocher data-naming convention which caused several errors and which could be corrected concerns the array data definitional statements While arraysof two and more dimensions are handled in a superior manner by the CONCOR proshygram single-dimension arrays pose a problem in coding as shown in the Figure 2 It is suggested that the command imperatives be changed to permit the codingof both rows and columns in single dimension arrays ie allow a single row vector as well as a single column vector to maintain the consistel -yof the array data definitional statements
A major requirement of COBOL CONCOR file processing concerns the fact that all related data records must be physically contiguous on the input file The implication of this requirement is that files may require preprocessing prior to actual data editing (This preprocessing is usually a sort routine upon a selected CONTROL-AREA key) While this type of processing merely introduces a new step in file processing a major limitation becomes apparent when a largenumber of DISCRETE DATA files of the same census or survey questionnaire are to be processed This limitation is the introduction of manual steps to save the most recent inputed values ie preventing the program from startingwith cold values each batch run If a command such as LOADUNLOAD ARRAYS was incorporated into the language (an enhancement not believed to be difficult to implement) manual processing would be reduced to a minimum between batches and the maximum benefits of the hot deck methodology would be realized It is envisioned that such a command would automatically insure the transfer of the appropriately designated hot values Automatic processing of this nature if done correctly can greatly reduce the time required to clean multishyvolume files for once CONCOR language statements have been compiled linked
While it is possible at this time to save the arrays that amp-e used in the imputation processes on a separate write-file right now it is not possibleto automatically load those values back to an object program and to iTmedishyately resume processing on another volume It isbelieved that suh an automatic feature of the language would cut down the manual processing time significantly enough that it warrants inclusion into the package prior to its general distribution
FIGURE 2
A05-DI FF-BETWEEN-AGE-OF-FEMALE-BY-RECATION v2 4 4
AGE iF LHUSBAND RELATION Connents The ARRAY-DATA command statement
12-L7 18-24 25-35 36+ provides the means to declare array identifiers
2 1 3v 4v HEAD with up to five dimensions Current documentation 2 -1 3 CHILD is not as explicite about the rules of this 1 31 -2 -4 OTHER command as is desirable The parameters of I 2 2 2 NONRFLATTVE the command should function as follows
user-identifier number of dimensions D R C M number of rows number of columns
magnitude of element intiial start up valuesA06-DI FF-BETI4EEN-ArE-DF-PERSON-AND-M4OTHER 114
(This coding generates the below In the example A05 is a two dimensional array 16 18 21 23 error message) with 4 rows 4 columns a default magvitude of 9
and cold deck values as labeled
A06-DI FF- ETWEEN-AGE-OF-PERSON-AND-MOTHER 11t 587
I 2 III jqARNINIGDD-207) COMMAND TERMINATOR I) NOT FOUND C) ASSUMED PRESENT (2) EPROR (DD-9lI) DIMENSION OF USER-SPECIFIED ARRAY IS LESS THAN THE MINIMUM VALUE PERMITTFD (2)
PREEV1OUS DIAGIOSTIC AT CINIE 563
As shown by the array variable A06 CONCORs treatment of vectors is not consistent with the above multidimensional array skeme ie
(Example of how vectors must be currently A06 must be coded as follows coded to be correct)
A06-DIFF-BETWEEN-AGE-OF- PERSON-AND-MOTHERtl42 user-identifier 1 dimension number of elements in vector magnitude of element initial start up values
A simple modification to this command would permit 6 Lthe coding of both row and column vectors and make
16 LB 21 23 this command less error prone
0 0 0
9
and stored as an object module on the system no other compilations should berequired for questionnaire processing files of the same type Theoreticallya single well-written CONCOR program is all that would be required to process an entire census run
Appendix H contrasts the internal identifiers of the old and new languageversions Without such identifiers a user would have little information about the status of input as it is processed by EDITOR As noted in theappendix most internal pointers are reset upon each break in the CONTROL-AREAprovided a CONTROL-AREA has been defined The limitation here is that there are obvious instances when the termination in the processing mode would beadvantageous based on run counts although a CONTROL-AREA has been specifiedeg debugging CONCOR programs or comparing input files Therefore another set of pointers should be implemented for this purpose and made available for programmer reference
One clearly disturbing development which needs to be pursued during inshydepth testing of the system concerns the MAX-STORAGE parameters of the DEFINE-RECORD statement As shown in the figure on the following page when MAX-STORAGE was set equal to the maximum value a COBOL program was generated whichrequired 1O00K of core to run The MAX-STORAGE value of 999 is clearly notrealistic under most processing circumstances This example drives home severalimportant points about CONCOR The core requiremenis of CONCOR generated proshygrams can be influenced significantly by the amount or nature of programmerspecified I0 operations In fact it is possible to generate a program of a size most foreign country machines could not process It is recommended that tests determine a realistic max-value restriction for implementation to prevent problems in this area
The final area of recommended modification concerns the newly implementedREPORT-DIVISION The purpose of the REPORT-DIVISION is to enable a user todescribe or specify certain CONCOR language statements which will generatestatistical reports These reports contain statistics generated by EDITOR as specified by the GENERATE-EDIT-STATISTICS command of the EXECUTION4-DIVISIONAll of the reports produced are organized according to the data fields definedby the AREA-CONTROL command of the DATA-DICTIONARY If the AREA-CONTROL command is not defined in the DATA-DICTIONARY then all the statistics aresummarized at the total run level If a control area field is defined then allstatistics will be summarized for each unique CONTROL AREA as encountered bythe EDITCR program on the input file Statistics by total run level will notbe available This in part relates back to previous discussions citing theneed for new internal identifiers Report listings may contain the values ofentire records or entire questionnaires depending upon the keyword used inthe report generation commands The problem centers upon the homogeneity of CONCOR printouts during a production run
It is virtually impossible to distinguish reports on the basis of thevolumes they were run against Some means should be provided to allow users touniquely and purposefully label the reports generated in this division Indeedthe whole name REPORT-DIVISION suggests that such a command is implicit andappropriate Such a LABEL-REPORT or REPORT-FILE command along with file inforshymation from the system should not be difficult to implement
FIGURE 3
C O N C o R PAGE
SC0 N S I S T E N C Y A N D C-0 R R E C t I ON ) RUN 6AE 0hi88n
E D I T A N D I M P U T ATI 0 N S Y S T EM
USER DICTIONARY DIVISION-SOURCE LISTING
LINE NUMBER
7070
71
72 DEFINE-RECORD RECORDNAME= HOUSING-RECORDi 72
73 MAX-STORAGE=- 999 73
74 RLrRD-TYPE 11t NOTE AN LITERAL 74
267
71
P6T
268 DEFINE-RECORD RECORD-NAME= POPULATION-RECORD p68
P69MAX-STORAGE= 999269
270RECORD-TYPE= F1270
IEF2R5I JES2JOB04966SO0114 SYSOUT iEF2851 SYS8nO18T095553RAOOOAMCP12ARO000005 PASSED
= IEF2851 VOL qFP NOS TmPOn2 IEF285I wHKHDGMCONCORP12AERRORRECS KEPT
= IEF2851 VOL SEP NOS TMPO03 IEF3731 STEP LOAD START 800181001
IEF3741 STEP LOAD STOP 800181002 CPU OMIN 0165SEC SRR flMIN O25SEC VIRT iO00K SYS 3
STEP START TIME - 100155 STEP END TIME - 100257 STEP CPU TIME - 165 SECONDS STEP COMPLETION CODE - C 0
O TAPES - 0 DISKS - 0 PEGION (REQUESTED - 1000K USED - 1000K) IO COUNT 211
CARDS READ - 0 CARDS PUNCHED - 0 LINES PRINTE6 - 22
STEPNAME - LOAD CHARGE (MACHINE UNITS - 1198 DOLLARS- S-77 5TONUM9E 1
11
Concluding Remarks of System Modifications
Version 1 of the COBOL CONCOR language was notorious for abending without the generation of diagnostic messages Appendix I sets forth a test of some of these historical problems In eac) instance the system provided messagesand source program line references wLch enabled immediate identification of the processing difficulties However tests of the various commands revealed the general high performance character of the language in its newest version Under these circumstances the recommended modifica-ions should be considered as ones which will uniformly allow CONCOR to realize t-r potential of its design enhance its utility and moreover endure as a software product
12
IV THE ADEQUACY OF COBOL CONCOR DOCUMENTATION
The current documentation of CONCOR consists of a newly developed relativelyvoluminous systems manual a Diagnostic Message Guide and some loose-leaf inshystructions concerning the installation of the system on IBM computers Of these three documents the Diagnostic Message Guide is clearly complete and adequateand could bear no changes except to provide message number tabs to enable a programmer to more quickly locate the text of a message number An example of the superior format of this guide appears as Appendix J
Conspicuously absent from the list of systems documents is a Users Guide as originally specified in the ISPC enhancement proposal
A thorough assessment of the Systems Manual has been undertaken and indishycates with some reorganization and editing it could be made a more useful and usable document as itis well constituted informationally In its current form the Systems Manual can be best described as a working document a document which was not intended to teach the CONCOR language but centered upon an audishyence already familiar with previous CONCOR releases Further it is not asystems manual per se as some of its components and appendixes could more appropriately compose a users guide For example excerpts from the POPSTAN publications which review the basic editing concepts and contain the POPSTAN example problem should be deleted from the systems manual and would be more contextually effective in the Users Guide The EXECUTION-DIVISION chapter of the Systems Manual would especially benefit from general editing and reorganishyzation (Specifically the contents of pages 90-92 should be moved to page 63 -shythese pages explain the placement and purpose of the three permissible routines of the CONCOR language Had the author been granted more time he could have outlined additional specific modifications to bring existing documentation to a level of adequacy) However some guidelines stand out
1 Installation instruction and considerations should compose a chapter of the Systems Manual rather than a separate document (Hand-out materials covering installation were not adequate) Further a documentation scheme setting forth how CONCOR was installed on each computer and any modifications required during installation could be serially added to and become a permanent part of the installations systems manual If CONCOR is going to be used and if agencies are going to actively support it as a package installation limitations must be known and well-documented The prescribed format for this documentation should be set forth and any modifishycations to CONCOR should be continuously updated on these shcets so that a history of changes to the language will be readily accessible to supporting agencies If CONCOR is to survive as a standardized system editing results must be capable of being replicated among installations Such a form could contain other information as well concerning the tyce of computer amount of care and nature of utilities supporting
13
users Upon installation a copy of this form could be
sent to the US agency which will ultimately be responsible
for supporting the CONCOR package
an appendix2 A complete COBOL CONCOR program should appear in
for reference
3 The development of the Users Guide should include an intensive
review of the editing concepts involved in processing census
data files beyond the POPSTAN materials
4 An explanation of the CONCOR benchmark program syould appear
in the Users Guide and the Systems Manual The running of a
supplied benchmark program should be a standard installation
protocol used to test all operational aspects of a new
installation
This tool which has5 The development of a CONCOR pocket card been proven useful in teaching and assisting programmers in
utilizing programming language lays out all commands options on
a single small card An example of such a pocket card is the
Such a small document is useful in clari-IBM Assembler Card fying the language and setting forth structural rules without
continual reference to full-size manuals
14
V CONCLUSION
In a situation of extreme need CONCOR could probably be utilized as it seems capable of performing its editing functiens Its usefulness as a data
cleaning tool is generally undisputed The importance of having CONCOR exshyhaustively tested by an independent agency cannot be over-emphasized in light of prior failures of the system to perform adequately Additionally core requirements and processing speed should be precisely determined One must be encouraged in the amount of progress which has been made in a relatively short amount of time towards the completion of the CONCOR package While it is clearly beyond the scope of this report to estimate the cost and the amount of time required to implement the herein outlined highly desirable changes to the CONCOR language and documentation and then exhaustively test the system it is believed that this process could be completed within a single quarter (The changes are of the nature that no structural redefinishytions of a systematic nature would be required) These enhancements would make the COBOL CONCOR package more complete consistent and reliable in the field ISPC has competently moved the system to the point that its comshypletion iswithin reach
Given the probable revision of the current Systems Reference Manual and the development of a suitable Users Guide document it is likely that this version of CONCOR could be taught to experienced programmers (unfamiliar with previous versions of the language) in as little as four weeks And while one is impressed by the potential power and overall simplicity of the COBOL CONCOR language it is difficult to envision individuals with no real computer programming experience utilizing its full capabilities Hence the importance of including a review of basic editing concepts in the documentation is undershylined
Further as documentation is often critical to the perceived ease of use of computer-based systems itwill be essential to have these various manuals examined independently prior to their general use
As previously mentioned Appendix G represents features of which the ISPC feels could have positive impact on CONCOR Depending upon the outcome of the testing recommended in previous chapters and the availability of funding supporting agencies may desire to look again at an assembler (ALC) version of the language
Overall COBOL CONCOR remains an intermediate product yet to realize its potential which is thought considerable Software development of this nature is characteristically of a long-term evolutionary design In light of this reality agencies should be prepared to evaluate such projects success or failure in terms other than that of immediate utility It is the philosophy underlying the system not the system itself which will ultimately be exported
APPENDIX A
Bucen Enforcement Proposal
APPENDIX A
BUCEN IN-HOUSE REVIEW ENHANCEMENT PROPOSAL
1 Easy to use interrecord referencing
2 Improved output file capabilities
A provide overflow protection on WRITE command
B provide OUTPUT command to create corrected questionnaire file that matches IP file data dictionary
3 Improvededit statistics reported (LISTERR)
A provide automatic (user-specified) area break
B provide options for compilation and displaying edit statistics at various levels
C provide automatic (user-specified) tolerance checking of error rates by area
D automatically capture IDs of areas failing tolerance check
4 Clean up known bugs in code
5 Comprehensive testing
6 Clean up and enhance documentation
A reference manual more examples error message guide
B installation guide
C systems manual
APPENDIX B
EVALUATIVE CRITERIA
APPEiUIX B [4 si 11 T q tII
UNITED STATES GOVERNMENshy
Pfern0 KGa 7 dll TO DSPOPFPSD Robert H Haladay
DATE December 3 1979
DSPOPDEIO Liliane Floge
SUBJECT Criteria for Evaluation of SuCen COBOL CONCOR September 30thVersion as presented at January 1930 orkshop
The following points should be considered when evaluatingCONCOR the COBOLediting system during participation at the January 19380 workshyshop
1 Are the Installation anual the System ilanual and theUsers anual ready for use Are these manuals adequate fortheir purposes Are the manuals clear and consistentthey be understood by subject-matter personnel Can
as well program ers
as
2 Is the editing system capable of impleaenting all thefeatures as described in the documenit_tion 3 Does the system appc3Ar to be ely debugged andtested and ready for release to dLvlooing conris 4 Does the system appear to be fast enough to edit a
census in a reasonable airount of time 5 hat size core does tine sys t- equire
6 CorFm-ent in general on coole te-nss usefulness andefficiency of ir-nease usethe n of by developingcountry prograrrers and other c nsus personnel
cc Su-a-- e Olds APiA SDi OrWJulio Ortuizar luotolaCelta Systems
APPENDIX C
WORKSHOP ITINERARY
APPENDIX C
CONCOR Workshop Schedule January 7-18 1980
U S Bureau of the Census International Statistical Programs Center
Scuderi Building Room 209 4235 28th Avenue Marlow Heights Maryland
Monday January 7
930 am shy 1000 Welcoming Remarks
Overview of Workshop
1000 - 1000 Introduction to CONCOR - Purpose and function
- History of development - General computer
requirements
115 - 200 System Description
-Constraints in design of CONCOR
-Basic subsystems of CONCOR
-User interactions with system
-Examples of outputs produced
1030 shy 1045 Break
1045 - 1200 Editing Concepts - Ways to interrogate
data - Ways to correct data
- Editing housing and
population data - POPSTAN
- Advantages of CONCOR
1200 shy 115 pm Break
200 - 230
230 - 245
245 - 325
User Program Organization -Divisions - Sections
- Routines - Commands
Break
Command Language
Description -Types of statements -Format -Syntax
is
2
Tuesday January 8
Dictionary Division Command Statements
115 pm-2 15 Input-Record-Section930 am-10
30 Punctuation - Define-RecordInput data referencing
Working-Data-SectionWorking data declara-- New-Datations - Array-DataInternal data represen-
tation and storage Break215 - 230
1030 - 1045 Break 230 - 325 Dictionary Examples
- Minimum dictionary1045 - 1200 Dictionary-Attributes-
structureSection - Maximum dictionary- Dictionary-Name
structure
- Hand out dictionaryFile-Section problem- Input-File
- Output-File - Write-File
- Error-File
1200 - 115 pm Break
Wednesday January 9
115 pm-2 1 5 Execution Division Command
930 am-1030 Discussion of Partici-Statements (continued)pants Dictionary
- Routines of Edit-Specishyproblems fications-Section
Free work time -Prolog-Routine -Filter-Routine -Epilog-Routine
- Types and functions 1030 - 1045 Break
of edit specification
commands1045 - 1200 Execution Division
Command Statements - Range- Punctuation - Assert- Subscripting
- Internal Identifiers - Report-Control-Section
-Generate-Edit-Statistics 215 - 230 Break-Count-Imputes
-Examples 230 - 325 - PassFail clauses
- List1200 - 115 pm Break
3
Thursday January 10
930 am-1030 Discussion of Problems - If115 pm-215
Free work time - UntilExit - Stop
1030 - 1045 Break 215 - 230 Break
1045 shy 1200 Edit Specification Command Statements 230 - 325 - Drecode
(continued) - Grecode
- Allocate - Update - Let
1200 - 115pm Break
Friday January 11
930 am-1030 Edit Specification 115 pm-3 25 Hand out Edit Specishyfications as problemCommand Statements
(continued) Free work time
- Output - Write
1030 - 1045 Break
1045 - 1200 Report Division Command Statements - Display-Control-
Section -Display-Edit-Statistics
- Tolerance-Control-Section -Error-Rate-Check -Reject-File
-Report Examples
1200 - 115 pm Break
I
29
Monday January 14
930 am-1030 Discuss procedures for running problems on computer
1030-1045 Break
1045-1200 Component Programs of the CONCOR system
1200- 115 pm Break
Tuesday January 15
930 am - 325 pm Free work time
Wednesday January 16
930 am 1200 Free work time
1200- 115 pm Break
115 pm-215 How to Install CONCOR on IBM 360370 OS
215- 230 Break
230-325 Free work time
Thursday January 17
930 am-325 Free work time
115 pm-215 Future CONCOR Design Considerations - for census processing - for survey processing
- manual correction system
215- 230 Break
230 - 245 Evaluation Guidelines
- Hand out evaluation forms
245 - 325 Free work time
Friday January 18
930 am-1030 Free work time 115-325 Free work time
1030 - 1045 Break
1045 - 1200 Group Discussion on CONCOR - submission of written evaluations Distribution of IBM installation tapes Presentation of Certificates to Participants
1200-115 pm Break
APPENDIX D
PARTICIPANTS
APPENDIX D
CONCOR WORKSHOP January 7 - 18 1980 Marlow Heights MD
PARTICIPANTS
KENYA
James Midianga Government Computer Centre Central Bureau of Statistics PO Box 30266 Nairobi Kenya
PANAMA
Omar Rivera Rios Data Processing Specialist Contraloria General de la Republica de Panama Apartado Postal 5213 Panama 5 Panama
PHILIPPINES Guida T Capellan OIC Computer Programming Division National Census and Statistics Office Solicarel Bldg 1 Magsaysay Blvd
Sta Mesa Manila Philippines
THAILAND
Angsumal Sunalai National Statistical Office Bangkok 1 Thailand
EGYPT
Farag Sedky Mourad Ghaleb CAPMAS (Centeral Agency for Public
Mobilization and Statistics)NASR City Cairo Egypt
L
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
0
EXECUTIVE SUMMARY
The December 1979 COBOL CONCOR (Version 2) is a much improved software package All commands appear to be functional however the system should be exhaustively tested by an independent agency prior to its general release This agency Should also precisely determine the systems relative speed and core processor requirements While the system (exclusive of documentation) could immediately be utilized in a situation of extreme need some CONCOR language coding inconsistencies detract from the learnability and exportashybility of the package and should be corrected Additionally there are other modifications or adjustments which would enhance the overall utility and productivity of the language in census and survey applications
System documentation continu~s to be a problem There is no Users Guide The Systems Manual though well-constituted informationally should be thoroughly reorganized in accordance with the guidelines set forth in this report
The staff of ISPC exhibited competence and professionalism in the conduct of the two-week workshop January 7-18 1980 ISPC generally is aware of both the potential and shortcomings of the CONCOR project The current CONCOR version makes a significant redesign of the overall system As a package it is in a state where its completion is within reach
I BACKGROUND
CONCOR (an acronym of Consistency and Correction) is best characterized as a software tool designed to expedite the processing of data files duringthe edit and imputation phase of population census and surveys As a metashycompiler written in the COBOL language the system reads and verifies CONCOR language statements to produce an executable EDITOR program The objectiveof this process is the creation of an error-free file which can be used at a later time for tabulation purposes
Since its release as Version 1 December 30 1978 numerous elements of the COBOL CONCOR system have undergone continual change and redefinition Infact the system has not been permitted to stand still for any period of time nor has it been exhaustively tested In June of 1979 ISPC suspendedthe further distribution of the COBOL CONCOR system This decision was based principally upon reports of the packages unsatisfactory performance at workshops held in Panama and Thailand ISPC upon their own initiative developed a proposal to overhaul CONCOR and its accompanying documentation This proposal is contained in Appendix A representing an ambitious undertakingWhile not all of the desired changes and capabilities could be implementedVersion 2 of December 1979 represents a significant managerial effort The questions are now whether COBOL CONCOR Version 2 will be a demonstrably adeshyquate sofrvare package -- a package capable of exportation to developingcountries -- a package requiring no further modification The purpose of this report is to address these critical issues In connection with this Appendix B sets forth the specific criteria around which such a discussion must evolve As this is not intended to be a compendium some of these broader issues will be immediately treated following chapters and appendiceswill qualify the exact nature of system altu ations already undertaken as well as further adjustments believed to be essential in realizing the goals of the systems philosophy
Workshop
During the period of January 7-18 1930 a workshop was held under the sponsorship of the ISPC to demonstrate the capabilities of the latest reshyvision of the COBOL CONCOR software package A schedule of events of this workshop is contained in Appendix C This workshop was intended to provideparticipants with the opportunity to program in the CONCOR language and to thereby test aspects of the system as individually appropriate A listing of the participants and the international organizations they represented is
EDITOR is the new name of the EXECUTOR module of previous language versions
A complete history of the development of COBOL CONCOR can be found in both the ACCENTER 1978 Version 1 and December 1979 Version 2 systemsmanuals as well as in previous consulting reports
2
contained in Appendix D During the concluding days of the workshop each participant was asked by ISPC to provide a written evaluation of the now-called December 1979 version of CONCOR This evaluation form Appendix E also inshycludes space for comments concerning the competence of the system documentation as well as any additional comments including these regarding the organization and clarity of workshop presentations It is assumed that in the near future summaries of these comments will be available to interested agencies
While virtually all instructional aspects of this two-week workshop were conducted in a highly professional manner -- a manner which revealed a high degree of coordination among staff members in their efforts -- there are several areas which future workshops may improve upon
1 All publications should be assembled in their entirety and proof-read prior to distribution
2 A complete CONCOR language program example and accompanying 110 documents should be provided at the onset of the workshyshop for reference
3 Numerous short application programming problems involving all CONCOR language divisions should be utilized in place of a single lengthy problem
It is noted that this workshop was not intended to teach the CONCOR language as the organization and presentation of materials probably would have been different It is believed that the two-week time period was sufficient time to provide participants a familiarity with the use of the new CONCOR features especially in light of the fact that workshop participants were permitted to work weekends and beyond normal working hours at their disshycretion Though funding was not generally available it is known that several workshop members chose to extend their stay inWashington to continue testing the COMCOR package or to work on projects which they could attempt to immedishyately install on their home computers At the conclusion of the workshops participants were permitted to take with them an installation tape of CONCOR as well as all the other materials they had acquired during the course of the project
3
II THE ADEQUACY OF CONCOR
CONCOR has been described by its designers as an adequate packageAdequacy as an evaluative criteria is often relative to need and should not be confused with readiness as an issue The CONCOR system exclusive of documentation is sufficiently corplete that in a situation of extreme need it could be used as a data-cleaning tool in the editing and imputation phaseof census processing Less extreme circumstances would impose reticence on such an endorsement Though non-exhaustive tests indicate that CONCOR appearsto be capable of performing all of the commands as implemented because of the rapidness with which the system was rewritten it is thought that there has not been enough time to fully test all aspects of the project Thereforeprior to its general dissemination it is recommended that an independent agency conduct exhaustive tests to certify the integrity of the system proshygrams The importance of this certification cannot be understated in lightof previous workshop experiences Concurrent with this testing process the same agency should determine the relative speed and size of the system under actual production circumstances and further determine CONCORs ease of nstalshylation Later sections of this discussion set forth additional testing recom endations
It is generally recognized that of all the data-cleaning tools available for exportation CONCOR is potentially the most powerful especially with the addition of its new commands as outlined in Appendix F While its utility is not in doubt one must ask the question of how much more useful could CONCOR be if modified and would this additional utility be worth the costs involved The nature of modifications (excluding documentation) to COBOL CONCOR approprishyate at this time for cnsideration are threefold
1 Adjustments to the elements of the system which are internallyinconsistent or awkward to facilitate its learnability and usability am ig developing country programmers
a Implementation of the division headings DICTIONARY-DIVISION EXECUTION-DIVISION and REPORT-DIVISION De-emphasis of section headings
b Improvement of the consistency among data identifiers allow alphanumeric variables to be coded without mandatory comshyparison strings throughout the DATA-DIVISION and to be of the same length of numeric variables Permit numeric identifiers to be of an equal length to NEW DATA identishyfiers Permit the coding of single dimension row and column vectors in the same manner as multi-dimensional arrays
2 Implementation of selective commands and internal variables to facilitate the production environment use of CONCOR in census applications These include
4
a LOADUNLOAD arrays Commands which would save and replace automatically hot-decked values from batch to batch
b TOTAL-QUESTIONNAIRE-COUNT-RECORD-COUNT internal
variables independent of AREA CONTROL
3 Other modifications
a Default values for max-storage parameter set in realistic range
b Allowance of more variables for survey applications
Some of these modifications are part of what ISPC calls its wish list for the future development of CONCOR This document has been included in this report as Appendix G It is arguable that these features are essential to the completion of the CONCOR package While it is beyond the scope of this report to draw a conclusion in this area the enhancements as outlined above are ones that would make the language more internally consistent and thereby easier to learn and apply to a census data production environment These modifications are not arbitrary or cosmetic but are a direct result of handsshyon programming experience in the language as well as observations and disshycussions with other workshop participants While it is probably impossible to ever be satisfied with the overall structure of any programming language the resolution of this issue of completeness must be made relative to the objecshytives for developing the COBOL CONCOR system in the first place An explicit statement of these objectivEs has been absent in all systems documentation to date
5
III PROPOSED CHANGES TO THE LANGUAGE STRUCTURE
Based upon the assumption that it is the intent uf sponso-ing agenciesto optimize the COBOL CONCOR package -- a goal which is believed currentlyobtainable -- an understanding of the nature of these changes and how theywould impact users is essential Appendix F sets forth in a comparative manner differences between the old December 1978 and the new December 1979 editions of CONCOR Studying this appendix obviates the fact that while the new version of the language is clearly superior to the old in nearly everyaspect the basic and overall structure of the language is essentially unshychanged Compartmentalization of aspects of the language into divisions represents a significant ideological enhancement to the language Indeeddevelopment of programs by divisions proved to be an extremely useful way of understanding the nature of editing work to be performed However note that while the END-DIVISION comnand is essential to the language the division headings DICTIONARY-DIVISION EXECUTION-DIVISION and REPORT-DIVISION were not implemented and are therefore preceded by a period to be treated as comment lines in the program listing It is inconsistent to implement END-DIVISION commands while not implementing the division headings It is believed that this division structuring is important enough to the overall organizational structure of a CONCOR source language program that it should be implementedprior to general distribution The section headers shown on the figures inAppendix F however are another matter They are cumbersome and were generally not coded by workshop participants and they could be deleted from this version of the language altogether with little loss in organizational understanding The CONCOR language is significantly powerful to stand on its own as a distinct product and is not meant to be a COBOL imitation its present degree of development and specialization do not warrant the structural drag of additional section identifiers is the probable intent of the originalCONCOR project was to develop a package which was uncomplicated and unwieldy to use The question of division and section names implementation while seemingly cosmetic can have real impact on its perceived easiness of learning and use
Figure 1 on the following page illustrates common mistakes programmersmake coding numeric and alphanumeric variables in the DATA-DIVISION These mistakes are the result of the inconsistent variable formats For instancein the numeric data definition statement it is permissible to specify 19-23 where N signifies numeric 9 signifies the length of the item and 23 specifiesthe starting position in the record In NEW-DATA however it is possible to code an item with a maximum length of 18 While on the surface this inconshysistency would seem harmless typically some data defined user variables in NEW-DATA defined N18 could be moved inadvertently to output record fields defined by a data definition statement 119-23 such an action would result in a data error Under certain circumstances itwould be highly desirable to output these larger length values A similar circumstance exists between the numeric and alphanumeric data coding conventions While the maximum lengthof the numeric is permitted to be 9 in the data definition statement (18 in NEW-DATA) the maximum alphanumeric variable is permitted to be only 4 characters in length In the current systems manual it is recommended that
FIGURE 1
DICTIONARY-DIVISION
DICTIONARY-NAME DATA-CODING-EXAMPLE
INPUT-FILE
OUTPUT-FILE
AREA-CONTROL N2-2 N2-4 N3-6 N2-9 N9-23 QUESTIONNAIRE-CONTROL A4-2 A3-6 A2-9 A3-11 A3-14
RECORD-CONTROL Al-l
DEFINE-RECORD
HOI-TYPE-OF-HOUISING-UNIT Nl-17
H02-MATERIAL-OF-ROOF N1-19 10 9
H03-TOTAL-PERSONS-IN-UNIT N8-40 NOT-NUMERIC BLAIK
1104-STATE-OF-UIIII-CODE A4-50 0 U 1 D
DEFINE-RECORD
P01-SEX 1-13 W F
NEW-DATA
NOI-SAVE-TYPE-OF-HOUSING-UNIT
N02-SAVE-TYPE-OF-ROOF 1
N03-COUNT-TOTAL-IN-UNITS 10 0
N04-AGGREGATE-INCOME 18 0
END-DIVISION
Explanations
N2-4 This is an example of an external numeric input data item (N) with a length of 2 bytes starting in column 4 of the input record The maximumlength of this type of variable outside of NEW-DATA is 9 When coded in
NEW-DATA 18 is permitted
A4-2 This is an example of an external alphanumeric input data item (A)
with a length of 4 bytes starting in column 2 of the input record This
construction for alphanumeric variable is valid only in the control stateshyments Additionally it can never be over 4 bytes in length When alphshynumeric data fields are defined within record types the EDITOR program
requires that the comparison strings always be specified A maximum of 3 is permitted The purpose of these strings is to force recode the data to a numeric value If no match is found EDITOR automatically assigns a unique negative value to the field
7
alphanumeric coding be utilized in the QUESTIONNAIRE-CONTROL and RECORD-CONTROL statements where each input data item must be of the same data type as shown in the example When alphanumeric data variables are used in these control stateshyments their construction is identical to that of numeric items However when used elsewhere in the DATA-DIVISION alphanumeric variables are required to specify one of three possible comparison values as shown There are number of production instances when it never would be necessary or even desirable to reshycode alphanumeric data However as CONCOR attempts to force data into a totally numeric format upon output there is no current way to preserve these values if desired
An unwieldy alternative to this situation which may be acceptable under some circumstances would be the expansion of the number of comparison stringsfrom three to a more realistic number The limitation of this compromise is that a full twenty-six comparison identifiers would be required in order to accommodate data which utilized the entire alphabet A better solutionhowever would be to make the general format of the alphanumeric variables identical to that of numeric identifiers ie A9-23 and to permit alphashynumeric values so defined to pass unaltered through the CONCOR system
Anocher data-naming convention which caused several errors and which could be corrected concerns the array data definitional statements While arraysof two and more dimensions are handled in a superior manner by the CONCOR proshygram single-dimension arrays pose a problem in coding as shown in the Figure 2 It is suggested that the command imperatives be changed to permit the codingof both rows and columns in single dimension arrays ie allow a single row vector as well as a single column vector to maintain the consistel -yof the array data definitional statements
A major requirement of COBOL CONCOR file processing concerns the fact that all related data records must be physically contiguous on the input file The implication of this requirement is that files may require preprocessing prior to actual data editing (This preprocessing is usually a sort routine upon a selected CONTROL-AREA key) While this type of processing merely introduces a new step in file processing a major limitation becomes apparent when a largenumber of DISCRETE DATA files of the same census or survey questionnaire are to be processed This limitation is the introduction of manual steps to save the most recent inputed values ie preventing the program from startingwith cold values each batch run If a command such as LOADUNLOAD ARRAYS was incorporated into the language (an enhancement not believed to be difficult to implement) manual processing would be reduced to a minimum between batches and the maximum benefits of the hot deck methodology would be realized It is envisioned that such a command would automatically insure the transfer of the appropriately designated hot values Automatic processing of this nature if done correctly can greatly reduce the time required to clean multishyvolume files for once CONCOR language statements have been compiled linked
While it is possible at this time to save the arrays that amp-e used in the imputation processes on a separate write-file right now it is not possibleto automatically load those values back to an object program and to iTmedishyately resume processing on another volume It isbelieved that suh an automatic feature of the language would cut down the manual processing time significantly enough that it warrants inclusion into the package prior to its general distribution
FIGURE 2
A05-DI FF-BETWEEN-AGE-OF-FEMALE-BY-RECATION v2 4 4
AGE iF LHUSBAND RELATION Connents The ARRAY-DATA command statement
12-L7 18-24 25-35 36+ provides the means to declare array identifiers
2 1 3v 4v HEAD with up to five dimensions Current documentation 2 -1 3 CHILD is not as explicite about the rules of this 1 31 -2 -4 OTHER command as is desirable The parameters of I 2 2 2 NONRFLATTVE the command should function as follows
user-identifier number of dimensions D R C M number of rows number of columns
magnitude of element intiial start up valuesA06-DI FF-BETI4EEN-ArE-DF-PERSON-AND-M4OTHER 114
(This coding generates the below In the example A05 is a two dimensional array 16 18 21 23 error message) with 4 rows 4 columns a default magvitude of 9
and cold deck values as labeled
A06-DI FF- ETWEEN-AGE-OF-PERSON-AND-MOTHER 11t 587
I 2 III jqARNINIGDD-207) COMMAND TERMINATOR I) NOT FOUND C) ASSUMED PRESENT (2) EPROR (DD-9lI) DIMENSION OF USER-SPECIFIED ARRAY IS LESS THAN THE MINIMUM VALUE PERMITTFD (2)
PREEV1OUS DIAGIOSTIC AT CINIE 563
As shown by the array variable A06 CONCORs treatment of vectors is not consistent with the above multidimensional array skeme ie
(Example of how vectors must be currently A06 must be coded as follows coded to be correct)
A06-DIFF-BETWEEN-AGE-OF- PERSON-AND-MOTHERtl42 user-identifier 1 dimension number of elements in vector magnitude of element initial start up values
A simple modification to this command would permit 6 Lthe coding of both row and column vectors and make
16 LB 21 23 this command less error prone
0 0 0
9
and stored as an object module on the system no other compilations should berequired for questionnaire processing files of the same type Theoreticallya single well-written CONCOR program is all that would be required to process an entire census run
Appendix H contrasts the internal identifiers of the old and new languageversions Without such identifiers a user would have little information about the status of input as it is processed by EDITOR As noted in theappendix most internal pointers are reset upon each break in the CONTROL-AREAprovided a CONTROL-AREA has been defined The limitation here is that there are obvious instances when the termination in the processing mode would beadvantageous based on run counts although a CONTROL-AREA has been specifiedeg debugging CONCOR programs or comparing input files Therefore another set of pointers should be implemented for this purpose and made available for programmer reference
One clearly disturbing development which needs to be pursued during inshydepth testing of the system concerns the MAX-STORAGE parameters of the DEFINE-RECORD statement As shown in the figure on the following page when MAX-STORAGE was set equal to the maximum value a COBOL program was generated whichrequired 1O00K of core to run The MAX-STORAGE value of 999 is clearly notrealistic under most processing circumstances This example drives home severalimportant points about CONCOR The core requiremenis of CONCOR generated proshygrams can be influenced significantly by the amount or nature of programmerspecified I0 operations In fact it is possible to generate a program of a size most foreign country machines could not process It is recommended that tests determine a realistic max-value restriction for implementation to prevent problems in this area
The final area of recommended modification concerns the newly implementedREPORT-DIVISION The purpose of the REPORT-DIVISION is to enable a user todescribe or specify certain CONCOR language statements which will generatestatistical reports These reports contain statistics generated by EDITOR as specified by the GENERATE-EDIT-STATISTICS command of the EXECUTION4-DIVISIONAll of the reports produced are organized according to the data fields definedby the AREA-CONTROL command of the DATA-DICTIONARY If the AREA-CONTROL command is not defined in the DATA-DICTIONARY then all the statistics aresummarized at the total run level If a control area field is defined then allstatistics will be summarized for each unique CONTROL AREA as encountered bythe EDITCR program on the input file Statistics by total run level will notbe available This in part relates back to previous discussions citing theneed for new internal identifiers Report listings may contain the values ofentire records or entire questionnaires depending upon the keyword used inthe report generation commands The problem centers upon the homogeneity of CONCOR printouts during a production run
It is virtually impossible to distinguish reports on the basis of thevolumes they were run against Some means should be provided to allow users touniquely and purposefully label the reports generated in this division Indeedthe whole name REPORT-DIVISION suggests that such a command is implicit andappropriate Such a LABEL-REPORT or REPORT-FILE command along with file inforshymation from the system should not be difficult to implement
FIGURE 3
C O N C o R PAGE
SC0 N S I S T E N C Y A N D C-0 R R E C t I ON ) RUN 6AE 0hi88n
E D I T A N D I M P U T ATI 0 N S Y S T EM
USER DICTIONARY DIVISION-SOURCE LISTING
LINE NUMBER
7070
71
72 DEFINE-RECORD RECORDNAME= HOUSING-RECORDi 72
73 MAX-STORAGE=- 999 73
74 RLrRD-TYPE 11t NOTE AN LITERAL 74
267
71
P6T
268 DEFINE-RECORD RECORD-NAME= POPULATION-RECORD p68
P69MAX-STORAGE= 999269
270RECORD-TYPE= F1270
IEF2R5I JES2JOB04966SO0114 SYSOUT iEF2851 SYS8nO18T095553RAOOOAMCP12ARO000005 PASSED
= IEF2851 VOL qFP NOS TmPOn2 IEF285I wHKHDGMCONCORP12AERRORRECS KEPT
= IEF2851 VOL SEP NOS TMPO03 IEF3731 STEP LOAD START 800181001
IEF3741 STEP LOAD STOP 800181002 CPU OMIN 0165SEC SRR flMIN O25SEC VIRT iO00K SYS 3
STEP START TIME - 100155 STEP END TIME - 100257 STEP CPU TIME - 165 SECONDS STEP COMPLETION CODE - C 0
O TAPES - 0 DISKS - 0 PEGION (REQUESTED - 1000K USED - 1000K) IO COUNT 211
CARDS READ - 0 CARDS PUNCHED - 0 LINES PRINTE6 - 22
STEPNAME - LOAD CHARGE (MACHINE UNITS - 1198 DOLLARS- S-77 5TONUM9E 1
11
Concluding Remarks of System Modifications
Version 1 of the COBOL CONCOR language was notorious for abending without the generation of diagnostic messages Appendix I sets forth a test of some of these historical problems In eac) instance the system provided messagesand source program line references wLch enabled immediate identification of the processing difficulties However tests of the various commands revealed the general high performance character of the language in its newest version Under these circumstances the recommended modifica-ions should be considered as ones which will uniformly allow CONCOR to realize t-r potential of its design enhance its utility and moreover endure as a software product
12
IV THE ADEQUACY OF COBOL CONCOR DOCUMENTATION
The current documentation of CONCOR consists of a newly developed relativelyvoluminous systems manual a Diagnostic Message Guide and some loose-leaf inshystructions concerning the installation of the system on IBM computers Of these three documents the Diagnostic Message Guide is clearly complete and adequateand could bear no changes except to provide message number tabs to enable a programmer to more quickly locate the text of a message number An example of the superior format of this guide appears as Appendix J
Conspicuously absent from the list of systems documents is a Users Guide as originally specified in the ISPC enhancement proposal
A thorough assessment of the Systems Manual has been undertaken and indishycates with some reorganization and editing it could be made a more useful and usable document as itis well constituted informationally In its current form the Systems Manual can be best described as a working document a document which was not intended to teach the CONCOR language but centered upon an audishyence already familiar with previous CONCOR releases Further it is not asystems manual per se as some of its components and appendixes could more appropriately compose a users guide For example excerpts from the POPSTAN publications which review the basic editing concepts and contain the POPSTAN example problem should be deleted from the systems manual and would be more contextually effective in the Users Guide The EXECUTION-DIVISION chapter of the Systems Manual would especially benefit from general editing and reorganishyzation (Specifically the contents of pages 90-92 should be moved to page 63 -shythese pages explain the placement and purpose of the three permissible routines of the CONCOR language Had the author been granted more time he could have outlined additional specific modifications to bring existing documentation to a level of adequacy) However some guidelines stand out
1 Installation instruction and considerations should compose a chapter of the Systems Manual rather than a separate document (Hand-out materials covering installation were not adequate) Further a documentation scheme setting forth how CONCOR was installed on each computer and any modifications required during installation could be serially added to and become a permanent part of the installations systems manual If CONCOR is going to be used and if agencies are going to actively support it as a package installation limitations must be known and well-documented The prescribed format for this documentation should be set forth and any modifishycations to CONCOR should be continuously updated on these shcets so that a history of changes to the language will be readily accessible to supporting agencies If CONCOR is to survive as a standardized system editing results must be capable of being replicated among installations Such a form could contain other information as well concerning the tyce of computer amount of care and nature of utilities supporting
13
users Upon installation a copy of this form could be
sent to the US agency which will ultimately be responsible
for supporting the CONCOR package
an appendix2 A complete COBOL CONCOR program should appear in
for reference
3 The development of the Users Guide should include an intensive
review of the editing concepts involved in processing census
data files beyond the POPSTAN materials
4 An explanation of the CONCOR benchmark program syould appear
in the Users Guide and the Systems Manual The running of a
supplied benchmark program should be a standard installation
protocol used to test all operational aspects of a new
installation
This tool which has5 The development of a CONCOR pocket card been proven useful in teaching and assisting programmers in
utilizing programming language lays out all commands options on
a single small card An example of such a pocket card is the
Such a small document is useful in clari-IBM Assembler Card fying the language and setting forth structural rules without
continual reference to full-size manuals
14
V CONCLUSION
In a situation of extreme need CONCOR could probably be utilized as it seems capable of performing its editing functiens Its usefulness as a data
cleaning tool is generally undisputed The importance of having CONCOR exshyhaustively tested by an independent agency cannot be over-emphasized in light of prior failures of the system to perform adequately Additionally core requirements and processing speed should be precisely determined One must be encouraged in the amount of progress which has been made in a relatively short amount of time towards the completion of the CONCOR package While it is clearly beyond the scope of this report to estimate the cost and the amount of time required to implement the herein outlined highly desirable changes to the CONCOR language and documentation and then exhaustively test the system it is believed that this process could be completed within a single quarter (The changes are of the nature that no structural redefinishytions of a systematic nature would be required) These enhancements would make the COBOL CONCOR package more complete consistent and reliable in the field ISPC has competently moved the system to the point that its comshypletion iswithin reach
Given the probable revision of the current Systems Reference Manual and the development of a suitable Users Guide document it is likely that this version of CONCOR could be taught to experienced programmers (unfamiliar with previous versions of the language) in as little as four weeks And while one is impressed by the potential power and overall simplicity of the COBOL CONCOR language it is difficult to envision individuals with no real computer programming experience utilizing its full capabilities Hence the importance of including a review of basic editing concepts in the documentation is undershylined
Further as documentation is often critical to the perceived ease of use of computer-based systems itwill be essential to have these various manuals examined independently prior to their general use
As previously mentioned Appendix G represents features of which the ISPC feels could have positive impact on CONCOR Depending upon the outcome of the testing recommended in previous chapters and the availability of funding supporting agencies may desire to look again at an assembler (ALC) version of the language
Overall COBOL CONCOR remains an intermediate product yet to realize its potential which is thought considerable Software development of this nature is characteristically of a long-term evolutionary design In light of this reality agencies should be prepared to evaluate such projects success or failure in terms other than that of immediate utility It is the philosophy underlying the system not the system itself which will ultimately be exported
APPENDIX A
Bucen Enforcement Proposal
APPENDIX A
BUCEN IN-HOUSE REVIEW ENHANCEMENT PROPOSAL
1 Easy to use interrecord referencing
2 Improved output file capabilities
A provide overflow protection on WRITE command
B provide OUTPUT command to create corrected questionnaire file that matches IP file data dictionary
3 Improvededit statistics reported (LISTERR)
A provide automatic (user-specified) area break
B provide options for compilation and displaying edit statistics at various levels
C provide automatic (user-specified) tolerance checking of error rates by area
D automatically capture IDs of areas failing tolerance check
4 Clean up known bugs in code
5 Comprehensive testing
6 Clean up and enhance documentation
A reference manual more examples error message guide
B installation guide
C systems manual
APPENDIX B
EVALUATIVE CRITERIA
APPEiUIX B [4 si 11 T q tII
UNITED STATES GOVERNMENshy
Pfern0 KGa 7 dll TO DSPOPFPSD Robert H Haladay
DATE December 3 1979
DSPOPDEIO Liliane Floge
SUBJECT Criteria for Evaluation of SuCen COBOL CONCOR September 30thVersion as presented at January 1930 orkshop
The following points should be considered when evaluatingCONCOR the COBOLediting system during participation at the January 19380 workshyshop
1 Are the Installation anual the System ilanual and theUsers anual ready for use Are these manuals adequate fortheir purposes Are the manuals clear and consistentthey be understood by subject-matter personnel Can
as well program ers
as
2 Is the editing system capable of impleaenting all thefeatures as described in the documenit_tion 3 Does the system appc3Ar to be ely debugged andtested and ready for release to dLvlooing conris 4 Does the system appear to be fast enough to edit a
census in a reasonable airount of time 5 hat size core does tine sys t- equire
6 CorFm-ent in general on coole te-nss usefulness andefficiency of ir-nease usethe n of by developingcountry prograrrers and other c nsus personnel
cc Su-a-- e Olds APiA SDi OrWJulio Ortuizar luotolaCelta Systems
APPENDIX C
WORKSHOP ITINERARY
APPENDIX C
CONCOR Workshop Schedule January 7-18 1980
U S Bureau of the Census International Statistical Programs Center
Scuderi Building Room 209 4235 28th Avenue Marlow Heights Maryland
Monday January 7
930 am shy 1000 Welcoming Remarks
Overview of Workshop
1000 - 1000 Introduction to CONCOR - Purpose and function
- History of development - General computer
requirements
115 - 200 System Description
-Constraints in design of CONCOR
-Basic subsystems of CONCOR
-User interactions with system
-Examples of outputs produced
1030 shy 1045 Break
1045 - 1200 Editing Concepts - Ways to interrogate
data - Ways to correct data
- Editing housing and
population data - POPSTAN
- Advantages of CONCOR
1200 shy 115 pm Break
200 - 230
230 - 245
245 - 325
User Program Organization -Divisions - Sections
- Routines - Commands
Break
Command Language
Description -Types of statements -Format -Syntax
is
2
Tuesday January 8
Dictionary Division Command Statements
115 pm-2 15 Input-Record-Section930 am-10
30 Punctuation - Define-RecordInput data referencing
Working-Data-SectionWorking data declara-- New-Datations - Array-DataInternal data represen-
tation and storage Break215 - 230
1030 - 1045 Break 230 - 325 Dictionary Examples
- Minimum dictionary1045 - 1200 Dictionary-Attributes-
structureSection - Maximum dictionary- Dictionary-Name
structure
- Hand out dictionaryFile-Section problem- Input-File
- Output-File - Write-File
- Error-File
1200 - 115 pm Break
Wednesday January 9
115 pm-2 1 5 Execution Division Command
930 am-1030 Discussion of Partici-Statements (continued)pants Dictionary
- Routines of Edit-Specishyproblems fications-Section
Free work time -Prolog-Routine -Filter-Routine -Epilog-Routine
- Types and functions 1030 - 1045 Break
of edit specification
commands1045 - 1200 Execution Division
Command Statements - Range- Punctuation - Assert- Subscripting
- Internal Identifiers - Report-Control-Section
-Generate-Edit-Statistics 215 - 230 Break-Count-Imputes
-Examples 230 - 325 - PassFail clauses
- List1200 - 115 pm Break
3
Thursday January 10
930 am-1030 Discussion of Problems - If115 pm-215
Free work time - UntilExit - Stop
1030 - 1045 Break 215 - 230 Break
1045 shy 1200 Edit Specification Command Statements 230 - 325 - Drecode
(continued) - Grecode
- Allocate - Update - Let
1200 - 115pm Break
Friday January 11
930 am-1030 Edit Specification 115 pm-3 25 Hand out Edit Specishyfications as problemCommand Statements
(continued) Free work time
- Output - Write
1030 - 1045 Break
1045 - 1200 Report Division Command Statements - Display-Control-
Section -Display-Edit-Statistics
- Tolerance-Control-Section -Error-Rate-Check -Reject-File
-Report Examples
1200 - 115 pm Break
I
29
Monday January 14
930 am-1030 Discuss procedures for running problems on computer
1030-1045 Break
1045-1200 Component Programs of the CONCOR system
1200- 115 pm Break
Tuesday January 15
930 am - 325 pm Free work time
Wednesday January 16
930 am 1200 Free work time
1200- 115 pm Break
115 pm-215 How to Install CONCOR on IBM 360370 OS
215- 230 Break
230-325 Free work time
Thursday January 17
930 am-325 Free work time
115 pm-215 Future CONCOR Design Considerations - for census processing - for survey processing
- manual correction system
215- 230 Break
230 - 245 Evaluation Guidelines
- Hand out evaluation forms
245 - 325 Free work time
Friday January 18
930 am-1030 Free work time 115-325 Free work time
1030 - 1045 Break
1045 - 1200 Group Discussion on CONCOR - submission of written evaluations Distribution of IBM installation tapes Presentation of Certificates to Participants
1200-115 pm Break
APPENDIX D
PARTICIPANTS
APPENDIX D
CONCOR WORKSHOP January 7 - 18 1980 Marlow Heights MD
PARTICIPANTS
KENYA
James Midianga Government Computer Centre Central Bureau of Statistics PO Box 30266 Nairobi Kenya
PANAMA
Omar Rivera Rios Data Processing Specialist Contraloria General de la Republica de Panama Apartado Postal 5213 Panama 5 Panama
PHILIPPINES Guida T Capellan OIC Computer Programming Division National Census and Statistics Office Solicarel Bldg 1 Magsaysay Blvd
Sta Mesa Manila Philippines
THAILAND
Angsumal Sunalai National Statistical Office Bangkok 1 Thailand
EGYPT
Farag Sedky Mourad Ghaleb CAPMAS (Centeral Agency for Public
Mobilization and Statistics)NASR City Cairo Egypt
L
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
I BACKGROUND
CONCOR (an acronym of Consistency and Correction) is best characterized as a software tool designed to expedite the processing of data files duringthe edit and imputation phase of population census and surveys As a metashycompiler written in the COBOL language the system reads and verifies CONCOR language statements to produce an executable EDITOR program The objectiveof this process is the creation of an error-free file which can be used at a later time for tabulation purposes
Since its release as Version 1 December 30 1978 numerous elements of the COBOL CONCOR system have undergone continual change and redefinition Infact the system has not been permitted to stand still for any period of time nor has it been exhaustively tested In June of 1979 ISPC suspendedthe further distribution of the COBOL CONCOR system This decision was based principally upon reports of the packages unsatisfactory performance at workshops held in Panama and Thailand ISPC upon their own initiative developed a proposal to overhaul CONCOR and its accompanying documentation This proposal is contained in Appendix A representing an ambitious undertakingWhile not all of the desired changes and capabilities could be implementedVersion 2 of December 1979 represents a significant managerial effort The questions are now whether COBOL CONCOR Version 2 will be a demonstrably adeshyquate sofrvare package -- a package capable of exportation to developingcountries -- a package requiring no further modification The purpose of this report is to address these critical issues In connection with this Appendix B sets forth the specific criteria around which such a discussion must evolve As this is not intended to be a compendium some of these broader issues will be immediately treated following chapters and appendiceswill qualify the exact nature of system altu ations already undertaken as well as further adjustments believed to be essential in realizing the goals of the systems philosophy
Workshop
During the period of January 7-18 1930 a workshop was held under the sponsorship of the ISPC to demonstrate the capabilities of the latest reshyvision of the COBOL CONCOR software package A schedule of events of this workshop is contained in Appendix C This workshop was intended to provideparticipants with the opportunity to program in the CONCOR language and to thereby test aspects of the system as individually appropriate A listing of the participants and the international organizations they represented is
EDITOR is the new name of the EXECUTOR module of previous language versions
A complete history of the development of COBOL CONCOR can be found in both the ACCENTER 1978 Version 1 and December 1979 Version 2 systemsmanuals as well as in previous consulting reports
2
contained in Appendix D During the concluding days of the workshop each participant was asked by ISPC to provide a written evaluation of the now-called December 1979 version of CONCOR This evaluation form Appendix E also inshycludes space for comments concerning the competence of the system documentation as well as any additional comments including these regarding the organization and clarity of workshop presentations It is assumed that in the near future summaries of these comments will be available to interested agencies
While virtually all instructional aspects of this two-week workshop were conducted in a highly professional manner -- a manner which revealed a high degree of coordination among staff members in their efforts -- there are several areas which future workshops may improve upon
1 All publications should be assembled in their entirety and proof-read prior to distribution
2 A complete CONCOR language program example and accompanying 110 documents should be provided at the onset of the workshyshop for reference
3 Numerous short application programming problems involving all CONCOR language divisions should be utilized in place of a single lengthy problem
It is noted that this workshop was not intended to teach the CONCOR language as the organization and presentation of materials probably would have been different It is believed that the two-week time period was sufficient time to provide participants a familiarity with the use of the new CONCOR features especially in light of the fact that workshop participants were permitted to work weekends and beyond normal working hours at their disshycretion Though funding was not generally available it is known that several workshop members chose to extend their stay inWashington to continue testing the COMCOR package or to work on projects which they could attempt to immedishyately install on their home computers At the conclusion of the workshops participants were permitted to take with them an installation tape of CONCOR as well as all the other materials they had acquired during the course of the project
3
II THE ADEQUACY OF CONCOR
CONCOR has been described by its designers as an adequate packageAdequacy as an evaluative criteria is often relative to need and should not be confused with readiness as an issue The CONCOR system exclusive of documentation is sufficiently corplete that in a situation of extreme need it could be used as a data-cleaning tool in the editing and imputation phaseof census processing Less extreme circumstances would impose reticence on such an endorsement Though non-exhaustive tests indicate that CONCOR appearsto be capable of performing all of the commands as implemented because of the rapidness with which the system was rewritten it is thought that there has not been enough time to fully test all aspects of the project Thereforeprior to its general dissemination it is recommended that an independent agency conduct exhaustive tests to certify the integrity of the system proshygrams The importance of this certification cannot be understated in lightof previous workshop experiences Concurrent with this testing process the same agency should determine the relative speed and size of the system under actual production circumstances and further determine CONCORs ease of nstalshylation Later sections of this discussion set forth additional testing recom endations
It is generally recognized that of all the data-cleaning tools available for exportation CONCOR is potentially the most powerful especially with the addition of its new commands as outlined in Appendix F While its utility is not in doubt one must ask the question of how much more useful could CONCOR be if modified and would this additional utility be worth the costs involved The nature of modifications (excluding documentation) to COBOL CONCOR approprishyate at this time for cnsideration are threefold
1 Adjustments to the elements of the system which are internallyinconsistent or awkward to facilitate its learnability and usability am ig developing country programmers
a Implementation of the division headings DICTIONARY-DIVISION EXECUTION-DIVISION and REPORT-DIVISION De-emphasis of section headings
b Improvement of the consistency among data identifiers allow alphanumeric variables to be coded without mandatory comshyparison strings throughout the DATA-DIVISION and to be of the same length of numeric variables Permit numeric identifiers to be of an equal length to NEW DATA identishyfiers Permit the coding of single dimension row and column vectors in the same manner as multi-dimensional arrays
2 Implementation of selective commands and internal variables to facilitate the production environment use of CONCOR in census applications These include
4
a LOADUNLOAD arrays Commands which would save and replace automatically hot-decked values from batch to batch
b TOTAL-QUESTIONNAIRE-COUNT-RECORD-COUNT internal
variables independent of AREA CONTROL
3 Other modifications
a Default values for max-storage parameter set in realistic range
b Allowance of more variables for survey applications
Some of these modifications are part of what ISPC calls its wish list for the future development of CONCOR This document has been included in this report as Appendix G It is arguable that these features are essential to the completion of the CONCOR package While it is beyond the scope of this report to draw a conclusion in this area the enhancements as outlined above are ones that would make the language more internally consistent and thereby easier to learn and apply to a census data production environment These modifications are not arbitrary or cosmetic but are a direct result of handsshyon programming experience in the language as well as observations and disshycussions with other workshop participants While it is probably impossible to ever be satisfied with the overall structure of any programming language the resolution of this issue of completeness must be made relative to the objecshytives for developing the COBOL CONCOR system in the first place An explicit statement of these objectivEs has been absent in all systems documentation to date
5
III PROPOSED CHANGES TO THE LANGUAGE STRUCTURE
Based upon the assumption that it is the intent uf sponso-ing agenciesto optimize the COBOL CONCOR package -- a goal which is believed currentlyobtainable -- an understanding of the nature of these changes and how theywould impact users is essential Appendix F sets forth in a comparative manner differences between the old December 1978 and the new December 1979 editions of CONCOR Studying this appendix obviates the fact that while the new version of the language is clearly superior to the old in nearly everyaspect the basic and overall structure of the language is essentially unshychanged Compartmentalization of aspects of the language into divisions represents a significant ideological enhancement to the language Indeeddevelopment of programs by divisions proved to be an extremely useful way of understanding the nature of editing work to be performed However note that while the END-DIVISION comnand is essential to the language the division headings DICTIONARY-DIVISION EXECUTION-DIVISION and REPORT-DIVISION were not implemented and are therefore preceded by a period to be treated as comment lines in the program listing It is inconsistent to implement END-DIVISION commands while not implementing the division headings It is believed that this division structuring is important enough to the overall organizational structure of a CONCOR source language program that it should be implementedprior to general distribution The section headers shown on the figures inAppendix F however are another matter They are cumbersome and were generally not coded by workshop participants and they could be deleted from this version of the language altogether with little loss in organizational understanding The CONCOR language is significantly powerful to stand on its own as a distinct product and is not meant to be a COBOL imitation its present degree of development and specialization do not warrant the structural drag of additional section identifiers is the probable intent of the originalCONCOR project was to develop a package which was uncomplicated and unwieldy to use The question of division and section names implementation while seemingly cosmetic can have real impact on its perceived easiness of learning and use
Figure 1 on the following page illustrates common mistakes programmersmake coding numeric and alphanumeric variables in the DATA-DIVISION These mistakes are the result of the inconsistent variable formats For instancein the numeric data definition statement it is permissible to specify 19-23 where N signifies numeric 9 signifies the length of the item and 23 specifiesthe starting position in the record In NEW-DATA however it is possible to code an item with a maximum length of 18 While on the surface this inconshysistency would seem harmless typically some data defined user variables in NEW-DATA defined N18 could be moved inadvertently to output record fields defined by a data definition statement 119-23 such an action would result in a data error Under certain circumstances itwould be highly desirable to output these larger length values A similar circumstance exists between the numeric and alphanumeric data coding conventions While the maximum lengthof the numeric is permitted to be 9 in the data definition statement (18 in NEW-DATA) the maximum alphanumeric variable is permitted to be only 4 characters in length In the current systems manual it is recommended that
FIGURE 1
DICTIONARY-DIVISION
DICTIONARY-NAME DATA-CODING-EXAMPLE
INPUT-FILE
OUTPUT-FILE
AREA-CONTROL N2-2 N2-4 N3-6 N2-9 N9-23 QUESTIONNAIRE-CONTROL A4-2 A3-6 A2-9 A3-11 A3-14
RECORD-CONTROL Al-l
DEFINE-RECORD
HOI-TYPE-OF-HOUISING-UNIT Nl-17
H02-MATERIAL-OF-ROOF N1-19 10 9
H03-TOTAL-PERSONS-IN-UNIT N8-40 NOT-NUMERIC BLAIK
1104-STATE-OF-UIIII-CODE A4-50 0 U 1 D
DEFINE-RECORD
P01-SEX 1-13 W F
NEW-DATA
NOI-SAVE-TYPE-OF-HOUSING-UNIT
N02-SAVE-TYPE-OF-ROOF 1
N03-COUNT-TOTAL-IN-UNITS 10 0
N04-AGGREGATE-INCOME 18 0
END-DIVISION
Explanations
N2-4 This is an example of an external numeric input data item (N) with a length of 2 bytes starting in column 4 of the input record The maximumlength of this type of variable outside of NEW-DATA is 9 When coded in
NEW-DATA 18 is permitted
A4-2 This is an example of an external alphanumeric input data item (A)
with a length of 4 bytes starting in column 2 of the input record This
construction for alphanumeric variable is valid only in the control stateshyments Additionally it can never be over 4 bytes in length When alphshynumeric data fields are defined within record types the EDITOR program
requires that the comparison strings always be specified A maximum of 3 is permitted The purpose of these strings is to force recode the data to a numeric value If no match is found EDITOR automatically assigns a unique negative value to the field
7
alphanumeric coding be utilized in the QUESTIONNAIRE-CONTROL and RECORD-CONTROL statements where each input data item must be of the same data type as shown in the example When alphanumeric data variables are used in these control stateshyments their construction is identical to that of numeric items However when used elsewhere in the DATA-DIVISION alphanumeric variables are required to specify one of three possible comparison values as shown There are number of production instances when it never would be necessary or even desirable to reshycode alphanumeric data However as CONCOR attempts to force data into a totally numeric format upon output there is no current way to preserve these values if desired
An unwieldy alternative to this situation which may be acceptable under some circumstances would be the expansion of the number of comparison stringsfrom three to a more realistic number The limitation of this compromise is that a full twenty-six comparison identifiers would be required in order to accommodate data which utilized the entire alphabet A better solutionhowever would be to make the general format of the alphanumeric variables identical to that of numeric identifiers ie A9-23 and to permit alphashynumeric values so defined to pass unaltered through the CONCOR system
Anocher data-naming convention which caused several errors and which could be corrected concerns the array data definitional statements While arraysof two and more dimensions are handled in a superior manner by the CONCOR proshygram single-dimension arrays pose a problem in coding as shown in the Figure 2 It is suggested that the command imperatives be changed to permit the codingof both rows and columns in single dimension arrays ie allow a single row vector as well as a single column vector to maintain the consistel -yof the array data definitional statements
A major requirement of COBOL CONCOR file processing concerns the fact that all related data records must be physically contiguous on the input file The implication of this requirement is that files may require preprocessing prior to actual data editing (This preprocessing is usually a sort routine upon a selected CONTROL-AREA key) While this type of processing merely introduces a new step in file processing a major limitation becomes apparent when a largenumber of DISCRETE DATA files of the same census or survey questionnaire are to be processed This limitation is the introduction of manual steps to save the most recent inputed values ie preventing the program from startingwith cold values each batch run If a command such as LOADUNLOAD ARRAYS was incorporated into the language (an enhancement not believed to be difficult to implement) manual processing would be reduced to a minimum between batches and the maximum benefits of the hot deck methodology would be realized It is envisioned that such a command would automatically insure the transfer of the appropriately designated hot values Automatic processing of this nature if done correctly can greatly reduce the time required to clean multishyvolume files for once CONCOR language statements have been compiled linked
While it is possible at this time to save the arrays that amp-e used in the imputation processes on a separate write-file right now it is not possibleto automatically load those values back to an object program and to iTmedishyately resume processing on another volume It isbelieved that suh an automatic feature of the language would cut down the manual processing time significantly enough that it warrants inclusion into the package prior to its general distribution
FIGURE 2
A05-DI FF-BETWEEN-AGE-OF-FEMALE-BY-RECATION v2 4 4
AGE iF LHUSBAND RELATION Connents The ARRAY-DATA command statement
12-L7 18-24 25-35 36+ provides the means to declare array identifiers
2 1 3v 4v HEAD with up to five dimensions Current documentation 2 -1 3 CHILD is not as explicite about the rules of this 1 31 -2 -4 OTHER command as is desirable The parameters of I 2 2 2 NONRFLATTVE the command should function as follows
user-identifier number of dimensions D R C M number of rows number of columns
magnitude of element intiial start up valuesA06-DI FF-BETI4EEN-ArE-DF-PERSON-AND-M4OTHER 114
(This coding generates the below In the example A05 is a two dimensional array 16 18 21 23 error message) with 4 rows 4 columns a default magvitude of 9
and cold deck values as labeled
A06-DI FF- ETWEEN-AGE-OF-PERSON-AND-MOTHER 11t 587
I 2 III jqARNINIGDD-207) COMMAND TERMINATOR I) NOT FOUND C) ASSUMED PRESENT (2) EPROR (DD-9lI) DIMENSION OF USER-SPECIFIED ARRAY IS LESS THAN THE MINIMUM VALUE PERMITTFD (2)
PREEV1OUS DIAGIOSTIC AT CINIE 563
As shown by the array variable A06 CONCORs treatment of vectors is not consistent with the above multidimensional array skeme ie
(Example of how vectors must be currently A06 must be coded as follows coded to be correct)
A06-DIFF-BETWEEN-AGE-OF- PERSON-AND-MOTHERtl42 user-identifier 1 dimension number of elements in vector magnitude of element initial start up values
A simple modification to this command would permit 6 Lthe coding of both row and column vectors and make
16 LB 21 23 this command less error prone
0 0 0
9
and stored as an object module on the system no other compilations should berequired for questionnaire processing files of the same type Theoreticallya single well-written CONCOR program is all that would be required to process an entire census run
Appendix H contrasts the internal identifiers of the old and new languageversions Without such identifiers a user would have little information about the status of input as it is processed by EDITOR As noted in theappendix most internal pointers are reset upon each break in the CONTROL-AREAprovided a CONTROL-AREA has been defined The limitation here is that there are obvious instances when the termination in the processing mode would beadvantageous based on run counts although a CONTROL-AREA has been specifiedeg debugging CONCOR programs or comparing input files Therefore another set of pointers should be implemented for this purpose and made available for programmer reference
One clearly disturbing development which needs to be pursued during inshydepth testing of the system concerns the MAX-STORAGE parameters of the DEFINE-RECORD statement As shown in the figure on the following page when MAX-STORAGE was set equal to the maximum value a COBOL program was generated whichrequired 1O00K of core to run The MAX-STORAGE value of 999 is clearly notrealistic under most processing circumstances This example drives home severalimportant points about CONCOR The core requiremenis of CONCOR generated proshygrams can be influenced significantly by the amount or nature of programmerspecified I0 operations In fact it is possible to generate a program of a size most foreign country machines could not process It is recommended that tests determine a realistic max-value restriction for implementation to prevent problems in this area
The final area of recommended modification concerns the newly implementedREPORT-DIVISION The purpose of the REPORT-DIVISION is to enable a user todescribe or specify certain CONCOR language statements which will generatestatistical reports These reports contain statistics generated by EDITOR as specified by the GENERATE-EDIT-STATISTICS command of the EXECUTION4-DIVISIONAll of the reports produced are organized according to the data fields definedby the AREA-CONTROL command of the DATA-DICTIONARY If the AREA-CONTROL command is not defined in the DATA-DICTIONARY then all the statistics aresummarized at the total run level If a control area field is defined then allstatistics will be summarized for each unique CONTROL AREA as encountered bythe EDITCR program on the input file Statistics by total run level will notbe available This in part relates back to previous discussions citing theneed for new internal identifiers Report listings may contain the values ofentire records or entire questionnaires depending upon the keyword used inthe report generation commands The problem centers upon the homogeneity of CONCOR printouts during a production run
It is virtually impossible to distinguish reports on the basis of thevolumes they were run against Some means should be provided to allow users touniquely and purposefully label the reports generated in this division Indeedthe whole name REPORT-DIVISION suggests that such a command is implicit andappropriate Such a LABEL-REPORT or REPORT-FILE command along with file inforshymation from the system should not be difficult to implement
FIGURE 3
C O N C o R PAGE
SC0 N S I S T E N C Y A N D C-0 R R E C t I ON ) RUN 6AE 0hi88n
E D I T A N D I M P U T ATI 0 N S Y S T EM
USER DICTIONARY DIVISION-SOURCE LISTING
LINE NUMBER
7070
71
72 DEFINE-RECORD RECORDNAME= HOUSING-RECORDi 72
73 MAX-STORAGE=- 999 73
74 RLrRD-TYPE 11t NOTE AN LITERAL 74
267
71
P6T
268 DEFINE-RECORD RECORD-NAME= POPULATION-RECORD p68
P69MAX-STORAGE= 999269
270RECORD-TYPE= F1270
IEF2R5I JES2JOB04966SO0114 SYSOUT iEF2851 SYS8nO18T095553RAOOOAMCP12ARO000005 PASSED
= IEF2851 VOL qFP NOS TmPOn2 IEF285I wHKHDGMCONCORP12AERRORRECS KEPT
= IEF2851 VOL SEP NOS TMPO03 IEF3731 STEP LOAD START 800181001
IEF3741 STEP LOAD STOP 800181002 CPU OMIN 0165SEC SRR flMIN O25SEC VIRT iO00K SYS 3
STEP START TIME - 100155 STEP END TIME - 100257 STEP CPU TIME - 165 SECONDS STEP COMPLETION CODE - C 0
O TAPES - 0 DISKS - 0 PEGION (REQUESTED - 1000K USED - 1000K) IO COUNT 211
CARDS READ - 0 CARDS PUNCHED - 0 LINES PRINTE6 - 22
STEPNAME - LOAD CHARGE (MACHINE UNITS - 1198 DOLLARS- S-77 5TONUM9E 1
11
Concluding Remarks of System Modifications
Version 1 of the COBOL CONCOR language was notorious for abending without the generation of diagnostic messages Appendix I sets forth a test of some of these historical problems In eac) instance the system provided messagesand source program line references wLch enabled immediate identification of the processing difficulties However tests of the various commands revealed the general high performance character of the language in its newest version Under these circumstances the recommended modifica-ions should be considered as ones which will uniformly allow CONCOR to realize t-r potential of its design enhance its utility and moreover endure as a software product
12
IV THE ADEQUACY OF COBOL CONCOR DOCUMENTATION
The current documentation of CONCOR consists of a newly developed relativelyvoluminous systems manual a Diagnostic Message Guide and some loose-leaf inshystructions concerning the installation of the system on IBM computers Of these three documents the Diagnostic Message Guide is clearly complete and adequateand could bear no changes except to provide message number tabs to enable a programmer to more quickly locate the text of a message number An example of the superior format of this guide appears as Appendix J
Conspicuously absent from the list of systems documents is a Users Guide as originally specified in the ISPC enhancement proposal
A thorough assessment of the Systems Manual has been undertaken and indishycates with some reorganization and editing it could be made a more useful and usable document as itis well constituted informationally In its current form the Systems Manual can be best described as a working document a document which was not intended to teach the CONCOR language but centered upon an audishyence already familiar with previous CONCOR releases Further it is not asystems manual per se as some of its components and appendixes could more appropriately compose a users guide For example excerpts from the POPSTAN publications which review the basic editing concepts and contain the POPSTAN example problem should be deleted from the systems manual and would be more contextually effective in the Users Guide The EXECUTION-DIVISION chapter of the Systems Manual would especially benefit from general editing and reorganishyzation (Specifically the contents of pages 90-92 should be moved to page 63 -shythese pages explain the placement and purpose of the three permissible routines of the CONCOR language Had the author been granted more time he could have outlined additional specific modifications to bring existing documentation to a level of adequacy) However some guidelines stand out
1 Installation instruction and considerations should compose a chapter of the Systems Manual rather than a separate document (Hand-out materials covering installation were not adequate) Further a documentation scheme setting forth how CONCOR was installed on each computer and any modifications required during installation could be serially added to and become a permanent part of the installations systems manual If CONCOR is going to be used and if agencies are going to actively support it as a package installation limitations must be known and well-documented The prescribed format for this documentation should be set forth and any modifishycations to CONCOR should be continuously updated on these shcets so that a history of changes to the language will be readily accessible to supporting agencies If CONCOR is to survive as a standardized system editing results must be capable of being replicated among installations Such a form could contain other information as well concerning the tyce of computer amount of care and nature of utilities supporting
13
users Upon installation a copy of this form could be
sent to the US agency which will ultimately be responsible
for supporting the CONCOR package
an appendix2 A complete COBOL CONCOR program should appear in
for reference
3 The development of the Users Guide should include an intensive
review of the editing concepts involved in processing census
data files beyond the POPSTAN materials
4 An explanation of the CONCOR benchmark program syould appear
in the Users Guide and the Systems Manual The running of a
supplied benchmark program should be a standard installation
protocol used to test all operational aspects of a new
installation
This tool which has5 The development of a CONCOR pocket card been proven useful in teaching and assisting programmers in
utilizing programming language lays out all commands options on
a single small card An example of such a pocket card is the
Such a small document is useful in clari-IBM Assembler Card fying the language and setting forth structural rules without
continual reference to full-size manuals
14
V CONCLUSION
In a situation of extreme need CONCOR could probably be utilized as it seems capable of performing its editing functiens Its usefulness as a data
cleaning tool is generally undisputed The importance of having CONCOR exshyhaustively tested by an independent agency cannot be over-emphasized in light of prior failures of the system to perform adequately Additionally core requirements and processing speed should be precisely determined One must be encouraged in the amount of progress which has been made in a relatively short amount of time towards the completion of the CONCOR package While it is clearly beyond the scope of this report to estimate the cost and the amount of time required to implement the herein outlined highly desirable changes to the CONCOR language and documentation and then exhaustively test the system it is believed that this process could be completed within a single quarter (The changes are of the nature that no structural redefinishytions of a systematic nature would be required) These enhancements would make the COBOL CONCOR package more complete consistent and reliable in the field ISPC has competently moved the system to the point that its comshypletion iswithin reach
Given the probable revision of the current Systems Reference Manual and the development of a suitable Users Guide document it is likely that this version of CONCOR could be taught to experienced programmers (unfamiliar with previous versions of the language) in as little as four weeks And while one is impressed by the potential power and overall simplicity of the COBOL CONCOR language it is difficult to envision individuals with no real computer programming experience utilizing its full capabilities Hence the importance of including a review of basic editing concepts in the documentation is undershylined
Further as documentation is often critical to the perceived ease of use of computer-based systems itwill be essential to have these various manuals examined independently prior to their general use
As previously mentioned Appendix G represents features of which the ISPC feels could have positive impact on CONCOR Depending upon the outcome of the testing recommended in previous chapters and the availability of funding supporting agencies may desire to look again at an assembler (ALC) version of the language
Overall COBOL CONCOR remains an intermediate product yet to realize its potential which is thought considerable Software development of this nature is characteristically of a long-term evolutionary design In light of this reality agencies should be prepared to evaluate such projects success or failure in terms other than that of immediate utility It is the philosophy underlying the system not the system itself which will ultimately be exported
APPENDIX A
Bucen Enforcement Proposal
APPENDIX A
BUCEN IN-HOUSE REVIEW ENHANCEMENT PROPOSAL
1 Easy to use interrecord referencing
2 Improved output file capabilities
A provide overflow protection on WRITE command
B provide OUTPUT command to create corrected questionnaire file that matches IP file data dictionary
3 Improvededit statistics reported (LISTERR)
A provide automatic (user-specified) area break
B provide options for compilation and displaying edit statistics at various levels
C provide automatic (user-specified) tolerance checking of error rates by area
D automatically capture IDs of areas failing tolerance check
4 Clean up known bugs in code
5 Comprehensive testing
6 Clean up and enhance documentation
A reference manual more examples error message guide
B installation guide
C systems manual
APPENDIX B
EVALUATIVE CRITERIA
APPEiUIX B [4 si 11 T q tII
UNITED STATES GOVERNMENshy
Pfern0 KGa 7 dll TO DSPOPFPSD Robert H Haladay
DATE December 3 1979
DSPOPDEIO Liliane Floge
SUBJECT Criteria for Evaluation of SuCen COBOL CONCOR September 30thVersion as presented at January 1930 orkshop
The following points should be considered when evaluatingCONCOR the COBOLediting system during participation at the January 19380 workshyshop
1 Are the Installation anual the System ilanual and theUsers anual ready for use Are these manuals adequate fortheir purposes Are the manuals clear and consistentthey be understood by subject-matter personnel Can
as well program ers
as
2 Is the editing system capable of impleaenting all thefeatures as described in the documenit_tion 3 Does the system appc3Ar to be ely debugged andtested and ready for release to dLvlooing conris 4 Does the system appear to be fast enough to edit a
census in a reasonable airount of time 5 hat size core does tine sys t- equire
6 CorFm-ent in general on coole te-nss usefulness andefficiency of ir-nease usethe n of by developingcountry prograrrers and other c nsus personnel
cc Su-a-- e Olds APiA SDi OrWJulio Ortuizar luotolaCelta Systems
APPENDIX C
WORKSHOP ITINERARY
APPENDIX C
CONCOR Workshop Schedule January 7-18 1980
U S Bureau of the Census International Statistical Programs Center
Scuderi Building Room 209 4235 28th Avenue Marlow Heights Maryland
Monday January 7
930 am shy 1000 Welcoming Remarks
Overview of Workshop
1000 - 1000 Introduction to CONCOR - Purpose and function
- History of development - General computer
requirements
115 - 200 System Description
-Constraints in design of CONCOR
-Basic subsystems of CONCOR
-User interactions with system
-Examples of outputs produced
1030 shy 1045 Break
1045 - 1200 Editing Concepts - Ways to interrogate
data - Ways to correct data
- Editing housing and
population data - POPSTAN
- Advantages of CONCOR
1200 shy 115 pm Break
200 - 230
230 - 245
245 - 325
User Program Organization -Divisions - Sections
- Routines - Commands
Break
Command Language
Description -Types of statements -Format -Syntax
is
2
Tuesday January 8
Dictionary Division Command Statements
115 pm-2 15 Input-Record-Section930 am-10
30 Punctuation - Define-RecordInput data referencing
Working-Data-SectionWorking data declara-- New-Datations - Array-DataInternal data represen-
tation and storage Break215 - 230
1030 - 1045 Break 230 - 325 Dictionary Examples
- Minimum dictionary1045 - 1200 Dictionary-Attributes-
structureSection - Maximum dictionary- Dictionary-Name
structure
- Hand out dictionaryFile-Section problem- Input-File
- Output-File - Write-File
- Error-File
1200 - 115 pm Break
Wednesday January 9
115 pm-2 1 5 Execution Division Command
930 am-1030 Discussion of Partici-Statements (continued)pants Dictionary
- Routines of Edit-Specishyproblems fications-Section
Free work time -Prolog-Routine -Filter-Routine -Epilog-Routine
- Types and functions 1030 - 1045 Break
of edit specification
commands1045 - 1200 Execution Division
Command Statements - Range- Punctuation - Assert- Subscripting
- Internal Identifiers - Report-Control-Section
-Generate-Edit-Statistics 215 - 230 Break-Count-Imputes
-Examples 230 - 325 - PassFail clauses
- List1200 - 115 pm Break
3
Thursday January 10
930 am-1030 Discussion of Problems - If115 pm-215
Free work time - UntilExit - Stop
1030 - 1045 Break 215 - 230 Break
1045 shy 1200 Edit Specification Command Statements 230 - 325 - Drecode
(continued) - Grecode
- Allocate - Update - Let
1200 - 115pm Break
Friday January 11
930 am-1030 Edit Specification 115 pm-3 25 Hand out Edit Specishyfications as problemCommand Statements
(continued) Free work time
- Output - Write
1030 - 1045 Break
1045 - 1200 Report Division Command Statements - Display-Control-
Section -Display-Edit-Statistics
- Tolerance-Control-Section -Error-Rate-Check -Reject-File
-Report Examples
1200 - 115 pm Break
I
29
Monday January 14
930 am-1030 Discuss procedures for running problems on computer
1030-1045 Break
1045-1200 Component Programs of the CONCOR system
1200- 115 pm Break
Tuesday January 15
930 am - 325 pm Free work time
Wednesday January 16
930 am 1200 Free work time
1200- 115 pm Break
115 pm-215 How to Install CONCOR on IBM 360370 OS
215- 230 Break
230-325 Free work time
Thursday January 17
930 am-325 Free work time
115 pm-215 Future CONCOR Design Considerations - for census processing - for survey processing
- manual correction system
215- 230 Break
230 - 245 Evaluation Guidelines
- Hand out evaluation forms
245 - 325 Free work time
Friday January 18
930 am-1030 Free work time 115-325 Free work time
1030 - 1045 Break
1045 - 1200 Group Discussion on CONCOR - submission of written evaluations Distribution of IBM installation tapes Presentation of Certificates to Participants
1200-115 pm Break
APPENDIX D
PARTICIPANTS
APPENDIX D
CONCOR WORKSHOP January 7 - 18 1980 Marlow Heights MD
PARTICIPANTS
KENYA
James Midianga Government Computer Centre Central Bureau of Statistics PO Box 30266 Nairobi Kenya
PANAMA
Omar Rivera Rios Data Processing Specialist Contraloria General de la Republica de Panama Apartado Postal 5213 Panama 5 Panama
PHILIPPINES Guida T Capellan OIC Computer Programming Division National Census and Statistics Office Solicarel Bldg 1 Magsaysay Blvd
Sta Mesa Manila Philippines
THAILAND
Angsumal Sunalai National Statistical Office Bangkok 1 Thailand
EGYPT
Farag Sedky Mourad Ghaleb CAPMAS (Centeral Agency for Public
Mobilization and Statistics)NASR City Cairo Egypt
L
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
2
contained in Appendix D During the concluding days of the workshop each participant was asked by ISPC to provide a written evaluation of the now-called December 1979 version of CONCOR This evaluation form Appendix E also inshycludes space for comments concerning the competence of the system documentation as well as any additional comments including these regarding the organization and clarity of workshop presentations It is assumed that in the near future summaries of these comments will be available to interested agencies
While virtually all instructional aspects of this two-week workshop were conducted in a highly professional manner -- a manner which revealed a high degree of coordination among staff members in their efforts -- there are several areas which future workshops may improve upon
1 All publications should be assembled in their entirety and proof-read prior to distribution
2 A complete CONCOR language program example and accompanying 110 documents should be provided at the onset of the workshyshop for reference
3 Numerous short application programming problems involving all CONCOR language divisions should be utilized in place of a single lengthy problem
It is noted that this workshop was not intended to teach the CONCOR language as the organization and presentation of materials probably would have been different It is believed that the two-week time period was sufficient time to provide participants a familiarity with the use of the new CONCOR features especially in light of the fact that workshop participants were permitted to work weekends and beyond normal working hours at their disshycretion Though funding was not generally available it is known that several workshop members chose to extend their stay inWashington to continue testing the COMCOR package or to work on projects which they could attempt to immedishyately install on their home computers At the conclusion of the workshops participants were permitted to take with them an installation tape of CONCOR as well as all the other materials they had acquired during the course of the project
3
II THE ADEQUACY OF CONCOR
CONCOR has been described by its designers as an adequate packageAdequacy as an evaluative criteria is often relative to need and should not be confused with readiness as an issue The CONCOR system exclusive of documentation is sufficiently corplete that in a situation of extreme need it could be used as a data-cleaning tool in the editing and imputation phaseof census processing Less extreme circumstances would impose reticence on such an endorsement Though non-exhaustive tests indicate that CONCOR appearsto be capable of performing all of the commands as implemented because of the rapidness with which the system was rewritten it is thought that there has not been enough time to fully test all aspects of the project Thereforeprior to its general dissemination it is recommended that an independent agency conduct exhaustive tests to certify the integrity of the system proshygrams The importance of this certification cannot be understated in lightof previous workshop experiences Concurrent with this testing process the same agency should determine the relative speed and size of the system under actual production circumstances and further determine CONCORs ease of nstalshylation Later sections of this discussion set forth additional testing recom endations
It is generally recognized that of all the data-cleaning tools available for exportation CONCOR is potentially the most powerful especially with the addition of its new commands as outlined in Appendix F While its utility is not in doubt one must ask the question of how much more useful could CONCOR be if modified and would this additional utility be worth the costs involved The nature of modifications (excluding documentation) to COBOL CONCOR approprishyate at this time for cnsideration are threefold
1 Adjustments to the elements of the system which are internallyinconsistent or awkward to facilitate its learnability and usability am ig developing country programmers
a Implementation of the division headings DICTIONARY-DIVISION EXECUTION-DIVISION and REPORT-DIVISION De-emphasis of section headings
b Improvement of the consistency among data identifiers allow alphanumeric variables to be coded without mandatory comshyparison strings throughout the DATA-DIVISION and to be of the same length of numeric variables Permit numeric identifiers to be of an equal length to NEW DATA identishyfiers Permit the coding of single dimension row and column vectors in the same manner as multi-dimensional arrays
2 Implementation of selective commands and internal variables to facilitate the production environment use of CONCOR in census applications These include
4
a LOADUNLOAD arrays Commands which would save and replace automatically hot-decked values from batch to batch
b TOTAL-QUESTIONNAIRE-COUNT-RECORD-COUNT internal
variables independent of AREA CONTROL
3 Other modifications
a Default values for max-storage parameter set in realistic range
b Allowance of more variables for survey applications
Some of these modifications are part of what ISPC calls its wish list for the future development of CONCOR This document has been included in this report as Appendix G It is arguable that these features are essential to the completion of the CONCOR package While it is beyond the scope of this report to draw a conclusion in this area the enhancements as outlined above are ones that would make the language more internally consistent and thereby easier to learn and apply to a census data production environment These modifications are not arbitrary or cosmetic but are a direct result of handsshyon programming experience in the language as well as observations and disshycussions with other workshop participants While it is probably impossible to ever be satisfied with the overall structure of any programming language the resolution of this issue of completeness must be made relative to the objecshytives for developing the COBOL CONCOR system in the first place An explicit statement of these objectivEs has been absent in all systems documentation to date
5
III PROPOSED CHANGES TO THE LANGUAGE STRUCTURE
Based upon the assumption that it is the intent uf sponso-ing agenciesto optimize the COBOL CONCOR package -- a goal which is believed currentlyobtainable -- an understanding of the nature of these changes and how theywould impact users is essential Appendix F sets forth in a comparative manner differences between the old December 1978 and the new December 1979 editions of CONCOR Studying this appendix obviates the fact that while the new version of the language is clearly superior to the old in nearly everyaspect the basic and overall structure of the language is essentially unshychanged Compartmentalization of aspects of the language into divisions represents a significant ideological enhancement to the language Indeeddevelopment of programs by divisions proved to be an extremely useful way of understanding the nature of editing work to be performed However note that while the END-DIVISION comnand is essential to the language the division headings DICTIONARY-DIVISION EXECUTION-DIVISION and REPORT-DIVISION were not implemented and are therefore preceded by a period to be treated as comment lines in the program listing It is inconsistent to implement END-DIVISION commands while not implementing the division headings It is believed that this division structuring is important enough to the overall organizational structure of a CONCOR source language program that it should be implementedprior to general distribution The section headers shown on the figures inAppendix F however are another matter They are cumbersome and were generally not coded by workshop participants and they could be deleted from this version of the language altogether with little loss in organizational understanding The CONCOR language is significantly powerful to stand on its own as a distinct product and is not meant to be a COBOL imitation its present degree of development and specialization do not warrant the structural drag of additional section identifiers is the probable intent of the originalCONCOR project was to develop a package which was uncomplicated and unwieldy to use The question of division and section names implementation while seemingly cosmetic can have real impact on its perceived easiness of learning and use
Figure 1 on the following page illustrates common mistakes programmersmake coding numeric and alphanumeric variables in the DATA-DIVISION These mistakes are the result of the inconsistent variable formats For instancein the numeric data definition statement it is permissible to specify 19-23 where N signifies numeric 9 signifies the length of the item and 23 specifiesthe starting position in the record In NEW-DATA however it is possible to code an item with a maximum length of 18 While on the surface this inconshysistency would seem harmless typically some data defined user variables in NEW-DATA defined N18 could be moved inadvertently to output record fields defined by a data definition statement 119-23 such an action would result in a data error Under certain circumstances itwould be highly desirable to output these larger length values A similar circumstance exists between the numeric and alphanumeric data coding conventions While the maximum lengthof the numeric is permitted to be 9 in the data definition statement (18 in NEW-DATA) the maximum alphanumeric variable is permitted to be only 4 characters in length In the current systems manual it is recommended that
FIGURE 1
DICTIONARY-DIVISION
DICTIONARY-NAME DATA-CODING-EXAMPLE
INPUT-FILE
OUTPUT-FILE
AREA-CONTROL N2-2 N2-4 N3-6 N2-9 N9-23 QUESTIONNAIRE-CONTROL A4-2 A3-6 A2-9 A3-11 A3-14
RECORD-CONTROL Al-l
DEFINE-RECORD
HOI-TYPE-OF-HOUISING-UNIT Nl-17
H02-MATERIAL-OF-ROOF N1-19 10 9
H03-TOTAL-PERSONS-IN-UNIT N8-40 NOT-NUMERIC BLAIK
1104-STATE-OF-UIIII-CODE A4-50 0 U 1 D
DEFINE-RECORD
P01-SEX 1-13 W F
NEW-DATA
NOI-SAVE-TYPE-OF-HOUSING-UNIT
N02-SAVE-TYPE-OF-ROOF 1
N03-COUNT-TOTAL-IN-UNITS 10 0
N04-AGGREGATE-INCOME 18 0
END-DIVISION
Explanations
N2-4 This is an example of an external numeric input data item (N) with a length of 2 bytes starting in column 4 of the input record The maximumlength of this type of variable outside of NEW-DATA is 9 When coded in
NEW-DATA 18 is permitted
A4-2 This is an example of an external alphanumeric input data item (A)
with a length of 4 bytes starting in column 2 of the input record This
construction for alphanumeric variable is valid only in the control stateshyments Additionally it can never be over 4 bytes in length When alphshynumeric data fields are defined within record types the EDITOR program
requires that the comparison strings always be specified A maximum of 3 is permitted The purpose of these strings is to force recode the data to a numeric value If no match is found EDITOR automatically assigns a unique negative value to the field
7
alphanumeric coding be utilized in the QUESTIONNAIRE-CONTROL and RECORD-CONTROL statements where each input data item must be of the same data type as shown in the example When alphanumeric data variables are used in these control stateshyments their construction is identical to that of numeric items However when used elsewhere in the DATA-DIVISION alphanumeric variables are required to specify one of three possible comparison values as shown There are number of production instances when it never would be necessary or even desirable to reshycode alphanumeric data However as CONCOR attempts to force data into a totally numeric format upon output there is no current way to preserve these values if desired
An unwieldy alternative to this situation which may be acceptable under some circumstances would be the expansion of the number of comparison stringsfrom three to a more realistic number The limitation of this compromise is that a full twenty-six comparison identifiers would be required in order to accommodate data which utilized the entire alphabet A better solutionhowever would be to make the general format of the alphanumeric variables identical to that of numeric identifiers ie A9-23 and to permit alphashynumeric values so defined to pass unaltered through the CONCOR system
Anocher data-naming convention which caused several errors and which could be corrected concerns the array data definitional statements While arraysof two and more dimensions are handled in a superior manner by the CONCOR proshygram single-dimension arrays pose a problem in coding as shown in the Figure 2 It is suggested that the command imperatives be changed to permit the codingof both rows and columns in single dimension arrays ie allow a single row vector as well as a single column vector to maintain the consistel -yof the array data definitional statements
A major requirement of COBOL CONCOR file processing concerns the fact that all related data records must be physically contiguous on the input file The implication of this requirement is that files may require preprocessing prior to actual data editing (This preprocessing is usually a sort routine upon a selected CONTROL-AREA key) While this type of processing merely introduces a new step in file processing a major limitation becomes apparent when a largenumber of DISCRETE DATA files of the same census or survey questionnaire are to be processed This limitation is the introduction of manual steps to save the most recent inputed values ie preventing the program from startingwith cold values each batch run If a command such as LOADUNLOAD ARRAYS was incorporated into the language (an enhancement not believed to be difficult to implement) manual processing would be reduced to a minimum between batches and the maximum benefits of the hot deck methodology would be realized It is envisioned that such a command would automatically insure the transfer of the appropriately designated hot values Automatic processing of this nature if done correctly can greatly reduce the time required to clean multishyvolume files for once CONCOR language statements have been compiled linked
While it is possible at this time to save the arrays that amp-e used in the imputation processes on a separate write-file right now it is not possibleto automatically load those values back to an object program and to iTmedishyately resume processing on another volume It isbelieved that suh an automatic feature of the language would cut down the manual processing time significantly enough that it warrants inclusion into the package prior to its general distribution
FIGURE 2
A05-DI FF-BETWEEN-AGE-OF-FEMALE-BY-RECATION v2 4 4
AGE iF LHUSBAND RELATION Connents The ARRAY-DATA command statement
12-L7 18-24 25-35 36+ provides the means to declare array identifiers
2 1 3v 4v HEAD with up to five dimensions Current documentation 2 -1 3 CHILD is not as explicite about the rules of this 1 31 -2 -4 OTHER command as is desirable The parameters of I 2 2 2 NONRFLATTVE the command should function as follows
user-identifier number of dimensions D R C M number of rows number of columns
magnitude of element intiial start up valuesA06-DI FF-BETI4EEN-ArE-DF-PERSON-AND-M4OTHER 114
(This coding generates the below In the example A05 is a two dimensional array 16 18 21 23 error message) with 4 rows 4 columns a default magvitude of 9
and cold deck values as labeled
A06-DI FF- ETWEEN-AGE-OF-PERSON-AND-MOTHER 11t 587
I 2 III jqARNINIGDD-207) COMMAND TERMINATOR I) NOT FOUND C) ASSUMED PRESENT (2) EPROR (DD-9lI) DIMENSION OF USER-SPECIFIED ARRAY IS LESS THAN THE MINIMUM VALUE PERMITTFD (2)
PREEV1OUS DIAGIOSTIC AT CINIE 563
As shown by the array variable A06 CONCORs treatment of vectors is not consistent with the above multidimensional array skeme ie
(Example of how vectors must be currently A06 must be coded as follows coded to be correct)
A06-DIFF-BETWEEN-AGE-OF- PERSON-AND-MOTHERtl42 user-identifier 1 dimension number of elements in vector magnitude of element initial start up values
A simple modification to this command would permit 6 Lthe coding of both row and column vectors and make
16 LB 21 23 this command less error prone
0 0 0
9
and stored as an object module on the system no other compilations should berequired for questionnaire processing files of the same type Theoreticallya single well-written CONCOR program is all that would be required to process an entire census run
Appendix H contrasts the internal identifiers of the old and new languageversions Without such identifiers a user would have little information about the status of input as it is processed by EDITOR As noted in theappendix most internal pointers are reset upon each break in the CONTROL-AREAprovided a CONTROL-AREA has been defined The limitation here is that there are obvious instances when the termination in the processing mode would beadvantageous based on run counts although a CONTROL-AREA has been specifiedeg debugging CONCOR programs or comparing input files Therefore another set of pointers should be implemented for this purpose and made available for programmer reference
One clearly disturbing development which needs to be pursued during inshydepth testing of the system concerns the MAX-STORAGE parameters of the DEFINE-RECORD statement As shown in the figure on the following page when MAX-STORAGE was set equal to the maximum value a COBOL program was generated whichrequired 1O00K of core to run The MAX-STORAGE value of 999 is clearly notrealistic under most processing circumstances This example drives home severalimportant points about CONCOR The core requiremenis of CONCOR generated proshygrams can be influenced significantly by the amount or nature of programmerspecified I0 operations In fact it is possible to generate a program of a size most foreign country machines could not process It is recommended that tests determine a realistic max-value restriction for implementation to prevent problems in this area
The final area of recommended modification concerns the newly implementedREPORT-DIVISION The purpose of the REPORT-DIVISION is to enable a user todescribe or specify certain CONCOR language statements which will generatestatistical reports These reports contain statistics generated by EDITOR as specified by the GENERATE-EDIT-STATISTICS command of the EXECUTION4-DIVISIONAll of the reports produced are organized according to the data fields definedby the AREA-CONTROL command of the DATA-DICTIONARY If the AREA-CONTROL command is not defined in the DATA-DICTIONARY then all the statistics aresummarized at the total run level If a control area field is defined then allstatistics will be summarized for each unique CONTROL AREA as encountered bythe EDITCR program on the input file Statistics by total run level will notbe available This in part relates back to previous discussions citing theneed for new internal identifiers Report listings may contain the values ofentire records or entire questionnaires depending upon the keyword used inthe report generation commands The problem centers upon the homogeneity of CONCOR printouts during a production run
It is virtually impossible to distinguish reports on the basis of thevolumes they were run against Some means should be provided to allow users touniquely and purposefully label the reports generated in this division Indeedthe whole name REPORT-DIVISION suggests that such a command is implicit andappropriate Such a LABEL-REPORT or REPORT-FILE command along with file inforshymation from the system should not be difficult to implement
FIGURE 3
C O N C o R PAGE
SC0 N S I S T E N C Y A N D C-0 R R E C t I ON ) RUN 6AE 0hi88n
E D I T A N D I M P U T ATI 0 N S Y S T EM
USER DICTIONARY DIVISION-SOURCE LISTING
LINE NUMBER
7070
71
72 DEFINE-RECORD RECORDNAME= HOUSING-RECORDi 72
73 MAX-STORAGE=- 999 73
74 RLrRD-TYPE 11t NOTE AN LITERAL 74
267
71
P6T
268 DEFINE-RECORD RECORD-NAME= POPULATION-RECORD p68
P69MAX-STORAGE= 999269
270RECORD-TYPE= F1270
IEF2R5I JES2JOB04966SO0114 SYSOUT iEF2851 SYS8nO18T095553RAOOOAMCP12ARO000005 PASSED
= IEF2851 VOL qFP NOS TmPOn2 IEF285I wHKHDGMCONCORP12AERRORRECS KEPT
= IEF2851 VOL SEP NOS TMPO03 IEF3731 STEP LOAD START 800181001
IEF3741 STEP LOAD STOP 800181002 CPU OMIN 0165SEC SRR flMIN O25SEC VIRT iO00K SYS 3
STEP START TIME - 100155 STEP END TIME - 100257 STEP CPU TIME - 165 SECONDS STEP COMPLETION CODE - C 0
O TAPES - 0 DISKS - 0 PEGION (REQUESTED - 1000K USED - 1000K) IO COUNT 211
CARDS READ - 0 CARDS PUNCHED - 0 LINES PRINTE6 - 22
STEPNAME - LOAD CHARGE (MACHINE UNITS - 1198 DOLLARS- S-77 5TONUM9E 1
11
Concluding Remarks of System Modifications
Version 1 of the COBOL CONCOR language was notorious for abending without the generation of diagnostic messages Appendix I sets forth a test of some of these historical problems In eac) instance the system provided messagesand source program line references wLch enabled immediate identification of the processing difficulties However tests of the various commands revealed the general high performance character of the language in its newest version Under these circumstances the recommended modifica-ions should be considered as ones which will uniformly allow CONCOR to realize t-r potential of its design enhance its utility and moreover endure as a software product
12
IV THE ADEQUACY OF COBOL CONCOR DOCUMENTATION
The current documentation of CONCOR consists of a newly developed relativelyvoluminous systems manual a Diagnostic Message Guide and some loose-leaf inshystructions concerning the installation of the system on IBM computers Of these three documents the Diagnostic Message Guide is clearly complete and adequateand could bear no changes except to provide message number tabs to enable a programmer to more quickly locate the text of a message number An example of the superior format of this guide appears as Appendix J
Conspicuously absent from the list of systems documents is a Users Guide as originally specified in the ISPC enhancement proposal
A thorough assessment of the Systems Manual has been undertaken and indishycates with some reorganization and editing it could be made a more useful and usable document as itis well constituted informationally In its current form the Systems Manual can be best described as a working document a document which was not intended to teach the CONCOR language but centered upon an audishyence already familiar with previous CONCOR releases Further it is not asystems manual per se as some of its components and appendixes could more appropriately compose a users guide For example excerpts from the POPSTAN publications which review the basic editing concepts and contain the POPSTAN example problem should be deleted from the systems manual and would be more contextually effective in the Users Guide The EXECUTION-DIVISION chapter of the Systems Manual would especially benefit from general editing and reorganishyzation (Specifically the contents of pages 90-92 should be moved to page 63 -shythese pages explain the placement and purpose of the three permissible routines of the CONCOR language Had the author been granted more time he could have outlined additional specific modifications to bring existing documentation to a level of adequacy) However some guidelines stand out
1 Installation instruction and considerations should compose a chapter of the Systems Manual rather than a separate document (Hand-out materials covering installation were not adequate) Further a documentation scheme setting forth how CONCOR was installed on each computer and any modifications required during installation could be serially added to and become a permanent part of the installations systems manual If CONCOR is going to be used and if agencies are going to actively support it as a package installation limitations must be known and well-documented The prescribed format for this documentation should be set forth and any modifishycations to CONCOR should be continuously updated on these shcets so that a history of changes to the language will be readily accessible to supporting agencies If CONCOR is to survive as a standardized system editing results must be capable of being replicated among installations Such a form could contain other information as well concerning the tyce of computer amount of care and nature of utilities supporting
13
users Upon installation a copy of this form could be
sent to the US agency which will ultimately be responsible
for supporting the CONCOR package
an appendix2 A complete COBOL CONCOR program should appear in
for reference
3 The development of the Users Guide should include an intensive
review of the editing concepts involved in processing census
data files beyond the POPSTAN materials
4 An explanation of the CONCOR benchmark program syould appear
in the Users Guide and the Systems Manual The running of a
supplied benchmark program should be a standard installation
protocol used to test all operational aspects of a new
installation
This tool which has5 The development of a CONCOR pocket card been proven useful in teaching and assisting programmers in
utilizing programming language lays out all commands options on
a single small card An example of such a pocket card is the
Such a small document is useful in clari-IBM Assembler Card fying the language and setting forth structural rules without
continual reference to full-size manuals
14
V CONCLUSION
In a situation of extreme need CONCOR could probably be utilized as it seems capable of performing its editing functiens Its usefulness as a data
cleaning tool is generally undisputed The importance of having CONCOR exshyhaustively tested by an independent agency cannot be over-emphasized in light of prior failures of the system to perform adequately Additionally core requirements and processing speed should be precisely determined One must be encouraged in the amount of progress which has been made in a relatively short amount of time towards the completion of the CONCOR package While it is clearly beyond the scope of this report to estimate the cost and the amount of time required to implement the herein outlined highly desirable changes to the CONCOR language and documentation and then exhaustively test the system it is believed that this process could be completed within a single quarter (The changes are of the nature that no structural redefinishytions of a systematic nature would be required) These enhancements would make the COBOL CONCOR package more complete consistent and reliable in the field ISPC has competently moved the system to the point that its comshypletion iswithin reach
Given the probable revision of the current Systems Reference Manual and the development of a suitable Users Guide document it is likely that this version of CONCOR could be taught to experienced programmers (unfamiliar with previous versions of the language) in as little as four weeks And while one is impressed by the potential power and overall simplicity of the COBOL CONCOR language it is difficult to envision individuals with no real computer programming experience utilizing its full capabilities Hence the importance of including a review of basic editing concepts in the documentation is undershylined
Further as documentation is often critical to the perceived ease of use of computer-based systems itwill be essential to have these various manuals examined independently prior to their general use
As previously mentioned Appendix G represents features of which the ISPC feels could have positive impact on CONCOR Depending upon the outcome of the testing recommended in previous chapters and the availability of funding supporting agencies may desire to look again at an assembler (ALC) version of the language
Overall COBOL CONCOR remains an intermediate product yet to realize its potential which is thought considerable Software development of this nature is characteristically of a long-term evolutionary design In light of this reality agencies should be prepared to evaluate such projects success or failure in terms other than that of immediate utility It is the philosophy underlying the system not the system itself which will ultimately be exported
APPENDIX A
Bucen Enforcement Proposal
APPENDIX A
BUCEN IN-HOUSE REVIEW ENHANCEMENT PROPOSAL
1 Easy to use interrecord referencing
2 Improved output file capabilities
A provide overflow protection on WRITE command
B provide OUTPUT command to create corrected questionnaire file that matches IP file data dictionary
3 Improvededit statistics reported (LISTERR)
A provide automatic (user-specified) area break
B provide options for compilation and displaying edit statistics at various levels
C provide automatic (user-specified) tolerance checking of error rates by area
D automatically capture IDs of areas failing tolerance check
4 Clean up known bugs in code
5 Comprehensive testing
6 Clean up and enhance documentation
A reference manual more examples error message guide
B installation guide
C systems manual
APPENDIX B
EVALUATIVE CRITERIA
APPEiUIX B [4 si 11 T q tII
UNITED STATES GOVERNMENshy
Pfern0 KGa 7 dll TO DSPOPFPSD Robert H Haladay
DATE December 3 1979
DSPOPDEIO Liliane Floge
SUBJECT Criteria for Evaluation of SuCen COBOL CONCOR September 30thVersion as presented at January 1930 orkshop
The following points should be considered when evaluatingCONCOR the COBOLediting system during participation at the January 19380 workshyshop
1 Are the Installation anual the System ilanual and theUsers anual ready for use Are these manuals adequate fortheir purposes Are the manuals clear and consistentthey be understood by subject-matter personnel Can
as well program ers
as
2 Is the editing system capable of impleaenting all thefeatures as described in the documenit_tion 3 Does the system appc3Ar to be ely debugged andtested and ready for release to dLvlooing conris 4 Does the system appear to be fast enough to edit a
census in a reasonable airount of time 5 hat size core does tine sys t- equire
6 CorFm-ent in general on coole te-nss usefulness andefficiency of ir-nease usethe n of by developingcountry prograrrers and other c nsus personnel
cc Su-a-- e Olds APiA SDi OrWJulio Ortuizar luotolaCelta Systems
APPENDIX C
WORKSHOP ITINERARY
APPENDIX C
CONCOR Workshop Schedule January 7-18 1980
U S Bureau of the Census International Statistical Programs Center
Scuderi Building Room 209 4235 28th Avenue Marlow Heights Maryland
Monday January 7
930 am shy 1000 Welcoming Remarks
Overview of Workshop
1000 - 1000 Introduction to CONCOR - Purpose and function
- History of development - General computer
requirements
115 - 200 System Description
-Constraints in design of CONCOR
-Basic subsystems of CONCOR
-User interactions with system
-Examples of outputs produced
1030 shy 1045 Break
1045 - 1200 Editing Concepts - Ways to interrogate
data - Ways to correct data
- Editing housing and
population data - POPSTAN
- Advantages of CONCOR
1200 shy 115 pm Break
200 - 230
230 - 245
245 - 325
User Program Organization -Divisions - Sections
- Routines - Commands
Break
Command Language
Description -Types of statements -Format -Syntax
is
2
Tuesday January 8
Dictionary Division Command Statements
115 pm-2 15 Input-Record-Section930 am-10
30 Punctuation - Define-RecordInput data referencing
Working-Data-SectionWorking data declara-- New-Datations - Array-DataInternal data represen-
tation and storage Break215 - 230
1030 - 1045 Break 230 - 325 Dictionary Examples
- Minimum dictionary1045 - 1200 Dictionary-Attributes-
structureSection - Maximum dictionary- Dictionary-Name
structure
- Hand out dictionaryFile-Section problem- Input-File
- Output-File - Write-File
- Error-File
1200 - 115 pm Break
Wednesday January 9
115 pm-2 1 5 Execution Division Command
930 am-1030 Discussion of Partici-Statements (continued)pants Dictionary
- Routines of Edit-Specishyproblems fications-Section
Free work time -Prolog-Routine -Filter-Routine -Epilog-Routine
- Types and functions 1030 - 1045 Break
of edit specification
commands1045 - 1200 Execution Division
Command Statements - Range- Punctuation - Assert- Subscripting
- Internal Identifiers - Report-Control-Section
-Generate-Edit-Statistics 215 - 230 Break-Count-Imputes
-Examples 230 - 325 - PassFail clauses
- List1200 - 115 pm Break
3
Thursday January 10
930 am-1030 Discussion of Problems - If115 pm-215
Free work time - UntilExit - Stop
1030 - 1045 Break 215 - 230 Break
1045 shy 1200 Edit Specification Command Statements 230 - 325 - Drecode
(continued) - Grecode
- Allocate - Update - Let
1200 - 115pm Break
Friday January 11
930 am-1030 Edit Specification 115 pm-3 25 Hand out Edit Specishyfications as problemCommand Statements
(continued) Free work time
- Output - Write
1030 - 1045 Break
1045 - 1200 Report Division Command Statements - Display-Control-
Section -Display-Edit-Statistics
- Tolerance-Control-Section -Error-Rate-Check -Reject-File
-Report Examples
1200 - 115 pm Break
I
29
Monday January 14
930 am-1030 Discuss procedures for running problems on computer
1030-1045 Break
1045-1200 Component Programs of the CONCOR system
1200- 115 pm Break
Tuesday January 15
930 am - 325 pm Free work time
Wednesday January 16
930 am 1200 Free work time
1200- 115 pm Break
115 pm-215 How to Install CONCOR on IBM 360370 OS
215- 230 Break
230-325 Free work time
Thursday January 17
930 am-325 Free work time
115 pm-215 Future CONCOR Design Considerations - for census processing - for survey processing
- manual correction system
215- 230 Break
230 - 245 Evaluation Guidelines
- Hand out evaluation forms
245 - 325 Free work time
Friday January 18
930 am-1030 Free work time 115-325 Free work time
1030 - 1045 Break
1045 - 1200 Group Discussion on CONCOR - submission of written evaluations Distribution of IBM installation tapes Presentation of Certificates to Participants
1200-115 pm Break
APPENDIX D
PARTICIPANTS
APPENDIX D
CONCOR WORKSHOP January 7 - 18 1980 Marlow Heights MD
PARTICIPANTS
KENYA
James Midianga Government Computer Centre Central Bureau of Statistics PO Box 30266 Nairobi Kenya
PANAMA
Omar Rivera Rios Data Processing Specialist Contraloria General de la Republica de Panama Apartado Postal 5213 Panama 5 Panama
PHILIPPINES Guida T Capellan OIC Computer Programming Division National Census and Statistics Office Solicarel Bldg 1 Magsaysay Blvd
Sta Mesa Manila Philippines
THAILAND
Angsumal Sunalai National Statistical Office Bangkok 1 Thailand
EGYPT
Farag Sedky Mourad Ghaleb CAPMAS (Centeral Agency for Public
Mobilization and Statistics)NASR City Cairo Egypt
L
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
3
II THE ADEQUACY OF CONCOR
CONCOR has been described by its designers as an adequate packageAdequacy as an evaluative criteria is often relative to need and should not be confused with readiness as an issue The CONCOR system exclusive of documentation is sufficiently corplete that in a situation of extreme need it could be used as a data-cleaning tool in the editing and imputation phaseof census processing Less extreme circumstances would impose reticence on such an endorsement Though non-exhaustive tests indicate that CONCOR appearsto be capable of performing all of the commands as implemented because of the rapidness with which the system was rewritten it is thought that there has not been enough time to fully test all aspects of the project Thereforeprior to its general dissemination it is recommended that an independent agency conduct exhaustive tests to certify the integrity of the system proshygrams The importance of this certification cannot be understated in lightof previous workshop experiences Concurrent with this testing process the same agency should determine the relative speed and size of the system under actual production circumstances and further determine CONCORs ease of nstalshylation Later sections of this discussion set forth additional testing recom endations
It is generally recognized that of all the data-cleaning tools available for exportation CONCOR is potentially the most powerful especially with the addition of its new commands as outlined in Appendix F While its utility is not in doubt one must ask the question of how much more useful could CONCOR be if modified and would this additional utility be worth the costs involved The nature of modifications (excluding documentation) to COBOL CONCOR approprishyate at this time for cnsideration are threefold
1 Adjustments to the elements of the system which are internallyinconsistent or awkward to facilitate its learnability and usability am ig developing country programmers
a Implementation of the division headings DICTIONARY-DIVISION EXECUTION-DIVISION and REPORT-DIVISION De-emphasis of section headings
b Improvement of the consistency among data identifiers allow alphanumeric variables to be coded without mandatory comshyparison strings throughout the DATA-DIVISION and to be of the same length of numeric variables Permit numeric identifiers to be of an equal length to NEW DATA identishyfiers Permit the coding of single dimension row and column vectors in the same manner as multi-dimensional arrays
2 Implementation of selective commands and internal variables to facilitate the production environment use of CONCOR in census applications These include
4
a LOADUNLOAD arrays Commands which would save and replace automatically hot-decked values from batch to batch
b TOTAL-QUESTIONNAIRE-COUNT-RECORD-COUNT internal
variables independent of AREA CONTROL
3 Other modifications
a Default values for max-storage parameter set in realistic range
b Allowance of more variables for survey applications
Some of these modifications are part of what ISPC calls its wish list for the future development of CONCOR This document has been included in this report as Appendix G It is arguable that these features are essential to the completion of the CONCOR package While it is beyond the scope of this report to draw a conclusion in this area the enhancements as outlined above are ones that would make the language more internally consistent and thereby easier to learn and apply to a census data production environment These modifications are not arbitrary or cosmetic but are a direct result of handsshyon programming experience in the language as well as observations and disshycussions with other workshop participants While it is probably impossible to ever be satisfied with the overall structure of any programming language the resolution of this issue of completeness must be made relative to the objecshytives for developing the COBOL CONCOR system in the first place An explicit statement of these objectivEs has been absent in all systems documentation to date
5
III PROPOSED CHANGES TO THE LANGUAGE STRUCTURE
Based upon the assumption that it is the intent uf sponso-ing agenciesto optimize the COBOL CONCOR package -- a goal which is believed currentlyobtainable -- an understanding of the nature of these changes and how theywould impact users is essential Appendix F sets forth in a comparative manner differences between the old December 1978 and the new December 1979 editions of CONCOR Studying this appendix obviates the fact that while the new version of the language is clearly superior to the old in nearly everyaspect the basic and overall structure of the language is essentially unshychanged Compartmentalization of aspects of the language into divisions represents a significant ideological enhancement to the language Indeeddevelopment of programs by divisions proved to be an extremely useful way of understanding the nature of editing work to be performed However note that while the END-DIVISION comnand is essential to the language the division headings DICTIONARY-DIVISION EXECUTION-DIVISION and REPORT-DIVISION were not implemented and are therefore preceded by a period to be treated as comment lines in the program listing It is inconsistent to implement END-DIVISION commands while not implementing the division headings It is believed that this division structuring is important enough to the overall organizational structure of a CONCOR source language program that it should be implementedprior to general distribution The section headers shown on the figures inAppendix F however are another matter They are cumbersome and were generally not coded by workshop participants and they could be deleted from this version of the language altogether with little loss in organizational understanding The CONCOR language is significantly powerful to stand on its own as a distinct product and is not meant to be a COBOL imitation its present degree of development and specialization do not warrant the structural drag of additional section identifiers is the probable intent of the originalCONCOR project was to develop a package which was uncomplicated and unwieldy to use The question of division and section names implementation while seemingly cosmetic can have real impact on its perceived easiness of learning and use
Figure 1 on the following page illustrates common mistakes programmersmake coding numeric and alphanumeric variables in the DATA-DIVISION These mistakes are the result of the inconsistent variable formats For instancein the numeric data definition statement it is permissible to specify 19-23 where N signifies numeric 9 signifies the length of the item and 23 specifiesthe starting position in the record In NEW-DATA however it is possible to code an item with a maximum length of 18 While on the surface this inconshysistency would seem harmless typically some data defined user variables in NEW-DATA defined N18 could be moved inadvertently to output record fields defined by a data definition statement 119-23 such an action would result in a data error Under certain circumstances itwould be highly desirable to output these larger length values A similar circumstance exists between the numeric and alphanumeric data coding conventions While the maximum lengthof the numeric is permitted to be 9 in the data definition statement (18 in NEW-DATA) the maximum alphanumeric variable is permitted to be only 4 characters in length In the current systems manual it is recommended that
FIGURE 1
DICTIONARY-DIVISION
DICTIONARY-NAME DATA-CODING-EXAMPLE
INPUT-FILE
OUTPUT-FILE
AREA-CONTROL N2-2 N2-4 N3-6 N2-9 N9-23 QUESTIONNAIRE-CONTROL A4-2 A3-6 A2-9 A3-11 A3-14
RECORD-CONTROL Al-l
DEFINE-RECORD
HOI-TYPE-OF-HOUISING-UNIT Nl-17
H02-MATERIAL-OF-ROOF N1-19 10 9
H03-TOTAL-PERSONS-IN-UNIT N8-40 NOT-NUMERIC BLAIK
1104-STATE-OF-UIIII-CODE A4-50 0 U 1 D
DEFINE-RECORD
P01-SEX 1-13 W F
NEW-DATA
NOI-SAVE-TYPE-OF-HOUSING-UNIT
N02-SAVE-TYPE-OF-ROOF 1
N03-COUNT-TOTAL-IN-UNITS 10 0
N04-AGGREGATE-INCOME 18 0
END-DIVISION
Explanations
N2-4 This is an example of an external numeric input data item (N) with a length of 2 bytes starting in column 4 of the input record The maximumlength of this type of variable outside of NEW-DATA is 9 When coded in
NEW-DATA 18 is permitted
A4-2 This is an example of an external alphanumeric input data item (A)
with a length of 4 bytes starting in column 2 of the input record This
construction for alphanumeric variable is valid only in the control stateshyments Additionally it can never be over 4 bytes in length When alphshynumeric data fields are defined within record types the EDITOR program
requires that the comparison strings always be specified A maximum of 3 is permitted The purpose of these strings is to force recode the data to a numeric value If no match is found EDITOR automatically assigns a unique negative value to the field
7
alphanumeric coding be utilized in the QUESTIONNAIRE-CONTROL and RECORD-CONTROL statements where each input data item must be of the same data type as shown in the example When alphanumeric data variables are used in these control stateshyments their construction is identical to that of numeric items However when used elsewhere in the DATA-DIVISION alphanumeric variables are required to specify one of three possible comparison values as shown There are number of production instances when it never would be necessary or even desirable to reshycode alphanumeric data However as CONCOR attempts to force data into a totally numeric format upon output there is no current way to preserve these values if desired
An unwieldy alternative to this situation which may be acceptable under some circumstances would be the expansion of the number of comparison stringsfrom three to a more realistic number The limitation of this compromise is that a full twenty-six comparison identifiers would be required in order to accommodate data which utilized the entire alphabet A better solutionhowever would be to make the general format of the alphanumeric variables identical to that of numeric identifiers ie A9-23 and to permit alphashynumeric values so defined to pass unaltered through the CONCOR system
Anocher data-naming convention which caused several errors and which could be corrected concerns the array data definitional statements While arraysof two and more dimensions are handled in a superior manner by the CONCOR proshygram single-dimension arrays pose a problem in coding as shown in the Figure 2 It is suggested that the command imperatives be changed to permit the codingof both rows and columns in single dimension arrays ie allow a single row vector as well as a single column vector to maintain the consistel -yof the array data definitional statements
A major requirement of COBOL CONCOR file processing concerns the fact that all related data records must be physically contiguous on the input file The implication of this requirement is that files may require preprocessing prior to actual data editing (This preprocessing is usually a sort routine upon a selected CONTROL-AREA key) While this type of processing merely introduces a new step in file processing a major limitation becomes apparent when a largenumber of DISCRETE DATA files of the same census or survey questionnaire are to be processed This limitation is the introduction of manual steps to save the most recent inputed values ie preventing the program from startingwith cold values each batch run If a command such as LOADUNLOAD ARRAYS was incorporated into the language (an enhancement not believed to be difficult to implement) manual processing would be reduced to a minimum between batches and the maximum benefits of the hot deck methodology would be realized It is envisioned that such a command would automatically insure the transfer of the appropriately designated hot values Automatic processing of this nature if done correctly can greatly reduce the time required to clean multishyvolume files for once CONCOR language statements have been compiled linked
While it is possible at this time to save the arrays that amp-e used in the imputation processes on a separate write-file right now it is not possibleto automatically load those values back to an object program and to iTmedishyately resume processing on another volume It isbelieved that suh an automatic feature of the language would cut down the manual processing time significantly enough that it warrants inclusion into the package prior to its general distribution
FIGURE 2
A05-DI FF-BETWEEN-AGE-OF-FEMALE-BY-RECATION v2 4 4
AGE iF LHUSBAND RELATION Connents The ARRAY-DATA command statement
12-L7 18-24 25-35 36+ provides the means to declare array identifiers
2 1 3v 4v HEAD with up to five dimensions Current documentation 2 -1 3 CHILD is not as explicite about the rules of this 1 31 -2 -4 OTHER command as is desirable The parameters of I 2 2 2 NONRFLATTVE the command should function as follows
user-identifier number of dimensions D R C M number of rows number of columns
magnitude of element intiial start up valuesA06-DI FF-BETI4EEN-ArE-DF-PERSON-AND-M4OTHER 114
(This coding generates the below In the example A05 is a two dimensional array 16 18 21 23 error message) with 4 rows 4 columns a default magvitude of 9
and cold deck values as labeled
A06-DI FF- ETWEEN-AGE-OF-PERSON-AND-MOTHER 11t 587
I 2 III jqARNINIGDD-207) COMMAND TERMINATOR I) NOT FOUND C) ASSUMED PRESENT (2) EPROR (DD-9lI) DIMENSION OF USER-SPECIFIED ARRAY IS LESS THAN THE MINIMUM VALUE PERMITTFD (2)
PREEV1OUS DIAGIOSTIC AT CINIE 563
As shown by the array variable A06 CONCORs treatment of vectors is not consistent with the above multidimensional array skeme ie
(Example of how vectors must be currently A06 must be coded as follows coded to be correct)
A06-DIFF-BETWEEN-AGE-OF- PERSON-AND-MOTHERtl42 user-identifier 1 dimension number of elements in vector magnitude of element initial start up values
A simple modification to this command would permit 6 Lthe coding of both row and column vectors and make
16 LB 21 23 this command less error prone
0 0 0
9
and stored as an object module on the system no other compilations should berequired for questionnaire processing files of the same type Theoreticallya single well-written CONCOR program is all that would be required to process an entire census run
Appendix H contrasts the internal identifiers of the old and new languageversions Without such identifiers a user would have little information about the status of input as it is processed by EDITOR As noted in theappendix most internal pointers are reset upon each break in the CONTROL-AREAprovided a CONTROL-AREA has been defined The limitation here is that there are obvious instances when the termination in the processing mode would beadvantageous based on run counts although a CONTROL-AREA has been specifiedeg debugging CONCOR programs or comparing input files Therefore another set of pointers should be implemented for this purpose and made available for programmer reference
One clearly disturbing development which needs to be pursued during inshydepth testing of the system concerns the MAX-STORAGE parameters of the DEFINE-RECORD statement As shown in the figure on the following page when MAX-STORAGE was set equal to the maximum value a COBOL program was generated whichrequired 1O00K of core to run The MAX-STORAGE value of 999 is clearly notrealistic under most processing circumstances This example drives home severalimportant points about CONCOR The core requiremenis of CONCOR generated proshygrams can be influenced significantly by the amount or nature of programmerspecified I0 operations In fact it is possible to generate a program of a size most foreign country machines could not process It is recommended that tests determine a realistic max-value restriction for implementation to prevent problems in this area
The final area of recommended modification concerns the newly implementedREPORT-DIVISION The purpose of the REPORT-DIVISION is to enable a user todescribe or specify certain CONCOR language statements which will generatestatistical reports These reports contain statistics generated by EDITOR as specified by the GENERATE-EDIT-STATISTICS command of the EXECUTION4-DIVISIONAll of the reports produced are organized according to the data fields definedby the AREA-CONTROL command of the DATA-DICTIONARY If the AREA-CONTROL command is not defined in the DATA-DICTIONARY then all the statistics aresummarized at the total run level If a control area field is defined then allstatistics will be summarized for each unique CONTROL AREA as encountered bythe EDITCR program on the input file Statistics by total run level will notbe available This in part relates back to previous discussions citing theneed for new internal identifiers Report listings may contain the values ofentire records or entire questionnaires depending upon the keyword used inthe report generation commands The problem centers upon the homogeneity of CONCOR printouts during a production run
It is virtually impossible to distinguish reports on the basis of thevolumes they were run against Some means should be provided to allow users touniquely and purposefully label the reports generated in this division Indeedthe whole name REPORT-DIVISION suggests that such a command is implicit andappropriate Such a LABEL-REPORT or REPORT-FILE command along with file inforshymation from the system should not be difficult to implement
FIGURE 3
C O N C o R PAGE
SC0 N S I S T E N C Y A N D C-0 R R E C t I ON ) RUN 6AE 0hi88n
E D I T A N D I M P U T ATI 0 N S Y S T EM
USER DICTIONARY DIVISION-SOURCE LISTING
LINE NUMBER
7070
71
72 DEFINE-RECORD RECORDNAME= HOUSING-RECORDi 72
73 MAX-STORAGE=- 999 73
74 RLrRD-TYPE 11t NOTE AN LITERAL 74
267
71
P6T
268 DEFINE-RECORD RECORD-NAME= POPULATION-RECORD p68
P69MAX-STORAGE= 999269
270RECORD-TYPE= F1270
IEF2R5I JES2JOB04966SO0114 SYSOUT iEF2851 SYS8nO18T095553RAOOOAMCP12ARO000005 PASSED
= IEF2851 VOL qFP NOS TmPOn2 IEF285I wHKHDGMCONCORP12AERRORRECS KEPT
= IEF2851 VOL SEP NOS TMPO03 IEF3731 STEP LOAD START 800181001
IEF3741 STEP LOAD STOP 800181002 CPU OMIN 0165SEC SRR flMIN O25SEC VIRT iO00K SYS 3
STEP START TIME - 100155 STEP END TIME - 100257 STEP CPU TIME - 165 SECONDS STEP COMPLETION CODE - C 0
O TAPES - 0 DISKS - 0 PEGION (REQUESTED - 1000K USED - 1000K) IO COUNT 211
CARDS READ - 0 CARDS PUNCHED - 0 LINES PRINTE6 - 22
STEPNAME - LOAD CHARGE (MACHINE UNITS - 1198 DOLLARS- S-77 5TONUM9E 1
11
Concluding Remarks of System Modifications
Version 1 of the COBOL CONCOR language was notorious for abending without the generation of diagnostic messages Appendix I sets forth a test of some of these historical problems In eac) instance the system provided messagesand source program line references wLch enabled immediate identification of the processing difficulties However tests of the various commands revealed the general high performance character of the language in its newest version Under these circumstances the recommended modifica-ions should be considered as ones which will uniformly allow CONCOR to realize t-r potential of its design enhance its utility and moreover endure as a software product
12
IV THE ADEQUACY OF COBOL CONCOR DOCUMENTATION
The current documentation of CONCOR consists of a newly developed relativelyvoluminous systems manual a Diagnostic Message Guide and some loose-leaf inshystructions concerning the installation of the system on IBM computers Of these three documents the Diagnostic Message Guide is clearly complete and adequateand could bear no changes except to provide message number tabs to enable a programmer to more quickly locate the text of a message number An example of the superior format of this guide appears as Appendix J
Conspicuously absent from the list of systems documents is a Users Guide as originally specified in the ISPC enhancement proposal
A thorough assessment of the Systems Manual has been undertaken and indishycates with some reorganization and editing it could be made a more useful and usable document as itis well constituted informationally In its current form the Systems Manual can be best described as a working document a document which was not intended to teach the CONCOR language but centered upon an audishyence already familiar with previous CONCOR releases Further it is not asystems manual per se as some of its components and appendixes could more appropriately compose a users guide For example excerpts from the POPSTAN publications which review the basic editing concepts and contain the POPSTAN example problem should be deleted from the systems manual and would be more contextually effective in the Users Guide The EXECUTION-DIVISION chapter of the Systems Manual would especially benefit from general editing and reorganishyzation (Specifically the contents of pages 90-92 should be moved to page 63 -shythese pages explain the placement and purpose of the three permissible routines of the CONCOR language Had the author been granted more time he could have outlined additional specific modifications to bring existing documentation to a level of adequacy) However some guidelines stand out
1 Installation instruction and considerations should compose a chapter of the Systems Manual rather than a separate document (Hand-out materials covering installation were not adequate) Further a documentation scheme setting forth how CONCOR was installed on each computer and any modifications required during installation could be serially added to and become a permanent part of the installations systems manual If CONCOR is going to be used and if agencies are going to actively support it as a package installation limitations must be known and well-documented The prescribed format for this documentation should be set forth and any modifishycations to CONCOR should be continuously updated on these shcets so that a history of changes to the language will be readily accessible to supporting agencies If CONCOR is to survive as a standardized system editing results must be capable of being replicated among installations Such a form could contain other information as well concerning the tyce of computer amount of care and nature of utilities supporting
13
users Upon installation a copy of this form could be
sent to the US agency which will ultimately be responsible
for supporting the CONCOR package
an appendix2 A complete COBOL CONCOR program should appear in
for reference
3 The development of the Users Guide should include an intensive
review of the editing concepts involved in processing census
data files beyond the POPSTAN materials
4 An explanation of the CONCOR benchmark program syould appear
in the Users Guide and the Systems Manual The running of a
supplied benchmark program should be a standard installation
protocol used to test all operational aspects of a new
installation
This tool which has5 The development of a CONCOR pocket card been proven useful in teaching and assisting programmers in
utilizing programming language lays out all commands options on
a single small card An example of such a pocket card is the
Such a small document is useful in clari-IBM Assembler Card fying the language and setting forth structural rules without
continual reference to full-size manuals
14
V CONCLUSION
In a situation of extreme need CONCOR could probably be utilized as it seems capable of performing its editing functiens Its usefulness as a data
cleaning tool is generally undisputed The importance of having CONCOR exshyhaustively tested by an independent agency cannot be over-emphasized in light of prior failures of the system to perform adequately Additionally core requirements and processing speed should be precisely determined One must be encouraged in the amount of progress which has been made in a relatively short amount of time towards the completion of the CONCOR package While it is clearly beyond the scope of this report to estimate the cost and the amount of time required to implement the herein outlined highly desirable changes to the CONCOR language and documentation and then exhaustively test the system it is believed that this process could be completed within a single quarter (The changes are of the nature that no structural redefinishytions of a systematic nature would be required) These enhancements would make the COBOL CONCOR package more complete consistent and reliable in the field ISPC has competently moved the system to the point that its comshypletion iswithin reach
Given the probable revision of the current Systems Reference Manual and the development of a suitable Users Guide document it is likely that this version of CONCOR could be taught to experienced programmers (unfamiliar with previous versions of the language) in as little as four weeks And while one is impressed by the potential power and overall simplicity of the COBOL CONCOR language it is difficult to envision individuals with no real computer programming experience utilizing its full capabilities Hence the importance of including a review of basic editing concepts in the documentation is undershylined
Further as documentation is often critical to the perceived ease of use of computer-based systems itwill be essential to have these various manuals examined independently prior to their general use
As previously mentioned Appendix G represents features of which the ISPC feels could have positive impact on CONCOR Depending upon the outcome of the testing recommended in previous chapters and the availability of funding supporting agencies may desire to look again at an assembler (ALC) version of the language
Overall COBOL CONCOR remains an intermediate product yet to realize its potential which is thought considerable Software development of this nature is characteristically of a long-term evolutionary design In light of this reality agencies should be prepared to evaluate such projects success or failure in terms other than that of immediate utility It is the philosophy underlying the system not the system itself which will ultimately be exported
APPENDIX A
Bucen Enforcement Proposal
APPENDIX A
BUCEN IN-HOUSE REVIEW ENHANCEMENT PROPOSAL
1 Easy to use interrecord referencing
2 Improved output file capabilities
A provide overflow protection on WRITE command
B provide OUTPUT command to create corrected questionnaire file that matches IP file data dictionary
3 Improvededit statistics reported (LISTERR)
A provide automatic (user-specified) area break
B provide options for compilation and displaying edit statistics at various levels
C provide automatic (user-specified) tolerance checking of error rates by area
D automatically capture IDs of areas failing tolerance check
4 Clean up known bugs in code
5 Comprehensive testing
6 Clean up and enhance documentation
A reference manual more examples error message guide
B installation guide
C systems manual
APPENDIX B
EVALUATIVE CRITERIA
APPEiUIX B [4 si 11 T q tII
UNITED STATES GOVERNMENshy
Pfern0 KGa 7 dll TO DSPOPFPSD Robert H Haladay
DATE December 3 1979
DSPOPDEIO Liliane Floge
SUBJECT Criteria for Evaluation of SuCen COBOL CONCOR September 30thVersion as presented at January 1930 orkshop
The following points should be considered when evaluatingCONCOR the COBOLediting system during participation at the January 19380 workshyshop
1 Are the Installation anual the System ilanual and theUsers anual ready for use Are these manuals adequate fortheir purposes Are the manuals clear and consistentthey be understood by subject-matter personnel Can
as well program ers
as
2 Is the editing system capable of impleaenting all thefeatures as described in the documenit_tion 3 Does the system appc3Ar to be ely debugged andtested and ready for release to dLvlooing conris 4 Does the system appear to be fast enough to edit a
census in a reasonable airount of time 5 hat size core does tine sys t- equire
6 CorFm-ent in general on coole te-nss usefulness andefficiency of ir-nease usethe n of by developingcountry prograrrers and other c nsus personnel
cc Su-a-- e Olds APiA SDi OrWJulio Ortuizar luotolaCelta Systems
APPENDIX C
WORKSHOP ITINERARY
APPENDIX C
CONCOR Workshop Schedule January 7-18 1980
U S Bureau of the Census International Statistical Programs Center
Scuderi Building Room 209 4235 28th Avenue Marlow Heights Maryland
Monday January 7
930 am shy 1000 Welcoming Remarks
Overview of Workshop
1000 - 1000 Introduction to CONCOR - Purpose and function
- History of development - General computer
requirements
115 - 200 System Description
-Constraints in design of CONCOR
-Basic subsystems of CONCOR
-User interactions with system
-Examples of outputs produced
1030 shy 1045 Break
1045 - 1200 Editing Concepts - Ways to interrogate
data - Ways to correct data
- Editing housing and
population data - POPSTAN
- Advantages of CONCOR
1200 shy 115 pm Break
200 - 230
230 - 245
245 - 325
User Program Organization -Divisions - Sections
- Routines - Commands
Break
Command Language
Description -Types of statements -Format -Syntax
is
2
Tuesday January 8
Dictionary Division Command Statements
115 pm-2 15 Input-Record-Section930 am-10
30 Punctuation - Define-RecordInput data referencing
Working-Data-SectionWorking data declara-- New-Datations - Array-DataInternal data represen-
tation and storage Break215 - 230
1030 - 1045 Break 230 - 325 Dictionary Examples
- Minimum dictionary1045 - 1200 Dictionary-Attributes-
structureSection - Maximum dictionary- Dictionary-Name
structure
- Hand out dictionaryFile-Section problem- Input-File
- Output-File - Write-File
- Error-File
1200 - 115 pm Break
Wednesday January 9
115 pm-2 1 5 Execution Division Command
930 am-1030 Discussion of Partici-Statements (continued)pants Dictionary
- Routines of Edit-Specishyproblems fications-Section
Free work time -Prolog-Routine -Filter-Routine -Epilog-Routine
- Types and functions 1030 - 1045 Break
of edit specification
commands1045 - 1200 Execution Division
Command Statements - Range- Punctuation - Assert- Subscripting
- Internal Identifiers - Report-Control-Section
-Generate-Edit-Statistics 215 - 230 Break-Count-Imputes
-Examples 230 - 325 - PassFail clauses
- List1200 - 115 pm Break
3
Thursday January 10
930 am-1030 Discussion of Problems - If115 pm-215
Free work time - UntilExit - Stop
1030 - 1045 Break 215 - 230 Break
1045 shy 1200 Edit Specification Command Statements 230 - 325 - Drecode
(continued) - Grecode
- Allocate - Update - Let
1200 - 115pm Break
Friday January 11
930 am-1030 Edit Specification 115 pm-3 25 Hand out Edit Specishyfications as problemCommand Statements
(continued) Free work time
- Output - Write
1030 - 1045 Break
1045 - 1200 Report Division Command Statements - Display-Control-
Section -Display-Edit-Statistics
- Tolerance-Control-Section -Error-Rate-Check -Reject-File
-Report Examples
1200 - 115 pm Break
I
29
Monday January 14
930 am-1030 Discuss procedures for running problems on computer
1030-1045 Break
1045-1200 Component Programs of the CONCOR system
1200- 115 pm Break
Tuesday January 15
930 am - 325 pm Free work time
Wednesday January 16
930 am 1200 Free work time
1200- 115 pm Break
115 pm-215 How to Install CONCOR on IBM 360370 OS
215- 230 Break
230-325 Free work time
Thursday January 17
930 am-325 Free work time
115 pm-215 Future CONCOR Design Considerations - for census processing - for survey processing
- manual correction system
215- 230 Break
230 - 245 Evaluation Guidelines
- Hand out evaluation forms
245 - 325 Free work time
Friday January 18
930 am-1030 Free work time 115-325 Free work time
1030 - 1045 Break
1045 - 1200 Group Discussion on CONCOR - submission of written evaluations Distribution of IBM installation tapes Presentation of Certificates to Participants
1200-115 pm Break
APPENDIX D
PARTICIPANTS
APPENDIX D
CONCOR WORKSHOP January 7 - 18 1980 Marlow Heights MD
PARTICIPANTS
KENYA
James Midianga Government Computer Centre Central Bureau of Statistics PO Box 30266 Nairobi Kenya
PANAMA
Omar Rivera Rios Data Processing Specialist Contraloria General de la Republica de Panama Apartado Postal 5213 Panama 5 Panama
PHILIPPINES Guida T Capellan OIC Computer Programming Division National Census and Statistics Office Solicarel Bldg 1 Magsaysay Blvd
Sta Mesa Manila Philippines
THAILAND
Angsumal Sunalai National Statistical Office Bangkok 1 Thailand
EGYPT
Farag Sedky Mourad Ghaleb CAPMAS (Centeral Agency for Public
Mobilization and Statistics)NASR City Cairo Egypt
L
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
4
a LOADUNLOAD arrays Commands which would save and replace automatically hot-decked values from batch to batch
b TOTAL-QUESTIONNAIRE-COUNT-RECORD-COUNT internal
variables independent of AREA CONTROL
3 Other modifications
a Default values for max-storage parameter set in realistic range
b Allowance of more variables for survey applications
Some of these modifications are part of what ISPC calls its wish list for the future development of CONCOR This document has been included in this report as Appendix G It is arguable that these features are essential to the completion of the CONCOR package While it is beyond the scope of this report to draw a conclusion in this area the enhancements as outlined above are ones that would make the language more internally consistent and thereby easier to learn and apply to a census data production environment These modifications are not arbitrary or cosmetic but are a direct result of handsshyon programming experience in the language as well as observations and disshycussions with other workshop participants While it is probably impossible to ever be satisfied with the overall structure of any programming language the resolution of this issue of completeness must be made relative to the objecshytives for developing the COBOL CONCOR system in the first place An explicit statement of these objectivEs has been absent in all systems documentation to date
5
III PROPOSED CHANGES TO THE LANGUAGE STRUCTURE
Based upon the assumption that it is the intent uf sponso-ing agenciesto optimize the COBOL CONCOR package -- a goal which is believed currentlyobtainable -- an understanding of the nature of these changes and how theywould impact users is essential Appendix F sets forth in a comparative manner differences between the old December 1978 and the new December 1979 editions of CONCOR Studying this appendix obviates the fact that while the new version of the language is clearly superior to the old in nearly everyaspect the basic and overall structure of the language is essentially unshychanged Compartmentalization of aspects of the language into divisions represents a significant ideological enhancement to the language Indeeddevelopment of programs by divisions proved to be an extremely useful way of understanding the nature of editing work to be performed However note that while the END-DIVISION comnand is essential to the language the division headings DICTIONARY-DIVISION EXECUTION-DIVISION and REPORT-DIVISION were not implemented and are therefore preceded by a period to be treated as comment lines in the program listing It is inconsistent to implement END-DIVISION commands while not implementing the division headings It is believed that this division structuring is important enough to the overall organizational structure of a CONCOR source language program that it should be implementedprior to general distribution The section headers shown on the figures inAppendix F however are another matter They are cumbersome and were generally not coded by workshop participants and they could be deleted from this version of the language altogether with little loss in organizational understanding The CONCOR language is significantly powerful to stand on its own as a distinct product and is not meant to be a COBOL imitation its present degree of development and specialization do not warrant the structural drag of additional section identifiers is the probable intent of the originalCONCOR project was to develop a package which was uncomplicated and unwieldy to use The question of division and section names implementation while seemingly cosmetic can have real impact on its perceived easiness of learning and use
Figure 1 on the following page illustrates common mistakes programmersmake coding numeric and alphanumeric variables in the DATA-DIVISION These mistakes are the result of the inconsistent variable formats For instancein the numeric data definition statement it is permissible to specify 19-23 where N signifies numeric 9 signifies the length of the item and 23 specifiesthe starting position in the record In NEW-DATA however it is possible to code an item with a maximum length of 18 While on the surface this inconshysistency would seem harmless typically some data defined user variables in NEW-DATA defined N18 could be moved inadvertently to output record fields defined by a data definition statement 119-23 such an action would result in a data error Under certain circumstances itwould be highly desirable to output these larger length values A similar circumstance exists between the numeric and alphanumeric data coding conventions While the maximum lengthof the numeric is permitted to be 9 in the data definition statement (18 in NEW-DATA) the maximum alphanumeric variable is permitted to be only 4 characters in length In the current systems manual it is recommended that
FIGURE 1
DICTIONARY-DIVISION
DICTIONARY-NAME DATA-CODING-EXAMPLE
INPUT-FILE
OUTPUT-FILE
AREA-CONTROL N2-2 N2-4 N3-6 N2-9 N9-23 QUESTIONNAIRE-CONTROL A4-2 A3-6 A2-9 A3-11 A3-14
RECORD-CONTROL Al-l
DEFINE-RECORD
HOI-TYPE-OF-HOUISING-UNIT Nl-17
H02-MATERIAL-OF-ROOF N1-19 10 9
H03-TOTAL-PERSONS-IN-UNIT N8-40 NOT-NUMERIC BLAIK
1104-STATE-OF-UIIII-CODE A4-50 0 U 1 D
DEFINE-RECORD
P01-SEX 1-13 W F
NEW-DATA
NOI-SAVE-TYPE-OF-HOUSING-UNIT
N02-SAVE-TYPE-OF-ROOF 1
N03-COUNT-TOTAL-IN-UNITS 10 0
N04-AGGREGATE-INCOME 18 0
END-DIVISION
Explanations
N2-4 This is an example of an external numeric input data item (N) with a length of 2 bytes starting in column 4 of the input record The maximumlength of this type of variable outside of NEW-DATA is 9 When coded in
NEW-DATA 18 is permitted
A4-2 This is an example of an external alphanumeric input data item (A)
with a length of 4 bytes starting in column 2 of the input record This
construction for alphanumeric variable is valid only in the control stateshyments Additionally it can never be over 4 bytes in length When alphshynumeric data fields are defined within record types the EDITOR program
requires that the comparison strings always be specified A maximum of 3 is permitted The purpose of these strings is to force recode the data to a numeric value If no match is found EDITOR automatically assigns a unique negative value to the field
7
alphanumeric coding be utilized in the QUESTIONNAIRE-CONTROL and RECORD-CONTROL statements where each input data item must be of the same data type as shown in the example When alphanumeric data variables are used in these control stateshyments their construction is identical to that of numeric items However when used elsewhere in the DATA-DIVISION alphanumeric variables are required to specify one of three possible comparison values as shown There are number of production instances when it never would be necessary or even desirable to reshycode alphanumeric data However as CONCOR attempts to force data into a totally numeric format upon output there is no current way to preserve these values if desired
An unwieldy alternative to this situation which may be acceptable under some circumstances would be the expansion of the number of comparison stringsfrom three to a more realistic number The limitation of this compromise is that a full twenty-six comparison identifiers would be required in order to accommodate data which utilized the entire alphabet A better solutionhowever would be to make the general format of the alphanumeric variables identical to that of numeric identifiers ie A9-23 and to permit alphashynumeric values so defined to pass unaltered through the CONCOR system
Anocher data-naming convention which caused several errors and which could be corrected concerns the array data definitional statements While arraysof two and more dimensions are handled in a superior manner by the CONCOR proshygram single-dimension arrays pose a problem in coding as shown in the Figure 2 It is suggested that the command imperatives be changed to permit the codingof both rows and columns in single dimension arrays ie allow a single row vector as well as a single column vector to maintain the consistel -yof the array data definitional statements
A major requirement of COBOL CONCOR file processing concerns the fact that all related data records must be physically contiguous on the input file The implication of this requirement is that files may require preprocessing prior to actual data editing (This preprocessing is usually a sort routine upon a selected CONTROL-AREA key) While this type of processing merely introduces a new step in file processing a major limitation becomes apparent when a largenumber of DISCRETE DATA files of the same census or survey questionnaire are to be processed This limitation is the introduction of manual steps to save the most recent inputed values ie preventing the program from startingwith cold values each batch run If a command such as LOADUNLOAD ARRAYS was incorporated into the language (an enhancement not believed to be difficult to implement) manual processing would be reduced to a minimum between batches and the maximum benefits of the hot deck methodology would be realized It is envisioned that such a command would automatically insure the transfer of the appropriately designated hot values Automatic processing of this nature if done correctly can greatly reduce the time required to clean multishyvolume files for once CONCOR language statements have been compiled linked
While it is possible at this time to save the arrays that amp-e used in the imputation processes on a separate write-file right now it is not possibleto automatically load those values back to an object program and to iTmedishyately resume processing on another volume It isbelieved that suh an automatic feature of the language would cut down the manual processing time significantly enough that it warrants inclusion into the package prior to its general distribution
FIGURE 2
A05-DI FF-BETWEEN-AGE-OF-FEMALE-BY-RECATION v2 4 4
AGE iF LHUSBAND RELATION Connents The ARRAY-DATA command statement
12-L7 18-24 25-35 36+ provides the means to declare array identifiers
2 1 3v 4v HEAD with up to five dimensions Current documentation 2 -1 3 CHILD is not as explicite about the rules of this 1 31 -2 -4 OTHER command as is desirable The parameters of I 2 2 2 NONRFLATTVE the command should function as follows
user-identifier number of dimensions D R C M number of rows number of columns
magnitude of element intiial start up valuesA06-DI FF-BETI4EEN-ArE-DF-PERSON-AND-M4OTHER 114
(This coding generates the below In the example A05 is a two dimensional array 16 18 21 23 error message) with 4 rows 4 columns a default magvitude of 9
and cold deck values as labeled
A06-DI FF- ETWEEN-AGE-OF-PERSON-AND-MOTHER 11t 587
I 2 III jqARNINIGDD-207) COMMAND TERMINATOR I) NOT FOUND C) ASSUMED PRESENT (2) EPROR (DD-9lI) DIMENSION OF USER-SPECIFIED ARRAY IS LESS THAN THE MINIMUM VALUE PERMITTFD (2)
PREEV1OUS DIAGIOSTIC AT CINIE 563
As shown by the array variable A06 CONCORs treatment of vectors is not consistent with the above multidimensional array skeme ie
(Example of how vectors must be currently A06 must be coded as follows coded to be correct)
A06-DIFF-BETWEEN-AGE-OF- PERSON-AND-MOTHERtl42 user-identifier 1 dimension number of elements in vector magnitude of element initial start up values
A simple modification to this command would permit 6 Lthe coding of both row and column vectors and make
16 LB 21 23 this command less error prone
0 0 0
9
and stored as an object module on the system no other compilations should berequired for questionnaire processing files of the same type Theoreticallya single well-written CONCOR program is all that would be required to process an entire census run
Appendix H contrasts the internal identifiers of the old and new languageversions Without such identifiers a user would have little information about the status of input as it is processed by EDITOR As noted in theappendix most internal pointers are reset upon each break in the CONTROL-AREAprovided a CONTROL-AREA has been defined The limitation here is that there are obvious instances when the termination in the processing mode would beadvantageous based on run counts although a CONTROL-AREA has been specifiedeg debugging CONCOR programs or comparing input files Therefore another set of pointers should be implemented for this purpose and made available for programmer reference
One clearly disturbing development which needs to be pursued during inshydepth testing of the system concerns the MAX-STORAGE parameters of the DEFINE-RECORD statement As shown in the figure on the following page when MAX-STORAGE was set equal to the maximum value a COBOL program was generated whichrequired 1O00K of core to run The MAX-STORAGE value of 999 is clearly notrealistic under most processing circumstances This example drives home severalimportant points about CONCOR The core requiremenis of CONCOR generated proshygrams can be influenced significantly by the amount or nature of programmerspecified I0 operations In fact it is possible to generate a program of a size most foreign country machines could not process It is recommended that tests determine a realistic max-value restriction for implementation to prevent problems in this area
The final area of recommended modification concerns the newly implementedREPORT-DIVISION The purpose of the REPORT-DIVISION is to enable a user todescribe or specify certain CONCOR language statements which will generatestatistical reports These reports contain statistics generated by EDITOR as specified by the GENERATE-EDIT-STATISTICS command of the EXECUTION4-DIVISIONAll of the reports produced are organized according to the data fields definedby the AREA-CONTROL command of the DATA-DICTIONARY If the AREA-CONTROL command is not defined in the DATA-DICTIONARY then all the statistics aresummarized at the total run level If a control area field is defined then allstatistics will be summarized for each unique CONTROL AREA as encountered bythe EDITCR program on the input file Statistics by total run level will notbe available This in part relates back to previous discussions citing theneed for new internal identifiers Report listings may contain the values ofentire records or entire questionnaires depending upon the keyword used inthe report generation commands The problem centers upon the homogeneity of CONCOR printouts during a production run
It is virtually impossible to distinguish reports on the basis of thevolumes they were run against Some means should be provided to allow users touniquely and purposefully label the reports generated in this division Indeedthe whole name REPORT-DIVISION suggests that such a command is implicit andappropriate Such a LABEL-REPORT or REPORT-FILE command along with file inforshymation from the system should not be difficult to implement
FIGURE 3
C O N C o R PAGE
SC0 N S I S T E N C Y A N D C-0 R R E C t I ON ) RUN 6AE 0hi88n
E D I T A N D I M P U T ATI 0 N S Y S T EM
USER DICTIONARY DIVISION-SOURCE LISTING
LINE NUMBER
7070
71
72 DEFINE-RECORD RECORDNAME= HOUSING-RECORDi 72
73 MAX-STORAGE=- 999 73
74 RLrRD-TYPE 11t NOTE AN LITERAL 74
267
71
P6T
268 DEFINE-RECORD RECORD-NAME= POPULATION-RECORD p68
P69MAX-STORAGE= 999269
270RECORD-TYPE= F1270
IEF2R5I JES2JOB04966SO0114 SYSOUT iEF2851 SYS8nO18T095553RAOOOAMCP12ARO000005 PASSED
= IEF2851 VOL qFP NOS TmPOn2 IEF285I wHKHDGMCONCORP12AERRORRECS KEPT
= IEF2851 VOL SEP NOS TMPO03 IEF3731 STEP LOAD START 800181001
IEF3741 STEP LOAD STOP 800181002 CPU OMIN 0165SEC SRR flMIN O25SEC VIRT iO00K SYS 3
STEP START TIME - 100155 STEP END TIME - 100257 STEP CPU TIME - 165 SECONDS STEP COMPLETION CODE - C 0
O TAPES - 0 DISKS - 0 PEGION (REQUESTED - 1000K USED - 1000K) IO COUNT 211
CARDS READ - 0 CARDS PUNCHED - 0 LINES PRINTE6 - 22
STEPNAME - LOAD CHARGE (MACHINE UNITS - 1198 DOLLARS- S-77 5TONUM9E 1
11
Concluding Remarks of System Modifications
Version 1 of the COBOL CONCOR language was notorious for abending without the generation of diagnostic messages Appendix I sets forth a test of some of these historical problems In eac) instance the system provided messagesand source program line references wLch enabled immediate identification of the processing difficulties However tests of the various commands revealed the general high performance character of the language in its newest version Under these circumstances the recommended modifica-ions should be considered as ones which will uniformly allow CONCOR to realize t-r potential of its design enhance its utility and moreover endure as a software product
12
IV THE ADEQUACY OF COBOL CONCOR DOCUMENTATION
The current documentation of CONCOR consists of a newly developed relativelyvoluminous systems manual a Diagnostic Message Guide and some loose-leaf inshystructions concerning the installation of the system on IBM computers Of these three documents the Diagnostic Message Guide is clearly complete and adequateand could bear no changes except to provide message number tabs to enable a programmer to more quickly locate the text of a message number An example of the superior format of this guide appears as Appendix J
Conspicuously absent from the list of systems documents is a Users Guide as originally specified in the ISPC enhancement proposal
A thorough assessment of the Systems Manual has been undertaken and indishycates with some reorganization and editing it could be made a more useful and usable document as itis well constituted informationally In its current form the Systems Manual can be best described as a working document a document which was not intended to teach the CONCOR language but centered upon an audishyence already familiar with previous CONCOR releases Further it is not asystems manual per se as some of its components and appendixes could more appropriately compose a users guide For example excerpts from the POPSTAN publications which review the basic editing concepts and contain the POPSTAN example problem should be deleted from the systems manual and would be more contextually effective in the Users Guide The EXECUTION-DIVISION chapter of the Systems Manual would especially benefit from general editing and reorganishyzation (Specifically the contents of pages 90-92 should be moved to page 63 -shythese pages explain the placement and purpose of the three permissible routines of the CONCOR language Had the author been granted more time he could have outlined additional specific modifications to bring existing documentation to a level of adequacy) However some guidelines stand out
1 Installation instruction and considerations should compose a chapter of the Systems Manual rather than a separate document (Hand-out materials covering installation were not adequate) Further a documentation scheme setting forth how CONCOR was installed on each computer and any modifications required during installation could be serially added to and become a permanent part of the installations systems manual If CONCOR is going to be used and if agencies are going to actively support it as a package installation limitations must be known and well-documented The prescribed format for this documentation should be set forth and any modifishycations to CONCOR should be continuously updated on these shcets so that a history of changes to the language will be readily accessible to supporting agencies If CONCOR is to survive as a standardized system editing results must be capable of being replicated among installations Such a form could contain other information as well concerning the tyce of computer amount of care and nature of utilities supporting
13
users Upon installation a copy of this form could be
sent to the US agency which will ultimately be responsible
for supporting the CONCOR package
an appendix2 A complete COBOL CONCOR program should appear in
for reference
3 The development of the Users Guide should include an intensive
review of the editing concepts involved in processing census
data files beyond the POPSTAN materials
4 An explanation of the CONCOR benchmark program syould appear
in the Users Guide and the Systems Manual The running of a
supplied benchmark program should be a standard installation
protocol used to test all operational aspects of a new
installation
This tool which has5 The development of a CONCOR pocket card been proven useful in teaching and assisting programmers in
utilizing programming language lays out all commands options on
a single small card An example of such a pocket card is the
Such a small document is useful in clari-IBM Assembler Card fying the language and setting forth structural rules without
continual reference to full-size manuals
14
V CONCLUSION
In a situation of extreme need CONCOR could probably be utilized as it seems capable of performing its editing functiens Its usefulness as a data
cleaning tool is generally undisputed The importance of having CONCOR exshyhaustively tested by an independent agency cannot be over-emphasized in light of prior failures of the system to perform adequately Additionally core requirements and processing speed should be precisely determined One must be encouraged in the amount of progress which has been made in a relatively short amount of time towards the completion of the CONCOR package While it is clearly beyond the scope of this report to estimate the cost and the amount of time required to implement the herein outlined highly desirable changes to the CONCOR language and documentation and then exhaustively test the system it is believed that this process could be completed within a single quarter (The changes are of the nature that no structural redefinishytions of a systematic nature would be required) These enhancements would make the COBOL CONCOR package more complete consistent and reliable in the field ISPC has competently moved the system to the point that its comshypletion iswithin reach
Given the probable revision of the current Systems Reference Manual and the development of a suitable Users Guide document it is likely that this version of CONCOR could be taught to experienced programmers (unfamiliar with previous versions of the language) in as little as four weeks And while one is impressed by the potential power and overall simplicity of the COBOL CONCOR language it is difficult to envision individuals with no real computer programming experience utilizing its full capabilities Hence the importance of including a review of basic editing concepts in the documentation is undershylined
Further as documentation is often critical to the perceived ease of use of computer-based systems itwill be essential to have these various manuals examined independently prior to their general use
As previously mentioned Appendix G represents features of which the ISPC feels could have positive impact on CONCOR Depending upon the outcome of the testing recommended in previous chapters and the availability of funding supporting agencies may desire to look again at an assembler (ALC) version of the language
Overall COBOL CONCOR remains an intermediate product yet to realize its potential which is thought considerable Software development of this nature is characteristically of a long-term evolutionary design In light of this reality agencies should be prepared to evaluate such projects success or failure in terms other than that of immediate utility It is the philosophy underlying the system not the system itself which will ultimately be exported
APPENDIX A
Bucen Enforcement Proposal
APPENDIX A
BUCEN IN-HOUSE REVIEW ENHANCEMENT PROPOSAL
1 Easy to use interrecord referencing
2 Improved output file capabilities
A provide overflow protection on WRITE command
B provide OUTPUT command to create corrected questionnaire file that matches IP file data dictionary
3 Improvededit statistics reported (LISTERR)
A provide automatic (user-specified) area break
B provide options for compilation and displaying edit statistics at various levels
C provide automatic (user-specified) tolerance checking of error rates by area
D automatically capture IDs of areas failing tolerance check
4 Clean up known bugs in code
5 Comprehensive testing
6 Clean up and enhance documentation
A reference manual more examples error message guide
B installation guide
C systems manual
APPENDIX B
EVALUATIVE CRITERIA
APPEiUIX B [4 si 11 T q tII
UNITED STATES GOVERNMENshy
Pfern0 KGa 7 dll TO DSPOPFPSD Robert H Haladay
DATE December 3 1979
DSPOPDEIO Liliane Floge
SUBJECT Criteria for Evaluation of SuCen COBOL CONCOR September 30thVersion as presented at January 1930 orkshop
The following points should be considered when evaluatingCONCOR the COBOLediting system during participation at the January 19380 workshyshop
1 Are the Installation anual the System ilanual and theUsers anual ready for use Are these manuals adequate fortheir purposes Are the manuals clear and consistentthey be understood by subject-matter personnel Can
as well program ers
as
2 Is the editing system capable of impleaenting all thefeatures as described in the documenit_tion 3 Does the system appc3Ar to be ely debugged andtested and ready for release to dLvlooing conris 4 Does the system appear to be fast enough to edit a
census in a reasonable airount of time 5 hat size core does tine sys t- equire
6 CorFm-ent in general on coole te-nss usefulness andefficiency of ir-nease usethe n of by developingcountry prograrrers and other c nsus personnel
cc Su-a-- e Olds APiA SDi OrWJulio Ortuizar luotolaCelta Systems
APPENDIX C
WORKSHOP ITINERARY
APPENDIX C
CONCOR Workshop Schedule January 7-18 1980
U S Bureau of the Census International Statistical Programs Center
Scuderi Building Room 209 4235 28th Avenue Marlow Heights Maryland
Monday January 7
930 am shy 1000 Welcoming Remarks
Overview of Workshop
1000 - 1000 Introduction to CONCOR - Purpose and function
- History of development - General computer
requirements
115 - 200 System Description
-Constraints in design of CONCOR
-Basic subsystems of CONCOR
-User interactions with system
-Examples of outputs produced
1030 shy 1045 Break
1045 - 1200 Editing Concepts - Ways to interrogate
data - Ways to correct data
- Editing housing and
population data - POPSTAN
- Advantages of CONCOR
1200 shy 115 pm Break
200 - 230
230 - 245
245 - 325
User Program Organization -Divisions - Sections
- Routines - Commands
Break
Command Language
Description -Types of statements -Format -Syntax
is
2
Tuesday January 8
Dictionary Division Command Statements
115 pm-2 15 Input-Record-Section930 am-10
30 Punctuation - Define-RecordInput data referencing
Working-Data-SectionWorking data declara-- New-Datations - Array-DataInternal data represen-
tation and storage Break215 - 230
1030 - 1045 Break 230 - 325 Dictionary Examples
- Minimum dictionary1045 - 1200 Dictionary-Attributes-
structureSection - Maximum dictionary- Dictionary-Name
structure
- Hand out dictionaryFile-Section problem- Input-File
- Output-File - Write-File
- Error-File
1200 - 115 pm Break
Wednesday January 9
115 pm-2 1 5 Execution Division Command
930 am-1030 Discussion of Partici-Statements (continued)pants Dictionary
- Routines of Edit-Specishyproblems fications-Section
Free work time -Prolog-Routine -Filter-Routine -Epilog-Routine
- Types and functions 1030 - 1045 Break
of edit specification
commands1045 - 1200 Execution Division
Command Statements - Range- Punctuation - Assert- Subscripting
- Internal Identifiers - Report-Control-Section
-Generate-Edit-Statistics 215 - 230 Break-Count-Imputes
-Examples 230 - 325 - PassFail clauses
- List1200 - 115 pm Break
3
Thursday January 10
930 am-1030 Discussion of Problems - If115 pm-215
Free work time - UntilExit - Stop
1030 - 1045 Break 215 - 230 Break
1045 shy 1200 Edit Specification Command Statements 230 - 325 - Drecode
(continued) - Grecode
- Allocate - Update - Let
1200 - 115pm Break
Friday January 11
930 am-1030 Edit Specification 115 pm-3 25 Hand out Edit Specishyfications as problemCommand Statements
(continued) Free work time
- Output - Write
1030 - 1045 Break
1045 - 1200 Report Division Command Statements - Display-Control-
Section -Display-Edit-Statistics
- Tolerance-Control-Section -Error-Rate-Check -Reject-File
-Report Examples
1200 - 115 pm Break
I
29
Monday January 14
930 am-1030 Discuss procedures for running problems on computer
1030-1045 Break
1045-1200 Component Programs of the CONCOR system
1200- 115 pm Break
Tuesday January 15
930 am - 325 pm Free work time
Wednesday January 16
930 am 1200 Free work time
1200- 115 pm Break
115 pm-215 How to Install CONCOR on IBM 360370 OS
215- 230 Break
230-325 Free work time
Thursday January 17
930 am-325 Free work time
115 pm-215 Future CONCOR Design Considerations - for census processing - for survey processing
- manual correction system
215- 230 Break
230 - 245 Evaluation Guidelines
- Hand out evaluation forms
245 - 325 Free work time
Friday January 18
930 am-1030 Free work time 115-325 Free work time
1030 - 1045 Break
1045 - 1200 Group Discussion on CONCOR - submission of written evaluations Distribution of IBM installation tapes Presentation of Certificates to Participants
1200-115 pm Break
APPENDIX D
PARTICIPANTS
APPENDIX D
CONCOR WORKSHOP January 7 - 18 1980 Marlow Heights MD
PARTICIPANTS
KENYA
James Midianga Government Computer Centre Central Bureau of Statistics PO Box 30266 Nairobi Kenya
PANAMA
Omar Rivera Rios Data Processing Specialist Contraloria General de la Republica de Panama Apartado Postal 5213 Panama 5 Panama
PHILIPPINES Guida T Capellan OIC Computer Programming Division National Census and Statistics Office Solicarel Bldg 1 Magsaysay Blvd
Sta Mesa Manila Philippines
THAILAND
Angsumal Sunalai National Statistical Office Bangkok 1 Thailand
EGYPT
Farag Sedky Mourad Ghaleb CAPMAS (Centeral Agency for Public
Mobilization and Statistics)NASR City Cairo Egypt
L
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
5
III PROPOSED CHANGES TO THE LANGUAGE STRUCTURE
Based upon the assumption that it is the intent uf sponso-ing agenciesto optimize the COBOL CONCOR package -- a goal which is believed currentlyobtainable -- an understanding of the nature of these changes and how theywould impact users is essential Appendix F sets forth in a comparative manner differences between the old December 1978 and the new December 1979 editions of CONCOR Studying this appendix obviates the fact that while the new version of the language is clearly superior to the old in nearly everyaspect the basic and overall structure of the language is essentially unshychanged Compartmentalization of aspects of the language into divisions represents a significant ideological enhancement to the language Indeeddevelopment of programs by divisions proved to be an extremely useful way of understanding the nature of editing work to be performed However note that while the END-DIVISION comnand is essential to the language the division headings DICTIONARY-DIVISION EXECUTION-DIVISION and REPORT-DIVISION were not implemented and are therefore preceded by a period to be treated as comment lines in the program listing It is inconsistent to implement END-DIVISION commands while not implementing the division headings It is believed that this division structuring is important enough to the overall organizational structure of a CONCOR source language program that it should be implementedprior to general distribution The section headers shown on the figures inAppendix F however are another matter They are cumbersome and were generally not coded by workshop participants and they could be deleted from this version of the language altogether with little loss in organizational understanding The CONCOR language is significantly powerful to stand on its own as a distinct product and is not meant to be a COBOL imitation its present degree of development and specialization do not warrant the structural drag of additional section identifiers is the probable intent of the originalCONCOR project was to develop a package which was uncomplicated and unwieldy to use The question of division and section names implementation while seemingly cosmetic can have real impact on its perceived easiness of learning and use
Figure 1 on the following page illustrates common mistakes programmersmake coding numeric and alphanumeric variables in the DATA-DIVISION These mistakes are the result of the inconsistent variable formats For instancein the numeric data definition statement it is permissible to specify 19-23 where N signifies numeric 9 signifies the length of the item and 23 specifiesthe starting position in the record In NEW-DATA however it is possible to code an item with a maximum length of 18 While on the surface this inconshysistency would seem harmless typically some data defined user variables in NEW-DATA defined N18 could be moved inadvertently to output record fields defined by a data definition statement 119-23 such an action would result in a data error Under certain circumstances itwould be highly desirable to output these larger length values A similar circumstance exists between the numeric and alphanumeric data coding conventions While the maximum lengthof the numeric is permitted to be 9 in the data definition statement (18 in NEW-DATA) the maximum alphanumeric variable is permitted to be only 4 characters in length In the current systems manual it is recommended that
FIGURE 1
DICTIONARY-DIVISION
DICTIONARY-NAME DATA-CODING-EXAMPLE
INPUT-FILE
OUTPUT-FILE
AREA-CONTROL N2-2 N2-4 N3-6 N2-9 N9-23 QUESTIONNAIRE-CONTROL A4-2 A3-6 A2-9 A3-11 A3-14
RECORD-CONTROL Al-l
DEFINE-RECORD
HOI-TYPE-OF-HOUISING-UNIT Nl-17
H02-MATERIAL-OF-ROOF N1-19 10 9
H03-TOTAL-PERSONS-IN-UNIT N8-40 NOT-NUMERIC BLAIK
1104-STATE-OF-UIIII-CODE A4-50 0 U 1 D
DEFINE-RECORD
P01-SEX 1-13 W F
NEW-DATA
NOI-SAVE-TYPE-OF-HOUSING-UNIT
N02-SAVE-TYPE-OF-ROOF 1
N03-COUNT-TOTAL-IN-UNITS 10 0
N04-AGGREGATE-INCOME 18 0
END-DIVISION
Explanations
N2-4 This is an example of an external numeric input data item (N) with a length of 2 bytes starting in column 4 of the input record The maximumlength of this type of variable outside of NEW-DATA is 9 When coded in
NEW-DATA 18 is permitted
A4-2 This is an example of an external alphanumeric input data item (A)
with a length of 4 bytes starting in column 2 of the input record This
construction for alphanumeric variable is valid only in the control stateshyments Additionally it can never be over 4 bytes in length When alphshynumeric data fields are defined within record types the EDITOR program
requires that the comparison strings always be specified A maximum of 3 is permitted The purpose of these strings is to force recode the data to a numeric value If no match is found EDITOR automatically assigns a unique negative value to the field
7
alphanumeric coding be utilized in the QUESTIONNAIRE-CONTROL and RECORD-CONTROL statements where each input data item must be of the same data type as shown in the example When alphanumeric data variables are used in these control stateshyments their construction is identical to that of numeric items However when used elsewhere in the DATA-DIVISION alphanumeric variables are required to specify one of three possible comparison values as shown There are number of production instances when it never would be necessary or even desirable to reshycode alphanumeric data However as CONCOR attempts to force data into a totally numeric format upon output there is no current way to preserve these values if desired
An unwieldy alternative to this situation which may be acceptable under some circumstances would be the expansion of the number of comparison stringsfrom three to a more realistic number The limitation of this compromise is that a full twenty-six comparison identifiers would be required in order to accommodate data which utilized the entire alphabet A better solutionhowever would be to make the general format of the alphanumeric variables identical to that of numeric identifiers ie A9-23 and to permit alphashynumeric values so defined to pass unaltered through the CONCOR system
Anocher data-naming convention which caused several errors and which could be corrected concerns the array data definitional statements While arraysof two and more dimensions are handled in a superior manner by the CONCOR proshygram single-dimension arrays pose a problem in coding as shown in the Figure 2 It is suggested that the command imperatives be changed to permit the codingof both rows and columns in single dimension arrays ie allow a single row vector as well as a single column vector to maintain the consistel -yof the array data definitional statements
A major requirement of COBOL CONCOR file processing concerns the fact that all related data records must be physically contiguous on the input file The implication of this requirement is that files may require preprocessing prior to actual data editing (This preprocessing is usually a sort routine upon a selected CONTROL-AREA key) While this type of processing merely introduces a new step in file processing a major limitation becomes apparent when a largenumber of DISCRETE DATA files of the same census or survey questionnaire are to be processed This limitation is the introduction of manual steps to save the most recent inputed values ie preventing the program from startingwith cold values each batch run If a command such as LOADUNLOAD ARRAYS was incorporated into the language (an enhancement not believed to be difficult to implement) manual processing would be reduced to a minimum between batches and the maximum benefits of the hot deck methodology would be realized It is envisioned that such a command would automatically insure the transfer of the appropriately designated hot values Automatic processing of this nature if done correctly can greatly reduce the time required to clean multishyvolume files for once CONCOR language statements have been compiled linked
While it is possible at this time to save the arrays that amp-e used in the imputation processes on a separate write-file right now it is not possibleto automatically load those values back to an object program and to iTmedishyately resume processing on another volume It isbelieved that suh an automatic feature of the language would cut down the manual processing time significantly enough that it warrants inclusion into the package prior to its general distribution
FIGURE 2
A05-DI FF-BETWEEN-AGE-OF-FEMALE-BY-RECATION v2 4 4
AGE iF LHUSBAND RELATION Connents The ARRAY-DATA command statement
12-L7 18-24 25-35 36+ provides the means to declare array identifiers
2 1 3v 4v HEAD with up to five dimensions Current documentation 2 -1 3 CHILD is not as explicite about the rules of this 1 31 -2 -4 OTHER command as is desirable The parameters of I 2 2 2 NONRFLATTVE the command should function as follows
user-identifier number of dimensions D R C M number of rows number of columns
magnitude of element intiial start up valuesA06-DI FF-BETI4EEN-ArE-DF-PERSON-AND-M4OTHER 114
(This coding generates the below In the example A05 is a two dimensional array 16 18 21 23 error message) with 4 rows 4 columns a default magvitude of 9
and cold deck values as labeled
A06-DI FF- ETWEEN-AGE-OF-PERSON-AND-MOTHER 11t 587
I 2 III jqARNINIGDD-207) COMMAND TERMINATOR I) NOT FOUND C) ASSUMED PRESENT (2) EPROR (DD-9lI) DIMENSION OF USER-SPECIFIED ARRAY IS LESS THAN THE MINIMUM VALUE PERMITTFD (2)
PREEV1OUS DIAGIOSTIC AT CINIE 563
As shown by the array variable A06 CONCORs treatment of vectors is not consistent with the above multidimensional array skeme ie
(Example of how vectors must be currently A06 must be coded as follows coded to be correct)
A06-DIFF-BETWEEN-AGE-OF- PERSON-AND-MOTHERtl42 user-identifier 1 dimension number of elements in vector magnitude of element initial start up values
A simple modification to this command would permit 6 Lthe coding of both row and column vectors and make
16 LB 21 23 this command less error prone
0 0 0
9
and stored as an object module on the system no other compilations should berequired for questionnaire processing files of the same type Theoreticallya single well-written CONCOR program is all that would be required to process an entire census run
Appendix H contrasts the internal identifiers of the old and new languageversions Without such identifiers a user would have little information about the status of input as it is processed by EDITOR As noted in theappendix most internal pointers are reset upon each break in the CONTROL-AREAprovided a CONTROL-AREA has been defined The limitation here is that there are obvious instances when the termination in the processing mode would beadvantageous based on run counts although a CONTROL-AREA has been specifiedeg debugging CONCOR programs or comparing input files Therefore another set of pointers should be implemented for this purpose and made available for programmer reference
One clearly disturbing development which needs to be pursued during inshydepth testing of the system concerns the MAX-STORAGE parameters of the DEFINE-RECORD statement As shown in the figure on the following page when MAX-STORAGE was set equal to the maximum value a COBOL program was generated whichrequired 1O00K of core to run The MAX-STORAGE value of 999 is clearly notrealistic under most processing circumstances This example drives home severalimportant points about CONCOR The core requiremenis of CONCOR generated proshygrams can be influenced significantly by the amount or nature of programmerspecified I0 operations In fact it is possible to generate a program of a size most foreign country machines could not process It is recommended that tests determine a realistic max-value restriction for implementation to prevent problems in this area
The final area of recommended modification concerns the newly implementedREPORT-DIVISION The purpose of the REPORT-DIVISION is to enable a user todescribe or specify certain CONCOR language statements which will generatestatistical reports These reports contain statistics generated by EDITOR as specified by the GENERATE-EDIT-STATISTICS command of the EXECUTION4-DIVISIONAll of the reports produced are organized according to the data fields definedby the AREA-CONTROL command of the DATA-DICTIONARY If the AREA-CONTROL command is not defined in the DATA-DICTIONARY then all the statistics aresummarized at the total run level If a control area field is defined then allstatistics will be summarized for each unique CONTROL AREA as encountered bythe EDITCR program on the input file Statistics by total run level will notbe available This in part relates back to previous discussions citing theneed for new internal identifiers Report listings may contain the values ofentire records or entire questionnaires depending upon the keyword used inthe report generation commands The problem centers upon the homogeneity of CONCOR printouts during a production run
It is virtually impossible to distinguish reports on the basis of thevolumes they were run against Some means should be provided to allow users touniquely and purposefully label the reports generated in this division Indeedthe whole name REPORT-DIVISION suggests that such a command is implicit andappropriate Such a LABEL-REPORT or REPORT-FILE command along with file inforshymation from the system should not be difficult to implement
FIGURE 3
C O N C o R PAGE
SC0 N S I S T E N C Y A N D C-0 R R E C t I ON ) RUN 6AE 0hi88n
E D I T A N D I M P U T ATI 0 N S Y S T EM
USER DICTIONARY DIVISION-SOURCE LISTING
LINE NUMBER
7070
71
72 DEFINE-RECORD RECORDNAME= HOUSING-RECORDi 72
73 MAX-STORAGE=- 999 73
74 RLrRD-TYPE 11t NOTE AN LITERAL 74
267
71
P6T
268 DEFINE-RECORD RECORD-NAME= POPULATION-RECORD p68
P69MAX-STORAGE= 999269
270RECORD-TYPE= F1270
IEF2R5I JES2JOB04966SO0114 SYSOUT iEF2851 SYS8nO18T095553RAOOOAMCP12ARO000005 PASSED
= IEF2851 VOL qFP NOS TmPOn2 IEF285I wHKHDGMCONCORP12AERRORRECS KEPT
= IEF2851 VOL SEP NOS TMPO03 IEF3731 STEP LOAD START 800181001
IEF3741 STEP LOAD STOP 800181002 CPU OMIN 0165SEC SRR flMIN O25SEC VIRT iO00K SYS 3
STEP START TIME - 100155 STEP END TIME - 100257 STEP CPU TIME - 165 SECONDS STEP COMPLETION CODE - C 0
O TAPES - 0 DISKS - 0 PEGION (REQUESTED - 1000K USED - 1000K) IO COUNT 211
CARDS READ - 0 CARDS PUNCHED - 0 LINES PRINTE6 - 22
STEPNAME - LOAD CHARGE (MACHINE UNITS - 1198 DOLLARS- S-77 5TONUM9E 1
11
Concluding Remarks of System Modifications
Version 1 of the COBOL CONCOR language was notorious for abending without the generation of diagnostic messages Appendix I sets forth a test of some of these historical problems In eac) instance the system provided messagesand source program line references wLch enabled immediate identification of the processing difficulties However tests of the various commands revealed the general high performance character of the language in its newest version Under these circumstances the recommended modifica-ions should be considered as ones which will uniformly allow CONCOR to realize t-r potential of its design enhance its utility and moreover endure as a software product
12
IV THE ADEQUACY OF COBOL CONCOR DOCUMENTATION
The current documentation of CONCOR consists of a newly developed relativelyvoluminous systems manual a Diagnostic Message Guide and some loose-leaf inshystructions concerning the installation of the system on IBM computers Of these three documents the Diagnostic Message Guide is clearly complete and adequateand could bear no changes except to provide message number tabs to enable a programmer to more quickly locate the text of a message number An example of the superior format of this guide appears as Appendix J
Conspicuously absent from the list of systems documents is a Users Guide as originally specified in the ISPC enhancement proposal
A thorough assessment of the Systems Manual has been undertaken and indishycates with some reorganization and editing it could be made a more useful and usable document as itis well constituted informationally In its current form the Systems Manual can be best described as a working document a document which was not intended to teach the CONCOR language but centered upon an audishyence already familiar with previous CONCOR releases Further it is not asystems manual per se as some of its components and appendixes could more appropriately compose a users guide For example excerpts from the POPSTAN publications which review the basic editing concepts and contain the POPSTAN example problem should be deleted from the systems manual and would be more contextually effective in the Users Guide The EXECUTION-DIVISION chapter of the Systems Manual would especially benefit from general editing and reorganishyzation (Specifically the contents of pages 90-92 should be moved to page 63 -shythese pages explain the placement and purpose of the three permissible routines of the CONCOR language Had the author been granted more time he could have outlined additional specific modifications to bring existing documentation to a level of adequacy) However some guidelines stand out
1 Installation instruction and considerations should compose a chapter of the Systems Manual rather than a separate document (Hand-out materials covering installation were not adequate) Further a documentation scheme setting forth how CONCOR was installed on each computer and any modifications required during installation could be serially added to and become a permanent part of the installations systems manual If CONCOR is going to be used and if agencies are going to actively support it as a package installation limitations must be known and well-documented The prescribed format for this documentation should be set forth and any modifishycations to CONCOR should be continuously updated on these shcets so that a history of changes to the language will be readily accessible to supporting agencies If CONCOR is to survive as a standardized system editing results must be capable of being replicated among installations Such a form could contain other information as well concerning the tyce of computer amount of care and nature of utilities supporting
13
users Upon installation a copy of this form could be
sent to the US agency which will ultimately be responsible
for supporting the CONCOR package
an appendix2 A complete COBOL CONCOR program should appear in
for reference
3 The development of the Users Guide should include an intensive
review of the editing concepts involved in processing census
data files beyond the POPSTAN materials
4 An explanation of the CONCOR benchmark program syould appear
in the Users Guide and the Systems Manual The running of a
supplied benchmark program should be a standard installation
protocol used to test all operational aspects of a new
installation
This tool which has5 The development of a CONCOR pocket card been proven useful in teaching and assisting programmers in
utilizing programming language lays out all commands options on
a single small card An example of such a pocket card is the
Such a small document is useful in clari-IBM Assembler Card fying the language and setting forth structural rules without
continual reference to full-size manuals
14
V CONCLUSION
In a situation of extreme need CONCOR could probably be utilized as it seems capable of performing its editing functiens Its usefulness as a data
cleaning tool is generally undisputed The importance of having CONCOR exshyhaustively tested by an independent agency cannot be over-emphasized in light of prior failures of the system to perform adequately Additionally core requirements and processing speed should be precisely determined One must be encouraged in the amount of progress which has been made in a relatively short amount of time towards the completion of the CONCOR package While it is clearly beyond the scope of this report to estimate the cost and the amount of time required to implement the herein outlined highly desirable changes to the CONCOR language and documentation and then exhaustively test the system it is believed that this process could be completed within a single quarter (The changes are of the nature that no structural redefinishytions of a systematic nature would be required) These enhancements would make the COBOL CONCOR package more complete consistent and reliable in the field ISPC has competently moved the system to the point that its comshypletion iswithin reach
Given the probable revision of the current Systems Reference Manual and the development of a suitable Users Guide document it is likely that this version of CONCOR could be taught to experienced programmers (unfamiliar with previous versions of the language) in as little as four weeks And while one is impressed by the potential power and overall simplicity of the COBOL CONCOR language it is difficult to envision individuals with no real computer programming experience utilizing its full capabilities Hence the importance of including a review of basic editing concepts in the documentation is undershylined
Further as documentation is often critical to the perceived ease of use of computer-based systems itwill be essential to have these various manuals examined independently prior to their general use
As previously mentioned Appendix G represents features of which the ISPC feels could have positive impact on CONCOR Depending upon the outcome of the testing recommended in previous chapters and the availability of funding supporting agencies may desire to look again at an assembler (ALC) version of the language
Overall COBOL CONCOR remains an intermediate product yet to realize its potential which is thought considerable Software development of this nature is characteristically of a long-term evolutionary design In light of this reality agencies should be prepared to evaluate such projects success or failure in terms other than that of immediate utility It is the philosophy underlying the system not the system itself which will ultimately be exported
APPENDIX A
Bucen Enforcement Proposal
APPENDIX A
BUCEN IN-HOUSE REVIEW ENHANCEMENT PROPOSAL
1 Easy to use interrecord referencing
2 Improved output file capabilities
A provide overflow protection on WRITE command
B provide OUTPUT command to create corrected questionnaire file that matches IP file data dictionary
3 Improvededit statistics reported (LISTERR)
A provide automatic (user-specified) area break
B provide options for compilation and displaying edit statistics at various levels
C provide automatic (user-specified) tolerance checking of error rates by area
D automatically capture IDs of areas failing tolerance check
4 Clean up known bugs in code
5 Comprehensive testing
6 Clean up and enhance documentation
A reference manual more examples error message guide
B installation guide
C systems manual
APPENDIX B
EVALUATIVE CRITERIA
APPEiUIX B [4 si 11 T q tII
UNITED STATES GOVERNMENshy
Pfern0 KGa 7 dll TO DSPOPFPSD Robert H Haladay
DATE December 3 1979
DSPOPDEIO Liliane Floge
SUBJECT Criteria for Evaluation of SuCen COBOL CONCOR September 30thVersion as presented at January 1930 orkshop
The following points should be considered when evaluatingCONCOR the COBOLediting system during participation at the January 19380 workshyshop
1 Are the Installation anual the System ilanual and theUsers anual ready for use Are these manuals adequate fortheir purposes Are the manuals clear and consistentthey be understood by subject-matter personnel Can
as well program ers
as
2 Is the editing system capable of impleaenting all thefeatures as described in the documenit_tion 3 Does the system appc3Ar to be ely debugged andtested and ready for release to dLvlooing conris 4 Does the system appear to be fast enough to edit a
census in a reasonable airount of time 5 hat size core does tine sys t- equire
6 CorFm-ent in general on coole te-nss usefulness andefficiency of ir-nease usethe n of by developingcountry prograrrers and other c nsus personnel
cc Su-a-- e Olds APiA SDi OrWJulio Ortuizar luotolaCelta Systems
APPENDIX C
WORKSHOP ITINERARY
APPENDIX C
CONCOR Workshop Schedule January 7-18 1980
U S Bureau of the Census International Statistical Programs Center
Scuderi Building Room 209 4235 28th Avenue Marlow Heights Maryland
Monday January 7
930 am shy 1000 Welcoming Remarks
Overview of Workshop
1000 - 1000 Introduction to CONCOR - Purpose and function
- History of development - General computer
requirements
115 - 200 System Description
-Constraints in design of CONCOR
-Basic subsystems of CONCOR
-User interactions with system
-Examples of outputs produced
1030 shy 1045 Break
1045 - 1200 Editing Concepts - Ways to interrogate
data - Ways to correct data
- Editing housing and
population data - POPSTAN
- Advantages of CONCOR
1200 shy 115 pm Break
200 - 230
230 - 245
245 - 325
User Program Organization -Divisions - Sections
- Routines - Commands
Break
Command Language
Description -Types of statements -Format -Syntax
is
2
Tuesday January 8
Dictionary Division Command Statements
115 pm-2 15 Input-Record-Section930 am-10
30 Punctuation - Define-RecordInput data referencing
Working-Data-SectionWorking data declara-- New-Datations - Array-DataInternal data represen-
tation and storage Break215 - 230
1030 - 1045 Break 230 - 325 Dictionary Examples
- Minimum dictionary1045 - 1200 Dictionary-Attributes-
structureSection - Maximum dictionary- Dictionary-Name
structure
- Hand out dictionaryFile-Section problem- Input-File
- Output-File - Write-File
- Error-File
1200 - 115 pm Break
Wednesday January 9
115 pm-2 1 5 Execution Division Command
930 am-1030 Discussion of Partici-Statements (continued)pants Dictionary
- Routines of Edit-Specishyproblems fications-Section
Free work time -Prolog-Routine -Filter-Routine -Epilog-Routine
- Types and functions 1030 - 1045 Break
of edit specification
commands1045 - 1200 Execution Division
Command Statements - Range- Punctuation - Assert- Subscripting
- Internal Identifiers - Report-Control-Section
-Generate-Edit-Statistics 215 - 230 Break-Count-Imputes
-Examples 230 - 325 - PassFail clauses
- List1200 - 115 pm Break
3
Thursday January 10
930 am-1030 Discussion of Problems - If115 pm-215
Free work time - UntilExit - Stop
1030 - 1045 Break 215 - 230 Break
1045 shy 1200 Edit Specification Command Statements 230 - 325 - Drecode
(continued) - Grecode
- Allocate - Update - Let
1200 - 115pm Break
Friday January 11
930 am-1030 Edit Specification 115 pm-3 25 Hand out Edit Specishyfications as problemCommand Statements
(continued) Free work time
- Output - Write
1030 - 1045 Break
1045 - 1200 Report Division Command Statements - Display-Control-
Section -Display-Edit-Statistics
- Tolerance-Control-Section -Error-Rate-Check -Reject-File
-Report Examples
1200 - 115 pm Break
I
29
Monday January 14
930 am-1030 Discuss procedures for running problems on computer
1030-1045 Break
1045-1200 Component Programs of the CONCOR system
1200- 115 pm Break
Tuesday January 15
930 am - 325 pm Free work time
Wednesday January 16
930 am 1200 Free work time
1200- 115 pm Break
115 pm-215 How to Install CONCOR on IBM 360370 OS
215- 230 Break
230-325 Free work time
Thursday January 17
930 am-325 Free work time
115 pm-215 Future CONCOR Design Considerations - for census processing - for survey processing
- manual correction system
215- 230 Break
230 - 245 Evaluation Guidelines
- Hand out evaluation forms
245 - 325 Free work time
Friday January 18
930 am-1030 Free work time 115-325 Free work time
1030 - 1045 Break
1045 - 1200 Group Discussion on CONCOR - submission of written evaluations Distribution of IBM installation tapes Presentation of Certificates to Participants
1200-115 pm Break
APPENDIX D
PARTICIPANTS
APPENDIX D
CONCOR WORKSHOP January 7 - 18 1980 Marlow Heights MD
PARTICIPANTS
KENYA
James Midianga Government Computer Centre Central Bureau of Statistics PO Box 30266 Nairobi Kenya
PANAMA
Omar Rivera Rios Data Processing Specialist Contraloria General de la Republica de Panama Apartado Postal 5213 Panama 5 Panama
PHILIPPINES Guida T Capellan OIC Computer Programming Division National Census and Statistics Office Solicarel Bldg 1 Magsaysay Blvd
Sta Mesa Manila Philippines
THAILAND
Angsumal Sunalai National Statistical Office Bangkok 1 Thailand
EGYPT
Farag Sedky Mourad Ghaleb CAPMAS (Centeral Agency for Public
Mobilization and Statistics)NASR City Cairo Egypt
L
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
FIGURE 1
DICTIONARY-DIVISION
DICTIONARY-NAME DATA-CODING-EXAMPLE
INPUT-FILE
OUTPUT-FILE
AREA-CONTROL N2-2 N2-4 N3-6 N2-9 N9-23 QUESTIONNAIRE-CONTROL A4-2 A3-6 A2-9 A3-11 A3-14
RECORD-CONTROL Al-l
DEFINE-RECORD
HOI-TYPE-OF-HOUISING-UNIT Nl-17
H02-MATERIAL-OF-ROOF N1-19 10 9
H03-TOTAL-PERSONS-IN-UNIT N8-40 NOT-NUMERIC BLAIK
1104-STATE-OF-UIIII-CODE A4-50 0 U 1 D
DEFINE-RECORD
P01-SEX 1-13 W F
NEW-DATA
NOI-SAVE-TYPE-OF-HOUSING-UNIT
N02-SAVE-TYPE-OF-ROOF 1
N03-COUNT-TOTAL-IN-UNITS 10 0
N04-AGGREGATE-INCOME 18 0
END-DIVISION
Explanations
N2-4 This is an example of an external numeric input data item (N) with a length of 2 bytes starting in column 4 of the input record The maximumlength of this type of variable outside of NEW-DATA is 9 When coded in
NEW-DATA 18 is permitted
A4-2 This is an example of an external alphanumeric input data item (A)
with a length of 4 bytes starting in column 2 of the input record This
construction for alphanumeric variable is valid only in the control stateshyments Additionally it can never be over 4 bytes in length When alphshynumeric data fields are defined within record types the EDITOR program
requires that the comparison strings always be specified A maximum of 3 is permitted The purpose of these strings is to force recode the data to a numeric value If no match is found EDITOR automatically assigns a unique negative value to the field
7
alphanumeric coding be utilized in the QUESTIONNAIRE-CONTROL and RECORD-CONTROL statements where each input data item must be of the same data type as shown in the example When alphanumeric data variables are used in these control stateshyments their construction is identical to that of numeric items However when used elsewhere in the DATA-DIVISION alphanumeric variables are required to specify one of three possible comparison values as shown There are number of production instances when it never would be necessary or even desirable to reshycode alphanumeric data However as CONCOR attempts to force data into a totally numeric format upon output there is no current way to preserve these values if desired
An unwieldy alternative to this situation which may be acceptable under some circumstances would be the expansion of the number of comparison stringsfrom three to a more realistic number The limitation of this compromise is that a full twenty-six comparison identifiers would be required in order to accommodate data which utilized the entire alphabet A better solutionhowever would be to make the general format of the alphanumeric variables identical to that of numeric identifiers ie A9-23 and to permit alphashynumeric values so defined to pass unaltered through the CONCOR system
Anocher data-naming convention which caused several errors and which could be corrected concerns the array data definitional statements While arraysof two and more dimensions are handled in a superior manner by the CONCOR proshygram single-dimension arrays pose a problem in coding as shown in the Figure 2 It is suggested that the command imperatives be changed to permit the codingof both rows and columns in single dimension arrays ie allow a single row vector as well as a single column vector to maintain the consistel -yof the array data definitional statements
A major requirement of COBOL CONCOR file processing concerns the fact that all related data records must be physically contiguous on the input file The implication of this requirement is that files may require preprocessing prior to actual data editing (This preprocessing is usually a sort routine upon a selected CONTROL-AREA key) While this type of processing merely introduces a new step in file processing a major limitation becomes apparent when a largenumber of DISCRETE DATA files of the same census or survey questionnaire are to be processed This limitation is the introduction of manual steps to save the most recent inputed values ie preventing the program from startingwith cold values each batch run If a command such as LOADUNLOAD ARRAYS was incorporated into the language (an enhancement not believed to be difficult to implement) manual processing would be reduced to a minimum between batches and the maximum benefits of the hot deck methodology would be realized It is envisioned that such a command would automatically insure the transfer of the appropriately designated hot values Automatic processing of this nature if done correctly can greatly reduce the time required to clean multishyvolume files for once CONCOR language statements have been compiled linked
While it is possible at this time to save the arrays that amp-e used in the imputation processes on a separate write-file right now it is not possibleto automatically load those values back to an object program and to iTmedishyately resume processing on another volume It isbelieved that suh an automatic feature of the language would cut down the manual processing time significantly enough that it warrants inclusion into the package prior to its general distribution
FIGURE 2
A05-DI FF-BETWEEN-AGE-OF-FEMALE-BY-RECATION v2 4 4
AGE iF LHUSBAND RELATION Connents The ARRAY-DATA command statement
12-L7 18-24 25-35 36+ provides the means to declare array identifiers
2 1 3v 4v HEAD with up to five dimensions Current documentation 2 -1 3 CHILD is not as explicite about the rules of this 1 31 -2 -4 OTHER command as is desirable The parameters of I 2 2 2 NONRFLATTVE the command should function as follows
user-identifier number of dimensions D R C M number of rows number of columns
magnitude of element intiial start up valuesA06-DI FF-BETI4EEN-ArE-DF-PERSON-AND-M4OTHER 114
(This coding generates the below In the example A05 is a two dimensional array 16 18 21 23 error message) with 4 rows 4 columns a default magvitude of 9
and cold deck values as labeled
A06-DI FF- ETWEEN-AGE-OF-PERSON-AND-MOTHER 11t 587
I 2 III jqARNINIGDD-207) COMMAND TERMINATOR I) NOT FOUND C) ASSUMED PRESENT (2) EPROR (DD-9lI) DIMENSION OF USER-SPECIFIED ARRAY IS LESS THAN THE MINIMUM VALUE PERMITTFD (2)
PREEV1OUS DIAGIOSTIC AT CINIE 563
As shown by the array variable A06 CONCORs treatment of vectors is not consistent with the above multidimensional array skeme ie
(Example of how vectors must be currently A06 must be coded as follows coded to be correct)
A06-DIFF-BETWEEN-AGE-OF- PERSON-AND-MOTHERtl42 user-identifier 1 dimension number of elements in vector magnitude of element initial start up values
A simple modification to this command would permit 6 Lthe coding of both row and column vectors and make
16 LB 21 23 this command less error prone
0 0 0
9
and stored as an object module on the system no other compilations should berequired for questionnaire processing files of the same type Theoreticallya single well-written CONCOR program is all that would be required to process an entire census run
Appendix H contrasts the internal identifiers of the old and new languageversions Without such identifiers a user would have little information about the status of input as it is processed by EDITOR As noted in theappendix most internal pointers are reset upon each break in the CONTROL-AREAprovided a CONTROL-AREA has been defined The limitation here is that there are obvious instances when the termination in the processing mode would beadvantageous based on run counts although a CONTROL-AREA has been specifiedeg debugging CONCOR programs or comparing input files Therefore another set of pointers should be implemented for this purpose and made available for programmer reference
One clearly disturbing development which needs to be pursued during inshydepth testing of the system concerns the MAX-STORAGE parameters of the DEFINE-RECORD statement As shown in the figure on the following page when MAX-STORAGE was set equal to the maximum value a COBOL program was generated whichrequired 1O00K of core to run The MAX-STORAGE value of 999 is clearly notrealistic under most processing circumstances This example drives home severalimportant points about CONCOR The core requiremenis of CONCOR generated proshygrams can be influenced significantly by the amount or nature of programmerspecified I0 operations In fact it is possible to generate a program of a size most foreign country machines could not process It is recommended that tests determine a realistic max-value restriction for implementation to prevent problems in this area
The final area of recommended modification concerns the newly implementedREPORT-DIVISION The purpose of the REPORT-DIVISION is to enable a user todescribe or specify certain CONCOR language statements which will generatestatistical reports These reports contain statistics generated by EDITOR as specified by the GENERATE-EDIT-STATISTICS command of the EXECUTION4-DIVISIONAll of the reports produced are organized according to the data fields definedby the AREA-CONTROL command of the DATA-DICTIONARY If the AREA-CONTROL command is not defined in the DATA-DICTIONARY then all the statistics aresummarized at the total run level If a control area field is defined then allstatistics will be summarized for each unique CONTROL AREA as encountered bythe EDITCR program on the input file Statistics by total run level will notbe available This in part relates back to previous discussions citing theneed for new internal identifiers Report listings may contain the values ofentire records or entire questionnaires depending upon the keyword used inthe report generation commands The problem centers upon the homogeneity of CONCOR printouts during a production run
It is virtually impossible to distinguish reports on the basis of thevolumes they were run against Some means should be provided to allow users touniquely and purposefully label the reports generated in this division Indeedthe whole name REPORT-DIVISION suggests that such a command is implicit andappropriate Such a LABEL-REPORT or REPORT-FILE command along with file inforshymation from the system should not be difficult to implement
FIGURE 3
C O N C o R PAGE
SC0 N S I S T E N C Y A N D C-0 R R E C t I ON ) RUN 6AE 0hi88n
E D I T A N D I M P U T ATI 0 N S Y S T EM
USER DICTIONARY DIVISION-SOURCE LISTING
LINE NUMBER
7070
71
72 DEFINE-RECORD RECORDNAME= HOUSING-RECORDi 72
73 MAX-STORAGE=- 999 73
74 RLrRD-TYPE 11t NOTE AN LITERAL 74
267
71
P6T
268 DEFINE-RECORD RECORD-NAME= POPULATION-RECORD p68
P69MAX-STORAGE= 999269
270RECORD-TYPE= F1270
IEF2R5I JES2JOB04966SO0114 SYSOUT iEF2851 SYS8nO18T095553RAOOOAMCP12ARO000005 PASSED
= IEF2851 VOL qFP NOS TmPOn2 IEF285I wHKHDGMCONCORP12AERRORRECS KEPT
= IEF2851 VOL SEP NOS TMPO03 IEF3731 STEP LOAD START 800181001
IEF3741 STEP LOAD STOP 800181002 CPU OMIN 0165SEC SRR flMIN O25SEC VIRT iO00K SYS 3
STEP START TIME - 100155 STEP END TIME - 100257 STEP CPU TIME - 165 SECONDS STEP COMPLETION CODE - C 0
O TAPES - 0 DISKS - 0 PEGION (REQUESTED - 1000K USED - 1000K) IO COUNT 211
CARDS READ - 0 CARDS PUNCHED - 0 LINES PRINTE6 - 22
STEPNAME - LOAD CHARGE (MACHINE UNITS - 1198 DOLLARS- S-77 5TONUM9E 1
11
Concluding Remarks of System Modifications
Version 1 of the COBOL CONCOR language was notorious for abending without the generation of diagnostic messages Appendix I sets forth a test of some of these historical problems In eac) instance the system provided messagesand source program line references wLch enabled immediate identification of the processing difficulties However tests of the various commands revealed the general high performance character of the language in its newest version Under these circumstances the recommended modifica-ions should be considered as ones which will uniformly allow CONCOR to realize t-r potential of its design enhance its utility and moreover endure as a software product
12
IV THE ADEQUACY OF COBOL CONCOR DOCUMENTATION
The current documentation of CONCOR consists of a newly developed relativelyvoluminous systems manual a Diagnostic Message Guide and some loose-leaf inshystructions concerning the installation of the system on IBM computers Of these three documents the Diagnostic Message Guide is clearly complete and adequateand could bear no changes except to provide message number tabs to enable a programmer to more quickly locate the text of a message number An example of the superior format of this guide appears as Appendix J
Conspicuously absent from the list of systems documents is a Users Guide as originally specified in the ISPC enhancement proposal
A thorough assessment of the Systems Manual has been undertaken and indishycates with some reorganization and editing it could be made a more useful and usable document as itis well constituted informationally In its current form the Systems Manual can be best described as a working document a document which was not intended to teach the CONCOR language but centered upon an audishyence already familiar with previous CONCOR releases Further it is not asystems manual per se as some of its components and appendixes could more appropriately compose a users guide For example excerpts from the POPSTAN publications which review the basic editing concepts and contain the POPSTAN example problem should be deleted from the systems manual and would be more contextually effective in the Users Guide The EXECUTION-DIVISION chapter of the Systems Manual would especially benefit from general editing and reorganishyzation (Specifically the contents of pages 90-92 should be moved to page 63 -shythese pages explain the placement and purpose of the three permissible routines of the CONCOR language Had the author been granted more time he could have outlined additional specific modifications to bring existing documentation to a level of adequacy) However some guidelines stand out
1 Installation instruction and considerations should compose a chapter of the Systems Manual rather than a separate document (Hand-out materials covering installation were not adequate) Further a documentation scheme setting forth how CONCOR was installed on each computer and any modifications required during installation could be serially added to and become a permanent part of the installations systems manual If CONCOR is going to be used and if agencies are going to actively support it as a package installation limitations must be known and well-documented The prescribed format for this documentation should be set forth and any modifishycations to CONCOR should be continuously updated on these shcets so that a history of changes to the language will be readily accessible to supporting agencies If CONCOR is to survive as a standardized system editing results must be capable of being replicated among installations Such a form could contain other information as well concerning the tyce of computer amount of care and nature of utilities supporting
13
users Upon installation a copy of this form could be
sent to the US agency which will ultimately be responsible
for supporting the CONCOR package
an appendix2 A complete COBOL CONCOR program should appear in
for reference
3 The development of the Users Guide should include an intensive
review of the editing concepts involved in processing census
data files beyond the POPSTAN materials
4 An explanation of the CONCOR benchmark program syould appear
in the Users Guide and the Systems Manual The running of a
supplied benchmark program should be a standard installation
protocol used to test all operational aspects of a new
installation
This tool which has5 The development of a CONCOR pocket card been proven useful in teaching and assisting programmers in
utilizing programming language lays out all commands options on
a single small card An example of such a pocket card is the
Such a small document is useful in clari-IBM Assembler Card fying the language and setting forth structural rules without
continual reference to full-size manuals
14
V CONCLUSION
In a situation of extreme need CONCOR could probably be utilized as it seems capable of performing its editing functiens Its usefulness as a data
cleaning tool is generally undisputed The importance of having CONCOR exshyhaustively tested by an independent agency cannot be over-emphasized in light of prior failures of the system to perform adequately Additionally core requirements and processing speed should be precisely determined One must be encouraged in the amount of progress which has been made in a relatively short amount of time towards the completion of the CONCOR package While it is clearly beyond the scope of this report to estimate the cost and the amount of time required to implement the herein outlined highly desirable changes to the CONCOR language and documentation and then exhaustively test the system it is believed that this process could be completed within a single quarter (The changes are of the nature that no structural redefinishytions of a systematic nature would be required) These enhancements would make the COBOL CONCOR package more complete consistent and reliable in the field ISPC has competently moved the system to the point that its comshypletion iswithin reach
Given the probable revision of the current Systems Reference Manual and the development of a suitable Users Guide document it is likely that this version of CONCOR could be taught to experienced programmers (unfamiliar with previous versions of the language) in as little as four weeks And while one is impressed by the potential power and overall simplicity of the COBOL CONCOR language it is difficult to envision individuals with no real computer programming experience utilizing its full capabilities Hence the importance of including a review of basic editing concepts in the documentation is undershylined
Further as documentation is often critical to the perceived ease of use of computer-based systems itwill be essential to have these various manuals examined independently prior to their general use
As previously mentioned Appendix G represents features of which the ISPC feels could have positive impact on CONCOR Depending upon the outcome of the testing recommended in previous chapters and the availability of funding supporting agencies may desire to look again at an assembler (ALC) version of the language
Overall COBOL CONCOR remains an intermediate product yet to realize its potential which is thought considerable Software development of this nature is characteristically of a long-term evolutionary design In light of this reality agencies should be prepared to evaluate such projects success or failure in terms other than that of immediate utility It is the philosophy underlying the system not the system itself which will ultimately be exported
APPENDIX A
Bucen Enforcement Proposal
APPENDIX A
BUCEN IN-HOUSE REVIEW ENHANCEMENT PROPOSAL
1 Easy to use interrecord referencing
2 Improved output file capabilities
A provide overflow protection on WRITE command
B provide OUTPUT command to create corrected questionnaire file that matches IP file data dictionary
3 Improvededit statistics reported (LISTERR)
A provide automatic (user-specified) area break
B provide options for compilation and displaying edit statistics at various levels
C provide automatic (user-specified) tolerance checking of error rates by area
D automatically capture IDs of areas failing tolerance check
4 Clean up known bugs in code
5 Comprehensive testing
6 Clean up and enhance documentation
A reference manual more examples error message guide
B installation guide
C systems manual
APPENDIX B
EVALUATIVE CRITERIA
APPEiUIX B [4 si 11 T q tII
UNITED STATES GOVERNMENshy
Pfern0 KGa 7 dll TO DSPOPFPSD Robert H Haladay
DATE December 3 1979
DSPOPDEIO Liliane Floge
SUBJECT Criteria for Evaluation of SuCen COBOL CONCOR September 30thVersion as presented at January 1930 orkshop
The following points should be considered when evaluatingCONCOR the COBOLediting system during participation at the January 19380 workshyshop
1 Are the Installation anual the System ilanual and theUsers anual ready for use Are these manuals adequate fortheir purposes Are the manuals clear and consistentthey be understood by subject-matter personnel Can
as well program ers
as
2 Is the editing system capable of impleaenting all thefeatures as described in the documenit_tion 3 Does the system appc3Ar to be ely debugged andtested and ready for release to dLvlooing conris 4 Does the system appear to be fast enough to edit a
census in a reasonable airount of time 5 hat size core does tine sys t- equire
6 CorFm-ent in general on coole te-nss usefulness andefficiency of ir-nease usethe n of by developingcountry prograrrers and other c nsus personnel
cc Su-a-- e Olds APiA SDi OrWJulio Ortuizar luotolaCelta Systems
APPENDIX C
WORKSHOP ITINERARY
APPENDIX C
CONCOR Workshop Schedule January 7-18 1980
U S Bureau of the Census International Statistical Programs Center
Scuderi Building Room 209 4235 28th Avenue Marlow Heights Maryland
Monday January 7
930 am shy 1000 Welcoming Remarks
Overview of Workshop
1000 - 1000 Introduction to CONCOR - Purpose and function
- History of development - General computer
requirements
115 - 200 System Description
-Constraints in design of CONCOR
-Basic subsystems of CONCOR
-User interactions with system
-Examples of outputs produced
1030 shy 1045 Break
1045 - 1200 Editing Concepts - Ways to interrogate
data - Ways to correct data
- Editing housing and
population data - POPSTAN
- Advantages of CONCOR
1200 shy 115 pm Break
200 - 230
230 - 245
245 - 325
User Program Organization -Divisions - Sections
- Routines - Commands
Break
Command Language
Description -Types of statements -Format -Syntax
is
2
Tuesday January 8
Dictionary Division Command Statements
115 pm-2 15 Input-Record-Section930 am-10
30 Punctuation - Define-RecordInput data referencing
Working-Data-SectionWorking data declara-- New-Datations - Array-DataInternal data represen-
tation and storage Break215 - 230
1030 - 1045 Break 230 - 325 Dictionary Examples
- Minimum dictionary1045 - 1200 Dictionary-Attributes-
structureSection - Maximum dictionary- Dictionary-Name
structure
- Hand out dictionaryFile-Section problem- Input-File
- Output-File - Write-File
- Error-File
1200 - 115 pm Break
Wednesday January 9
115 pm-2 1 5 Execution Division Command
930 am-1030 Discussion of Partici-Statements (continued)pants Dictionary
- Routines of Edit-Specishyproblems fications-Section
Free work time -Prolog-Routine -Filter-Routine -Epilog-Routine
- Types and functions 1030 - 1045 Break
of edit specification
commands1045 - 1200 Execution Division
Command Statements - Range- Punctuation - Assert- Subscripting
- Internal Identifiers - Report-Control-Section
-Generate-Edit-Statistics 215 - 230 Break-Count-Imputes
-Examples 230 - 325 - PassFail clauses
- List1200 - 115 pm Break
3
Thursday January 10
930 am-1030 Discussion of Problems - If115 pm-215
Free work time - UntilExit - Stop
1030 - 1045 Break 215 - 230 Break
1045 shy 1200 Edit Specification Command Statements 230 - 325 - Drecode
(continued) - Grecode
- Allocate - Update - Let
1200 - 115pm Break
Friday January 11
930 am-1030 Edit Specification 115 pm-3 25 Hand out Edit Specishyfications as problemCommand Statements
(continued) Free work time
- Output - Write
1030 - 1045 Break
1045 - 1200 Report Division Command Statements - Display-Control-
Section -Display-Edit-Statistics
- Tolerance-Control-Section -Error-Rate-Check -Reject-File
-Report Examples
1200 - 115 pm Break
I
29
Monday January 14
930 am-1030 Discuss procedures for running problems on computer
1030-1045 Break
1045-1200 Component Programs of the CONCOR system
1200- 115 pm Break
Tuesday January 15
930 am - 325 pm Free work time
Wednesday January 16
930 am 1200 Free work time
1200- 115 pm Break
115 pm-215 How to Install CONCOR on IBM 360370 OS
215- 230 Break
230-325 Free work time
Thursday January 17
930 am-325 Free work time
115 pm-215 Future CONCOR Design Considerations - for census processing - for survey processing
- manual correction system
215- 230 Break
230 - 245 Evaluation Guidelines
- Hand out evaluation forms
245 - 325 Free work time
Friday January 18
930 am-1030 Free work time 115-325 Free work time
1030 - 1045 Break
1045 - 1200 Group Discussion on CONCOR - submission of written evaluations Distribution of IBM installation tapes Presentation of Certificates to Participants
1200-115 pm Break
APPENDIX D
PARTICIPANTS
APPENDIX D
CONCOR WORKSHOP January 7 - 18 1980 Marlow Heights MD
PARTICIPANTS
KENYA
James Midianga Government Computer Centre Central Bureau of Statistics PO Box 30266 Nairobi Kenya
PANAMA
Omar Rivera Rios Data Processing Specialist Contraloria General de la Republica de Panama Apartado Postal 5213 Panama 5 Panama
PHILIPPINES Guida T Capellan OIC Computer Programming Division National Census and Statistics Office Solicarel Bldg 1 Magsaysay Blvd
Sta Mesa Manila Philippines
THAILAND
Angsumal Sunalai National Statistical Office Bangkok 1 Thailand
EGYPT
Farag Sedky Mourad Ghaleb CAPMAS (Centeral Agency for Public
Mobilization and Statistics)NASR City Cairo Egypt
L
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
7
alphanumeric coding be utilized in the QUESTIONNAIRE-CONTROL and RECORD-CONTROL statements where each input data item must be of the same data type as shown in the example When alphanumeric data variables are used in these control stateshyments their construction is identical to that of numeric items However when used elsewhere in the DATA-DIVISION alphanumeric variables are required to specify one of three possible comparison values as shown There are number of production instances when it never would be necessary or even desirable to reshycode alphanumeric data However as CONCOR attempts to force data into a totally numeric format upon output there is no current way to preserve these values if desired
An unwieldy alternative to this situation which may be acceptable under some circumstances would be the expansion of the number of comparison stringsfrom three to a more realistic number The limitation of this compromise is that a full twenty-six comparison identifiers would be required in order to accommodate data which utilized the entire alphabet A better solutionhowever would be to make the general format of the alphanumeric variables identical to that of numeric identifiers ie A9-23 and to permit alphashynumeric values so defined to pass unaltered through the CONCOR system
Anocher data-naming convention which caused several errors and which could be corrected concerns the array data definitional statements While arraysof two and more dimensions are handled in a superior manner by the CONCOR proshygram single-dimension arrays pose a problem in coding as shown in the Figure 2 It is suggested that the command imperatives be changed to permit the codingof both rows and columns in single dimension arrays ie allow a single row vector as well as a single column vector to maintain the consistel -yof the array data definitional statements
A major requirement of COBOL CONCOR file processing concerns the fact that all related data records must be physically contiguous on the input file The implication of this requirement is that files may require preprocessing prior to actual data editing (This preprocessing is usually a sort routine upon a selected CONTROL-AREA key) While this type of processing merely introduces a new step in file processing a major limitation becomes apparent when a largenumber of DISCRETE DATA files of the same census or survey questionnaire are to be processed This limitation is the introduction of manual steps to save the most recent inputed values ie preventing the program from startingwith cold values each batch run If a command such as LOADUNLOAD ARRAYS was incorporated into the language (an enhancement not believed to be difficult to implement) manual processing would be reduced to a minimum between batches and the maximum benefits of the hot deck methodology would be realized It is envisioned that such a command would automatically insure the transfer of the appropriately designated hot values Automatic processing of this nature if done correctly can greatly reduce the time required to clean multishyvolume files for once CONCOR language statements have been compiled linked
While it is possible at this time to save the arrays that amp-e used in the imputation processes on a separate write-file right now it is not possibleto automatically load those values back to an object program and to iTmedishyately resume processing on another volume It isbelieved that suh an automatic feature of the language would cut down the manual processing time significantly enough that it warrants inclusion into the package prior to its general distribution
FIGURE 2
A05-DI FF-BETWEEN-AGE-OF-FEMALE-BY-RECATION v2 4 4
AGE iF LHUSBAND RELATION Connents The ARRAY-DATA command statement
12-L7 18-24 25-35 36+ provides the means to declare array identifiers
2 1 3v 4v HEAD with up to five dimensions Current documentation 2 -1 3 CHILD is not as explicite about the rules of this 1 31 -2 -4 OTHER command as is desirable The parameters of I 2 2 2 NONRFLATTVE the command should function as follows
user-identifier number of dimensions D R C M number of rows number of columns
magnitude of element intiial start up valuesA06-DI FF-BETI4EEN-ArE-DF-PERSON-AND-M4OTHER 114
(This coding generates the below In the example A05 is a two dimensional array 16 18 21 23 error message) with 4 rows 4 columns a default magvitude of 9
and cold deck values as labeled
A06-DI FF- ETWEEN-AGE-OF-PERSON-AND-MOTHER 11t 587
I 2 III jqARNINIGDD-207) COMMAND TERMINATOR I) NOT FOUND C) ASSUMED PRESENT (2) EPROR (DD-9lI) DIMENSION OF USER-SPECIFIED ARRAY IS LESS THAN THE MINIMUM VALUE PERMITTFD (2)
PREEV1OUS DIAGIOSTIC AT CINIE 563
As shown by the array variable A06 CONCORs treatment of vectors is not consistent with the above multidimensional array skeme ie
(Example of how vectors must be currently A06 must be coded as follows coded to be correct)
A06-DIFF-BETWEEN-AGE-OF- PERSON-AND-MOTHERtl42 user-identifier 1 dimension number of elements in vector magnitude of element initial start up values
A simple modification to this command would permit 6 Lthe coding of both row and column vectors and make
16 LB 21 23 this command less error prone
0 0 0
9
and stored as an object module on the system no other compilations should berequired for questionnaire processing files of the same type Theoreticallya single well-written CONCOR program is all that would be required to process an entire census run
Appendix H contrasts the internal identifiers of the old and new languageversions Without such identifiers a user would have little information about the status of input as it is processed by EDITOR As noted in theappendix most internal pointers are reset upon each break in the CONTROL-AREAprovided a CONTROL-AREA has been defined The limitation here is that there are obvious instances when the termination in the processing mode would beadvantageous based on run counts although a CONTROL-AREA has been specifiedeg debugging CONCOR programs or comparing input files Therefore another set of pointers should be implemented for this purpose and made available for programmer reference
One clearly disturbing development which needs to be pursued during inshydepth testing of the system concerns the MAX-STORAGE parameters of the DEFINE-RECORD statement As shown in the figure on the following page when MAX-STORAGE was set equal to the maximum value a COBOL program was generated whichrequired 1O00K of core to run The MAX-STORAGE value of 999 is clearly notrealistic under most processing circumstances This example drives home severalimportant points about CONCOR The core requiremenis of CONCOR generated proshygrams can be influenced significantly by the amount or nature of programmerspecified I0 operations In fact it is possible to generate a program of a size most foreign country machines could not process It is recommended that tests determine a realistic max-value restriction for implementation to prevent problems in this area
The final area of recommended modification concerns the newly implementedREPORT-DIVISION The purpose of the REPORT-DIVISION is to enable a user todescribe or specify certain CONCOR language statements which will generatestatistical reports These reports contain statistics generated by EDITOR as specified by the GENERATE-EDIT-STATISTICS command of the EXECUTION4-DIVISIONAll of the reports produced are organized according to the data fields definedby the AREA-CONTROL command of the DATA-DICTIONARY If the AREA-CONTROL command is not defined in the DATA-DICTIONARY then all the statistics aresummarized at the total run level If a control area field is defined then allstatistics will be summarized for each unique CONTROL AREA as encountered bythe EDITCR program on the input file Statistics by total run level will notbe available This in part relates back to previous discussions citing theneed for new internal identifiers Report listings may contain the values ofentire records or entire questionnaires depending upon the keyword used inthe report generation commands The problem centers upon the homogeneity of CONCOR printouts during a production run
It is virtually impossible to distinguish reports on the basis of thevolumes they were run against Some means should be provided to allow users touniquely and purposefully label the reports generated in this division Indeedthe whole name REPORT-DIVISION suggests that such a command is implicit andappropriate Such a LABEL-REPORT or REPORT-FILE command along with file inforshymation from the system should not be difficult to implement
FIGURE 3
C O N C o R PAGE
SC0 N S I S T E N C Y A N D C-0 R R E C t I ON ) RUN 6AE 0hi88n
E D I T A N D I M P U T ATI 0 N S Y S T EM
USER DICTIONARY DIVISION-SOURCE LISTING
LINE NUMBER
7070
71
72 DEFINE-RECORD RECORDNAME= HOUSING-RECORDi 72
73 MAX-STORAGE=- 999 73
74 RLrRD-TYPE 11t NOTE AN LITERAL 74
267
71
P6T
268 DEFINE-RECORD RECORD-NAME= POPULATION-RECORD p68
P69MAX-STORAGE= 999269
270RECORD-TYPE= F1270
IEF2R5I JES2JOB04966SO0114 SYSOUT iEF2851 SYS8nO18T095553RAOOOAMCP12ARO000005 PASSED
= IEF2851 VOL qFP NOS TmPOn2 IEF285I wHKHDGMCONCORP12AERRORRECS KEPT
= IEF2851 VOL SEP NOS TMPO03 IEF3731 STEP LOAD START 800181001
IEF3741 STEP LOAD STOP 800181002 CPU OMIN 0165SEC SRR flMIN O25SEC VIRT iO00K SYS 3
STEP START TIME - 100155 STEP END TIME - 100257 STEP CPU TIME - 165 SECONDS STEP COMPLETION CODE - C 0
O TAPES - 0 DISKS - 0 PEGION (REQUESTED - 1000K USED - 1000K) IO COUNT 211
CARDS READ - 0 CARDS PUNCHED - 0 LINES PRINTE6 - 22
STEPNAME - LOAD CHARGE (MACHINE UNITS - 1198 DOLLARS- S-77 5TONUM9E 1
11
Concluding Remarks of System Modifications
Version 1 of the COBOL CONCOR language was notorious for abending without the generation of diagnostic messages Appendix I sets forth a test of some of these historical problems In eac) instance the system provided messagesand source program line references wLch enabled immediate identification of the processing difficulties However tests of the various commands revealed the general high performance character of the language in its newest version Under these circumstances the recommended modifica-ions should be considered as ones which will uniformly allow CONCOR to realize t-r potential of its design enhance its utility and moreover endure as a software product
12
IV THE ADEQUACY OF COBOL CONCOR DOCUMENTATION
The current documentation of CONCOR consists of a newly developed relativelyvoluminous systems manual a Diagnostic Message Guide and some loose-leaf inshystructions concerning the installation of the system on IBM computers Of these three documents the Diagnostic Message Guide is clearly complete and adequateand could bear no changes except to provide message number tabs to enable a programmer to more quickly locate the text of a message number An example of the superior format of this guide appears as Appendix J
Conspicuously absent from the list of systems documents is a Users Guide as originally specified in the ISPC enhancement proposal
A thorough assessment of the Systems Manual has been undertaken and indishycates with some reorganization and editing it could be made a more useful and usable document as itis well constituted informationally In its current form the Systems Manual can be best described as a working document a document which was not intended to teach the CONCOR language but centered upon an audishyence already familiar with previous CONCOR releases Further it is not asystems manual per se as some of its components and appendixes could more appropriately compose a users guide For example excerpts from the POPSTAN publications which review the basic editing concepts and contain the POPSTAN example problem should be deleted from the systems manual and would be more contextually effective in the Users Guide The EXECUTION-DIVISION chapter of the Systems Manual would especially benefit from general editing and reorganishyzation (Specifically the contents of pages 90-92 should be moved to page 63 -shythese pages explain the placement and purpose of the three permissible routines of the CONCOR language Had the author been granted more time he could have outlined additional specific modifications to bring existing documentation to a level of adequacy) However some guidelines stand out
1 Installation instruction and considerations should compose a chapter of the Systems Manual rather than a separate document (Hand-out materials covering installation were not adequate) Further a documentation scheme setting forth how CONCOR was installed on each computer and any modifications required during installation could be serially added to and become a permanent part of the installations systems manual If CONCOR is going to be used and if agencies are going to actively support it as a package installation limitations must be known and well-documented The prescribed format for this documentation should be set forth and any modifishycations to CONCOR should be continuously updated on these shcets so that a history of changes to the language will be readily accessible to supporting agencies If CONCOR is to survive as a standardized system editing results must be capable of being replicated among installations Such a form could contain other information as well concerning the tyce of computer amount of care and nature of utilities supporting
13
users Upon installation a copy of this form could be
sent to the US agency which will ultimately be responsible
for supporting the CONCOR package
an appendix2 A complete COBOL CONCOR program should appear in
for reference
3 The development of the Users Guide should include an intensive
review of the editing concepts involved in processing census
data files beyond the POPSTAN materials
4 An explanation of the CONCOR benchmark program syould appear
in the Users Guide and the Systems Manual The running of a
supplied benchmark program should be a standard installation
protocol used to test all operational aspects of a new
installation
This tool which has5 The development of a CONCOR pocket card been proven useful in teaching and assisting programmers in
utilizing programming language lays out all commands options on
a single small card An example of such a pocket card is the
Such a small document is useful in clari-IBM Assembler Card fying the language and setting forth structural rules without
continual reference to full-size manuals
14
V CONCLUSION
In a situation of extreme need CONCOR could probably be utilized as it seems capable of performing its editing functiens Its usefulness as a data
cleaning tool is generally undisputed The importance of having CONCOR exshyhaustively tested by an independent agency cannot be over-emphasized in light of prior failures of the system to perform adequately Additionally core requirements and processing speed should be precisely determined One must be encouraged in the amount of progress which has been made in a relatively short amount of time towards the completion of the CONCOR package While it is clearly beyond the scope of this report to estimate the cost and the amount of time required to implement the herein outlined highly desirable changes to the CONCOR language and documentation and then exhaustively test the system it is believed that this process could be completed within a single quarter (The changes are of the nature that no structural redefinishytions of a systematic nature would be required) These enhancements would make the COBOL CONCOR package more complete consistent and reliable in the field ISPC has competently moved the system to the point that its comshypletion iswithin reach
Given the probable revision of the current Systems Reference Manual and the development of a suitable Users Guide document it is likely that this version of CONCOR could be taught to experienced programmers (unfamiliar with previous versions of the language) in as little as four weeks And while one is impressed by the potential power and overall simplicity of the COBOL CONCOR language it is difficult to envision individuals with no real computer programming experience utilizing its full capabilities Hence the importance of including a review of basic editing concepts in the documentation is undershylined
Further as documentation is often critical to the perceived ease of use of computer-based systems itwill be essential to have these various manuals examined independently prior to their general use
As previously mentioned Appendix G represents features of which the ISPC feels could have positive impact on CONCOR Depending upon the outcome of the testing recommended in previous chapters and the availability of funding supporting agencies may desire to look again at an assembler (ALC) version of the language
Overall COBOL CONCOR remains an intermediate product yet to realize its potential which is thought considerable Software development of this nature is characteristically of a long-term evolutionary design In light of this reality agencies should be prepared to evaluate such projects success or failure in terms other than that of immediate utility It is the philosophy underlying the system not the system itself which will ultimately be exported
APPENDIX A
Bucen Enforcement Proposal
APPENDIX A
BUCEN IN-HOUSE REVIEW ENHANCEMENT PROPOSAL
1 Easy to use interrecord referencing
2 Improved output file capabilities
A provide overflow protection on WRITE command
B provide OUTPUT command to create corrected questionnaire file that matches IP file data dictionary
3 Improvededit statistics reported (LISTERR)
A provide automatic (user-specified) area break
B provide options for compilation and displaying edit statistics at various levels
C provide automatic (user-specified) tolerance checking of error rates by area
D automatically capture IDs of areas failing tolerance check
4 Clean up known bugs in code
5 Comprehensive testing
6 Clean up and enhance documentation
A reference manual more examples error message guide
B installation guide
C systems manual
APPENDIX B
EVALUATIVE CRITERIA
APPEiUIX B [4 si 11 T q tII
UNITED STATES GOVERNMENshy
Pfern0 KGa 7 dll TO DSPOPFPSD Robert H Haladay
DATE December 3 1979
DSPOPDEIO Liliane Floge
SUBJECT Criteria for Evaluation of SuCen COBOL CONCOR September 30thVersion as presented at January 1930 orkshop
The following points should be considered when evaluatingCONCOR the COBOLediting system during participation at the January 19380 workshyshop
1 Are the Installation anual the System ilanual and theUsers anual ready for use Are these manuals adequate fortheir purposes Are the manuals clear and consistentthey be understood by subject-matter personnel Can
as well program ers
as
2 Is the editing system capable of impleaenting all thefeatures as described in the documenit_tion 3 Does the system appc3Ar to be ely debugged andtested and ready for release to dLvlooing conris 4 Does the system appear to be fast enough to edit a
census in a reasonable airount of time 5 hat size core does tine sys t- equire
6 CorFm-ent in general on coole te-nss usefulness andefficiency of ir-nease usethe n of by developingcountry prograrrers and other c nsus personnel
cc Su-a-- e Olds APiA SDi OrWJulio Ortuizar luotolaCelta Systems
APPENDIX C
WORKSHOP ITINERARY
APPENDIX C
CONCOR Workshop Schedule January 7-18 1980
U S Bureau of the Census International Statistical Programs Center
Scuderi Building Room 209 4235 28th Avenue Marlow Heights Maryland
Monday January 7
930 am shy 1000 Welcoming Remarks
Overview of Workshop
1000 - 1000 Introduction to CONCOR - Purpose and function
- History of development - General computer
requirements
115 - 200 System Description
-Constraints in design of CONCOR
-Basic subsystems of CONCOR
-User interactions with system
-Examples of outputs produced
1030 shy 1045 Break
1045 - 1200 Editing Concepts - Ways to interrogate
data - Ways to correct data
- Editing housing and
population data - POPSTAN
- Advantages of CONCOR
1200 shy 115 pm Break
200 - 230
230 - 245
245 - 325
User Program Organization -Divisions - Sections
- Routines - Commands
Break
Command Language
Description -Types of statements -Format -Syntax
is
2
Tuesday January 8
Dictionary Division Command Statements
115 pm-2 15 Input-Record-Section930 am-10
30 Punctuation - Define-RecordInput data referencing
Working-Data-SectionWorking data declara-- New-Datations - Array-DataInternal data represen-
tation and storage Break215 - 230
1030 - 1045 Break 230 - 325 Dictionary Examples
- Minimum dictionary1045 - 1200 Dictionary-Attributes-
structureSection - Maximum dictionary- Dictionary-Name
structure
- Hand out dictionaryFile-Section problem- Input-File
- Output-File - Write-File
- Error-File
1200 - 115 pm Break
Wednesday January 9
115 pm-2 1 5 Execution Division Command
930 am-1030 Discussion of Partici-Statements (continued)pants Dictionary
- Routines of Edit-Specishyproblems fications-Section
Free work time -Prolog-Routine -Filter-Routine -Epilog-Routine
- Types and functions 1030 - 1045 Break
of edit specification
commands1045 - 1200 Execution Division
Command Statements - Range- Punctuation - Assert- Subscripting
- Internal Identifiers - Report-Control-Section
-Generate-Edit-Statistics 215 - 230 Break-Count-Imputes
-Examples 230 - 325 - PassFail clauses
- List1200 - 115 pm Break
3
Thursday January 10
930 am-1030 Discussion of Problems - If115 pm-215
Free work time - UntilExit - Stop
1030 - 1045 Break 215 - 230 Break
1045 shy 1200 Edit Specification Command Statements 230 - 325 - Drecode
(continued) - Grecode
- Allocate - Update - Let
1200 - 115pm Break
Friday January 11
930 am-1030 Edit Specification 115 pm-3 25 Hand out Edit Specishyfications as problemCommand Statements
(continued) Free work time
- Output - Write
1030 - 1045 Break
1045 - 1200 Report Division Command Statements - Display-Control-
Section -Display-Edit-Statistics
- Tolerance-Control-Section -Error-Rate-Check -Reject-File
-Report Examples
1200 - 115 pm Break
I
29
Monday January 14
930 am-1030 Discuss procedures for running problems on computer
1030-1045 Break
1045-1200 Component Programs of the CONCOR system
1200- 115 pm Break
Tuesday January 15
930 am - 325 pm Free work time
Wednesday January 16
930 am 1200 Free work time
1200- 115 pm Break
115 pm-215 How to Install CONCOR on IBM 360370 OS
215- 230 Break
230-325 Free work time
Thursday January 17
930 am-325 Free work time
115 pm-215 Future CONCOR Design Considerations - for census processing - for survey processing
- manual correction system
215- 230 Break
230 - 245 Evaluation Guidelines
- Hand out evaluation forms
245 - 325 Free work time
Friday January 18
930 am-1030 Free work time 115-325 Free work time
1030 - 1045 Break
1045 - 1200 Group Discussion on CONCOR - submission of written evaluations Distribution of IBM installation tapes Presentation of Certificates to Participants
1200-115 pm Break
APPENDIX D
PARTICIPANTS
APPENDIX D
CONCOR WORKSHOP January 7 - 18 1980 Marlow Heights MD
PARTICIPANTS
KENYA
James Midianga Government Computer Centre Central Bureau of Statistics PO Box 30266 Nairobi Kenya
PANAMA
Omar Rivera Rios Data Processing Specialist Contraloria General de la Republica de Panama Apartado Postal 5213 Panama 5 Panama
PHILIPPINES Guida T Capellan OIC Computer Programming Division National Census and Statistics Office Solicarel Bldg 1 Magsaysay Blvd
Sta Mesa Manila Philippines
THAILAND
Angsumal Sunalai National Statistical Office Bangkok 1 Thailand
EGYPT
Farag Sedky Mourad Ghaleb CAPMAS (Centeral Agency for Public
Mobilization and Statistics)NASR City Cairo Egypt
L
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
FIGURE 2
A05-DI FF-BETWEEN-AGE-OF-FEMALE-BY-RECATION v2 4 4
AGE iF LHUSBAND RELATION Connents The ARRAY-DATA command statement
12-L7 18-24 25-35 36+ provides the means to declare array identifiers
2 1 3v 4v HEAD with up to five dimensions Current documentation 2 -1 3 CHILD is not as explicite about the rules of this 1 31 -2 -4 OTHER command as is desirable The parameters of I 2 2 2 NONRFLATTVE the command should function as follows
user-identifier number of dimensions D R C M number of rows number of columns
magnitude of element intiial start up valuesA06-DI FF-BETI4EEN-ArE-DF-PERSON-AND-M4OTHER 114
(This coding generates the below In the example A05 is a two dimensional array 16 18 21 23 error message) with 4 rows 4 columns a default magvitude of 9
and cold deck values as labeled
A06-DI FF- ETWEEN-AGE-OF-PERSON-AND-MOTHER 11t 587
I 2 III jqARNINIGDD-207) COMMAND TERMINATOR I) NOT FOUND C) ASSUMED PRESENT (2) EPROR (DD-9lI) DIMENSION OF USER-SPECIFIED ARRAY IS LESS THAN THE MINIMUM VALUE PERMITTFD (2)
PREEV1OUS DIAGIOSTIC AT CINIE 563
As shown by the array variable A06 CONCORs treatment of vectors is not consistent with the above multidimensional array skeme ie
(Example of how vectors must be currently A06 must be coded as follows coded to be correct)
A06-DIFF-BETWEEN-AGE-OF- PERSON-AND-MOTHERtl42 user-identifier 1 dimension number of elements in vector magnitude of element initial start up values
A simple modification to this command would permit 6 Lthe coding of both row and column vectors and make
16 LB 21 23 this command less error prone
0 0 0
9
and stored as an object module on the system no other compilations should berequired for questionnaire processing files of the same type Theoreticallya single well-written CONCOR program is all that would be required to process an entire census run
Appendix H contrasts the internal identifiers of the old and new languageversions Without such identifiers a user would have little information about the status of input as it is processed by EDITOR As noted in theappendix most internal pointers are reset upon each break in the CONTROL-AREAprovided a CONTROL-AREA has been defined The limitation here is that there are obvious instances when the termination in the processing mode would beadvantageous based on run counts although a CONTROL-AREA has been specifiedeg debugging CONCOR programs or comparing input files Therefore another set of pointers should be implemented for this purpose and made available for programmer reference
One clearly disturbing development which needs to be pursued during inshydepth testing of the system concerns the MAX-STORAGE parameters of the DEFINE-RECORD statement As shown in the figure on the following page when MAX-STORAGE was set equal to the maximum value a COBOL program was generated whichrequired 1O00K of core to run The MAX-STORAGE value of 999 is clearly notrealistic under most processing circumstances This example drives home severalimportant points about CONCOR The core requiremenis of CONCOR generated proshygrams can be influenced significantly by the amount or nature of programmerspecified I0 operations In fact it is possible to generate a program of a size most foreign country machines could not process It is recommended that tests determine a realistic max-value restriction for implementation to prevent problems in this area
The final area of recommended modification concerns the newly implementedREPORT-DIVISION The purpose of the REPORT-DIVISION is to enable a user todescribe or specify certain CONCOR language statements which will generatestatistical reports These reports contain statistics generated by EDITOR as specified by the GENERATE-EDIT-STATISTICS command of the EXECUTION4-DIVISIONAll of the reports produced are organized according to the data fields definedby the AREA-CONTROL command of the DATA-DICTIONARY If the AREA-CONTROL command is not defined in the DATA-DICTIONARY then all the statistics aresummarized at the total run level If a control area field is defined then allstatistics will be summarized for each unique CONTROL AREA as encountered bythe EDITCR program on the input file Statistics by total run level will notbe available This in part relates back to previous discussions citing theneed for new internal identifiers Report listings may contain the values ofentire records or entire questionnaires depending upon the keyword used inthe report generation commands The problem centers upon the homogeneity of CONCOR printouts during a production run
It is virtually impossible to distinguish reports on the basis of thevolumes they were run against Some means should be provided to allow users touniquely and purposefully label the reports generated in this division Indeedthe whole name REPORT-DIVISION suggests that such a command is implicit andappropriate Such a LABEL-REPORT or REPORT-FILE command along with file inforshymation from the system should not be difficult to implement
FIGURE 3
C O N C o R PAGE
SC0 N S I S T E N C Y A N D C-0 R R E C t I ON ) RUN 6AE 0hi88n
E D I T A N D I M P U T ATI 0 N S Y S T EM
USER DICTIONARY DIVISION-SOURCE LISTING
LINE NUMBER
7070
71
72 DEFINE-RECORD RECORDNAME= HOUSING-RECORDi 72
73 MAX-STORAGE=- 999 73
74 RLrRD-TYPE 11t NOTE AN LITERAL 74
267
71
P6T
268 DEFINE-RECORD RECORD-NAME= POPULATION-RECORD p68
P69MAX-STORAGE= 999269
270RECORD-TYPE= F1270
IEF2R5I JES2JOB04966SO0114 SYSOUT iEF2851 SYS8nO18T095553RAOOOAMCP12ARO000005 PASSED
= IEF2851 VOL qFP NOS TmPOn2 IEF285I wHKHDGMCONCORP12AERRORRECS KEPT
= IEF2851 VOL SEP NOS TMPO03 IEF3731 STEP LOAD START 800181001
IEF3741 STEP LOAD STOP 800181002 CPU OMIN 0165SEC SRR flMIN O25SEC VIRT iO00K SYS 3
STEP START TIME - 100155 STEP END TIME - 100257 STEP CPU TIME - 165 SECONDS STEP COMPLETION CODE - C 0
O TAPES - 0 DISKS - 0 PEGION (REQUESTED - 1000K USED - 1000K) IO COUNT 211
CARDS READ - 0 CARDS PUNCHED - 0 LINES PRINTE6 - 22
STEPNAME - LOAD CHARGE (MACHINE UNITS - 1198 DOLLARS- S-77 5TONUM9E 1
11
Concluding Remarks of System Modifications
Version 1 of the COBOL CONCOR language was notorious for abending without the generation of diagnostic messages Appendix I sets forth a test of some of these historical problems In eac) instance the system provided messagesand source program line references wLch enabled immediate identification of the processing difficulties However tests of the various commands revealed the general high performance character of the language in its newest version Under these circumstances the recommended modifica-ions should be considered as ones which will uniformly allow CONCOR to realize t-r potential of its design enhance its utility and moreover endure as a software product
12
IV THE ADEQUACY OF COBOL CONCOR DOCUMENTATION
The current documentation of CONCOR consists of a newly developed relativelyvoluminous systems manual a Diagnostic Message Guide and some loose-leaf inshystructions concerning the installation of the system on IBM computers Of these three documents the Diagnostic Message Guide is clearly complete and adequateand could bear no changes except to provide message number tabs to enable a programmer to more quickly locate the text of a message number An example of the superior format of this guide appears as Appendix J
Conspicuously absent from the list of systems documents is a Users Guide as originally specified in the ISPC enhancement proposal
A thorough assessment of the Systems Manual has been undertaken and indishycates with some reorganization and editing it could be made a more useful and usable document as itis well constituted informationally In its current form the Systems Manual can be best described as a working document a document which was not intended to teach the CONCOR language but centered upon an audishyence already familiar with previous CONCOR releases Further it is not asystems manual per se as some of its components and appendixes could more appropriately compose a users guide For example excerpts from the POPSTAN publications which review the basic editing concepts and contain the POPSTAN example problem should be deleted from the systems manual and would be more contextually effective in the Users Guide The EXECUTION-DIVISION chapter of the Systems Manual would especially benefit from general editing and reorganishyzation (Specifically the contents of pages 90-92 should be moved to page 63 -shythese pages explain the placement and purpose of the three permissible routines of the CONCOR language Had the author been granted more time he could have outlined additional specific modifications to bring existing documentation to a level of adequacy) However some guidelines stand out
1 Installation instruction and considerations should compose a chapter of the Systems Manual rather than a separate document (Hand-out materials covering installation were not adequate) Further a documentation scheme setting forth how CONCOR was installed on each computer and any modifications required during installation could be serially added to and become a permanent part of the installations systems manual If CONCOR is going to be used and if agencies are going to actively support it as a package installation limitations must be known and well-documented The prescribed format for this documentation should be set forth and any modifishycations to CONCOR should be continuously updated on these shcets so that a history of changes to the language will be readily accessible to supporting agencies If CONCOR is to survive as a standardized system editing results must be capable of being replicated among installations Such a form could contain other information as well concerning the tyce of computer amount of care and nature of utilities supporting
13
users Upon installation a copy of this form could be
sent to the US agency which will ultimately be responsible
for supporting the CONCOR package
an appendix2 A complete COBOL CONCOR program should appear in
for reference
3 The development of the Users Guide should include an intensive
review of the editing concepts involved in processing census
data files beyond the POPSTAN materials
4 An explanation of the CONCOR benchmark program syould appear
in the Users Guide and the Systems Manual The running of a
supplied benchmark program should be a standard installation
protocol used to test all operational aspects of a new
installation
This tool which has5 The development of a CONCOR pocket card been proven useful in teaching and assisting programmers in
utilizing programming language lays out all commands options on
a single small card An example of such a pocket card is the
Such a small document is useful in clari-IBM Assembler Card fying the language and setting forth structural rules without
continual reference to full-size manuals
14
V CONCLUSION
In a situation of extreme need CONCOR could probably be utilized as it seems capable of performing its editing functiens Its usefulness as a data
cleaning tool is generally undisputed The importance of having CONCOR exshyhaustively tested by an independent agency cannot be over-emphasized in light of prior failures of the system to perform adequately Additionally core requirements and processing speed should be precisely determined One must be encouraged in the amount of progress which has been made in a relatively short amount of time towards the completion of the CONCOR package While it is clearly beyond the scope of this report to estimate the cost and the amount of time required to implement the herein outlined highly desirable changes to the CONCOR language and documentation and then exhaustively test the system it is believed that this process could be completed within a single quarter (The changes are of the nature that no structural redefinishytions of a systematic nature would be required) These enhancements would make the COBOL CONCOR package more complete consistent and reliable in the field ISPC has competently moved the system to the point that its comshypletion iswithin reach
Given the probable revision of the current Systems Reference Manual and the development of a suitable Users Guide document it is likely that this version of CONCOR could be taught to experienced programmers (unfamiliar with previous versions of the language) in as little as four weeks And while one is impressed by the potential power and overall simplicity of the COBOL CONCOR language it is difficult to envision individuals with no real computer programming experience utilizing its full capabilities Hence the importance of including a review of basic editing concepts in the documentation is undershylined
Further as documentation is often critical to the perceived ease of use of computer-based systems itwill be essential to have these various manuals examined independently prior to their general use
As previously mentioned Appendix G represents features of which the ISPC feels could have positive impact on CONCOR Depending upon the outcome of the testing recommended in previous chapters and the availability of funding supporting agencies may desire to look again at an assembler (ALC) version of the language
Overall COBOL CONCOR remains an intermediate product yet to realize its potential which is thought considerable Software development of this nature is characteristically of a long-term evolutionary design In light of this reality agencies should be prepared to evaluate such projects success or failure in terms other than that of immediate utility It is the philosophy underlying the system not the system itself which will ultimately be exported
APPENDIX A
Bucen Enforcement Proposal
APPENDIX A
BUCEN IN-HOUSE REVIEW ENHANCEMENT PROPOSAL
1 Easy to use interrecord referencing
2 Improved output file capabilities
A provide overflow protection on WRITE command
B provide OUTPUT command to create corrected questionnaire file that matches IP file data dictionary
3 Improvededit statistics reported (LISTERR)
A provide automatic (user-specified) area break
B provide options for compilation and displaying edit statistics at various levels
C provide automatic (user-specified) tolerance checking of error rates by area
D automatically capture IDs of areas failing tolerance check
4 Clean up known bugs in code
5 Comprehensive testing
6 Clean up and enhance documentation
A reference manual more examples error message guide
B installation guide
C systems manual
APPENDIX B
EVALUATIVE CRITERIA
APPEiUIX B [4 si 11 T q tII
UNITED STATES GOVERNMENshy
Pfern0 KGa 7 dll TO DSPOPFPSD Robert H Haladay
DATE December 3 1979
DSPOPDEIO Liliane Floge
SUBJECT Criteria for Evaluation of SuCen COBOL CONCOR September 30thVersion as presented at January 1930 orkshop
The following points should be considered when evaluatingCONCOR the COBOLediting system during participation at the January 19380 workshyshop
1 Are the Installation anual the System ilanual and theUsers anual ready for use Are these manuals adequate fortheir purposes Are the manuals clear and consistentthey be understood by subject-matter personnel Can
as well program ers
as
2 Is the editing system capable of impleaenting all thefeatures as described in the documenit_tion 3 Does the system appc3Ar to be ely debugged andtested and ready for release to dLvlooing conris 4 Does the system appear to be fast enough to edit a
census in a reasonable airount of time 5 hat size core does tine sys t- equire
6 CorFm-ent in general on coole te-nss usefulness andefficiency of ir-nease usethe n of by developingcountry prograrrers and other c nsus personnel
cc Su-a-- e Olds APiA SDi OrWJulio Ortuizar luotolaCelta Systems
APPENDIX C
WORKSHOP ITINERARY
APPENDIX C
CONCOR Workshop Schedule January 7-18 1980
U S Bureau of the Census International Statistical Programs Center
Scuderi Building Room 209 4235 28th Avenue Marlow Heights Maryland
Monday January 7
930 am shy 1000 Welcoming Remarks
Overview of Workshop
1000 - 1000 Introduction to CONCOR - Purpose and function
- History of development - General computer
requirements
115 - 200 System Description
-Constraints in design of CONCOR
-Basic subsystems of CONCOR
-User interactions with system
-Examples of outputs produced
1030 shy 1045 Break
1045 - 1200 Editing Concepts - Ways to interrogate
data - Ways to correct data
- Editing housing and
population data - POPSTAN
- Advantages of CONCOR
1200 shy 115 pm Break
200 - 230
230 - 245
245 - 325
User Program Organization -Divisions - Sections
- Routines - Commands
Break
Command Language
Description -Types of statements -Format -Syntax
is
2
Tuesday January 8
Dictionary Division Command Statements
115 pm-2 15 Input-Record-Section930 am-10
30 Punctuation - Define-RecordInput data referencing
Working-Data-SectionWorking data declara-- New-Datations - Array-DataInternal data represen-
tation and storage Break215 - 230
1030 - 1045 Break 230 - 325 Dictionary Examples
- Minimum dictionary1045 - 1200 Dictionary-Attributes-
structureSection - Maximum dictionary- Dictionary-Name
structure
- Hand out dictionaryFile-Section problem- Input-File
- Output-File - Write-File
- Error-File
1200 - 115 pm Break
Wednesday January 9
115 pm-2 1 5 Execution Division Command
930 am-1030 Discussion of Partici-Statements (continued)pants Dictionary
- Routines of Edit-Specishyproblems fications-Section
Free work time -Prolog-Routine -Filter-Routine -Epilog-Routine
- Types and functions 1030 - 1045 Break
of edit specification
commands1045 - 1200 Execution Division
Command Statements - Range- Punctuation - Assert- Subscripting
- Internal Identifiers - Report-Control-Section
-Generate-Edit-Statistics 215 - 230 Break-Count-Imputes
-Examples 230 - 325 - PassFail clauses
- List1200 - 115 pm Break
3
Thursday January 10
930 am-1030 Discussion of Problems - If115 pm-215
Free work time - UntilExit - Stop
1030 - 1045 Break 215 - 230 Break
1045 shy 1200 Edit Specification Command Statements 230 - 325 - Drecode
(continued) - Grecode
- Allocate - Update - Let
1200 - 115pm Break
Friday January 11
930 am-1030 Edit Specification 115 pm-3 25 Hand out Edit Specishyfications as problemCommand Statements
(continued) Free work time
- Output - Write
1030 - 1045 Break
1045 - 1200 Report Division Command Statements - Display-Control-
Section -Display-Edit-Statistics
- Tolerance-Control-Section -Error-Rate-Check -Reject-File
-Report Examples
1200 - 115 pm Break
I
29
Monday January 14
930 am-1030 Discuss procedures for running problems on computer
1030-1045 Break
1045-1200 Component Programs of the CONCOR system
1200- 115 pm Break
Tuesday January 15
930 am - 325 pm Free work time
Wednesday January 16
930 am 1200 Free work time
1200- 115 pm Break
115 pm-215 How to Install CONCOR on IBM 360370 OS
215- 230 Break
230-325 Free work time
Thursday January 17
930 am-325 Free work time
115 pm-215 Future CONCOR Design Considerations - for census processing - for survey processing
- manual correction system
215- 230 Break
230 - 245 Evaluation Guidelines
- Hand out evaluation forms
245 - 325 Free work time
Friday January 18
930 am-1030 Free work time 115-325 Free work time
1030 - 1045 Break
1045 - 1200 Group Discussion on CONCOR - submission of written evaluations Distribution of IBM installation tapes Presentation of Certificates to Participants
1200-115 pm Break
APPENDIX D
PARTICIPANTS
APPENDIX D
CONCOR WORKSHOP January 7 - 18 1980 Marlow Heights MD
PARTICIPANTS
KENYA
James Midianga Government Computer Centre Central Bureau of Statistics PO Box 30266 Nairobi Kenya
PANAMA
Omar Rivera Rios Data Processing Specialist Contraloria General de la Republica de Panama Apartado Postal 5213 Panama 5 Panama
PHILIPPINES Guida T Capellan OIC Computer Programming Division National Census and Statistics Office Solicarel Bldg 1 Magsaysay Blvd
Sta Mesa Manila Philippines
THAILAND
Angsumal Sunalai National Statistical Office Bangkok 1 Thailand
EGYPT
Farag Sedky Mourad Ghaleb CAPMAS (Centeral Agency for Public
Mobilization and Statistics)NASR City Cairo Egypt
L
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
9
and stored as an object module on the system no other compilations should berequired for questionnaire processing files of the same type Theoreticallya single well-written CONCOR program is all that would be required to process an entire census run
Appendix H contrasts the internal identifiers of the old and new languageversions Without such identifiers a user would have little information about the status of input as it is processed by EDITOR As noted in theappendix most internal pointers are reset upon each break in the CONTROL-AREAprovided a CONTROL-AREA has been defined The limitation here is that there are obvious instances when the termination in the processing mode would beadvantageous based on run counts although a CONTROL-AREA has been specifiedeg debugging CONCOR programs or comparing input files Therefore another set of pointers should be implemented for this purpose and made available for programmer reference
One clearly disturbing development which needs to be pursued during inshydepth testing of the system concerns the MAX-STORAGE parameters of the DEFINE-RECORD statement As shown in the figure on the following page when MAX-STORAGE was set equal to the maximum value a COBOL program was generated whichrequired 1O00K of core to run The MAX-STORAGE value of 999 is clearly notrealistic under most processing circumstances This example drives home severalimportant points about CONCOR The core requiremenis of CONCOR generated proshygrams can be influenced significantly by the amount or nature of programmerspecified I0 operations In fact it is possible to generate a program of a size most foreign country machines could not process It is recommended that tests determine a realistic max-value restriction for implementation to prevent problems in this area
The final area of recommended modification concerns the newly implementedREPORT-DIVISION The purpose of the REPORT-DIVISION is to enable a user todescribe or specify certain CONCOR language statements which will generatestatistical reports These reports contain statistics generated by EDITOR as specified by the GENERATE-EDIT-STATISTICS command of the EXECUTION4-DIVISIONAll of the reports produced are organized according to the data fields definedby the AREA-CONTROL command of the DATA-DICTIONARY If the AREA-CONTROL command is not defined in the DATA-DICTIONARY then all the statistics aresummarized at the total run level If a control area field is defined then allstatistics will be summarized for each unique CONTROL AREA as encountered bythe EDITCR program on the input file Statistics by total run level will notbe available This in part relates back to previous discussions citing theneed for new internal identifiers Report listings may contain the values ofentire records or entire questionnaires depending upon the keyword used inthe report generation commands The problem centers upon the homogeneity of CONCOR printouts during a production run
It is virtually impossible to distinguish reports on the basis of thevolumes they were run against Some means should be provided to allow users touniquely and purposefully label the reports generated in this division Indeedthe whole name REPORT-DIVISION suggests that such a command is implicit andappropriate Such a LABEL-REPORT or REPORT-FILE command along with file inforshymation from the system should not be difficult to implement
FIGURE 3
C O N C o R PAGE
SC0 N S I S T E N C Y A N D C-0 R R E C t I ON ) RUN 6AE 0hi88n
E D I T A N D I M P U T ATI 0 N S Y S T EM
USER DICTIONARY DIVISION-SOURCE LISTING
LINE NUMBER
7070
71
72 DEFINE-RECORD RECORDNAME= HOUSING-RECORDi 72
73 MAX-STORAGE=- 999 73
74 RLrRD-TYPE 11t NOTE AN LITERAL 74
267
71
P6T
268 DEFINE-RECORD RECORD-NAME= POPULATION-RECORD p68
P69MAX-STORAGE= 999269
270RECORD-TYPE= F1270
IEF2R5I JES2JOB04966SO0114 SYSOUT iEF2851 SYS8nO18T095553RAOOOAMCP12ARO000005 PASSED
= IEF2851 VOL qFP NOS TmPOn2 IEF285I wHKHDGMCONCORP12AERRORRECS KEPT
= IEF2851 VOL SEP NOS TMPO03 IEF3731 STEP LOAD START 800181001
IEF3741 STEP LOAD STOP 800181002 CPU OMIN 0165SEC SRR flMIN O25SEC VIRT iO00K SYS 3
STEP START TIME - 100155 STEP END TIME - 100257 STEP CPU TIME - 165 SECONDS STEP COMPLETION CODE - C 0
O TAPES - 0 DISKS - 0 PEGION (REQUESTED - 1000K USED - 1000K) IO COUNT 211
CARDS READ - 0 CARDS PUNCHED - 0 LINES PRINTE6 - 22
STEPNAME - LOAD CHARGE (MACHINE UNITS - 1198 DOLLARS- S-77 5TONUM9E 1
11
Concluding Remarks of System Modifications
Version 1 of the COBOL CONCOR language was notorious for abending without the generation of diagnostic messages Appendix I sets forth a test of some of these historical problems In eac) instance the system provided messagesand source program line references wLch enabled immediate identification of the processing difficulties However tests of the various commands revealed the general high performance character of the language in its newest version Under these circumstances the recommended modifica-ions should be considered as ones which will uniformly allow CONCOR to realize t-r potential of its design enhance its utility and moreover endure as a software product
12
IV THE ADEQUACY OF COBOL CONCOR DOCUMENTATION
The current documentation of CONCOR consists of a newly developed relativelyvoluminous systems manual a Diagnostic Message Guide and some loose-leaf inshystructions concerning the installation of the system on IBM computers Of these three documents the Diagnostic Message Guide is clearly complete and adequateand could bear no changes except to provide message number tabs to enable a programmer to more quickly locate the text of a message number An example of the superior format of this guide appears as Appendix J
Conspicuously absent from the list of systems documents is a Users Guide as originally specified in the ISPC enhancement proposal
A thorough assessment of the Systems Manual has been undertaken and indishycates with some reorganization and editing it could be made a more useful and usable document as itis well constituted informationally In its current form the Systems Manual can be best described as a working document a document which was not intended to teach the CONCOR language but centered upon an audishyence already familiar with previous CONCOR releases Further it is not asystems manual per se as some of its components and appendixes could more appropriately compose a users guide For example excerpts from the POPSTAN publications which review the basic editing concepts and contain the POPSTAN example problem should be deleted from the systems manual and would be more contextually effective in the Users Guide The EXECUTION-DIVISION chapter of the Systems Manual would especially benefit from general editing and reorganishyzation (Specifically the contents of pages 90-92 should be moved to page 63 -shythese pages explain the placement and purpose of the three permissible routines of the CONCOR language Had the author been granted more time he could have outlined additional specific modifications to bring existing documentation to a level of adequacy) However some guidelines stand out
1 Installation instruction and considerations should compose a chapter of the Systems Manual rather than a separate document (Hand-out materials covering installation were not adequate) Further a documentation scheme setting forth how CONCOR was installed on each computer and any modifications required during installation could be serially added to and become a permanent part of the installations systems manual If CONCOR is going to be used and if agencies are going to actively support it as a package installation limitations must be known and well-documented The prescribed format for this documentation should be set forth and any modifishycations to CONCOR should be continuously updated on these shcets so that a history of changes to the language will be readily accessible to supporting agencies If CONCOR is to survive as a standardized system editing results must be capable of being replicated among installations Such a form could contain other information as well concerning the tyce of computer amount of care and nature of utilities supporting
13
users Upon installation a copy of this form could be
sent to the US agency which will ultimately be responsible
for supporting the CONCOR package
an appendix2 A complete COBOL CONCOR program should appear in
for reference
3 The development of the Users Guide should include an intensive
review of the editing concepts involved in processing census
data files beyond the POPSTAN materials
4 An explanation of the CONCOR benchmark program syould appear
in the Users Guide and the Systems Manual The running of a
supplied benchmark program should be a standard installation
protocol used to test all operational aspects of a new
installation
This tool which has5 The development of a CONCOR pocket card been proven useful in teaching and assisting programmers in
utilizing programming language lays out all commands options on
a single small card An example of such a pocket card is the
Such a small document is useful in clari-IBM Assembler Card fying the language and setting forth structural rules without
continual reference to full-size manuals
14
V CONCLUSION
In a situation of extreme need CONCOR could probably be utilized as it seems capable of performing its editing functiens Its usefulness as a data
cleaning tool is generally undisputed The importance of having CONCOR exshyhaustively tested by an independent agency cannot be over-emphasized in light of prior failures of the system to perform adequately Additionally core requirements and processing speed should be precisely determined One must be encouraged in the amount of progress which has been made in a relatively short amount of time towards the completion of the CONCOR package While it is clearly beyond the scope of this report to estimate the cost and the amount of time required to implement the herein outlined highly desirable changes to the CONCOR language and documentation and then exhaustively test the system it is believed that this process could be completed within a single quarter (The changes are of the nature that no structural redefinishytions of a systematic nature would be required) These enhancements would make the COBOL CONCOR package more complete consistent and reliable in the field ISPC has competently moved the system to the point that its comshypletion iswithin reach
Given the probable revision of the current Systems Reference Manual and the development of a suitable Users Guide document it is likely that this version of CONCOR could be taught to experienced programmers (unfamiliar with previous versions of the language) in as little as four weeks And while one is impressed by the potential power and overall simplicity of the COBOL CONCOR language it is difficult to envision individuals with no real computer programming experience utilizing its full capabilities Hence the importance of including a review of basic editing concepts in the documentation is undershylined
Further as documentation is often critical to the perceived ease of use of computer-based systems itwill be essential to have these various manuals examined independently prior to their general use
As previously mentioned Appendix G represents features of which the ISPC feels could have positive impact on CONCOR Depending upon the outcome of the testing recommended in previous chapters and the availability of funding supporting agencies may desire to look again at an assembler (ALC) version of the language
Overall COBOL CONCOR remains an intermediate product yet to realize its potential which is thought considerable Software development of this nature is characteristically of a long-term evolutionary design In light of this reality agencies should be prepared to evaluate such projects success or failure in terms other than that of immediate utility It is the philosophy underlying the system not the system itself which will ultimately be exported
APPENDIX A
Bucen Enforcement Proposal
APPENDIX A
BUCEN IN-HOUSE REVIEW ENHANCEMENT PROPOSAL
1 Easy to use interrecord referencing
2 Improved output file capabilities
A provide overflow protection on WRITE command
B provide OUTPUT command to create corrected questionnaire file that matches IP file data dictionary
3 Improvededit statistics reported (LISTERR)
A provide automatic (user-specified) area break
B provide options for compilation and displaying edit statistics at various levels
C provide automatic (user-specified) tolerance checking of error rates by area
D automatically capture IDs of areas failing tolerance check
4 Clean up known bugs in code
5 Comprehensive testing
6 Clean up and enhance documentation
A reference manual more examples error message guide
B installation guide
C systems manual
APPENDIX B
EVALUATIVE CRITERIA
APPEiUIX B [4 si 11 T q tII
UNITED STATES GOVERNMENshy
Pfern0 KGa 7 dll TO DSPOPFPSD Robert H Haladay
DATE December 3 1979
DSPOPDEIO Liliane Floge
SUBJECT Criteria for Evaluation of SuCen COBOL CONCOR September 30thVersion as presented at January 1930 orkshop
The following points should be considered when evaluatingCONCOR the COBOLediting system during participation at the January 19380 workshyshop
1 Are the Installation anual the System ilanual and theUsers anual ready for use Are these manuals adequate fortheir purposes Are the manuals clear and consistentthey be understood by subject-matter personnel Can
as well program ers
as
2 Is the editing system capable of impleaenting all thefeatures as described in the documenit_tion 3 Does the system appc3Ar to be ely debugged andtested and ready for release to dLvlooing conris 4 Does the system appear to be fast enough to edit a
census in a reasonable airount of time 5 hat size core does tine sys t- equire
6 CorFm-ent in general on coole te-nss usefulness andefficiency of ir-nease usethe n of by developingcountry prograrrers and other c nsus personnel
cc Su-a-- e Olds APiA SDi OrWJulio Ortuizar luotolaCelta Systems
APPENDIX C
WORKSHOP ITINERARY
APPENDIX C
CONCOR Workshop Schedule January 7-18 1980
U S Bureau of the Census International Statistical Programs Center
Scuderi Building Room 209 4235 28th Avenue Marlow Heights Maryland
Monday January 7
930 am shy 1000 Welcoming Remarks
Overview of Workshop
1000 - 1000 Introduction to CONCOR - Purpose and function
- History of development - General computer
requirements
115 - 200 System Description
-Constraints in design of CONCOR
-Basic subsystems of CONCOR
-User interactions with system
-Examples of outputs produced
1030 shy 1045 Break
1045 - 1200 Editing Concepts - Ways to interrogate
data - Ways to correct data
- Editing housing and
population data - POPSTAN
- Advantages of CONCOR
1200 shy 115 pm Break
200 - 230
230 - 245
245 - 325
User Program Organization -Divisions - Sections
- Routines - Commands
Break
Command Language
Description -Types of statements -Format -Syntax
is
2
Tuesday January 8
Dictionary Division Command Statements
115 pm-2 15 Input-Record-Section930 am-10
30 Punctuation - Define-RecordInput data referencing
Working-Data-SectionWorking data declara-- New-Datations - Array-DataInternal data represen-
tation and storage Break215 - 230
1030 - 1045 Break 230 - 325 Dictionary Examples
- Minimum dictionary1045 - 1200 Dictionary-Attributes-
structureSection - Maximum dictionary- Dictionary-Name
structure
- Hand out dictionaryFile-Section problem- Input-File
- Output-File - Write-File
- Error-File
1200 - 115 pm Break
Wednesday January 9
115 pm-2 1 5 Execution Division Command
930 am-1030 Discussion of Partici-Statements (continued)pants Dictionary
- Routines of Edit-Specishyproblems fications-Section
Free work time -Prolog-Routine -Filter-Routine -Epilog-Routine
- Types and functions 1030 - 1045 Break
of edit specification
commands1045 - 1200 Execution Division
Command Statements - Range- Punctuation - Assert- Subscripting
- Internal Identifiers - Report-Control-Section
-Generate-Edit-Statistics 215 - 230 Break-Count-Imputes
-Examples 230 - 325 - PassFail clauses
- List1200 - 115 pm Break
3
Thursday January 10
930 am-1030 Discussion of Problems - If115 pm-215
Free work time - UntilExit - Stop
1030 - 1045 Break 215 - 230 Break
1045 shy 1200 Edit Specification Command Statements 230 - 325 - Drecode
(continued) - Grecode
- Allocate - Update - Let
1200 - 115pm Break
Friday January 11
930 am-1030 Edit Specification 115 pm-3 25 Hand out Edit Specishyfications as problemCommand Statements
(continued) Free work time
- Output - Write
1030 - 1045 Break
1045 - 1200 Report Division Command Statements - Display-Control-
Section -Display-Edit-Statistics
- Tolerance-Control-Section -Error-Rate-Check -Reject-File
-Report Examples
1200 - 115 pm Break
I
29
Monday January 14
930 am-1030 Discuss procedures for running problems on computer
1030-1045 Break
1045-1200 Component Programs of the CONCOR system
1200- 115 pm Break
Tuesday January 15
930 am - 325 pm Free work time
Wednesday January 16
930 am 1200 Free work time
1200- 115 pm Break
115 pm-215 How to Install CONCOR on IBM 360370 OS
215- 230 Break
230-325 Free work time
Thursday January 17
930 am-325 Free work time
115 pm-215 Future CONCOR Design Considerations - for census processing - for survey processing
- manual correction system
215- 230 Break
230 - 245 Evaluation Guidelines
- Hand out evaluation forms
245 - 325 Free work time
Friday January 18
930 am-1030 Free work time 115-325 Free work time
1030 - 1045 Break
1045 - 1200 Group Discussion on CONCOR - submission of written evaluations Distribution of IBM installation tapes Presentation of Certificates to Participants
1200-115 pm Break
APPENDIX D
PARTICIPANTS
APPENDIX D
CONCOR WORKSHOP January 7 - 18 1980 Marlow Heights MD
PARTICIPANTS
KENYA
James Midianga Government Computer Centre Central Bureau of Statistics PO Box 30266 Nairobi Kenya
PANAMA
Omar Rivera Rios Data Processing Specialist Contraloria General de la Republica de Panama Apartado Postal 5213 Panama 5 Panama
PHILIPPINES Guida T Capellan OIC Computer Programming Division National Census and Statistics Office Solicarel Bldg 1 Magsaysay Blvd
Sta Mesa Manila Philippines
THAILAND
Angsumal Sunalai National Statistical Office Bangkok 1 Thailand
EGYPT
Farag Sedky Mourad Ghaleb CAPMAS (Centeral Agency for Public
Mobilization and Statistics)NASR City Cairo Egypt
L
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
FIGURE 3
C O N C o R PAGE
SC0 N S I S T E N C Y A N D C-0 R R E C t I ON ) RUN 6AE 0hi88n
E D I T A N D I M P U T ATI 0 N S Y S T EM
USER DICTIONARY DIVISION-SOURCE LISTING
LINE NUMBER
7070
71
72 DEFINE-RECORD RECORDNAME= HOUSING-RECORDi 72
73 MAX-STORAGE=- 999 73
74 RLrRD-TYPE 11t NOTE AN LITERAL 74
267
71
P6T
268 DEFINE-RECORD RECORD-NAME= POPULATION-RECORD p68
P69MAX-STORAGE= 999269
270RECORD-TYPE= F1270
IEF2R5I JES2JOB04966SO0114 SYSOUT iEF2851 SYS8nO18T095553RAOOOAMCP12ARO000005 PASSED
= IEF2851 VOL qFP NOS TmPOn2 IEF285I wHKHDGMCONCORP12AERRORRECS KEPT
= IEF2851 VOL SEP NOS TMPO03 IEF3731 STEP LOAD START 800181001
IEF3741 STEP LOAD STOP 800181002 CPU OMIN 0165SEC SRR flMIN O25SEC VIRT iO00K SYS 3
STEP START TIME - 100155 STEP END TIME - 100257 STEP CPU TIME - 165 SECONDS STEP COMPLETION CODE - C 0
O TAPES - 0 DISKS - 0 PEGION (REQUESTED - 1000K USED - 1000K) IO COUNT 211
CARDS READ - 0 CARDS PUNCHED - 0 LINES PRINTE6 - 22
STEPNAME - LOAD CHARGE (MACHINE UNITS - 1198 DOLLARS- S-77 5TONUM9E 1
11
Concluding Remarks of System Modifications
Version 1 of the COBOL CONCOR language was notorious for abending without the generation of diagnostic messages Appendix I sets forth a test of some of these historical problems In eac) instance the system provided messagesand source program line references wLch enabled immediate identification of the processing difficulties However tests of the various commands revealed the general high performance character of the language in its newest version Under these circumstances the recommended modifica-ions should be considered as ones which will uniformly allow CONCOR to realize t-r potential of its design enhance its utility and moreover endure as a software product
12
IV THE ADEQUACY OF COBOL CONCOR DOCUMENTATION
The current documentation of CONCOR consists of a newly developed relativelyvoluminous systems manual a Diagnostic Message Guide and some loose-leaf inshystructions concerning the installation of the system on IBM computers Of these three documents the Diagnostic Message Guide is clearly complete and adequateand could bear no changes except to provide message number tabs to enable a programmer to more quickly locate the text of a message number An example of the superior format of this guide appears as Appendix J
Conspicuously absent from the list of systems documents is a Users Guide as originally specified in the ISPC enhancement proposal
A thorough assessment of the Systems Manual has been undertaken and indishycates with some reorganization and editing it could be made a more useful and usable document as itis well constituted informationally In its current form the Systems Manual can be best described as a working document a document which was not intended to teach the CONCOR language but centered upon an audishyence already familiar with previous CONCOR releases Further it is not asystems manual per se as some of its components and appendixes could more appropriately compose a users guide For example excerpts from the POPSTAN publications which review the basic editing concepts and contain the POPSTAN example problem should be deleted from the systems manual and would be more contextually effective in the Users Guide The EXECUTION-DIVISION chapter of the Systems Manual would especially benefit from general editing and reorganishyzation (Specifically the contents of pages 90-92 should be moved to page 63 -shythese pages explain the placement and purpose of the three permissible routines of the CONCOR language Had the author been granted more time he could have outlined additional specific modifications to bring existing documentation to a level of adequacy) However some guidelines stand out
1 Installation instruction and considerations should compose a chapter of the Systems Manual rather than a separate document (Hand-out materials covering installation were not adequate) Further a documentation scheme setting forth how CONCOR was installed on each computer and any modifications required during installation could be serially added to and become a permanent part of the installations systems manual If CONCOR is going to be used and if agencies are going to actively support it as a package installation limitations must be known and well-documented The prescribed format for this documentation should be set forth and any modifishycations to CONCOR should be continuously updated on these shcets so that a history of changes to the language will be readily accessible to supporting agencies If CONCOR is to survive as a standardized system editing results must be capable of being replicated among installations Such a form could contain other information as well concerning the tyce of computer amount of care and nature of utilities supporting
13
users Upon installation a copy of this form could be
sent to the US agency which will ultimately be responsible
for supporting the CONCOR package
an appendix2 A complete COBOL CONCOR program should appear in
for reference
3 The development of the Users Guide should include an intensive
review of the editing concepts involved in processing census
data files beyond the POPSTAN materials
4 An explanation of the CONCOR benchmark program syould appear
in the Users Guide and the Systems Manual The running of a
supplied benchmark program should be a standard installation
protocol used to test all operational aspects of a new
installation
This tool which has5 The development of a CONCOR pocket card been proven useful in teaching and assisting programmers in
utilizing programming language lays out all commands options on
a single small card An example of such a pocket card is the
Such a small document is useful in clari-IBM Assembler Card fying the language and setting forth structural rules without
continual reference to full-size manuals
14
V CONCLUSION
In a situation of extreme need CONCOR could probably be utilized as it seems capable of performing its editing functiens Its usefulness as a data
cleaning tool is generally undisputed The importance of having CONCOR exshyhaustively tested by an independent agency cannot be over-emphasized in light of prior failures of the system to perform adequately Additionally core requirements and processing speed should be precisely determined One must be encouraged in the amount of progress which has been made in a relatively short amount of time towards the completion of the CONCOR package While it is clearly beyond the scope of this report to estimate the cost and the amount of time required to implement the herein outlined highly desirable changes to the CONCOR language and documentation and then exhaustively test the system it is believed that this process could be completed within a single quarter (The changes are of the nature that no structural redefinishytions of a systematic nature would be required) These enhancements would make the COBOL CONCOR package more complete consistent and reliable in the field ISPC has competently moved the system to the point that its comshypletion iswithin reach
Given the probable revision of the current Systems Reference Manual and the development of a suitable Users Guide document it is likely that this version of CONCOR could be taught to experienced programmers (unfamiliar with previous versions of the language) in as little as four weeks And while one is impressed by the potential power and overall simplicity of the COBOL CONCOR language it is difficult to envision individuals with no real computer programming experience utilizing its full capabilities Hence the importance of including a review of basic editing concepts in the documentation is undershylined
Further as documentation is often critical to the perceived ease of use of computer-based systems itwill be essential to have these various manuals examined independently prior to their general use
As previously mentioned Appendix G represents features of which the ISPC feels could have positive impact on CONCOR Depending upon the outcome of the testing recommended in previous chapters and the availability of funding supporting agencies may desire to look again at an assembler (ALC) version of the language
Overall COBOL CONCOR remains an intermediate product yet to realize its potential which is thought considerable Software development of this nature is characteristically of a long-term evolutionary design In light of this reality agencies should be prepared to evaluate such projects success or failure in terms other than that of immediate utility It is the philosophy underlying the system not the system itself which will ultimately be exported
APPENDIX A
Bucen Enforcement Proposal
APPENDIX A
BUCEN IN-HOUSE REVIEW ENHANCEMENT PROPOSAL
1 Easy to use interrecord referencing
2 Improved output file capabilities
A provide overflow protection on WRITE command
B provide OUTPUT command to create corrected questionnaire file that matches IP file data dictionary
3 Improvededit statistics reported (LISTERR)
A provide automatic (user-specified) area break
B provide options for compilation and displaying edit statistics at various levels
C provide automatic (user-specified) tolerance checking of error rates by area
D automatically capture IDs of areas failing tolerance check
4 Clean up known bugs in code
5 Comprehensive testing
6 Clean up and enhance documentation
A reference manual more examples error message guide
B installation guide
C systems manual
APPENDIX B
EVALUATIVE CRITERIA
APPEiUIX B [4 si 11 T q tII
UNITED STATES GOVERNMENshy
Pfern0 KGa 7 dll TO DSPOPFPSD Robert H Haladay
DATE December 3 1979
DSPOPDEIO Liliane Floge
SUBJECT Criteria for Evaluation of SuCen COBOL CONCOR September 30thVersion as presented at January 1930 orkshop
The following points should be considered when evaluatingCONCOR the COBOLediting system during participation at the January 19380 workshyshop
1 Are the Installation anual the System ilanual and theUsers anual ready for use Are these manuals adequate fortheir purposes Are the manuals clear and consistentthey be understood by subject-matter personnel Can
as well program ers
as
2 Is the editing system capable of impleaenting all thefeatures as described in the documenit_tion 3 Does the system appc3Ar to be ely debugged andtested and ready for release to dLvlooing conris 4 Does the system appear to be fast enough to edit a
census in a reasonable airount of time 5 hat size core does tine sys t- equire
6 CorFm-ent in general on coole te-nss usefulness andefficiency of ir-nease usethe n of by developingcountry prograrrers and other c nsus personnel
cc Su-a-- e Olds APiA SDi OrWJulio Ortuizar luotolaCelta Systems
APPENDIX C
WORKSHOP ITINERARY
APPENDIX C
CONCOR Workshop Schedule January 7-18 1980
U S Bureau of the Census International Statistical Programs Center
Scuderi Building Room 209 4235 28th Avenue Marlow Heights Maryland
Monday January 7
930 am shy 1000 Welcoming Remarks
Overview of Workshop
1000 - 1000 Introduction to CONCOR - Purpose and function
- History of development - General computer
requirements
115 - 200 System Description
-Constraints in design of CONCOR
-Basic subsystems of CONCOR
-User interactions with system
-Examples of outputs produced
1030 shy 1045 Break
1045 - 1200 Editing Concepts - Ways to interrogate
data - Ways to correct data
- Editing housing and
population data - POPSTAN
- Advantages of CONCOR
1200 shy 115 pm Break
200 - 230
230 - 245
245 - 325
User Program Organization -Divisions - Sections
- Routines - Commands
Break
Command Language
Description -Types of statements -Format -Syntax
is
2
Tuesday January 8
Dictionary Division Command Statements
115 pm-2 15 Input-Record-Section930 am-10
30 Punctuation - Define-RecordInput data referencing
Working-Data-SectionWorking data declara-- New-Datations - Array-DataInternal data represen-
tation and storage Break215 - 230
1030 - 1045 Break 230 - 325 Dictionary Examples
- Minimum dictionary1045 - 1200 Dictionary-Attributes-
structureSection - Maximum dictionary- Dictionary-Name
structure
- Hand out dictionaryFile-Section problem- Input-File
- Output-File - Write-File
- Error-File
1200 - 115 pm Break
Wednesday January 9
115 pm-2 1 5 Execution Division Command
930 am-1030 Discussion of Partici-Statements (continued)pants Dictionary
- Routines of Edit-Specishyproblems fications-Section
Free work time -Prolog-Routine -Filter-Routine -Epilog-Routine
- Types and functions 1030 - 1045 Break
of edit specification
commands1045 - 1200 Execution Division
Command Statements - Range- Punctuation - Assert- Subscripting
- Internal Identifiers - Report-Control-Section
-Generate-Edit-Statistics 215 - 230 Break-Count-Imputes
-Examples 230 - 325 - PassFail clauses
- List1200 - 115 pm Break
3
Thursday January 10
930 am-1030 Discussion of Problems - If115 pm-215
Free work time - UntilExit - Stop
1030 - 1045 Break 215 - 230 Break
1045 shy 1200 Edit Specification Command Statements 230 - 325 - Drecode
(continued) - Grecode
- Allocate - Update - Let
1200 - 115pm Break
Friday January 11
930 am-1030 Edit Specification 115 pm-3 25 Hand out Edit Specishyfications as problemCommand Statements
(continued) Free work time
- Output - Write
1030 - 1045 Break
1045 - 1200 Report Division Command Statements - Display-Control-
Section -Display-Edit-Statistics
- Tolerance-Control-Section -Error-Rate-Check -Reject-File
-Report Examples
1200 - 115 pm Break
I
29
Monday January 14
930 am-1030 Discuss procedures for running problems on computer
1030-1045 Break
1045-1200 Component Programs of the CONCOR system
1200- 115 pm Break
Tuesday January 15
930 am - 325 pm Free work time
Wednesday January 16
930 am 1200 Free work time
1200- 115 pm Break
115 pm-215 How to Install CONCOR on IBM 360370 OS
215- 230 Break
230-325 Free work time
Thursday January 17
930 am-325 Free work time
115 pm-215 Future CONCOR Design Considerations - for census processing - for survey processing
- manual correction system
215- 230 Break
230 - 245 Evaluation Guidelines
- Hand out evaluation forms
245 - 325 Free work time
Friday January 18
930 am-1030 Free work time 115-325 Free work time
1030 - 1045 Break
1045 - 1200 Group Discussion on CONCOR - submission of written evaluations Distribution of IBM installation tapes Presentation of Certificates to Participants
1200-115 pm Break
APPENDIX D
PARTICIPANTS
APPENDIX D
CONCOR WORKSHOP January 7 - 18 1980 Marlow Heights MD
PARTICIPANTS
KENYA
James Midianga Government Computer Centre Central Bureau of Statistics PO Box 30266 Nairobi Kenya
PANAMA
Omar Rivera Rios Data Processing Specialist Contraloria General de la Republica de Panama Apartado Postal 5213 Panama 5 Panama
PHILIPPINES Guida T Capellan OIC Computer Programming Division National Census and Statistics Office Solicarel Bldg 1 Magsaysay Blvd
Sta Mesa Manila Philippines
THAILAND
Angsumal Sunalai National Statistical Office Bangkok 1 Thailand
EGYPT
Farag Sedky Mourad Ghaleb CAPMAS (Centeral Agency for Public
Mobilization and Statistics)NASR City Cairo Egypt
L
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
11
Concluding Remarks of System Modifications
Version 1 of the COBOL CONCOR language was notorious for abending without the generation of diagnostic messages Appendix I sets forth a test of some of these historical problems In eac) instance the system provided messagesand source program line references wLch enabled immediate identification of the processing difficulties However tests of the various commands revealed the general high performance character of the language in its newest version Under these circumstances the recommended modifica-ions should be considered as ones which will uniformly allow CONCOR to realize t-r potential of its design enhance its utility and moreover endure as a software product
12
IV THE ADEQUACY OF COBOL CONCOR DOCUMENTATION
The current documentation of CONCOR consists of a newly developed relativelyvoluminous systems manual a Diagnostic Message Guide and some loose-leaf inshystructions concerning the installation of the system on IBM computers Of these three documents the Diagnostic Message Guide is clearly complete and adequateand could bear no changes except to provide message number tabs to enable a programmer to more quickly locate the text of a message number An example of the superior format of this guide appears as Appendix J
Conspicuously absent from the list of systems documents is a Users Guide as originally specified in the ISPC enhancement proposal
A thorough assessment of the Systems Manual has been undertaken and indishycates with some reorganization and editing it could be made a more useful and usable document as itis well constituted informationally In its current form the Systems Manual can be best described as a working document a document which was not intended to teach the CONCOR language but centered upon an audishyence already familiar with previous CONCOR releases Further it is not asystems manual per se as some of its components and appendixes could more appropriately compose a users guide For example excerpts from the POPSTAN publications which review the basic editing concepts and contain the POPSTAN example problem should be deleted from the systems manual and would be more contextually effective in the Users Guide The EXECUTION-DIVISION chapter of the Systems Manual would especially benefit from general editing and reorganishyzation (Specifically the contents of pages 90-92 should be moved to page 63 -shythese pages explain the placement and purpose of the three permissible routines of the CONCOR language Had the author been granted more time he could have outlined additional specific modifications to bring existing documentation to a level of adequacy) However some guidelines stand out
1 Installation instruction and considerations should compose a chapter of the Systems Manual rather than a separate document (Hand-out materials covering installation were not adequate) Further a documentation scheme setting forth how CONCOR was installed on each computer and any modifications required during installation could be serially added to and become a permanent part of the installations systems manual If CONCOR is going to be used and if agencies are going to actively support it as a package installation limitations must be known and well-documented The prescribed format for this documentation should be set forth and any modifishycations to CONCOR should be continuously updated on these shcets so that a history of changes to the language will be readily accessible to supporting agencies If CONCOR is to survive as a standardized system editing results must be capable of being replicated among installations Such a form could contain other information as well concerning the tyce of computer amount of care and nature of utilities supporting
13
users Upon installation a copy of this form could be
sent to the US agency which will ultimately be responsible
for supporting the CONCOR package
an appendix2 A complete COBOL CONCOR program should appear in
for reference
3 The development of the Users Guide should include an intensive
review of the editing concepts involved in processing census
data files beyond the POPSTAN materials
4 An explanation of the CONCOR benchmark program syould appear
in the Users Guide and the Systems Manual The running of a
supplied benchmark program should be a standard installation
protocol used to test all operational aspects of a new
installation
This tool which has5 The development of a CONCOR pocket card been proven useful in teaching and assisting programmers in
utilizing programming language lays out all commands options on
a single small card An example of such a pocket card is the
Such a small document is useful in clari-IBM Assembler Card fying the language and setting forth structural rules without
continual reference to full-size manuals
14
V CONCLUSION
In a situation of extreme need CONCOR could probably be utilized as it seems capable of performing its editing functiens Its usefulness as a data
cleaning tool is generally undisputed The importance of having CONCOR exshyhaustively tested by an independent agency cannot be over-emphasized in light of prior failures of the system to perform adequately Additionally core requirements and processing speed should be precisely determined One must be encouraged in the amount of progress which has been made in a relatively short amount of time towards the completion of the CONCOR package While it is clearly beyond the scope of this report to estimate the cost and the amount of time required to implement the herein outlined highly desirable changes to the CONCOR language and documentation and then exhaustively test the system it is believed that this process could be completed within a single quarter (The changes are of the nature that no structural redefinishytions of a systematic nature would be required) These enhancements would make the COBOL CONCOR package more complete consistent and reliable in the field ISPC has competently moved the system to the point that its comshypletion iswithin reach
Given the probable revision of the current Systems Reference Manual and the development of a suitable Users Guide document it is likely that this version of CONCOR could be taught to experienced programmers (unfamiliar with previous versions of the language) in as little as four weeks And while one is impressed by the potential power and overall simplicity of the COBOL CONCOR language it is difficult to envision individuals with no real computer programming experience utilizing its full capabilities Hence the importance of including a review of basic editing concepts in the documentation is undershylined
Further as documentation is often critical to the perceived ease of use of computer-based systems itwill be essential to have these various manuals examined independently prior to their general use
As previously mentioned Appendix G represents features of which the ISPC feels could have positive impact on CONCOR Depending upon the outcome of the testing recommended in previous chapters and the availability of funding supporting agencies may desire to look again at an assembler (ALC) version of the language
Overall COBOL CONCOR remains an intermediate product yet to realize its potential which is thought considerable Software development of this nature is characteristically of a long-term evolutionary design In light of this reality agencies should be prepared to evaluate such projects success or failure in terms other than that of immediate utility It is the philosophy underlying the system not the system itself which will ultimately be exported
APPENDIX A
Bucen Enforcement Proposal
APPENDIX A
BUCEN IN-HOUSE REVIEW ENHANCEMENT PROPOSAL
1 Easy to use interrecord referencing
2 Improved output file capabilities
A provide overflow protection on WRITE command
B provide OUTPUT command to create corrected questionnaire file that matches IP file data dictionary
3 Improvededit statistics reported (LISTERR)
A provide automatic (user-specified) area break
B provide options for compilation and displaying edit statistics at various levels
C provide automatic (user-specified) tolerance checking of error rates by area
D automatically capture IDs of areas failing tolerance check
4 Clean up known bugs in code
5 Comprehensive testing
6 Clean up and enhance documentation
A reference manual more examples error message guide
B installation guide
C systems manual
APPENDIX B
EVALUATIVE CRITERIA
APPEiUIX B [4 si 11 T q tII
UNITED STATES GOVERNMENshy
Pfern0 KGa 7 dll TO DSPOPFPSD Robert H Haladay
DATE December 3 1979
DSPOPDEIO Liliane Floge
SUBJECT Criteria for Evaluation of SuCen COBOL CONCOR September 30thVersion as presented at January 1930 orkshop
The following points should be considered when evaluatingCONCOR the COBOLediting system during participation at the January 19380 workshyshop
1 Are the Installation anual the System ilanual and theUsers anual ready for use Are these manuals adequate fortheir purposes Are the manuals clear and consistentthey be understood by subject-matter personnel Can
as well program ers
as
2 Is the editing system capable of impleaenting all thefeatures as described in the documenit_tion 3 Does the system appc3Ar to be ely debugged andtested and ready for release to dLvlooing conris 4 Does the system appear to be fast enough to edit a
census in a reasonable airount of time 5 hat size core does tine sys t- equire
6 CorFm-ent in general on coole te-nss usefulness andefficiency of ir-nease usethe n of by developingcountry prograrrers and other c nsus personnel
cc Su-a-- e Olds APiA SDi OrWJulio Ortuizar luotolaCelta Systems
APPENDIX C
WORKSHOP ITINERARY
APPENDIX C
CONCOR Workshop Schedule January 7-18 1980
U S Bureau of the Census International Statistical Programs Center
Scuderi Building Room 209 4235 28th Avenue Marlow Heights Maryland
Monday January 7
930 am shy 1000 Welcoming Remarks
Overview of Workshop
1000 - 1000 Introduction to CONCOR - Purpose and function
- History of development - General computer
requirements
115 - 200 System Description
-Constraints in design of CONCOR
-Basic subsystems of CONCOR
-User interactions with system
-Examples of outputs produced
1030 shy 1045 Break
1045 - 1200 Editing Concepts - Ways to interrogate
data - Ways to correct data
- Editing housing and
population data - POPSTAN
- Advantages of CONCOR
1200 shy 115 pm Break
200 - 230
230 - 245
245 - 325
User Program Organization -Divisions - Sections
- Routines - Commands
Break
Command Language
Description -Types of statements -Format -Syntax
is
2
Tuesday January 8
Dictionary Division Command Statements
115 pm-2 15 Input-Record-Section930 am-10
30 Punctuation - Define-RecordInput data referencing
Working-Data-SectionWorking data declara-- New-Datations - Array-DataInternal data represen-
tation and storage Break215 - 230
1030 - 1045 Break 230 - 325 Dictionary Examples
- Minimum dictionary1045 - 1200 Dictionary-Attributes-
structureSection - Maximum dictionary- Dictionary-Name
structure
- Hand out dictionaryFile-Section problem- Input-File
- Output-File - Write-File
- Error-File
1200 - 115 pm Break
Wednesday January 9
115 pm-2 1 5 Execution Division Command
930 am-1030 Discussion of Partici-Statements (continued)pants Dictionary
- Routines of Edit-Specishyproblems fications-Section
Free work time -Prolog-Routine -Filter-Routine -Epilog-Routine
- Types and functions 1030 - 1045 Break
of edit specification
commands1045 - 1200 Execution Division
Command Statements - Range- Punctuation - Assert- Subscripting
- Internal Identifiers - Report-Control-Section
-Generate-Edit-Statistics 215 - 230 Break-Count-Imputes
-Examples 230 - 325 - PassFail clauses
- List1200 - 115 pm Break
3
Thursday January 10
930 am-1030 Discussion of Problems - If115 pm-215
Free work time - UntilExit - Stop
1030 - 1045 Break 215 - 230 Break
1045 shy 1200 Edit Specification Command Statements 230 - 325 - Drecode
(continued) - Grecode
- Allocate - Update - Let
1200 - 115pm Break
Friday January 11
930 am-1030 Edit Specification 115 pm-3 25 Hand out Edit Specishyfications as problemCommand Statements
(continued) Free work time
- Output - Write
1030 - 1045 Break
1045 - 1200 Report Division Command Statements - Display-Control-
Section -Display-Edit-Statistics
- Tolerance-Control-Section -Error-Rate-Check -Reject-File
-Report Examples
1200 - 115 pm Break
I
29
Monday January 14
930 am-1030 Discuss procedures for running problems on computer
1030-1045 Break
1045-1200 Component Programs of the CONCOR system
1200- 115 pm Break
Tuesday January 15
930 am - 325 pm Free work time
Wednesday January 16
930 am 1200 Free work time
1200- 115 pm Break
115 pm-215 How to Install CONCOR on IBM 360370 OS
215- 230 Break
230-325 Free work time
Thursday January 17
930 am-325 Free work time
115 pm-215 Future CONCOR Design Considerations - for census processing - for survey processing
- manual correction system
215- 230 Break
230 - 245 Evaluation Guidelines
- Hand out evaluation forms
245 - 325 Free work time
Friday January 18
930 am-1030 Free work time 115-325 Free work time
1030 - 1045 Break
1045 - 1200 Group Discussion on CONCOR - submission of written evaluations Distribution of IBM installation tapes Presentation of Certificates to Participants
1200-115 pm Break
APPENDIX D
PARTICIPANTS
APPENDIX D
CONCOR WORKSHOP January 7 - 18 1980 Marlow Heights MD
PARTICIPANTS
KENYA
James Midianga Government Computer Centre Central Bureau of Statistics PO Box 30266 Nairobi Kenya
PANAMA
Omar Rivera Rios Data Processing Specialist Contraloria General de la Republica de Panama Apartado Postal 5213 Panama 5 Panama
PHILIPPINES Guida T Capellan OIC Computer Programming Division National Census and Statistics Office Solicarel Bldg 1 Magsaysay Blvd
Sta Mesa Manila Philippines
THAILAND
Angsumal Sunalai National Statistical Office Bangkok 1 Thailand
EGYPT
Farag Sedky Mourad Ghaleb CAPMAS (Centeral Agency for Public
Mobilization and Statistics)NASR City Cairo Egypt
L
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
12
IV THE ADEQUACY OF COBOL CONCOR DOCUMENTATION
The current documentation of CONCOR consists of a newly developed relativelyvoluminous systems manual a Diagnostic Message Guide and some loose-leaf inshystructions concerning the installation of the system on IBM computers Of these three documents the Diagnostic Message Guide is clearly complete and adequateand could bear no changes except to provide message number tabs to enable a programmer to more quickly locate the text of a message number An example of the superior format of this guide appears as Appendix J
Conspicuously absent from the list of systems documents is a Users Guide as originally specified in the ISPC enhancement proposal
A thorough assessment of the Systems Manual has been undertaken and indishycates with some reorganization and editing it could be made a more useful and usable document as itis well constituted informationally In its current form the Systems Manual can be best described as a working document a document which was not intended to teach the CONCOR language but centered upon an audishyence already familiar with previous CONCOR releases Further it is not asystems manual per se as some of its components and appendixes could more appropriately compose a users guide For example excerpts from the POPSTAN publications which review the basic editing concepts and contain the POPSTAN example problem should be deleted from the systems manual and would be more contextually effective in the Users Guide The EXECUTION-DIVISION chapter of the Systems Manual would especially benefit from general editing and reorganishyzation (Specifically the contents of pages 90-92 should be moved to page 63 -shythese pages explain the placement and purpose of the three permissible routines of the CONCOR language Had the author been granted more time he could have outlined additional specific modifications to bring existing documentation to a level of adequacy) However some guidelines stand out
1 Installation instruction and considerations should compose a chapter of the Systems Manual rather than a separate document (Hand-out materials covering installation were not adequate) Further a documentation scheme setting forth how CONCOR was installed on each computer and any modifications required during installation could be serially added to and become a permanent part of the installations systems manual If CONCOR is going to be used and if agencies are going to actively support it as a package installation limitations must be known and well-documented The prescribed format for this documentation should be set forth and any modifishycations to CONCOR should be continuously updated on these shcets so that a history of changes to the language will be readily accessible to supporting agencies If CONCOR is to survive as a standardized system editing results must be capable of being replicated among installations Such a form could contain other information as well concerning the tyce of computer amount of care and nature of utilities supporting
13
users Upon installation a copy of this form could be
sent to the US agency which will ultimately be responsible
for supporting the CONCOR package
an appendix2 A complete COBOL CONCOR program should appear in
for reference
3 The development of the Users Guide should include an intensive
review of the editing concepts involved in processing census
data files beyond the POPSTAN materials
4 An explanation of the CONCOR benchmark program syould appear
in the Users Guide and the Systems Manual The running of a
supplied benchmark program should be a standard installation
protocol used to test all operational aspects of a new
installation
This tool which has5 The development of a CONCOR pocket card been proven useful in teaching and assisting programmers in
utilizing programming language lays out all commands options on
a single small card An example of such a pocket card is the
Such a small document is useful in clari-IBM Assembler Card fying the language and setting forth structural rules without
continual reference to full-size manuals
14
V CONCLUSION
In a situation of extreme need CONCOR could probably be utilized as it seems capable of performing its editing functiens Its usefulness as a data
cleaning tool is generally undisputed The importance of having CONCOR exshyhaustively tested by an independent agency cannot be over-emphasized in light of prior failures of the system to perform adequately Additionally core requirements and processing speed should be precisely determined One must be encouraged in the amount of progress which has been made in a relatively short amount of time towards the completion of the CONCOR package While it is clearly beyond the scope of this report to estimate the cost and the amount of time required to implement the herein outlined highly desirable changes to the CONCOR language and documentation and then exhaustively test the system it is believed that this process could be completed within a single quarter (The changes are of the nature that no structural redefinishytions of a systematic nature would be required) These enhancements would make the COBOL CONCOR package more complete consistent and reliable in the field ISPC has competently moved the system to the point that its comshypletion iswithin reach
Given the probable revision of the current Systems Reference Manual and the development of a suitable Users Guide document it is likely that this version of CONCOR could be taught to experienced programmers (unfamiliar with previous versions of the language) in as little as four weeks And while one is impressed by the potential power and overall simplicity of the COBOL CONCOR language it is difficult to envision individuals with no real computer programming experience utilizing its full capabilities Hence the importance of including a review of basic editing concepts in the documentation is undershylined
Further as documentation is often critical to the perceived ease of use of computer-based systems itwill be essential to have these various manuals examined independently prior to their general use
As previously mentioned Appendix G represents features of which the ISPC feels could have positive impact on CONCOR Depending upon the outcome of the testing recommended in previous chapters and the availability of funding supporting agencies may desire to look again at an assembler (ALC) version of the language
Overall COBOL CONCOR remains an intermediate product yet to realize its potential which is thought considerable Software development of this nature is characteristically of a long-term evolutionary design In light of this reality agencies should be prepared to evaluate such projects success or failure in terms other than that of immediate utility It is the philosophy underlying the system not the system itself which will ultimately be exported
APPENDIX A
Bucen Enforcement Proposal
APPENDIX A
BUCEN IN-HOUSE REVIEW ENHANCEMENT PROPOSAL
1 Easy to use interrecord referencing
2 Improved output file capabilities
A provide overflow protection on WRITE command
B provide OUTPUT command to create corrected questionnaire file that matches IP file data dictionary
3 Improvededit statistics reported (LISTERR)
A provide automatic (user-specified) area break
B provide options for compilation and displaying edit statistics at various levels
C provide automatic (user-specified) tolerance checking of error rates by area
D automatically capture IDs of areas failing tolerance check
4 Clean up known bugs in code
5 Comprehensive testing
6 Clean up and enhance documentation
A reference manual more examples error message guide
B installation guide
C systems manual
APPENDIX B
EVALUATIVE CRITERIA
APPEiUIX B [4 si 11 T q tII
UNITED STATES GOVERNMENshy
Pfern0 KGa 7 dll TO DSPOPFPSD Robert H Haladay
DATE December 3 1979
DSPOPDEIO Liliane Floge
SUBJECT Criteria for Evaluation of SuCen COBOL CONCOR September 30thVersion as presented at January 1930 orkshop
The following points should be considered when evaluatingCONCOR the COBOLediting system during participation at the January 19380 workshyshop
1 Are the Installation anual the System ilanual and theUsers anual ready for use Are these manuals adequate fortheir purposes Are the manuals clear and consistentthey be understood by subject-matter personnel Can
as well program ers
as
2 Is the editing system capable of impleaenting all thefeatures as described in the documenit_tion 3 Does the system appc3Ar to be ely debugged andtested and ready for release to dLvlooing conris 4 Does the system appear to be fast enough to edit a
census in a reasonable airount of time 5 hat size core does tine sys t- equire
6 CorFm-ent in general on coole te-nss usefulness andefficiency of ir-nease usethe n of by developingcountry prograrrers and other c nsus personnel
cc Su-a-- e Olds APiA SDi OrWJulio Ortuizar luotolaCelta Systems
APPENDIX C
WORKSHOP ITINERARY
APPENDIX C
CONCOR Workshop Schedule January 7-18 1980
U S Bureau of the Census International Statistical Programs Center
Scuderi Building Room 209 4235 28th Avenue Marlow Heights Maryland
Monday January 7
930 am shy 1000 Welcoming Remarks
Overview of Workshop
1000 - 1000 Introduction to CONCOR - Purpose and function
- History of development - General computer
requirements
115 - 200 System Description
-Constraints in design of CONCOR
-Basic subsystems of CONCOR
-User interactions with system
-Examples of outputs produced
1030 shy 1045 Break
1045 - 1200 Editing Concepts - Ways to interrogate
data - Ways to correct data
- Editing housing and
population data - POPSTAN
- Advantages of CONCOR
1200 shy 115 pm Break
200 - 230
230 - 245
245 - 325
User Program Organization -Divisions - Sections
- Routines - Commands
Break
Command Language
Description -Types of statements -Format -Syntax
is
2
Tuesday January 8
Dictionary Division Command Statements
115 pm-2 15 Input-Record-Section930 am-10
30 Punctuation - Define-RecordInput data referencing
Working-Data-SectionWorking data declara-- New-Datations - Array-DataInternal data represen-
tation and storage Break215 - 230
1030 - 1045 Break 230 - 325 Dictionary Examples
- Minimum dictionary1045 - 1200 Dictionary-Attributes-
structureSection - Maximum dictionary- Dictionary-Name
structure
- Hand out dictionaryFile-Section problem- Input-File
- Output-File - Write-File
- Error-File
1200 - 115 pm Break
Wednesday January 9
115 pm-2 1 5 Execution Division Command
930 am-1030 Discussion of Partici-Statements (continued)pants Dictionary
- Routines of Edit-Specishyproblems fications-Section
Free work time -Prolog-Routine -Filter-Routine -Epilog-Routine
- Types and functions 1030 - 1045 Break
of edit specification
commands1045 - 1200 Execution Division
Command Statements - Range- Punctuation - Assert- Subscripting
- Internal Identifiers - Report-Control-Section
-Generate-Edit-Statistics 215 - 230 Break-Count-Imputes
-Examples 230 - 325 - PassFail clauses
- List1200 - 115 pm Break
3
Thursday January 10
930 am-1030 Discussion of Problems - If115 pm-215
Free work time - UntilExit - Stop
1030 - 1045 Break 215 - 230 Break
1045 shy 1200 Edit Specification Command Statements 230 - 325 - Drecode
(continued) - Grecode
- Allocate - Update - Let
1200 - 115pm Break
Friday January 11
930 am-1030 Edit Specification 115 pm-3 25 Hand out Edit Specishyfications as problemCommand Statements
(continued) Free work time
- Output - Write
1030 - 1045 Break
1045 - 1200 Report Division Command Statements - Display-Control-
Section -Display-Edit-Statistics
- Tolerance-Control-Section -Error-Rate-Check -Reject-File
-Report Examples
1200 - 115 pm Break
I
29
Monday January 14
930 am-1030 Discuss procedures for running problems on computer
1030-1045 Break
1045-1200 Component Programs of the CONCOR system
1200- 115 pm Break
Tuesday January 15
930 am - 325 pm Free work time
Wednesday January 16
930 am 1200 Free work time
1200- 115 pm Break
115 pm-215 How to Install CONCOR on IBM 360370 OS
215- 230 Break
230-325 Free work time
Thursday January 17
930 am-325 Free work time
115 pm-215 Future CONCOR Design Considerations - for census processing - for survey processing
- manual correction system
215- 230 Break
230 - 245 Evaluation Guidelines
- Hand out evaluation forms
245 - 325 Free work time
Friday January 18
930 am-1030 Free work time 115-325 Free work time
1030 - 1045 Break
1045 - 1200 Group Discussion on CONCOR - submission of written evaluations Distribution of IBM installation tapes Presentation of Certificates to Participants
1200-115 pm Break
APPENDIX D
PARTICIPANTS
APPENDIX D
CONCOR WORKSHOP January 7 - 18 1980 Marlow Heights MD
PARTICIPANTS
KENYA
James Midianga Government Computer Centre Central Bureau of Statistics PO Box 30266 Nairobi Kenya
PANAMA
Omar Rivera Rios Data Processing Specialist Contraloria General de la Republica de Panama Apartado Postal 5213 Panama 5 Panama
PHILIPPINES Guida T Capellan OIC Computer Programming Division National Census and Statistics Office Solicarel Bldg 1 Magsaysay Blvd
Sta Mesa Manila Philippines
THAILAND
Angsumal Sunalai National Statistical Office Bangkok 1 Thailand
EGYPT
Farag Sedky Mourad Ghaleb CAPMAS (Centeral Agency for Public
Mobilization and Statistics)NASR City Cairo Egypt
L
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
13
users Upon installation a copy of this form could be
sent to the US agency which will ultimately be responsible
for supporting the CONCOR package
an appendix2 A complete COBOL CONCOR program should appear in
for reference
3 The development of the Users Guide should include an intensive
review of the editing concepts involved in processing census
data files beyond the POPSTAN materials
4 An explanation of the CONCOR benchmark program syould appear
in the Users Guide and the Systems Manual The running of a
supplied benchmark program should be a standard installation
protocol used to test all operational aspects of a new
installation
This tool which has5 The development of a CONCOR pocket card been proven useful in teaching and assisting programmers in
utilizing programming language lays out all commands options on
a single small card An example of such a pocket card is the
Such a small document is useful in clari-IBM Assembler Card fying the language and setting forth structural rules without
continual reference to full-size manuals
14
V CONCLUSION
In a situation of extreme need CONCOR could probably be utilized as it seems capable of performing its editing functiens Its usefulness as a data
cleaning tool is generally undisputed The importance of having CONCOR exshyhaustively tested by an independent agency cannot be over-emphasized in light of prior failures of the system to perform adequately Additionally core requirements and processing speed should be precisely determined One must be encouraged in the amount of progress which has been made in a relatively short amount of time towards the completion of the CONCOR package While it is clearly beyond the scope of this report to estimate the cost and the amount of time required to implement the herein outlined highly desirable changes to the CONCOR language and documentation and then exhaustively test the system it is believed that this process could be completed within a single quarter (The changes are of the nature that no structural redefinishytions of a systematic nature would be required) These enhancements would make the COBOL CONCOR package more complete consistent and reliable in the field ISPC has competently moved the system to the point that its comshypletion iswithin reach
Given the probable revision of the current Systems Reference Manual and the development of a suitable Users Guide document it is likely that this version of CONCOR could be taught to experienced programmers (unfamiliar with previous versions of the language) in as little as four weeks And while one is impressed by the potential power and overall simplicity of the COBOL CONCOR language it is difficult to envision individuals with no real computer programming experience utilizing its full capabilities Hence the importance of including a review of basic editing concepts in the documentation is undershylined
Further as documentation is often critical to the perceived ease of use of computer-based systems itwill be essential to have these various manuals examined independently prior to their general use
As previously mentioned Appendix G represents features of which the ISPC feels could have positive impact on CONCOR Depending upon the outcome of the testing recommended in previous chapters and the availability of funding supporting agencies may desire to look again at an assembler (ALC) version of the language
Overall COBOL CONCOR remains an intermediate product yet to realize its potential which is thought considerable Software development of this nature is characteristically of a long-term evolutionary design In light of this reality agencies should be prepared to evaluate such projects success or failure in terms other than that of immediate utility It is the philosophy underlying the system not the system itself which will ultimately be exported
APPENDIX A
Bucen Enforcement Proposal
APPENDIX A
BUCEN IN-HOUSE REVIEW ENHANCEMENT PROPOSAL
1 Easy to use interrecord referencing
2 Improved output file capabilities
A provide overflow protection on WRITE command
B provide OUTPUT command to create corrected questionnaire file that matches IP file data dictionary
3 Improvededit statistics reported (LISTERR)
A provide automatic (user-specified) area break
B provide options for compilation and displaying edit statistics at various levels
C provide automatic (user-specified) tolerance checking of error rates by area
D automatically capture IDs of areas failing tolerance check
4 Clean up known bugs in code
5 Comprehensive testing
6 Clean up and enhance documentation
A reference manual more examples error message guide
B installation guide
C systems manual
APPENDIX B
EVALUATIVE CRITERIA
APPEiUIX B [4 si 11 T q tII
UNITED STATES GOVERNMENshy
Pfern0 KGa 7 dll TO DSPOPFPSD Robert H Haladay
DATE December 3 1979
DSPOPDEIO Liliane Floge
SUBJECT Criteria for Evaluation of SuCen COBOL CONCOR September 30thVersion as presented at January 1930 orkshop
The following points should be considered when evaluatingCONCOR the COBOLediting system during participation at the January 19380 workshyshop
1 Are the Installation anual the System ilanual and theUsers anual ready for use Are these manuals adequate fortheir purposes Are the manuals clear and consistentthey be understood by subject-matter personnel Can
as well program ers
as
2 Is the editing system capable of impleaenting all thefeatures as described in the documenit_tion 3 Does the system appc3Ar to be ely debugged andtested and ready for release to dLvlooing conris 4 Does the system appear to be fast enough to edit a
census in a reasonable airount of time 5 hat size core does tine sys t- equire
6 CorFm-ent in general on coole te-nss usefulness andefficiency of ir-nease usethe n of by developingcountry prograrrers and other c nsus personnel
cc Su-a-- e Olds APiA SDi OrWJulio Ortuizar luotolaCelta Systems
APPENDIX C
WORKSHOP ITINERARY
APPENDIX C
CONCOR Workshop Schedule January 7-18 1980
U S Bureau of the Census International Statistical Programs Center
Scuderi Building Room 209 4235 28th Avenue Marlow Heights Maryland
Monday January 7
930 am shy 1000 Welcoming Remarks
Overview of Workshop
1000 - 1000 Introduction to CONCOR - Purpose and function
- History of development - General computer
requirements
115 - 200 System Description
-Constraints in design of CONCOR
-Basic subsystems of CONCOR
-User interactions with system
-Examples of outputs produced
1030 shy 1045 Break
1045 - 1200 Editing Concepts - Ways to interrogate
data - Ways to correct data
- Editing housing and
population data - POPSTAN
- Advantages of CONCOR
1200 shy 115 pm Break
200 - 230
230 - 245
245 - 325
User Program Organization -Divisions - Sections
- Routines - Commands
Break
Command Language
Description -Types of statements -Format -Syntax
is
2
Tuesday January 8
Dictionary Division Command Statements
115 pm-2 15 Input-Record-Section930 am-10
30 Punctuation - Define-RecordInput data referencing
Working-Data-SectionWorking data declara-- New-Datations - Array-DataInternal data represen-
tation and storage Break215 - 230
1030 - 1045 Break 230 - 325 Dictionary Examples
- Minimum dictionary1045 - 1200 Dictionary-Attributes-
structureSection - Maximum dictionary- Dictionary-Name
structure
- Hand out dictionaryFile-Section problem- Input-File
- Output-File - Write-File
- Error-File
1200 - 115 pm Break
Wednesday January 9
115 pm-2 1 5 Execution Division Command
930 am-1030 Discussion of Partici-Statements (continued)pants Dictionary
- Routines of Edit-Specishyproblems fications-Section
Free work time -Prolog-Routine -Filter-Routine -Epilog-Routine
- Types and functions 1030 - 1045 Break
of edit specification
commands1045 - 1200 Execution Division
Command Statements - Range- Punctuation - Assert- Subscripting
- Internal Identifiers - Report-Control-Section
-Generate-Edit-Statistics 215 - 230 Break-Count-Imputes
-Examples 230 - 325 - PassFail clauses
- List1200 - 115 pm Break
3
Thursday January 10
930 am-1030 Discussion of Problems - If115 pm-215
Free work time - UntilExit - Stop
1030 - 1045 Break 215 - 230 Break
1045 shy 1200 Edit Specification Command Statements 230 - 325 - Drecode
(continued) - Grecode
- Allocate - Update - Let
1200 - 115pm Break
Friday January 11
930 am-1030 Edit Specification 115 pm-3 25 Hand out Edit Specishyfications as problemCommand Statements
(continued) Free work time
- Output - Write
1030 - 1045 Break
1045 - 1200 Report Division Command Statements - Display-Control-
Section -Display-Edit-Statistics
- Tolerance-Control-Section -Error-Rate-Check -Reject-File
-Report Examples
1200 - 115 pm Break
I
29
Monday January 14
930 am-1030 Discuss procedures for running problems on computer
1030-1045 Break
1045-1200 Component Programs of the CONCOR system
1200- 115 pm Break
Tuesday January 15
930 am - 325 pm Free work time
Wednesday January 16
930 am 1200 Free work time
1200- 115 pm Break
115 pm-215 How to Install CONCOR on IBM 360370 OS
215- 230 Break
230-325 Free work time
Thursday January 17
930 am-325 Free work time
115 pm-215 Future CONCOR Design Considerations - for census processing - for survey processing
- manual correction system
215- 230 Break
230 - 245 Evaluation Guidelines
- Hand out evaluation forms
245 - 325 Free work time
Friday January 18
930 am-1030 Free work time 115-325 Free work time
1030 - 1045 Break
1045 - 1200 Group Discussion on CONCOR - submission of written evaluations Distribution of IBM installation tapes Presentation of Certificates to Participants
1200-115 pm Break
APPENDIX D
PARTICIPANTS
APPENDIX D
CONCOR WORKSHOP January 7 - 18 1980 Marlow Heights MD
PARTICIPANTS
KENYA
James Midianga Government Computer Centre Central Bureau of Statistics PO Box 30266 Nairobi Kenya
PANAMA
Omar Rivera Rios Data Processing Specialist Contraloria General de la Republica de Panama Apartado Postal 5213 Panama 5 Panama
PHILIPPINES Guida T Capellan OIC Computer Programming Division National Census and Statistics Office Solicarel Bldg 1 Magsaysay Blvd
Sta Mesa Manila Philippines
THAILAND
Angsumal Sunalai National Statistical Office Bangkok 1 Thailand
EGYPT
Farag Sedky Mourad Ghaleb CAPMAS (Centeral Agency for Public
Mobilization and Statistics)NASR City Cairo Egypt
L
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
14
V CONCLUSION
In a situation of extreme need CONCOR could probably be utilized as it seems capable of performing its editing functiens Its usefulness as a data
cleaning tool is generally undisputed The importance of having CONCOR exshyhaustively tested by an independent agency cannot be over-emphasized in light of prior failures of the system to perform adequately Additionally core requirements and processing speed should be precisely determined One must be encouraged in the amount of progress which has been made in a relatively short amount of time towards the completion of the CONCOR package While it is clearly beyond the scope of this report to estimate the cost and the amount of time required to implement the herein outlined highly desirable changes to the CONCOR language and documentation and then exhaustively test the system it is believed that this process could be completed within a single quarter (The changes are of the nature that no structural redefinishytions of a systematic nature would be required) These enhancements would make the COBOL CONCOR package more complete consistent and reliable in the field ISPC has competently moved the system to the point that its comshypletion iswithin reach
Given the probable revision of the current Systems Reference Manual and the development of a suitable Users Guide document it is likely that this version of CONCOR could be taught to experienced programmers (unfamiliar with previous versions of the language) in as little as four weeks And while one is impressed by the potential power and overall simplicity of the COBOL CONCOR language it is difficult to envision individuals with no real computer programming experience utilizing its full capabilities Hence the importance of including a review of basic editing concepts in the documentation is undershylined
Further as documentation is often critical to the perceived ease of use of computer-based systems itwill be essential to have these various manuals examined independently prior to their general use
As previously mentioned Appendix G represents features of which the ISPC feels could have positive impact on CONCOR Depending upon the outcome of the testing recommended in previous chapters and the availability of funding supporting agencies may desire to look again at an assembler (ALC) version of the language
Overall COBOL CONCOR remains an intermediate product yet to realize its potential which is thought considerable Software development of this nature is characteristically of a long-term evolutionary design In light of this reality agencies should be prepared to evaluate such projects success or failure in terms other than that of immediate utility It is the philosophy underlying the system not the system itself which will ultimately be exported
APPENDIX A
Bucen Enforcement Proposal
APPENDIX A
BUCEN IN-HOUSE REVIEW ENHANCEMENT PROPOSAL
1 Easy to use interrecord referencing
2 Improved output file capabilities
A provide overflow protection on WRITE command
B provide OUTPUT command to create corrected questionnaire file that matches IP file data dictionary
3 Improvededit statistics reported (LISTERR)
A provide automatic (user-specified) area break
B provide options for compilation and displaying edit statistics at various levels
C provide automatic (user-specified) tolerance checking of error rates by area
D automatically capture IDs of areas failing tolerance check
4 Clean up known bugs in code
5 Comprehensive testing
6 Clean up and enhance documentation
A reference manual more examples error message guide
B installation guide
C systems manual
APPENDIX B
EVALUATIVE CRITERIA
APPEiUIX B [4 si 11 T q tII
UNITED STATES GOVERNMENshy
Pfern0 KGa 7 dll TO DSPOPFPSD Robert H Haladay
DATE December 3 1979
DSPOPDEIO Liliane Floge
SUBJECT Criteria for Evaluation of SuCen COBOL CONCOR September 30thVersion as presented at January 1930 orkshop
The following points should be considered when evaluatingCONCOR the COBOLediting system during participation at the January 19380 workshyshop
1 Are the Installation anual the System ilanual and theUsers anual ready for use Are these manuals adequate fortheir purposes Are the manuals clear and consistentthey be understood by subject-matter personnel Can
as well program ers
as
2 Is the editing system capable of impleaenting all thefeatures as described in the documenit_tion 3 Does the system appc3Ar to be ely debugged andtested and ready for release to dLvlooing conris 4 Does the system appear to be fast enough to edit a
census in a reasonable airount of time 5 hat size core does tine sys t- equire
6 CorFm-ent in general on coole te-nss usefulness andefficiency of ir-nease usethe n of by developingcountry prograrrers and other c nsus personnel
cc Su-a-- e Olds APiA SDi OrWJulio Ortuizar luotolaCelta Systems
APPENDIX C
WORKSHOP ITINERARY
APPENDIX C
CONCOR Workshop Schedule January 7-18 1980
U S Bureau of the Census International Statistical Programs Center
Scuderi Building Room 209 4235 28th Avenue Marlow Heights Maryland
Monday January 7
930 am shy 1000 Welcoming Remarks
Overview of Workshop
1000 - 1000 Introduction to CONCOR - Purpose and function
- History of development - General computer
requirements
115 - 200 System Description
-Constraints in design of CONCOR
-Basic subsystems of CONCOR
-User interactions with system
-Examples of outputs produced
1030 shy 1045 Break
1045 - 1200 Editing Concepts - Ways to interrogate
data - Ways to correct data
- Editing housing and
population data - POPSTAN
- Advantages of CONCOR
1200 shy 115 pm Break
200 - 230
230 - 245
245 - 325
User Program Organization -Divisions - Sections
- Routines - Commands
Break
Command Language
Description -Types of statements -Format -Syntax
is
2
Tuesday January 8
Dictionary Division Command Statements
115 pm-2 15 Input-Record-Section930 am-10
30 Punctuation - Define-RecordInput data referencing
Working-Data-SectionWorking data declara-- New-Datations - Array-DataInternal data represen-
tation and storage Break215 - 230
1030 - 1045 Break 230 - 325 Dictionary Examples
- Minimum dictionary1045 - 1200 Dictionary-Attributes-
structureSection - Maximum dictionary- Dictionary-Name
structure
- Hand out dictionaryFile-Section problem- Input-File
- Output-File - Write-File
- Error-File
1200 - 115 pm Break
Wednesday January 9
115 pm-2 1 5 Execution Division Command
930 am-1030 Discussion of Partici-Statements (continued)pants Dictionary
- Routines of Edit-Specishyproblems fications-Section
Free work time -Prolog-Routine -Filter-Routine -Epilog-Routine
- Types and functions 1030 - 1045 Break
of edit specification
commands1045 - 1200 Execution Division
Command Statements - Range- Punctuation - Assert- Subscripting
- Internal Identifiers - Report-Control-Section
-Generate-Edit-Statistics 215 - 230 Break-Count-Imputes
-Examples 230 - 325 - PassFail clauses
- List1200 - 115 pm Break
3
Thursday January 10
930 am-1030 Discussion of Problems - If115 pm-215
Free work time - UntilExit - Stop
1030 - 1045 Break 215 - 230 Break
1045 shy 1200 Edit Specification Command Statements 230 - 325 - Drecode
(continued) - Grecode
- Allocate - Update - Let
1200 - 115pm Break
Friday January 11
930 am-1030 Edit Specification 115 pm-3 25 Hand out Edit Specishyfications as problemCommand Statements
(continued) Free work time
- Output - Write
1030 - 1045 Break
1045 - 1200 Report Division Command Statements - Display-Control-
Section -Display-Edit-Statistics
- Tolerance-Control-Section -Error-Rate-Check -Reject-File
-Report Examples
1200 - 115 pm Break
I
29
Monday January 14
930 am-1030 Discuss procedures for running problems on computer
1030-1045 Break
1045-1200 Component Programs of the CONCOR system
1200- 115 pm Break
Tuesday January 15
930 am - 325 pm Free work time
Wednesday January 16
930 am 1200 Free work time
1200- 115 pm Break
115 pm-215 How to Install CONCOR on IBM 360370 OS
215- 230 Break
230-325 Free work time
Thursday January 17
930 am-325 Free work time
115 pm-215 Future CONCOR Design Considerations - for census processing - for survey processing
- manual correction system
215- 230 Break
230 - 245 Evaluation Guidelines
- Hand out evaluation forms
245 - 325 Free work time
Friday January 18
930 am-1030 Free work time 115-325 Free work time
1030 - 1045 Break
1045 - 1200 Group Discussion on CONCOR - submission of written evaluations Distribution of IBM installation tapes Presentation of Certificates to Participants
1200-115 pm Break
APPENDIX D
PARTICIPANTS
APPENDIX D
CONCOR WORKSHOP January 7 - 18 1980 Marlow Heights MD
PARTICIPANTS
KENYA
James Midianga Government Computer Centre Central Bureau of Statistics PO Box 30266 Nairobi Kenya
PANAMA
Omar Rivera Rios Data Processing Specialist Contraloria General de la Republica de Panama Apartado Postal 5213 Panama 5 Panama
PHILIPPINES Guida T Capellan OIC Computer Programming Division National Census and Statistics Office Solicarel Bldg 1 Magsaysay Blvd
Sta Mesa Manila Philippines
THAILAND
Angsumal Sunalai National Statistical Office Bangkok 1 Thailand
EGYPT
Farag Sedky Mourad Ghaleb CAPMAS (Centeral Agency for Public
Mobilization and Statistics)NASR City Cairo Egypt
L
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
APPENDIX A
Bucen Enforcement Proposal
APPENDIX A
BUCEN IN-HOUSE REVIEW ENHANCEMENT PROPOSAL
1 Easy to use interrecord referencing
2 Improved output file capabilities
A provide overflow protection on WRITE command
B provide OUTPUT command to create corrected questionnaire file that matches IP file data dictionary
3 Improvededit statistics reported (LISTERR)
A provide automatic (user-specified) area break
B provide options for compilation and displaying edit statistics at various levels
C provide automatic (user-specified) tolerance checking of error rates by area
D automatically capture IDs of areas failing tolerance check
4 Clean up known bugs in code
5 Comprehensive testing
6 Clean up and enhance documentation
A reference manual more examples error message guide
B installation guide
C systems manual
APPENDIX B
EVALUATIVE CRITERIA
APPEiUIX B [4 si 11 T q tII
UNITED STATES GOVERNMENshy
Pfern0 KGa 7 dll TO DSPOPFPSD Robert H Haladay
DATE December 3 1979
DSPOPDEIO Liliane Floge
SUBJECT Criteria for Evaluation of SuCen COBOL CONCOR September 30thVersion as presented at January 1930 orkshop
The following points should be considered when evaluatingCONCOR the COBOLediting system during participation at the January 19380 workshyshop
1 Are the Installation anual the System ilanual and theUsers anual ready for use Are these manuals adequate fortheir purposes Are the manuals clear and consistentthey be understood by subject-matter personnel Can
as well program ers
as
2 Is the editing system capable of impleaenting all thefeatures as described in the documenit_tion 3 Does the system appc3Ar to be ely debugged andtested and ready for release to dLvlooing conris 4 Does the system appear to be fast enough to edit a
census in a reasonable airount of time 5 hat size core does tine sys t- equire
6 CorFm-ent in general on coole te-nss usefulness andefficiency of ir-nease usethe n of by developingcountry prograrrers and other c nsus personnel
cc Su-a-- e Olds APiA SDi OrWJulio Ortuizar luotolaCelta Systems
APPENDIX C
WORKSHOP ITINERARY
APPENDIX C
CONCOR Workshop Schedule January 7-18 1980
U S Bureau of the Census International Statistical Programs Center
Scuderi Building Room 209 4235 28th Avenue Marlow Heights Maryland
Monday January 7
930 am shy 1000 Welcoming Remarks
Overview of Workshop
1000 - 1000 Introduction to CONCOR - Purpose and function
- History of development - General computer
requirements
115 - 200 System Description
-Constraints in design of CONCOR
-Basic subsystems of CONCOR
-User interactions with system
-Examples of outputs produced
1030 shy 1045 Break
1045 - 1200 Editing Concepts - Ways to interrogate
data - Ways to correct data
- Editing housing and
population data - POPSTAN
- Advantages of CONCOR
1200 shy 115 pm Break
200 - 230
230 - 245
245 - 325
User Program Organization -Divisions - Sections
- Routines - Commands
Break
Command Language
Description -Types of statements -Format -Syntax
is
2
Tuesday January 8
Dictionary Division Command Statements
115 pm-2 15 Input-Record-Section930 am-10
30 Punctuation - Define-RecordInput data referencing
Working-Data-SectionWorking data declara-- New-Datations - Array-DataInternal data represen-
tation and storage Break215 - 230
1030 - 1045 Break 230 - 325 Dictionary Examples
- Minimum dictionary1045 - 1200 Dictionary-Attributes-
structureSection - Maximum dictionary- Dictionary-Name
structure
- Hand out dictionaryFile-Section problem- Input-File
- Output-File - Write-File
- Error-File
1200 - 115 pm Break
Wednesday January 9
115 pm-2 1 5 Execution Division Command
930 am-1030 Discussion of Partici-Statements (continued)pants Dictionary
- Routines of Edit-Specishyproblems fications-Section
Free work time -Prolog-Routine -Filter-Routine -Epilog-Routine
- Types and functions 1030 - 1045 Break
of edit specification
commands1045 - 1200 Execution Division
Command Statements - Range- Punctuation - Assert- Subscripting
- Internal Identifiers - Report-Control-Section
-Generate-Edit-Statistics 215 - 230 Break-Count-Imputes
-Examples 230 - 325 - PassFail clauses
- List1200 - 115 pm Break
3
Thursday January 10
930 am-1030 Discussion of Problems - If115 pm-215
Free work time - UntilExit - Stop
1030 - 1045 Break 215 - 230 Break
1045 shy 1200 Edit Specification Command Statements 230 - 325 - Drecode
(continued) - Grecode
- Allocate - Update - Let
1200 - 115pm Break
Friday January 11
930 am-1030 Edit Specification 115 pm-3 25 Hand out Edit Specishyfications as problemCommand Statements
(continued) Free work time
- Output - Write
1030 - 1045 Break
1045 - 1200 Report Division Command Statements - Display-Control-
Section -Display-Edit-Statistics
- Tolerance-Control-Section -Error-Rate-Check -Reject-File
-Report Examples
1200 - 115 pm Break
I
29
Monday January 14
930 am-1030 Discuss procedures for running problems on computer
1030-1045 Break
1045-1200 Component Programs of the CONCOR system
1200- 115 pm Break
Tuesday January 15
930 am - 325 pm Free work time
Wednesday January 16
930 am 1200 Free work time
1200- 115 pm Break
115 pm-215 How to Install CONCOR on IBM 360370 OS
215- 230 Break
230-325 Free work time
Thursday January 17
930 am-325 Free work time
115 pm-215 Future CONCOR Design Considerations - for census processing - for survey processing
- manual correction system
215- 230 Break
230 - 245 Evaluation Guidelines
- Hand out evaluation forms
245 - 325 Free work time
Friday January 18
930 am-1030 Free work time 115-325 Free work time
1030 - 1045 Break
1045 - 1200 Group Discussion on CONCOR - submission of written evaluations Distribution of IBM installation tapes Presentation of Certificates to Participants
1200-115 pm Break
APPENDIX D
PARTICIPANTS
APPENDIX D
CONCOR WORKSHOP January 7 - 18 1980 Marlow Heights MD
PARTICIPANTS
KENYA
James Midianga Government Computer Centre Central Bureau of Statistics PO Box 30266 Nairobi Kenya
PANAMA
Omar Rivera Rios Data Processing Specialist Contraloria General de la Republica de Panama Apartado Postal 5213 Panama 5 Panama
PHILIPPINES Guida T Capellan OIC Computer Programming Division National Census and Statistics Office Solicarel Bldg 1 Magsaysay Blvd
Sta Mesa Manila Philippines
THAILAND
Angsumal Sunalai National Statistical Office Bangkok 1 Thailand
EGYPT
Farag Sedky Mourad Ghaleb CAPMAS (Centeral Agency for Public
Mobilization and Statistics)NASR City Cairo Egypt
L
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
APPENDIX A
BUCEN IN-HOUSE REVIEW ENHANCEMENT PROPOSAL
1 Easy to use interrecord referencing
2 Improved output file capabilities
A provide overflow protection on WRITE command
B provide OUTPUT command to create corrected questionnaire file that matches IP file data dictionary
3 Improvededit statistics reported (LISTERR)
A provide automatic (user-specified) area break
B provide options for compilation and displaying edit statistics at various levels
C provide automatic (user-specified) tolerance checking of error rates by area
D automatically capture IDs of areas failing tolerance check
4 Clean up known bugs in code
5 Comprehensive testing
6 Clean up and enhance documentation
A reference manual more examples error message guide
B installation guide
C systems manual
APPENDIX B
EVALUATIVE CRITERIA
APPEiUIX B [4 si 11 T q tII
UNITED STATES GOVERNMENshy
Pfern0 KGa 7 dll TO DSPOPFPSD Robert H Haladay
DATE December 3 1979
DSPOPDEIO Liliane Floge
SUBJECT Criteria for Evaluation of SuCen COBOL CONCOR September 30thVersion as presented at January 1930 orkshop
The following points should be considered when evaluatingCONCOR the COBOLediting system during participation at the January 19380 workshyshop
1 Are the Installation anual the System ilanual and theUsers anual ready for use Are these manuals adequate fortheir purposes Are the manuals clear and consistentthey be understood by subject-matter personnel Can
as well program ers
as
2 Is the editing system capable of impleaenting all thefeatures as described in the documenit_tion 3 Does the system appc3Ar to be ely debugged andtested and ready for release to dLvlooing conris 4 Does the system appear to be fast enough to edit a
census in a reasonable airount of time 5 hat size core does tine sys t- equire
6 CorFm-ent in general on coole te-nss usefulness andefficiency of ir-nease usethe n of by developingcountry prograrrers and other c nsus personnel
cc Su-a-- e Olds APiA SDi OrWJulio Ortuizar luotolaCelta Systems
APPENDIX C
WORKSHOP ITINERARY
APPENDIX C
CONCOR Workshop Schedule January 7-18 1980
U S Bureau of the Census International Statistical Programs Center
Scuderi Building Room 209 4235 28th Avenue Marlow Heights Maryland
Monday January 7
930 am shy 1000 Welcoming Remarks
Overview of Workshop
1000 - 1000 Introduction to CONCOR - Purpose and function
- History of development - General computer
requirements
115 - 200 System Description
-Constraints in design of CONCOR
-Basic subsystems of CONCOR
-User interactions with system
-Examples of outputs produced
1030 shy 1045 Break
1045 - 1200 Editing Concepts - Ways to interrogate
data - Ways to correct data
- Editing housing and
population data - POPSTAN
- Advantages of CONCOR
1200 shy 115 pm Break
200 - 230
230 - 245
245 - 325
User Program Organization -Divisions - Sections
- Routines - Commands
Break
Command Language
Description -Types of statements -Format -Syntax
is
2
Tuesday January 8
Dictionary Division Command Statements
115 pm-2 15 Input-Record-Section930 am-10
30 Punctuation - Define-RecordInput data referencing
Working-Data-SectionWorking data declara-- New-Datations - Array-DataInternal data represen-
tation and storage Break215 - 230
1030 - 1045 Break 230 - 325 Dictionary Examples
- Minimum dictionary1045 - 1200 Dictionary-Attributes-
structureSection - Maximum dictionary- Dictionary-Name
structure
- Hand out dictionaryFile-Section problem- Input-File
- Output-File - Write-File
- Error-File
1200 - 115 pm Break
Wednesday January 9
115 pm-2 1 5 Execution Division Command
930 am-1030 Discussion of Partici-Statements (continued)pants Dictionary
- Routines of Edit-Specishyproblems fications-Section
Free work time -Prolog-Routine -Filter-Routine -Epilog-Routine
- Types and functions 1030 - 1045 Break
of edit specification
commands1045 - 1200 Execution Division
Command Statements - Range- Punctuation - Assert- Subscripting
- Internal Identifiers - Report-Control-Section
-Generate-Edit-Statistics 215 - 230 Break-Count-Imputes
-Examples 230 - 325 - PassFail clauses
- List1200 - 115 pm Break
3
Thursday January 10
930 am-1030 Discussion of Problems - If115 pm-215
Free work time - UntilExit - Stop
1030 - 1045 Break 215 - 230 Break
1045 shy 1200 Edit Specification Command Statements 230 - 325 - Drecode
(continued) - Grecode
- Allocate - Update - Let
1200 - 115pm Break
Friday January 11
930 am-1030 Edit Specification 115 pm-3 25 Hand out Edit Specishyfications as problemCommand Statements
(continued) Free work time
- Output - Write
1030 - 1045 Break
1045 - 1200 Report Division Command Statements - Display-Control-
Section -Display-Edit-Statistics
- Tolerance-Control-Section -Error-Rate-Check -Reject-File
-Report Examples
1200 - 115 pm Break
I
29
Monday January 14
930 am-1030 Discuss procedures for running problems on computer
1030-1045 Break
1045-1200 Component Programs of the CONCOR system
1200- 115 pm Break
Tuesday January 15
930 am - 325 pm Free work time
Wednesday January 16
930 am 1200 Free work time
1200- 115 pm Break
115 pm-215 How to Install CONCOR on IBM 360370 OS
215- 230 Break
230-325 Free work time
Thursday January 17
930 am-325 Free work time
115 pm-215 Future CONCOR Design Considerations - for census processing - for survey processing
- manual correction system
215- 230 Break
230 - 245 Evaluation Guidelines
- Hand out evaluation forms
245 - 325 Free work time
Friday January 18
930 am-1030 Free work time 115-325 Free work time
1030 - 1045 Break
1045 - 1200 Group Discussion on CONCOR - submission of written evaluations Distribution of IBM installation tapes Presentation of Certificates to Participants
1200-115 pm Break
APPENDIX D
PARTICIPANTS
APPENDIX D
CONCOR WORKSHOP January 7 - 18 1980 Marlow Heights MD
PARTICIPANTS
KENYA
James Midianga Government Computer Centre Central Bureau of Statistics PO Box 30266 Nairobi Kenya
PANAMA
Omar Rivera Rios Data Processing Specialist Contraloria General de la Republica de Panama Apartado Postal 5213 Panama 5 Panama
PHILIPPINES Guida T Capellan OIC Computer Programming Division National Census and Statistics Office Solicarel Bldg 1 Magsaysay Blvd
Sta Mesa Manila Philippines
THAILAND
Angsumal Sunalai National Statistical Office Bangkok 1 Thailand
EGYPT
Farag Sedky Mourad Ghaleb CAPMAS (Centeral Agency for Public
Mobilization and Statistics)NASR City Cairo Egypt
L
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
APPENDIX B
EVALUATIVE CRITERIA
APPEiUIX B [4 si 11 T q tII
UNITED STATES GOVERNMENshy
Pfern0 KGa 7 dll TO DSPOPFPSD Robert H Haladay
DATE December 3 1979
DSPOPDEIO Liliane Floge
SUBJECT Criteria for Evaluation of SuCen COBOL CONCOR September 30thVersion as presented at January 1930 orkshop
The following points should be considered when evaluatingCONCOR the COBOLediting system during participation at the January 19380 workshyshop
1 Are the Installation anual the System ilanual and theUsers anual ready for use Are these manuals adequate fortheir purposes Are the manuals clear and consistentthey be understood by subject-matter personnel Can
as well program ers
as
2 Is the editing system capable of impleaenting all thefeatures as described in the documenit_tion 3 Does the system appc3Ar to be ely debugged andtested and ready for release to dLvlooing conris 4 Does the system appear to be fast enough to edit a
census in a reasonable airount of time 5 hat size core does tine sys t- equire
6 CorFm-ent in general on coole te-nss usefulness andefficiency of ir-nease usethe n of by developingcountry prograrrers and other c nsus personnel
cc Su-a-- e Olds APiA SDi OrWJulio Ortuizar luotolaCelta Systems
APPENDIX C
WORKSHOP ITINERARY
APPENDIX C
CONCOR Workshop Schedule January 7-18 1980
U S Bureau of the Census International Statistical Programs Center
Scuderi Building Room 209 4235 28th Avenue Marlow Heights Maryland
Monday January 7
930 am shy 1000 Welcoming Remarks
Overview of Workshop
1000 - 1000 Introduction to CONCOR - Purpose and function
- History of development - General computer
requirements
115 - 200 System Description
-Constraints in design of CONCOR
-Basic subsystems of CONCOR
-User interactions with system
-Examples of outputs produced
1030 shy 1045 Break
1045 - 1200 Editing Concepts - Ways to interrogate
data - Ways to correct data
- Editing housing and
population data - POPSTAN
- Advantages of CONCOR
1200 shy 115 pm Break
200 - 230
230 - 245
245 - 325
User Program Organization -Divisions - Sections
- Routines - Commands
Break
Command Language
Description -Types of statements -Format -Syntax
is
2
Tuesday January 8
Dictionary Division Command Statements
115 pm-2 15 Input-Record-Section930 am-10
30 Punctuation - Define-RecordInput data referencing
Working-Data-SectionWorking data declara-- New-Datations - Array-DataInternal data represen-
tation and storage Break215 - 230
1030 - 1045 Break 230 - 325 Dictionary Examples
- Minimum dictionary1045 - 1200 Dictionary-Attributes-
structureSection - Maximum dictionary- Dictionary-Name
structure
- Hand out dictionaryFile-Section problem- Input-File
- Output-File - Write-File
- Error-File
1200 - 115 pm Break
Wednesday January 9
115 pm-2 1 5 Execution Division Command
930 am-1030 Discussion of Partici-Statements (continued)pants Dictionary
- Routines of Edit-Specishyproblems fications-Section
Free work time -Prolog-Routine -Filter-Routine -Epilog-Routine
- Types and functions 1030 - 1045 Break
of edit specification
commands1045 - 1200 Execution Division
Command Statements - Range- Punctuation - Assert- Subscripting
- Internal Identifiers - Report-Control-Section
-Generate-Edit-Statistics 215 - 230 Break-Count-Imputes
-Examples 230 - 325 - PassFail clauses
- List1200 - 115 pm Break
3
Thursday January 10
930 am-1030 Discussion of Problems - If115 pm-215
Free work time - UntilExit - Stop
1030 - 1045 Break 215 - 230 Break
1045 shy 1200 Edit Specification Command Statements 230 - 325 - Drecode
(continued) - Grecode
- Allocate - Update - Let
1200 - 115pm Break
Friday January 11
930 am-1030 Edit Specification 115 pm-3 25 Hand out Edit Specishyfications as problemCommand Statements
(continued) Free work time
- Output - Write
1030 - 1045 Break
1045 - 1200 Report Division Command Statements - Display-Control-
Section -Display-Edit-Statistics
- Tolerance-Control-Section -Error-Rate-Check -Reject-File
-Report Examples
1200 - 115 pm Break
I
29
Monday January 14
930 am-1030 Discuss procedures for running problems on computer
1030-1045 Break
1045-1200 Component Programs of the CONCOR system
1200- 115 pm Break
Tuesday January 15
930 am - 325 pm Free work time
Wednesday January 16
930 am 1200 Free work time
1200- 115 pm Break
115 pm-215 How to Install CONCOR on IBM 360370 OS
215- 230 Break
230-325 Free work time
Thursday January 17
930 am-325 Free work time
115 pm-215 Future CONCOR Design Considerations - for census processing - for survey processing
- manual correction system
215- 230 Break
230 - 245 Evaluation Guidelines
- Hand out evaluation forms
245 - 325 Free work time
Friday January 18
930 am-1030 Free work time 115-325 Free work time
1030 - 1045 Break
1045 - 1200 Group Discussion on CONCOR - submission of written evaluations Distribution of IBM installation tapes Presentation of Certificates to Participants
1200-115 pm Break
APPENDIX D
PARTICIPANTS
APPENDIX D
CONCOR WORKSHOP January 7 - 18 1980 Marlow Heights MD
PARTICIPANTS
KENYA
James Midianga Government Computer Centre Central Bureau of Statistics PO Box 30266 Nairobi Kenya
PANAMA
Omar Rivera Rios Data Processing Specialist Contraloria General de la Republica de Panama Apartado Postal 5213 Panama 5 Panama
PHILIPPINES Guida T Capellan OIC Computer Programming Division National Census and Statistics Office Solicarel Bldg 1 Magsaysay Blvd
Sta Mesa Manila Philippines
THAILAND
Angsumal Sunalai National Statistical Office Bangkok 1 Thailand
EGYPT
Farag Sedky Mourad Ghaleb CAPMAS (Centeral Agency for Public
Mobilization and Statistics)NASR City Cairo Egypt
L
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
APPEiUIX B [4 si 11 T q tII
UNITED STATES GOVERNMENshy
Pfern0 KGa 7 dll TO DSPOPFPSD Robert H Haladay
DATE December 3 1979
DSPOPDEIO Liliane Floge
SUBJECT Criteria for Evaluation of SuCen COBOL CONCOR September 30thVersion as presented at January 1930 orkshop
The following points should be considered when evaluatingCONCOR the COBOLediting system during participation at the January 19380 workshyshop
1 Are the Installation anual the System ilanual and theUsers anual ready for use Are these manuals adequate fortheir purposes Are the manuals clear and consistentthey be understood by subject-matter personnel Can
as well program ers
as
2 Is the editing system capable of impleaenting all thefeatures as described in the documenit_tion 3 Does the system appc3Ar to be ely debugged andtested and ready for release to dLvlooing conris 4 Does the system appear to be fast enough to edit a
census in a reasonable airount of time 5 hat size core does tine sys t- equire
6 CorFm-ent in general on coole te-nss usefulness andefficiency of ir-nease usethe n of by developingcountry prograrrers and other c nsus personnel
cc Su-a-- e Olds APiA SDi OrWJulio Ortuizar luotolaCelta Systems
APPENDIX C
WORKSHOP ITINERARY
APPENDIX C
CONCOR Workshop Schedule January 7-18 1980
U S Bureau of the Census International Statistical Programs Center
Scuderi Building Room 209 4235 28th Avenue Marlow Heights Maryland
Monday January 7
930 am shy 1000 Welcoming Remarks
Overview of Workshop
1000 - 1000 Introduction to CONCOR - Purpose and function
- History of development - General computer
requirements
115 - 200 System Description
-Constraints in design of CONCOR
-Basic subsystems of CONCOR
-User interactions with system
-Examples of outputs produced
1030 shy 1045 Break
1045 - 1200 Editing Concepts - Ways to interrogate
data - Ways to correct data
- Editing housing and
population data - POPSTAN
- Advantages of CONCOR
1200 shy 115 pm Break
200 - 230
230 - 245
245 - 325
User Program Organization -Divisions - Sections
- Routines - Commands
Break
Command Language
Description -Types of statements -Format -Syntax
is
2
Tuesday January 8
Dictionary Division Command Statements
115 pm-2 15 Input-Record-Section930 am-10
30 Punctuation - Define-RecordInput data referencing
Working-Data-SectionWorking data declara-- New-Datations - Array-DataInternal data represen-
tation and storage Break215 - 230
1030 - 1045 Break 230 - 325 Dictionary Examples
- Minimum dictionary1045 - 1200 Dictionary-Attributes-
structureSection - Maximum dictionary- Dictionary-Name
structure
- Hand out dictionaryFile-Section problem- Input-File
- Output-File - Write-File
- Error-File
1200 - 115 pm Break
Wednesday January 9
115 pm-2 1 5 Execution Division Command
930 am-1030 Discussion of Partici-Statements (continued)pants Dictionary
- Routines of Edit-Specishyproblems fications-Section
Free work time -Prolog-Routine -Filter-Routine -Epilog-Routine
- Types and functions 1030 - 1045 Break
of edit specification
commands1045 - 1200 Execution Division
Command Statements - Range- Punctuation - Assert- Subscripting
- Internal Identifiers - Report-Control-Section
-Generate-Edit-Statistics 215 - 230 Break-Count-Imputes
-Examples 230 - 325 - PassFail clauses
- List1200 - 115 pm Break
3
Thursday January 10
930 am-1030 Discussion of Problems - If115 pm-215
Free work time - UntilExit - Stop
1030 - 1045 Break 215 - 230 Break
1045 shy 1200 Edit Specification Command Statements 230 - 325 - Drecode
(continued) - Grecode
- Allocate - Update - Let
1200 - 115pm Break
Friday January 11
930 am-1030 Edit Specification 115 pm-3 25 Hand out Edit Specishyfications as problemCommand Statements
(continued) Free work time
- Output - Write
1030 - 1045 Break
1045 - 1200 Report Division Command Statements - Display-Control-
Section -Display-Edit-Statistics
- Tolerance-Control-Section -Error-Rate-Check -Reject-File
-Report Examples
1200 - 115 pm Break
I
29
Monday January 14
930 am-1030 Discuss procedures for running problems on computer
1030-1045 Break
1045-1200 Component Programs of the CONCOR system
1200- 115 pm Break
Tuesday January 15
930 am - 325 pm Free work time
Wednesday January 16
930 am 1200 Free work time
1200- 115 pm Break
115 pm-215 How to Install CONCOR on IBM 360370 OS
215- 230 Break
230-325 Free work time
Thursday January 17
930 am-325 Free work time
115 pm-215 Future CONCOR Design Considerations - for census processing - for survey processing
- manual correction system
215- 230 Break
230 - 245 Evaluation Guidelines
- Hand out evaluation forms
245 - 325 Free work time
Friday January 18
930 am-1030 Free work time 115-325 Free work time
1030 - 1045 Break
1045 - 1200 Group Discussion on CONCOR - submission of written evaluations Distribution of IBM installation tapes Presentation of Certificates to Participants
1200-115 pm Break
APPENDIX D
PARTICIPANTS
APPENDIX D
CONCOR WORKSHOP January 7 - 18 1980 Marlow Heights MD
PARTICIPANTS
KENYA
James Midianga Government Computer Centre Central Bureau of Statistics PO Box 30266 Nairobi Kenya
PANAMA
Omar Rivera Rios Data Processing Specialist Contraloria General de la Republica de Panama Apartado Postal 5213 Panama 5 Panama
PHILIPPINES Guida T Capellan OIC Computer Programming Division National Census and Statistics Office Solicarel Bldg 1 Magsaysay Blvd
Sta Mesa Manila Philippines
THAILAND
Angsumal Sunalai National Statistical Office Bangkok 1 Thailand
EGYPT
Farag Sedky Mourad Ghaleb CAPMAS (Centeral Agency for Public
Mobilization and Statistics)NASR City Cairo Egypt
L
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
APPENDIX C
WORKSHOP ITINERARY
APPENDIX C
CONCOR Workshop Schedule January 7-18 1980
U S Bureau of the Census International Statistical Programs Center
Scuderi Building Room 209 4235 28th Avenue Marlow Heights Maryland
Monday January 7
930 am shy 1000 Welcoming Remarks
Overview of Workshop
1000 - 1000 Introduction to CONCOR - Purpose and function
- History of development - General computer
requirements
115 - 200 System Description
-Constraints in design of CONCOR
-Basic subsystems of CONCOR
-User interactions with system
-Examples of outputs produced
1030 shy 1045 Break
1045 - 1200 Editing Concepts - Ways to interrogate
data - Ways to correct data
- Editing housing and
population data - POPSTAN
- Advantages of CONCOR
1200 shy 115 pm Break
200 - 230
230 - 245
245 - 325
User Program Organization -Divisions - Sections
- Routines - Commands
Break
Command Language
Description -Types of statements -Format -Syntax
is
2
Tuesday January 8
Dictionary Division Command Statements
115 pm-2 15 Input-Record-Section930 am-10
30 Punctuation - Define-RecordInput data referencing
Working-Data-SectionWorking data declara-- New-Datations - Array-DataInternal data represen-
tation and storage Break215 - 230
1030 - 1045 Break 230 - 325 Dictionary Examples
- Minimum dictionary1045 - 1200 Dictionary-Attributes-
structureSection - Maximum dictionary- Dictionary-Name
structure
- Hand out dictionaryFile-Section problem- Input-File
- Output-File - Write-File
- Error-File
1200 - 115 pm Break
Wednesday January 9
115 pm-2 1 5 Execution Division Command
930 am-1030 Discussion of Partici-Statements (continued)pants Dictionary
- Routines of Edit-Specishyproblems fications-Section
Free work time -Prolog-Routine -Filter-Routine -Epilog-Routine
- Types and functions 1030 - 1045 Break
of edit specification
commands1045 - 1200 Execution Division
Command Statements - Range- Punctuation - Assert- Subscripting
- Internal Identifiers - Report-Control-Section
-Generate-Edit-Statistics 215 - 230 Break-Count-Imputes
-Examples 230 - 325 - PassFail clauses
- List1200 - 115 pm Break
3
Thursday January 10
930 am-1030 Discussion of Problems - If115 pm-215
Free work time - UntilExit - Stop
1030 - 1045 Break 215 - 230 Break
1045 shy 1200 Edit Specification Command Statements 230 - 325 - Drecode
(continued) - Grecode
- Allocate - Update - Let
1200 - 115pm Break
Friday January 11
930 am-1030 Edit Specification 115 pm-3 25 Hand out Edit Specishyfications as problemCommand Statements
(continued) Free work time
- Output - Write
1030 - 1045 Break
1045 - 1200 Report Division Command Statements - Display-Control-
Section -Display-Edit-Statistics
- Tolerance-Control-Section -Error-Rate-Check -Reject-File
-Report Examples
1200 - 115 pm Break
I
29
Monday January 14
930 am-1030 Discuss procedures for running problems on computer
1030-1045 Break
1045-1200 Component Programs of the CONCOR system
1200- 115 pm Break
Tuesday January 15
930 am - 325 pm Free work time
Wednesday January 16
930 am 1200 Free work time
1200- 115 pm Break
115 pm-215 How to Install CONCOR on IBM 360370 OS
215- 230 Break
230-325 Free work time
Thursday January 17
930 am-325 Free work time
115 pm-215 Future CONCOR Design Considerations - for census processing - for survey processing
- manual correction system
215- 230 Break
230 - 245 Evaluation Guidelines
- Hand out evaluation forms
245 - 325 Free work time
Friday January 18
930 am-1030 Free work time 115-325 Free work time
1030 - 1045 Break
1045 - 1200 Group Discussion on CONCOR - submission of written evaluations Distribution of IBM installation tapes Presentation of Certificates to Participants
1200-115 pm Break
APPENDIX D
PARTICIPANTS
APPENDIX D
CONCOR WORKSHOP January 7 - 18 1980 Marlow Heights MD
PARTICIPANTS
KENYA
James Midianga Government Computer Centre Central Bureau of Statistics PO Box 30266 Nairobi Kenya
PANAMA
Omar Rivera Rios Data Processing Specialist Contraloria General de la Republica de Panama Apartado Postal 5213 Panama 5 Panama
PHILIPPINES Guida T Capellan OIC Computer Programming Division National Census and Statistics Office Solicarel Bldg 1 Magsaysay Blvd
Sta Mesa Manila Philippines
THAILAND
Angsumal Sunalai National Statistical Office Bangkok 1 Thailand
EGYPT
Farag Sedky Mourad Ghaleb CAPMAS (Centeral Agency for Public
Mobilization and Statistics)NASR City Cairo Egypt
L
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
APPENDIX C
CONCOR Workshop Schedule January 7-18 1980
U S Bureau of the Census International Statistical Programs Center
Scuderi Building Room 209 4235 28th Avenue Marlow Heights Maryland
Monday January 7
930 am shy 1000 Welcoming Remarks
Overview of Workshop
1000 - 1000 Introduction to CONCOR - Purpose and function
- History of development - General computer
requirements
115 - 200 System Description
-Constraints in design of CONCOR
-Basic subsystems of CONCOR
-User interactions with system
-Examples of outputs produced
1030 shy 1045 Break
1045 - 1200 Editing Concepts - Ways to interrogate
data - Ways to correct data
- Editing housing and
population data - POPSTAN
- Advantages of CONCOR
1200 shy 115 pm Break
200 - 230
230 - 245
245 - 325
User Program Organization -Divisions - Sections
- Routines - Commands
Break
Command Language
Description -Types of statements -Format -Syntax
is
2
Tuesday January 8
Dictionary Division Command Statements
115 pm-2 15 Input-Record-Section930 am-10
30 Punctuation - Define-RecordInput data referencing
Working-Data-SectionWorking data declara-- New-Datations - Array-DataInternal data represen-
tation and storage Break215 - 230
1030 - 1045 Break 230 - 325 Dictionary Examples
- Minimum dictionary1045 - 1200 Dictionary-Attributes-
structureSection - Maximum dictionary- Dictionary-Name
structure
- Hand out dictionaryFile-Section problem- Input-File
- Output-File - Write-File
- Error-File
1200 - 115 pm Break
Wednesday January 9
115 pm-2 1 5 Execution Division Command
930 am-1030 Discussion of Partici-Statements (continued)pants Dictionary
- Routines of Edit-Specishyproblems fications-Section
Free work time -Prolog-Routine -Filter-Routine -Epilog-Routine
- Types and functions 1030 - 1045 Break
of edit specification
commands1045 - 1200 Execution Division
Command Statements - Range- Punctuation - Assert- Subscripting
- Internal Identifiers - Report-Control-Section
-Generate-Edit-Statistics 215 - 230 Break-Count-Imputes
-Examples 230 - 325 - PassFail clauses
- List1200 - 115 pm Break
3
Thursday January 10
930 am-1030 Discussion of Problems - If115 pm-215
Free work time - UntilExit - Stop
1030 - 1045 Break 215 - 230 Break
1045 shy 1200 Edit Specification Command Statements 230 - 325 - Drecode
(continued) - Grecode
- Allocate - Update - Let
1200 - 115pm Break
Friday January 11
930 am-1030 Edit Specification 115 pm-3 25 Hand out Edit Specishyfications as problemCommand Statements
(continued) Free work time
- Output - Write
1030 - 1045 Break
1045 - 1200 Report Division Command Statements - Display-Control-
Section -Display-Edit-Statistics
- Tolerance-Control-Section -Error-Rate-Check -Reject-File
-Report Examples
1200 - 115 pm Break
I
29
Monday January 14
930 am-1030 Discuss procedures for running problems on computer
1030-1045 Break
1045-1200 Component Programs of the CONCOR system
1200- 115 pm Break
Tuesday January 15
930 am - 325 pm Free work time
Wednesday January 16
930 am 1200 Free work time
1200- 115 pm Break
115 pm-215 How to Install CONCOR on IBM 360370 OS
215- 230 Break
230-325 Free work time
Thursday January 17
930 am-325 Free work time
115 pm-215 Future CONCOR Design Considerations - for census processing - for survey processing
- manual correction system
215- 230 Break
230 - 245 Evaluation Guidelines
- Hand out evaluation forms
245 - 325 Free work time
Friday January 18
930 am-1030 Free work time 115-325 Free work time
1030 - 1045 Break
1045 - 1200 Group Discussion on CONCOR - submission of written evaluations Distribution of IBM installation tapes Presentation of Certificates to Participants
1200-115 pm Break
APPENDIX D
PARTICIPANTS
APPENDIX D
CONCOR WORKSHOP January 7 - 18 1980 Marlow Heights MD
PARTICIPANTS
KENYA
James Midianga Government Computer Centre Central Bureau of Statistics PO Box 30266 Nairobi Kenya
PANAMA
Omar Rivera Rios Data Processing Specialist Contraloria General de la Republica de Panama Apartado Postal 5213 Panama 5 Panama
PHILIPPINES Guida T Capellan OIC Computer Programming Division National Census and Statistics Office Solicarel Bldg 1 Magsaysay Blvd
Sta Mesa Manila Philippines
THAILAND
Angsumal Sunalai National Statistical Office Bangkok 1 Thailand
EGYPT
Farag Sedky Mourad Ghaleb CAPMAS (Centeral Agency for Public
Mobilization and Statistics)NASR City Cairo Egypt
L
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
2
Tuesday January 8
Dictionary Division Command Statements
115 pm-2 15 Input-Record-Section930 am-10
30 Punctuation - Define-RecordInput data referencing
Working-Data-SectionWorking data declara-- New-Datations - Array-DataInternal data represen-
tation and storage Break215 - 230
1030 - 1045 Break 230 - 325 Dictionary Examples
- Minimum dictionary1045 - 1200 Dictionary-Attributes-
structureSection - Maximum dictionary- Dictionary-Name
structure
- Hand out dictionaryFile-Section problem- Input-File
- Output-File - Write-File
- Error-File
1200 - 115 pm Break
Wednesday January 9
115 pm-2 1 5 Execution Division Command
930 am-1030 Discussion of Partici-Statements (continued)pants Dictionary
- Routines of Edit-Specishyproblems fications-Section
Free work time -Prolog-Routine -Filter-Routine -Epilog-Routine
- Types and functions 1030 - 1045 Break
of edit specification
commands1045 - 1200 Execution Division
Command Statements - Range- Punctuation - Assert- Subscripting
- Internal Identifiers - Report-Control-Section
-Generate-Edit-Statistics 215 - 230 Break-Count-Imputes
-Examples 230 - 325 - PassFail clauses
- List1200 - 115 pm Break
3
Thursday January 10
930 am-1030 Discussion of Problems - If115 pm-215
Free work time - UntilExit - Stop
1030 - 1045 Break 215 - 230 Break
1045 shy 1200 Edit Specification Command Statements 230 - 325 - Drecode
(continued) - Grecode
- Allocate - Update - Let
1200 - 115pm Break
Friday January 11
930 am-1030 Edit Specification 115 pm-3 25 Hand out Edit Specishyfications as problemCommand Statements
(continued) Free work time
- Output - Write
1030 - 1045 Break
1045 - 1200 Report Division Command Statements - Display-Control-
Section -Display-Edit-Statistics
- Tolerance-Control-Section -Error-Rate-Check -Reject-File
-Report Examples
1200 - 115 pm Break
I
29
Monday January 14
930 am-1030 Discuss procedures for running problems on computer
1030-1045 Break
1045-1200 Component Programs of the CONCOR system
1200- 115 pm Break
Tuesday January 15
930 am - 325 pm Free work time
Wednesday January 16
930 am 1200 Free work time
1200- 115 pm Break
115 pm-215 How to Install CONCOR on IBM 360370 OS
215- 230 Break
230-325 Free work time
Thursday January 17
930 am-325 Free work time
115 pm-215 Future CONCOR Design Considerations - for census processing - for survey processing
- manual correction system
215- 230 Break
230 - 245 Evaluation Guidelines
- Hand out evaluation forms
245 - 325 Free work time
Friday January 18
930 am-1030 Free work time 115-325 Free work time
1030 - 1045 Break
1045 - 1200 Group Discussion on CONCOR - submission of written evaluations Distribution of IBM installation tapes Presentation of Certificates to Participants
1200-115 pm Break
APPENDIX D
PARTICIPANTS
APPENDIX D
CONCOR WORKSHOP January 7 - 18 1980 Marlow Heights MD
PARTICIPANTS
KENYA
James Midianga Government Computer Centre Central Bureau of Statistics PO Box 30266 Nairobi Kenya
PANAMA
Omar Rivera Rios Data Processing Specialist Contraloria General de la Republica de Panama Apartado Postal 5213 Panama 5 Panama
PHILIPPINES Guida T Capellan OIC Computer Programming Division National Census and Statistics Office Solicarel Bldg 1 Magsaysay Blvd
Sta Mesa Manila Philippines
THAILAND
Angsumal Sunalai National Statistical Office Bangkok 1 Thailand
EGYPT
Farag Sedky Mourad Ghaleb CAPMAS (Centeral Agency for Public
Mobilization and Statistics)NASR City Cairo Egypt
L
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
3
Thursday January 10
930 am-1030 Discussion of Problems - If115 pm-215
Free work time - UntilExit - Stop
1030 - 1045 Break 215 - 230 Break
1045 shy 1200 Edit Specification Command Statements 230 - 325 - Drecode
(continued) - Grecode
- Allocate - Update - Let
1200 - 115pm Break
Friday January 11
930 am-1030 Edit Specification 115 pm-3 25 Hand out Edit Specishyfications as problemCommand Statements
(continued) Free work time
- Output - Write
1030 - 1045 Break
1045 - 1200 Report Division Command Statements - Display-Control-
Section -Display-Edit-Statistics
- Tolerance-Control-Section -Error-Rate-Check -Reject-File
-Report Examples
1200 - 115 pm Break
I
29
Monday January 14
930 am-1030 Discuss procedures for running problems on computer
1030-1045 Break
1045-1200 Component Programs of the CONCOR system
1200- 115 pm Break
Tuesday January 15
930 am - 325 pm Free work time
Wednesday January 16
930 am 1200 Free work time
1200- 115 pm Break
115 pm-215 How to Install CONCOR on IBM 360370 OS
215- 230 Break
230-325 Free work time
Thursday January 17
930 am-325 Free work time
115 pm-215 Future CONCOR Design Considerations - for census processing - for survey processing
- manual correction system
215- 230 Break
230 - 245 Evaluation Guidelines
- Hand out evaluation forms
245 - 325 Free work time
Friday January 18
930 am-1030 Free work time 115-325 Free work time
1030 - 1045 Break
1045 - 1200 Group Discussion on CONCOR - submission of written evaluations Distribution of IBM installation tapes Presentation of Certificates to Participants
1200-115 pm Break
APPENDIX D
PARTICIPANTS
APPENDIX D
CONCOR WORKSHOP January 7 - 18 1980 Marlow Heights MD
PARTICIPANTS
KENYA
James Midianga Government Computer Centre Central Bureau of Statistics PO Box 30266 Nairobi Kenya
PANAMA
Omar Rivera Rios Data Processing Specialist Contraloria General de la Republica de Panama Apartado Postal 5213 Panama 5 Panama
PHILIPPINES Guida T Capellan OIC Computer Programming Division National Census and Statistics Office Solicarel Bldg 1 Magsaysay Blvd
Sta Mesa Manila Philippines
THAILAND
Angsumal Sunalai National Statistical Office Bangkok 1 Thailand
EGYPT
Farag Sedky Mourad Ghaleb CAPMAS (Centeral Agency for Public
Mobilization and Statistics)NASR City Cairo Egypt
L
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
29
Monday January 14
930 am-1030 Discuss procedures for running problems on computer
1030-1045 Break
1045-1200 Component Programs of the CONCOR system
1200- 115 pm Break
Tuesday January 15
930 am - 325 pm Free work time
Wednesday January 16
930 am 1200 Free work time
1200- 115 pm Break
115 pm-215 How to Install CONCOR on IBM 360370 OS
215- 230 Break
230-325 Free work time
Thursday January 17
930 am-325 Free work time
115 pm-215 Future CONCOR Design Considerations - for census processing - for survey processing
- manual correction system
215- 230 Break
230 - 245 Evaluation Guidelines
- Hand out evaluation forms
245 - 325 Free work time
Friday January 18
930 am-1030 Free work time 115-325 Free work time
1030 - 1045 Break
1045 - 1200 Group Discussion on CONCOR - submission of written evaluations Distribution of IBM installation tapes Presentation of Certificates to Participants
1200-115 pm Break
APPENDIX D
PARTICIPANTS
APPENDIX D
CONCOR WORKSHOP January 7 - 18 1980 Marlow Heights MD
PARTICIPANTS
KENYA
James Midianga Government Computer Centre Central Bureau of Statistics PO Box 30266 Nairobi Kenya
PANAMA
Omar Rivera Rios Data Processing Specialist Contraloria General de la Republica de Panama Apartado Postal 5213 Panama 5 Panama
PHILIPPINES Guida T Capellan OIC Computer Programming Division National Census and Statistics Office Solicarel Bldg 1 Magsaysay Blvd
Sta Mesa Manila Philippines
THAILAND
Angsumal Sunalai National Statistical Office Bangkok 1 Thailand
EGYPT
Farag Sedky Mourad Ghaleb CAPMAS (Centeral Agency for Public
Mobilization and Statistics)NASR City Cairo Egypt
L
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
APPENDIX D
PARTICIPANTS
APPENDIX D
CONCOR WORKSHOP January 7 - 18 1980 Marlow Heights MD
PARTICIPANTS
KENYA
James Midianga Government Computer Centre Central Bureau of Statistics PO Box 30266 Nairobi Kenya
PANAMA
Omar Rivera Rios Data Processing Specialist Contraloria General de la Republica de Panama Apartado Postal 5213 Panama 5 Panama
PHILIPPINES Guida T Capellan OIC Computer Programming Division National Census and Statistics Office Solicarel Bldg 1 Magsaysay Blvd
Sta Mesa Manila Philippines
THAILAND
Angsumal Sunalai National Statistical Office Bangkok 1 Thailand
EGYPT
Farag Sedky Mourad Ghaleb CAPMAS (Centeral Agency for Public
Mobilization and Statistics)NASR City Cairo Egypt
L
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
APPENDIX D
CONCOR WORKSHOP January 7 - 18 1980 Marlow Heights MD
PARTICIPANTS
KENYA
James Midianga Government Computer Centre Central Bureau of Statistics PO Box 30266 Nairobi Kenya
PANAMA
Omar Rivera Rios Data Processing Specialist Contraloria General de la Republica de Panama Apartado Postal 5213 Panama 5 Panama
PHILIPPINES Guida T Capellan OIC Computer Programming Division National Census and Statistics Office Solicarel Bldg 1 Magsaysay Blvd
Sta Mesa Manila Philippines
THAILAND
Angsumal Sunalai National Statistical Office Bangkok 1 Thailand
EGYPT
Farag Sedky Mourad Ghaleb CAPMAS (Centeral Agency for Public
Mobilization and Statistics)NASR City Cairo Egypt
L
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
2 31
SAUDI ARABIA
Shargi R Al-Shargi National Computer Center PO Box 2534 Riyadh Saudi Arabia
Ahmed S Al-Ghamdi Dept Director of National Computer Center PO Box 2534 Riyadh Saudi Arabia
OTHER
Robert W ONeal USREPJECOR APO New York NY 09038
John N Adams Data Processing Advisor DACCA Department of State Washington DC 20520
OR co UNDP PO Box 224 Dacca Bangladesh
Joe Quasney US Census Bureau Washington DC 20233
Howard Brunsman 5715 N Ninth St Arlington VA 22205
Larry Shiller and Julio Ortuzar Delta Systems Consultants Inc 264 Alhambra Circle Coral Cables FL 33134
Nicholas Ourusoff Burpee Hill Road New London NH 03257 (United Nations)
Frederic J Grant IV Systems Development GeorgiaWorld Congress Institute 580 North Omni International Atlanta Georgia 30303
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
3 32
Rafael Samper Mary Jo Keenan Catherine B Gleason LACDR Room 2246 NS Washington DC 20523
John Marshall AIDSERDMPSE Rm 706C SA-12 Washington DC 20523
Susanne Bacon and Mary K Friday ISPCData Processing US Census Bureau Washington DC 20233
Leo Dougherty ISPCWorld Census Staff US Bureau of the Census Washington DC 20233
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
33
CON C OR WORKSHOP
January 7-18 1979 Washington DC
Staff
Robert R Bair
Luis Garcia
David Malkovsky
Sandra Mansfield
Selma Sawaya
Vivian Toro
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
APPENDIX E
CONCOR EVALUATION FORM
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
0
APPENDIX E
CONCOR WORKSHOP January 16 1980
BASIC COBOL VERSION 2
December 1979 Release
Participant Evaluation of Package
1 What is your impression of this version of CONCOR in terms of its usefulness and reliability for editing housing and population census data in less developed countries
2 If you are familiar with previous versions of CONCOR how does this current version compare to them
2D
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
35
2
3 How would you compare the utility of this version of CONCOR for less developed countries with other systems for editing data of which you are familiar
4 How well do you feel the documentation (Reference Manual and Diagnostic Messages Guide) supports the software
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
36
3
5 How well do you think this version of CONCOR would serve for editing other types of statistical censuses or surveys
6 If your organization does statistical data processing by computer would you recommend to your superiors that this software be used to edit the data Do you feel assistance would be required in the installation of this software on your organizations computer andor in training userR on the packages applicability to their work
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
37
7 In what way(s) could improvements or enhancements be made to this version
of the CONCOR software andor its documentation
8 Other comments
1shy
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
APPENDIX F
NEWOLD COMMAND COMPARISONS
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
0 0
OLD CONCOR VERSION 1 DECEMBER 1978
DATA-DICTIONARY
DEFINE INPUT-FILE
DEFINE ERROR-FILE
DEFINE COMMON-DATA
(contains the questionnaire-IDand record type location information)
DEFINE RECORD-TYPE=
DEFINE NEW-DATA
DEFINE ARRAY-DATA
APPENDIX F
NEW CONCOR VERSION 2 DECEMBER 1979
DICTIONARY-DIVISION
DICTIONARY-ATTRIBUTES-SECTION
DICTIONARY-NAME
FILE-SECTION
INPUT-FILE OUTPUT-FILE WRITE-FILE ERROR-FILE
DOENTIFICATION-CONTROL-SECTION
AREA-CONTROL
QUESTIONNAIRE-CONTROL
RECORD-CONTROL
INPUT-RECORD-SECTION
DEFINE-RECORD
WORKING-DATA-SECTION
NEW-DATA ARRAY-DATA
END-DiVISION
COMMENTS
In Version 1 all command keyword labels had to begin in column 1 ofthe coding sheet The position notation of Version 2 permits thecommand to begin in columns 1-72
The period preceding a statement
signifies that it is but a commentcard Accordingly it should be noted that while the END-DIVISION command must be present the division name identifier DICTIONARY-DIVISION
has not been implemented
Note Some parameters for each commandhave been altered even where general correspondence exists
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
OLD CONCOR VERSION 1 DECEMBER 1978
EDIT-SPECIFICATION
PROLOG
FILTER TYPE ( )
EPILOG
ALLOCATE (ALL) ASSERT (AST)
END ERROR
FILTER TYPE ( ) WITH
IF LET
RANGE (RNG) RECODE (REC) STOP --keyword
UPDATE (UPD) WRITE (WRT) XRECODE (XREC)
NEW CONCOR VERSION 2 DECEMBER 1979
EXECUTION-DIVISION
REPORT-CONTROL-SECTION
COUNT-IMPUTES GENERATE-EDIT-STATISTICS
EDIT-SPECIFICATION-SECTION
PROLOG-ROUTINE
FILTER-ROUTINE (name-of-record-type)
EPILOG-ROUTINE
ALLOCATE (ALLOC) ASSERT (ASRT) DRECODE (DRCD) END-DIVISION
EXIT
GRECODE (GRCD) IF LET LIST OUTPUT RANGE (RNG)
STOP UNTIL UPDATE (UPD) WRITE (WRT)
END-DIVISION
COMMENTS
Periods preceding division and section names signify that they are treated as comments by CONCOR
END-DIVISION signifies the termination of the CONCOR division organization and must always be present even though division identifiers have not been implemented
This figure additionally permits a comparison of the EDIT-SPECI-FICATION commands of both CONCOR language versions Note the changes in abbreviations and that some keyword options (not shown) have been altered even where general corresponshydence exists Individual commands are discussed on the following pages
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
CONCOR LANGUAGE COMMAND STATEMENTS
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
ALLOCATE (ALL) ALLOCATE (ALLOC) Allows for the assignment of values and with the UPDATE command providesthe imputation mechanism Major difference is this statement now providesfor a receiving-identifier list and statistics are generated as coded inthe GENERATE-EDIT-STATISTICS command option of the EXECUTION-DIVISION
ASSERT fAST) ASSERT (ASRT) The ASSERT command is the basic consistency editor command of CONCOR Command now provides for a NOERROR and MESSAGE option Additionallyby utilizinq the DUMPR DUMPO keywords provides the user with the option of using all the current value of all user-identifiers defined for the record type being processed Other changes to this command include the pass end-pass fail end-fail constructions for the P purpose of maintaining structured logic
(see RECODE) DRECODE (DRCD) Provides a method for recoding data items having a starting value of zero and continue in ascending sequence Replaces t previous XRECODE command
EXIT Provides the means to terminate EXECUTION (ESCAPE) of command statements within the UNTIL command
END END-IF END-THEN
A solitary END command is no lonqer permitted in conditional construction Note the similarity to COBOL pseudocode
END-ELSE
ERROR not implemented No longer supported in language
(continued)
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
FILTER TYPE ( ) WITH not implemented No longer are option statements supported in language This command once specified essentially a GOTO depending on construction
(see XRECODE) GRECODE (GRCD) This new command provides a method for the group recoding of input values
IF IF The IF statement has properties found virtually in all programming languages ie provides a means of evaluating a condition and controlling the execution of alternative logical statements Similar to pseudocode a CONCOR programmer must utilize the structured command IF THEN END-THEN ELSE END-ELSE construction in place of the IF THEN END ELSE END statements acceptable to the 1978 version
LET LET Virtually unchanged
LIST New command which allows for the generation of specific messages to be displayed in the report of edit statistics by questionnaire as well as specific individual identifiers
OUTPUT New command which provides the means by which records will be written to the output-file
(continuej)
4)bull
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 VERSION 1
RANGE (RNG)
RECODE (REC)
STOP- keyword
UPDATE (UPD)
(continued)
DECEMBER 1979 VERSION 2
RANGE (RNG)
name changed
STOP- keyword
UNTIL DO END-DO
UPDATE (UPD)
COMMENTS
As a basic editing command of CONCOR the December 1979 version permits the listing of the out-of-range values as well as the entire record or questionnaire through DUMPR and DUMPO optionswithin the command Another new option of this command involves the utilization of a PASS END-PASS FAIL END-FAIL construction permitting the specification of logicalcommand paths depending upon the outcome of the range test
Name changed to DRECODE Virtually identical with the DRC command in the COCENTS system
Unchanged
DO-END-DO one of the most significant and powerful enhancements to the CONCOR language this command provides a loopingcapability similar to most other high-level pogramminglanguages The number of iterations is under user control as well as the value initially assigned to the user identifier (counter) which is incremented with each successive execution of the command statements
Virtually unchanged it is the basic mechanism for maintaining arrays involved in imputation processes
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
CONCOR LANGUAGE COMMAND STATEMENTS (continued)
DECEMBER 1978 DECEMBER 1979 VERSION 1 VERSION 2 COMMENTS
WRITE (WRT) WRITE (WRT) The WRITE language statement while once utilized as the primary output command is now used to create an auxiliary or derivative file Statistics may be accumulated with the specification of the GENERATE-EDIT-STATISTICS option of the EXECUTION-DIVISION (See WRITE-FILE of DICTIONARY-DIVISION)
XRECODE (XREC) name changed Old command which has become the GRECODE command in the December 1979 version
0 0 0
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
DECEMBER 1978 VERSION 1
DECEMBER 1979 VERSION 2
Not implemented REPORT-DIViSION
DISPLAY-CONTROL-SECTION
DISPLAY-EDIT-STATISTICS
TOLERANCE-CONTROL-SECTION
ERROR-RATE-CHECK REJECT-FILE
END-DIVISION
COMMENTS
This new division (note division and section names treated as comments) provides the means to controlthe type of edit statistics reports to be generated Generally all reports produced by the commands ofthis division are organized according to the AREA-CONTROL command of the DICTIONARY-DIVISION Notethat this means the input data file must be preproshycessed (presorted external to CONCOR) on the basisof the AREA-CONTROL data field as CONCOR is incapableof merging uncontiguous records
There is currently no command to enable users to specify unique report headings on the listingfrom this division
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
APPENDIX G
ISPC FUTURE ENHANCEMENT LIST
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
G-1
Appendix G
Future Versions of CONCOR Design Considerations
The present version of the CONCOR system (Basic COBOL Version 2) was designed to provide adequate computing power to properly edit housing and population census da a using automatic correction techniquns As a result CONCOR should not be construed to be the ultimate software package for generalized data editing The developers of this version of CONCOR realize that there is a group of improvements that can be made to the system which would facilitate the census data editing process A second group of changes or enhancements also exist that could facilitate the editing of survey or other types of census data It should be noted that some of the modifications to meet these two objectives at times are mutually exclusive Some of the possible modifications and considerations are outlined below
1 Housing and Population Census Processing
1 1 Software
1111 Dictionary Division
(1) Implementation of the Dictionary Attributes Section This would allow the user to specify the dictionary size (number of pages) on disk and control other file characteristics such as volume specification and retention periods if applicable
(2) Addition of an optional input andor output file(s) which could be used to initialize and save values stored in arrays and used during the allocation process This feature would require additional syntax analysis and the creation of new commands to perform the movement of the data values
(3) An output record format area (Output Record Section) could be added This would allow the creation of edited output files in a different format and length than the file read into CONCOR This implementation would require other enhanceients to provide a simple means of moving values from th-iT i tato tile output file Unique naming of user-identifiers would also need to be addressed Another possiblity is the declaration of both the input and output location in the same command when specifying the contents of tile data record to be edited
(4) Addition of user-identifier naires to fields used in the controlling of summary statistics and unique questionnaire determination This would require modification to the system level data dictionary
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string
46 G-2
(5) Increasing the number of area and questionnaire control fields
This implementation would require modification lo the
dictionary as tell as to the error records passed to the Report
Division Another problem would be the formatting of the extra
data values in the report headings
(6) Permit the mixing of difFerent data item types in the CONTROL
commands This would allow the greatest flexibilty in choosing
control fields In order to implement this enhancement the
dictionary and the generated COBOL program (EDITOR) would
require modification
(7) Give the user access to the values in the control fields during
the editing process Either the user would be prohibited from
changing these values (as is currently done with CONCOR
internal identifiers) to prevent the creation of new questionnaires or the summary information gathered on a
questionnaire basis would become meaningless Another
ramification of this enhancement is how to handle fields defined as alphanumeric as CONCOR internally works in binary
(8) Allow input data records with a similar format to be defined by
the same DEFINE-RECORD command statement This could be
accomplished by permitting multiple values in the RECORD-TYPE
clause of the DEFINE-RECORD command A possible problem here
is in the determination of which value to be move to the output
record when the record is to be written out using the OUTPUT
command 0
(9) Implement a COMMION-DATA concept to handle items command to all
record types This could be costly in terms of core storage
and additional conversion time if a storage area is allocated
for each identifier for each record type Only permitting one
common area for all records would require validation of the
data to ensure the values were exactly the same on each record
This wJould require the CONCOR program to perform many more
internal checks when reading the data file into the store area
(10) Permit the selective inclusion of data items found in the
DEFINE-RECORD command statement into the universe count used
for the determination of tolerance levels when using the COUNT-IIPUTES command statement
(11) Increase the number and types of data format referencing now
permitted External numeric data (type N) could be expanded to
handle 18 digit numbers Also signed numeric and packed
decimal data format could be added The signed numeric format
would allow both leading blanks and negative data values This
format is essential ihen dealing with data files generated by
FORTRAN programs
0
+
G-3 47
(12) Increase the number of comparison strings permitted for alphanumeric data items This would allow for easier processing of alphanumeric strings but would require modifications to the system data dictionary
(13) If a range of valid values were specified with the declaration of a user-identifier in the DEFINE-RECORD command an automatic range check could be performed prior to CONCOR giving the user access to the data items on the questionnaire The inclusion of such values are now done by many users in the form of comment statements This enhancement would also facilitate the use of the dictionary by tabulation software
(14) The language format used in the Dictionary Division could be modified to recognize both the space and comma character as valid delimiters The user would still be required to code two commas to receive the system default value for any parameter
(15) Provide a repetition factor in the initialization of arrays
(16) Print the data dictionary name in the titles of the dictionary documentation or provide a means by which the user may specify a title to appear in the listings
112 Execution Division
(M) Implement the Run Control Section This would allow the user to increase execution speed of the EDITOR program This would be accomplished by eliminating some of the system protection features such as division by zero and out of bounds checking In this secion the user could also specify new maximum values for the EDITOR error limits
(2) Provide an internal identifier (OCCURRENCE-PTR) which would contain the occurrence number of the record currently being filtered The only problem to resolve is how to treat this id-ntifier outside of the Filter Routines
(3) Implement a CREATE commnand This command would allow the user to create a record that Was not found on the input file This implementation would require special care as it gives the user the ability to change the valme of a CONCOR internal identifier (TYPE-CGUNT) and modify the contents of the store Another consequence of this command is what action is to be taken in the calculation of tolerance in terms of this new records inclusion into the universe of observations and number of changes
iK
48 G-4
(4) The code generated in the EDITOR program for the DRECODE
command should be modified to utilized a lookup table approach
This would decrease the execution time of the command but woul C
require additional communication between the GENED and GENDD
programs
(5) Permit the user to continue the message text field over
successive lines of code
(6) Implementation of a SET command command that would change
CONCORs default occurrence pointer for the duration of the SET
command This command would be ever helpful as the validation
of the existence of a record would only have to be done once
when the SET command was encounterd The major drawback to
this command is in the users ability to remember the action
taken during the last execution of the command when the
Execution Division command statements may cover many pages of
code
(7) Implement a method of specifying different data formats for
fields on the WRITE command statement This would give greater
flexibility to the user but would require major revision to the
routines now used to process the WRITE command
(8) Provide the user with two sets of internal identifiers The
first set currently exists in the system and is valid on the
basis of the current control area (if specified) being
executed The second set would be accumulated on a run total
basis and would require the creation of new CONCOR internal
ident ifiers
(9) Implement automatic array declarations for capturing allocation
frequency distributions This could be done by the system for
discrete values (especially if point number 13 above is
implemented) but a mechanism for handling continuous values
would need to be designed nother program would then be
required to format the information gathered into a readable
form
113 Report Division
(I) Provide a means by which the user may specify their own headings or titles -for the reports
(2) Give the user the ability to override page ejection as a means
of saving paper
(3) Provide a neans by which the statistics gathered may be
presented in other aqgregations other than the area-break and
total levels now pro ided This would require another
procedure to aggregate the statistics and pass them onto the
Report Division before the printing of the reports began Thi
extra procedure is required because of program size
cons i derat ions
G-5
49
114 System Level
(1) Generation of assembler (ALC) language code for the EDITOR program to be used on the IBM 360370 series type computers
(2) Modification of the GENSRC program i utilize a lookup table instead of the linear search technioa2 now used
(3) Modification of the RDMSGTXT program to look at continuation lines when searching for the text to be supplied by the system
(4) Hlodification to the system data dictionary to better arrange and make available information needed most frequently Also to examine the possibility of making the section dealing with the message text and report generation a separate file Investigate the industry standards for dictionary format and see how CONCOR meets the requirements set forth
(5) Make the parsing and syntacial routines interactive so as to increase the product ity of the CONCOR user This does not mean to imply that the editing process itself should be made interactive but rather only the development of the user code should be put online
12 Documentation
(1)The development of a self-teaching CONCOR manual
(2) The development of a CONCOR Users Guide
G-6
0
50
2 Survey or other Census Processing
21 Software
(1) Make optional the specification of the record type and questionnaire identification information as some files are not
hierarchical in nature
(2) Provide another input file as a means of doing a check-in procedure of sample cases expected This new input file would contain the master list of those oberservations selected to be
examined
(3) Allow floating point calculations
(4) Provide a mechanism for referencing repetitive sections of questionnaire This referencing (loop letter approach) could be accomplished in the same fashion as interrecord checking is now implemented Note that by using this approach two versions of the software would be required One version would be as CONCOR is now implemented and the second would accept flat data files where the subscript indicated a repetitive section on the survey document
(5) Add a weighting scheme to allow meaningful totals and summary
statistics based upon the sampling frame used to gather the data being edited
(6) Increase the number of user-identifiers allowed in the system because of the more detailed information found on survey forms
(7) Provide a means of manual correction of erroneous data items This correction scheme would utilize the data dictionary and allow for correction based upon the user-identifer name and
questionnaire identification CELADE has a COBOL version of
this program but it has not been finished nor tested
42
APPENDIX H
CONCOR SYSTEM INTERNAL VARIABLES
0 APPENO H
CONCOR SYSTEM INTERNAL VARIABLES
1 EOF-FLAG
2 ERROR-TN-QUESTIONNAIRE
3 CURRENT-POINTER-VALUE
4 TYPE-COUNT ( )
5 RECORD-COUNT
6 QUESTIONNAIRE-COUNT
7 RECORDS-IN-STORE
8 INVALID-RECORD-TYPE-COUNT
9 INCOMPLETE-FLAG
10 CONTINUATION-FLAG
11 INVALID-RECORD-FLAG
EOF-FLAG
ERRORS-IN-QUESTIONNAIRE
TYPE-COUNT ( )
IN-RECORD-COUNT
IN-QUESTIONNAIRE-COUNT
RECORDS-IN-STORE
INVALID-RECORD-TYPE-COUNT
INCOMPLETE-FLAG
CONTINUATION-FLAG
OUT-OF-RANGE
NOT-NUMBER
BLANK
(new internal counters) OUT-RECORD-COUNT OUTPUT-QUEST-COUNT WRITE-RECORD-COUNT
Change in spelling of variable
Not mentioned--possible ommission from manual
Not implemented
Note Current documentation equivocates when some variables
are reset--most are initialized with each control area break
Though implemented as reserved identifiers due to poor documentation of (old) CONCOR exact values and utility were unknown
These apparently new counters permit access to valuable totals within control areas
Note These are not cumulative counters Also outputshyquest-count violates naming conversion eg in out
APPENDIX I
CONCOR-EDITOR EXECUTION STATISTICS
C 0 N C o R - E D I T O R EXECUT ION S-TA-T1 ST I CS RUN DATEt 517I
APPENDIX I
CONTROL AREA 0 INPUTS RELD OUTPUTS BY OUTPUTCOHMAND OUTPUTS BY W0ITECOMP4JD NO VALI 0 SEQUENCE NUMBER 00000 00 0 RECTYPES 0
OUESTIONNAIRES RECORDS 0 QUEST KEYWORD- 0 ECORDS- RECORDS FORQUFSTshy
bullDIVISION BY ZERO LINE NUMBER i18 to DIVISION BY ZERO LINE NUMBER 118 bo DIVISION BY ZERO 00 LINE NUMBER 118
Oboo DIVISION BY ZERO 00 LINE NUMBER 118 LTNE |UJMP 000 DIVISION BY ZERO 000 LINE NUMBER 118
bo DIVISION BY ZERO 0 LINE NUMBER 118 00 RUN ABORTED DUE TO DIVISION BY ZERO ERRORS 00 1it LIST IR003 = Ro3t
1 6 42
112 LIST
Comments One of the most sever criticisms 114 LIST IDIVISIION BY ZErO ERROR Cw FOLLOWS of the old CONCOR package was that ]l LET RO01 003 = O it would cease processing for no
apparent reason--ABEND without generating any messages The most )16 LIST iRO01 R003 SET TO ZEROS Pou19 Ron3 conmon types of cessations are division by zero array coordinate 117 LIST RO0 0 = errors and user specified terminations A program to deliberately 118 LET P003 = R002 RO01 test CONCORS ability to providediagnostics was written The system 119 LIST LIST R003 R002RO01 =1 R0031 perfoned exceptionally well and correctly provided the line numbers in the 120 source program which caused the data exceptions The program text has been 121 superimposed upon the Execution Statistics page for illustration purposes 122 END TEST OF OPERANDS AND MSG
123
124
1 5 PEGISTERS RESET TO ZERO
125 LET R001 RO00 R003 = nt
0 N C 1 - Z D I T 0 it - ( r S T A T I S T I C S QJi DAT 0119
I ~ - ~APPENDIX b)APPENDIX TL I A Cr14A) 01 3TUTS JRtTF -0 VtLIQ AP-A I PUT R - 3ITD JTS LY UTPUT - Y C
I)ESTIjAAIES - PECLRW cU] I S T igry-t QEC UP )S irrkqS F] ) T
VLLII= FF= 15( C 1 u U-1 Tr JJ TI A)[ 9 )i 1= TI =WATE ) L i - U = VI L IJ) L T J3r-= I n
D 1 4A 1 EItR P IXSTI J A 1 L = 7 II I fIAT E I i Y C
= y 4 I 19f CZ U AT -Z-2j 1ESTIJh14 E CJ)PI) IATE= 2 V L 1E= L T I P1 =UM 11
= rU F r
F R L- -- I JE TI rjA OrPR eTE= rUMiF=E II V1 LIIF= LIT V M
tYfZU ttj gR R QUIESTI3Ii1 j Vi r IATE= LI n l FJw== 0R V J L T C ( -pJIT E~Q - cUESTT Y2ZjI 1= g rfiflUI)IATL= VeLUJE= P LIf PP= 1 3)
bull~ - A ED 011 1 Tn CCy -tENCE C rGR DI T FOR UP S )r 0 JCCUJ0EICF FJ71 ECGPO
173
175 UNTIL ROO1 gt 10 VARY P001 FROM 1 By Comments
In this example RO01 a 2 dimensional array 177 DO with 10 rows and 10 columns is attempting
178 UNTIL R002 - 10 VARY BO FROM 1 BY 1 to update the smaller (5 X 5) TA2 Nonshy- Uexistent element references caused the error
The program text has been superimposed on this
page for illustration purposes
180 ALLOCATE NOOP = TA1 (RO0] ROO)I
JAI UPDATE TA2 (RO01 R002) RO031
182 LIST N002 = R0037 NOO R003
1P3 LIST VALUES OF TA1 = RO03v TAl (RO01 R002)
184 LIST
185 LIST ITA2=TA2 RO01 Rn02)l
l6 END-DOI LIST
187 END-DO
deg r- -- shy
APPENJDIXI1 J( j
CONTROL AREA INPUTS READ OUTPUTS BY OUTPUT LOMANO OUTPUTS TEltCOMMNDfN
SEQUENCE NUMBER 11T A QUESTIONNAIRES RECORDS QUEST KEYWORD RECORDS SFf- 0 T rr-
JOB STOPPED DUETO USER STOP-RUN COMMAND ON LINC 215 - - -- 2 15 _ _ 4$ A
EDITOR -- NO RMAL END OF J B r T I1)
214 IF IN-RECORD-COUNT gt Comments In this example when the internal variable1215 ITHEN STOP-RUN END-THEN was than
215 TE END-THEN IN-RECORD-COUNT was greater than 500 processing was directed by the source COBOL CONCOR language
216 1 program to cease The program text has been - upon this page for illustrationI _-EC superimposed
S IF IN-RECORD-COUNT- lt 2 - - purposes ~~~~~~~~~~- bull - - ------I-
218~ THEN -LIST lt(THAN 20RECS INR
amp-~~~~~ ntai-beDcent
I
APPENDIX J
DIAGNOSTIC MEFSAGE GUIDE EXAMPLE
APPENDIX J DD-2
WARNING(DD-001) BLANK COMMAND STATEMENT LINE
EXPLANATION A blank command statement line was found in the user Dictionary Division command statement file
ACTION TAKEN Parsing continued with the next command statement line in the Dictionary Division command file
USER RESPONSE Put a comment indicator () in column 1 of the command statement line if spacing is desired if not remove the command statement line
ERROR (DD-002) ILLEGAL FIRST CHARACTER IN STRING
EXPLANATION A character not in the CONCOR legal character set (A-Z 0-9 - + the comma character () the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found as the first character in the string
ACTION TAKEN Parsing began again with the next string
USER RESPONSE Check for possible keying errors and ensure that the first character is a member of the valid CONCOR
character set
ERROR (DD-003) END OF LINE REACHED WITH NO DELIMITER FOR ILLEGAL STRING
EXPLANATION A string starting with a character not valid in the CONCOR legal character set (A-Z 0-9 - + the comma character ()p the delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ()) was found without one of the delimiter characters ( =) when the end of the command line was reached
ACTION TAKEN Parsing began again with the next user command statement line in the Dictionary Division command file
USER RESPONSE Check for possible keying errors and ensure that each of the characters in the string is a member of the CONCOR valid character set (A-Z 0-9 - + the comma character () he delimiter character () the keyword separator character (=) the command terminator character () and the literal string delimiter ())
ERROR (DD-004) COMMENT INDICATOR () IN STRING
EXPLANATION The comment indicator character () was found embedded in a string