5
613 Evaluation of residents’ surgical skills is usually per- formed by subjective faculty assessment. The assessment is typically performed at the end of the rotation and is based on the recollection of how the resident performed during that rotation. This type of assessment has been shown to have poor reliability/validity and is often biased by factors other than technical skill. 1 Despite the importance of tech- nical skills for surgical competency, <1% of obstetrics and gynecology residencies actually test the technical skills of their residents. 2 Once out in practice, there are no require- ments to document skill competency, putting surgical pro- fessions in stark contrast to commercial airline pilots who undergo extensive and continually updated certification. During the past several years, we have developed an objective structured assessment of technical skills (OSATS) for obstetrics and gynecology residents. 3,4 This examination tool is based on the work of Reznick et al 5 from the University of Toronto. We have designed an assessment that can be administered in animals or surgical models and can evaluate laparoscopic and open abdominal surgical skills. In our previous stud- ies 3,4 we have shown that OSATS is a feasible method to assess surgical skills and can be done with high reliabil- ity, interrater reliability, and validity. We have also found that testing in bench models is equivalent to test- ing in animal models, with the advantage that examina- tions conducted in bench models are significantly less costly and more easily administered. Although the results of our initial studies have been promising, there is still additional research needed to ver- ify the results. In our previous studies, the examinations From the Department of Obstetrics and Gynecology, a University of Wash- ington School of Medicine, and the Department of Obstetrics and Gyne- cology, b Madigan Army Medical Center. Supported in part by a grant from the National Board of Medical Exam- iners (NBME) Medical Education Research Fund Grant and by an edu- cational grant from Ortho-McNeil. Charles Hunter Award Paper, to be presented at the Annual Meeting of the American Gynecological and Obstetrical Society, September 12-14, 2002, Hot Springs, Virginia. The project does not necessarily reflect NBME policy and NBME support provides no official endorsement. The views expressed in this article are those of the authors and do not reflect the official policy or position of the Department of the Army, the Department of Defense, or the United States Government. Reprint requests: Barbara A. Goff, MD, Department of Obstetrics and Gynecology, Box 356460, University of Washington School of Medicine, Seattle, WA 98195-6460. Copyright 2002, Mosby, Inc. All rights reserved. 0002-9378/2002 $35.00 6/6/122145 doi:10.1067/mob.2002.122145 Surgical skills assessment: A blinded examination of obstetrics and gynecology residents Barbara A. Goff, MD, a Peter E. Nielsen, MD, b Gretchen M. Lentz, MD, a Greg E. Chow, MD, b Robert W. Chalmers, MD, b Dee Fenner, MD, a and Lynn S. Mandel, PhD a Seattle and Tacoma, Wash OBJECTIVE: We have previously shown that objective structured assessment of technical skills (OSATS) is an innovative, reliable, and valid method of assessing surgical skills.Our goal was to establish the feasibility, reliability, and validity of our surgical skills assessment instrument when administered in a blinded fashion. STUDY DESIGN: A 7-station OSATS was administered to 16 obstetric and gynecology residents from Madi- gan Army Medical Center.The test included laparoscopic (salpingostomy, intracorporeal knot, and ligation of vessels with clips) and open abdominal procedures (subcuticular closure, bladder neck suspension, enter- otomy repair, and abdominal wall closure). All tasks were performed with lifelike surgical models. Residents were timed and assessed at each station with 3 methods of scoring: task-specific checklist, global rating scale, and pass/fail grade.Each resident was evaluated by one examiner blinded as to the postgraduate year level and one examiner who had previously worked with the resident. RESULTS: Assessment of construct validity (the ability to distinguish between resident levels) found signifi- cant differences on the checklist, global rating scale, and pass/fail grade by residency level for both blinded and unblinded examiners. Reliability indices calculated with Cronbach’s α were .82 for the checklists and .93 for the global rating scale.Overall interrater reliability between blinded and unblinded examiners was 0.95 for global rating scale and ranged from 0.74 to 0.97 for the checklists.The cost to administer the exam for the 16 residents was approximately $1000. CONCLUSIONS: OSATS administered in either a blinded or unblinded fashion can assess residents’surgical skills with a high degree of reliability and validity.This study provides further evidence that OSATS can be used to establish surgical competence. (Am J Obstet Gynecol 2002;186:613-7.) Key words: Surgical skills assessment

Surgical skills assessment: A blinded examination of obstetrics and gynecology residents

Embed Size (px)

Citation preview

Page 1: Surgical skills assessment: A blinded examination of obstetrics and gynecology residents

613

Evaluation of residents’ surgical skills is usually per-formed by subjective faculty assessment. The assessment istypically performed at the end of the rotation and is basedon the recollection of how the resident performed duringthat rotation. This type of assessment has been shown tohave poor reliability/validity and is often biased by factorsother than technical skill.1 Despite the importance of tech-

nical skills for surgical competency, <1% of obstetrics andgynecology residencies actually test the technical skills oftheir residents.2 Once out in practice, there are no require-ments to document skill competency, putting surgical pro-fessions in stark contrast to commercial airline pilots whoundergo extensive and continually updated certification.

During the past several years, we have developed anobjective structured assessment of technical skills(OSATS) for obstetrics and gynecology residents.3,4

This examination tool is based on the work of Reznicket al5 from the University of Toronto. We have designedan assessment that can be administered in animals orsurgical models and can evaluate laparoscopic andopen abdominal surgical skills. In our previous stud-ies3,4 we have shown that OSATS is a feasible method toassess surgical skills and can be done with high reliabil-ity, interrater reliability, and validity. We have alsofound that testing in bench models is equivalent to test-ing in animal models, with the advantage that examina-tions conducted in bench models are significantly lesscostly and more easily administered.

Although the results of our initial studies have beenpromising, there is still additional research needed to ver-ify the results. In our previous studies, the examinations

From the Department of Obstetrics and Gynecology,a University of Wash-ington School of Medicine, and the Department of Obstetrics and Gyne-cology,b Madigan Army Medical Center.Supported in part by a grant from the National Board of Medical Exam-iners (NBME) Medical Education Research Fund Grant and by an edu-cational grant from Ortho-McNeil.Charles Hunter Award Paper, to be presented at the Annual Meeting ofthe American Gynecological and Obstetrical Society, September 12-14,2002, Hot Springs, Virginia.The project does not necessarily reflect NBME policy and NBME supportprovides no official endorsement. The views expressed in this article arethose of the authors and do not reflect the official policy or position of theDepartment of the Army, the Department of Defense, or the United StatesGovernment.Reprint requests: Barbara A. Goff, MD, Department of Obstetrics andGynecology, Box 356460, University of Washington School of Medicine,Seattle, WA 98195-6460.Copyright 2002, Mosby, Inc. All rights reserved.0002-9378/2002 $35.00 6/6/122145doi:10.1067/mob.2002.122145

Surgical skills assessment: A blinded examination of obstetricsand gynecology residents

Barbara A. Goff, MD,a Peter E. Nielsen, MD,b Gretchen M. Lentz, MD,a Greg E. Chow, MD,b

Robert W. Chalmers, MD,b Dee Fenner, MD,a and Lynn S. Mandel, PhDa

Seattle and Tacoma, Wash

OBJECTIVE: We have previously shown that objective structured assessment of technical skills (OSATS) isan innovative, reliable, and valid method of assessing surgical skills. Our goal was to establish the feasibility,reliability, and validity of our surgical skills assessment instrument when administered in a blinded fashion.STUDY DESIGN: A 7-station OSATS was administered to 16 obstetric and gynecology residents from Madi-gan Army Medical Center. The test included laparoscopic (salpingostomy, intracorporeal knot, and ligation ofvessels with clips) and open abdominal procedures (subcuticular closure, bladder neck suspension, enter-otomy repair, and abdominal wall closure). All tasks were performed with lifelike surgical models. Residentswere timed and assessed at each station with 3 methods of scoring: task-specific checklist, global ratingscale, and pass/fail grade. Each resident was evaluated by one examiner blinded as to the postgraduate yearlevel and one examiner who had previously worked with the resident.RESULTS: Assessment of construct validity (the ability to distinguish between resident levels) found signifi-cant differences on the checklist, global rating scale, and pass/fail grade by residency level for both blindedand unblinded examiners. Reliability indices calculated with Cronbach’s α were .82 for the checklists and .93for the global rating scale. Overall interrater reliability between blinded and unblinded examiners was 0.95 forglobal rating scale and ranged from 0.74 to 0.97 for the checklists. The cost to administer the exam for the 16residents was approximately $1000.CONCLUSIONS: OSATS administered in either a blinded or unblinded fashion can assess residents’ surgicalskills with a high degree of reliability and validity. This study provides further evidence that OSATS can beused to establish surgical competence. (Am J Obstet Gynecol 2002;186:613-7.)

Key words: Surgical skills assessment

Page 2: Surgical skills assessment: A blinded examination of obstetrics and gynecology residents

614 Goff et al April 2002Am J Obstet Gynecol

were conducted only at the University of Washington in asingle residency program. Additional information aboutfeasibility can be obtained by administering the exam toresidents from other programs. Another concern has beenthat all of the examiners (University of Washington fac-ulty) had worked closely with the residents, which couldhave biased the results. The purpose of our current studywas to perform the OSATS in a blinded fashion and com-pare blinded scores with unblinded scores to determinethe reliability and validity of this surgical assessment tool.

Material and methods

We administered a written examination and a surgicalskills assessment examination to 16 of the 20 obstetrics andgynecology residents at Madigan Army Medical Center.The examination was given in September, 2000, on a singleday from 9 AM to 5 PM. It was similar to the OSATS, whichhad been administered at the University of Washingtonand previously described in detail.3,4 A 7-station examina-tion was developed. This included laparoscopic (linearsalpingostomy, intracorporeal knot, and ligation of vesselwith clips) and open abdominal procedures (abdominalwall closure, repair of enterotomy, bladder neck suspen-sion, and subcuticular closure). Tasks were selected for arange of difficulty based upon our previous research.

Each task was performed in a surgical model pur-chased from Limbs and Things, Inc (Bristol, England).Because of the need to evaluate laparoscopic skills, thetesting was performed in the animal surgical facilities sothat we would have access to laparoscopic equipment andcameras. For each procedure, the faculty member acts asa qualified assistant but does only what is asked by the res-ident and provides no input on surgical management.The resident is responsible for choosing the appropriateinstruments and suture and directing the assistant. For in-struments and suture, appropriate choices as well as dis-tractors are provided.

The examiners consisted of 3 faculty from the Univer-sity of Washington (1 gynecologic oncologist and 2 uro-gynecologists) who were completely blinded regardingany information about the residents. In addition, 6 fac-ulty from Madigan Army Medical Center (1 gynecologic

oncologist, 2 urogynecologists, 1 perinatologist, and 2generalists) comprised the unblinded examiners. TheUniversity of Washington faculty had participated in ourprevious studies of OSATS. The Madigan faculty had par-ticipated in a 60-minute orientation session on how toconduct the examination a month before the test date.Each resident was graded by one University of Washing-ton faculty and 2 Madigan faculty. Faculty members wereassigned specific procedures, and the residents rotatedthrough the tasks. Faculty members evaluated the sametask for each resident.

Three evaluation methods were used to score the resi-dents and have been reported previously.3,4 Briefly, weused a task-specific checklist, a global rating scale, and anoverall pass/fail judgement. The total possible points forthe global rating scale is 35, and the total possible pointsfor the checklists ranged from 26 to 52, depending on theindividual task. Residents could fail for 2 reasons: poorperformance or inability to complete the task in 10 min-utes. Reason for failure was recorded for each task. Achecklist, global rating scale, pass/fail grade, and time toperform the procedure were recorded for each task theresident performed.

After completing the skills assessment, residents werealso asked to complete a written examination on surgicalanatomy, sutures, and laparoscopy. This written test hasbeen evaluated previously for reliability and validity.8,9 Inaddition, we had all residents fill out an evaluation formabout their experiences.

Statistical analyses were performed using SPSS for Win-dows, version 10.0 (SPSS, Inc, Chicago, Ill). Internal con-sistency, which is a measure of the reliability of theexamination, was calculated using Cronbach’s α. Inter-rater reliability was calculated using intraclass correlationcoefficients. Validity, which is the extent to which the testmeasures what it is intended, is very difficult to assess in asingle study, so educators often settle for proxy measures,such as construct validity. Construct validity, which is theability to distinguish between residency levels, was assessedby analyzing resident performance with a one-way analysisof variance, with residence year as the independent variable. Post-hoc contrasts were done with Student–

Table I. Mean total scores, time, and pass rates for objective structured assessment of technical skills*

Mean time Pass rate Fail because of Procedure Mean checklist† Mean global† (s) (%) time (%)

Subcuticular closure 76 ± 16 67 ± 22 409 ± 122 87 0Abdominal wall closure 67 ± 20 65 ± 19 606 ± 209 44 38Burch 39 ± 28 47 ± 26 300 ± 78 37 0Ligate vessels 78 ± 14 73 ± 18 369 ± 123 81 19Intracorporeal knot 54 ± 24 56 ± 23 572 ± 59 38 6Repair enterotomy 48 ± 17 50 ± 17 332 ± 161 25 0Laparoscopic salpingostomy 57 ± 23 53 ± 22 554 ± 124 27 47

*Scores represent means from all 24 residents.†Data given as % ± SD.

Page 3: Surgical skills assessment: A blinded examination of obstetrics and gynecology residents

Volume 186, Number 4 Goff et al 615Am J Obstet Gynecol

Newman-Keuls test. Pass/fail data were analyzed withnonparametric tests, including χ2 and Mann-Whitney Utests.

Results

The surgical procedures chosen for this examinationwere tasks that had been previously tested on residents atthe University of Washington and which the Madigan fac-ulty knew were performed routinely by their residents.The mean total scores, time, and pass/fail data for the 7procedures is provided in Table I. The procedures pro-vided a range of difficulty. The subcuticular closure andlaparoscopic ligation of vessels with clips had scores andpass rates which were significantly higher than other pro-cedures. The Burch procedure and repair of enterotomyhad the lowest scores and pass rates. Some of the more ju-nior residents may not have had practical experience withthese 2 tasks, since the test was administered only 3months into the academic year. A large number of resi-dents were unable to complete the abdominal wall closure and laparoscopic salpingostomy within the 10-minute period, although most were technically able toperform the procedure.

One-way analysis of variance with Student-Newman-Keuls post hoc test was used to evaluate construct validity,which is the ability to distinguish between residency lev-els. The results for mean total checklist, global score,time, and pass/fail analysis for each residency level isshown in Table II. Data are provided for University ofWashington and Madigan faculty separately. Both thetotal checklist and total global score for all 7 proceduresshowed significant differences among the residency lev-els. In the global rating scale, both blinded (University ofWashington) and unblinded (Madigan) examiners,found significant differences between the postgraduateyear (PGY) 1s, the PGY2s, PGY3s, and the PGY4s. For thetotal checklists, the blinded examiners had better con-struct validity than the unblinded examiners. Time wasnot a significant discriminator among resident levels.This is in part because some residents did not attempt

procedures they felt they could not perform. Analysis ofoverall pass/fail data also revealed significant differencesamong the residents with PGY1 < PGY2, PGY3 < PGY4.

Reliability indices are shown in Table III. Internal con-sistency of the examination was calculated with Cron-bach’s α for both the blinded and the unblindedexaminers. The overall reliability for the global ratingscale was 0.93 for University of Washington faculty and0.89 for Madigan. The overall reliability for the checklistwas 0.82 (University of Washington) and 0.74 (Madigan).For the individual tasks, the reliability indices rangedfrom 0.46 to 0.95. Interrater reliability between blindedand unblinded examiners was calculated with intraclasscorrelation. Although the unblinded examiners tendedto score residents slightly higher than blinded examiners(Table II), there was excellent interrater reliability. Over-all, interrater reliability was 0.95 for the global ratingscale and 0.96 for the checklist. For the individual tasks,interrater reliability ranged from 0.74 to 0.97.

Evaluation of written examination scores also revealedsignificant differences in surgical knowledge among theresidents. Mean examination scores were 25.5, 26.8, 30.8,and 34.0 for PGY1s, 2s, 3s, and 4s, respectively. The residentevaluation forms were filled out anonymously and indi-cated a mean overall rating of the experience of 4.3 on ascale of 1 (poor) to 5 (excellent). Fourteen residents indicatedthat they would like to participate in more testing sessions.Half of the residents indicated that they felt the testing wasbeneficial for self-evaluation, and several commented thatthe examination would heighten their awareness aboutthings during surgery that they took for granted. What resi-dents disliked most about the testing was the lack of feed-back they received during and after the test.

The resources used to administer the examinationwere also documented. We had purchased a femaletrainer, 2 laparoscopic trainers, abdominal wall closuremodel, and a bowel model for approximately $5000 fromLimbs and Things (England). These trainers had beenused in previous studies of University of Washington resi-dents and can be reused an indefinite number of times.

Table II. Construct validity: total examination scores for all 7 tasks per resident level*

Evaluation tool PGY1 PGY2 PGY3 PGY4 P value Post hoc test

Mean global score 42 55 60 77 .001 PGY1 < PGY2, PGY3 < PG4(blinded UW faculty) (%)

Mean global score 49 60 63 79 .001 PGY1 < PGY2, PGY3 < PG4(unblinded Madigan faculty) (%)

Mean checklist 44 57 63 79 .001 PGY1 < PGY2, PGY3 < PG4(blinded University of Washington faculty) (%)

Mean checklist 53 63 62 84 .001 PGY1,2,3 < PGY4(unblinded Madigan faculty) (%)

Mean time (s) 548 455 406 447 .06% passing 19 44 40 79 .001 PGY1 < PGY2, PGY3 < PG4

*Scores represent percent of total possible points.

Page 4: Surgical skills assessment: A blinded examination of obstetrics and gynecology residents

616 Goff et al April 2002Am J Obstet Gynecol

All of the models were transported from the University ofWashington to Madigan for testing. The cost for replace-able items was approximately $1000 for all 16 residents.This included items such as pig’s feet and the non-reusable components of certain models (ectopic preg-nancies, dissection pad, bowel, abdominal wall, andbladder neck suspension). There was no cost to use thesurgical training laboratories at Madigan. The total timeto test 16 residents was 8 hours for each faculty member.One hour was spent orienting and training faculty. Ap-proximately 8 hours were spent planning and coordinat-ing the examination. The faculty time does not includedata entry and analysis.

Comment

Surgical competence is essential for all surgeons andexpected by patients. Residencies are under increasingpressure to certify that the residents they graduate arecompetent physicians. The challenge remains how toassess competency accurately. This is especially true fortechnical skills where very little research has been fo-cused on how to teach or evaluate these types ofskills.1,8 The industry that has been most successful incertifying competency of technical skills has been theairlines. Commercial pilots must undergo annual certi-fication in simulators in order to show that they arecompetent to fly. However, surgeons do not have to ob-tain certification or “prove” they are competent in per-forming a set of standardized skills before they areallowed to operate unsupervised. Our current study in-vestigates methods which could be used to help assesssurgical competence.

The concept of objective structured assessment of tech-nical skills was first pioneered by Reznick et al from theUniversity of Toronto.1,5-7 Their group designed an as-sessment of surgical skills for general surgery residentsand found that this type of examination could be con-ducted with high reliability and good validity. In 1999, wedeveloped a similar type of examination that could assessboth laparoscopic and open abdominal skills for gyneco-logic procedures.3 In our studies, we had results that were

very similar to those of Reznick et al. Overall, reliabilityfor our OSATS was 0.87 and we had good construct valid-ity with significant differences in scores between thePGY1s, the PGY2s, PGY3s, and the PGY4s. However, no in-vestigators have conducted these types of assessments ofsurgical skills in a blinded fashion. We were concernedthat our close association with the residents we were grad-ing may have unintentionally biased our results. It was forthat reason and to expand the number of residents par-ticipating in this testing that we administered our OSATSto the obstetrics/gynecology residents at Madigan ArmyMedical Center in Tacoma, Wash.

At both the University of Washington and the Uni-versity of Toronto, examinations have been conductedin animal (pig) and surgical models. Studies haveshown that the use of bench models for objective struc-tures assessment of technical skills is equivalent to thesame type of assessment in animal models.3-7 Surgicalmodels are significantly less expensive and the exami-nation is more easily administered because there is noneed for an operating room or a veterinary technician.We also found in our current study that the examina-tion was completely portable and could easily be ad-ministered at another institution.

Because we owned the surgical models, the cost forreplaceable items was $62 per resident. The most diffi-cult factor in administering OSATS was the faculty timecommitment. We had 3 University of Washington fac-ulty and 6 Madigan Army Medical Center faculty partic-ipate in 8 hours of testing. In addition, we spent 1 hoursetting up the test, 1 hour providing a training sessionfor the Madigan examiners, and 3 hours traveling backand forth. Time commitments for faculty could be re-duced by eliminating the second examiner. For eachresident, approximately 2 hours were spent performingthe examination.

We found that, in addition to its feasibility, the exami-nation had very good reliability indices when it was ad-ministered in either a blinded or an unblinded fashion.The overall reliability was 0.93 and 0.89 for blinded andunblinded global scale and 0.82 and 0.74 for blinded andunblinded overall checklist score. Studies have shownthat examination with reliability indices above 0.80 canbe used for high-stakes purposes such as certification.1 Al-though the unblinded examiners tended to grade the res-idents slightly higher than the blinded examiners, therewas excellent correlation between the 2 groups with re-spect to overall rankings of the residents. The interraterreliability was 0.95 for the overall global and 0.96 for thetotal checklist scores.

This study has also shown that surgical skills assessmentconducted in either a blinded or unblinded manner hassignificant construct validity. Even with only 4 residentsper year, there were 3 significant groups identified by the

Table III. Reliability indices

UW Madigan Interraterreliability reliability reliability

Global 0.93 0.89 0.95Checklist overall 0.82 0.74 0.96Subcuticular closure 0.55 0.46 0.74Abdominal wall closure 0.75 0.79 0.86Burch 0.95 0.88 0.97Ligate vessel 0.62 0.49 0.76Intracorporeal knot 0.90 0.90 0.91Repair enterotomy 0.67 0.70 0.91Laparoscopic salpingostomy 0.81 0.80 0.93

Page 5: Surgical skills assessment: A blinded examination of obstetrics and gynecology residents

Volume 186, Number 4 Goff et al 617Am J Obstet Gynecol

post hoc tests with PGY1 < PGY2, PGY3 < PGY4. Both thechecklist and the global rating scales were excellent dis-criminators among residency levels. Time, not surpris-ingly, was not found to be a significant discriminator. Inour previous studies, we have found some instanceswhere time to complete a task is a significant discrimina-tor with more advanced residents being able to completetasks more quickly.4 However, it is important to point outthat time to complete a task is not necessarily a good sur-rogate for ability.3,8-10 For example, an inexperienced res-ident can do a procedure quickly but completely wrong.In this study as has been shown previously, pass/fail scor-ing also revealed significant construct validity. At the be-ginning of the PGY3, 40% of the tasks were passed by theresidents compared with 79% for the PGY4s. This em-phasized the importance of the last 2 years of residencytraining for surgical experience. We have seen similarfindings among our own residents at the University ofWashington.3,4

In addition to helping the faculty at the University ofWashington refine our surgical evaluation tool, the ob-jective assessment of residents’ surgical skills was very use-ful to the faculty at Madigan. Having a resident operatecompletely independently, without the usual verbal orphysical cues that attending physicians often uncon-sciously provide residents, is very revealing. It allows fac-ulty members as well as the residents to see if they can doa basic procedure on their own. This type of testing is alsouseful because the tasks are standardized and every resi-dent is doing the identical procedure. This allows facultymembers to accurately assess if a resident is falling be-hind his/her peers in technical skills. This type of testingcan also expose weaknesses in the surgical curriculum ofthe residency. As a result of this testing, the Madigan fac-ulty have made changes to the surgical curriculum andthis has heightened faculty awareness about teaching sur-gical skills.

Although the results of this pilot project are encourag-ing, there is still additional research needed to verify the

results. During the next 2 years, we will be conducting ex-tensive testing in 4 additional residency programs. Ineach institution, we will have blinded and unblinded ex-aminers, and we will attempt to test approximately 100residents. If additional research confirms that OSATS area valid and reliable method to assess surgical skills, thereare many potential uses. Residents could be tested at theend of each year to ascertain if the residency programsare providing appropriate surgical education. Residentswho fall behind can be identified early for additional in-struction and practice. Testing can provide residents withthe self-confidence that they can do a procedure withoutinput from a supervising physician. Finally, this type oftesting could allow surgical educators to be confidentthat we have trained competent surgeons.

REFERENCES

1. Reznick RK. Teaching and testing technical skills. Am J Surg1993;165:358-61.

2. Mandel LS, Lentz GM, Goff BA. Teaching and evaluating surgi-cal skills. Obstet Gynecol 2000;95:783-5.

3. Goff BA, Lentz GM, Lee D, Houmard B, Mandel LS. Developmentof an objective structured assessment of technical skills for obstet-ric and gynecology residents. Obstet Gynecol 2000;96:146-50.

4. Goff BA, Lentz GM, Lee D, Fenner D, Morris J, Mandel, LS. De-velopment of a bench station objective structured assessment oftechnical skills. Obstet Gynecol 2001;98:412-6.

5. Reznick R, Regehr G, MacRae H, Martin J, McCulloch W. Test-ing technical skill via an innovative “bench station” examination.Am J Surg 1997;173:226-30.

6. Martin JA, Regehr G, Reznick R, MacRae H, Murnaghan J,Hutchison C, Brown M. Objective structured assessment oftechnical skill (OSATS) for surgical residents. Br J Surg1997;84:273-8.

7. Winckel CP, Reznick RK, Cohen R, Taylor B. Reliability and con-struct validity of a structured technical skills assessment form.Am J Surg 1994;167:423-7.

8. Goff BA, Lentz GM, Lee DM, Mandel LS. Formal teaching of sur-gical skills in an obstetric-gynecology residency. Obstet Gynecol1999;93:785-90.

9. Chung JY, Sackier JM. A method of objectively evaluating im-provements in laparoscopic skills. Surg Endosc 1998;12:1111-6.

10. Lentz GM, Mandel LS, Lee D, Melville J, Gardella C, Goff BA.Testing surgical skills of obstetric and gynecology residents in abench lab setting: validity and interrater reliability. Am J ObstetGynecol 2001;184:1462-8.