6
Medical Education 1990, 24, 376-381 Training psychiatrists and family doctors in evaluating interpersonal skills J.E. DES MARCHAIS, P. JEAN t & L. G. CASTONGUAYS UniversitC de Sherbrooke, Quebec, t The McLaughlin Centre for Evaluation of Clinical Competence, Ottawa and #Department ofPsychology, State University of New York at Stony Brook Summary. Ten psychiatrists and 15 family doctors were asked to score videotapes of patient-doctor encounters before and after each of two training periods. One period focused on the theory of assessment of doctors’ interperso- nal skills, while the other was purely practical. Results indicate that after one training session in either theory or practice, both groups of doctors achieved a significantly higher interrater re- liability. The second session, which crossed over theory and practice of assessment, did not increase the improvement in interrater agreement achieved by the first training period. Although both groups of doctors showed a significant increase in interrater agreement, psy- chiatrists exhibited greater reliability scores than family doctors before the experiment as well as after the second training session. These results were discussed in terms of their implications for future research on the doctor-patient re- lationship. Key words: psychiatry/*educ; physicians, family/*educ; *physician-patient relations; *clinical competence; education, medical, con- tinuing; Quebec Introduction The doctor-patient relationship is gaining recog- nition as an important feature of medical prac- Correspondence: Professor Jacques E. Des Marchais, Cabinet du vice-doyen aux Ctudes, Faculti de midicine, UniversitC de Sherbrooke, Centre hos- pitalier universitaire de Sherbrooke, Sherbrooke, Quibec, Canada J1H 5N4. tice. Comstock et al. (1982) have shown that the level of patient satisfaction depends on doctors’ attitude and on the amount of infoEmation that they communicate to the patient. According to Gray (1982), the majority of patient complaints about doctor behaviour deal with interpersonal skills (e.g. listening, understanding, communi- cating). Most medical schools agree that medical knowledge and technical skills alone are not sufficient for proper student training (Rezler 1974; Anon. 1980). Indeed, future doctors are expected to master the skills necessary to estab- lish and maintain a helpful, empathic, respectful relationship with their patients. It is therefore important that whoever teaches these skills should also be able to assess them reliably. Although a variety of methods has been developed to evaluate interpersonal skills, teachers and researchers are still confronted with a problem of interrater reliability (Wilson et al. 1969; Ludbrook & Marshall 1971). Reliable measurement of doctors’ interper- sonal skills depends on many factors (Bujold etal. 1982). Few studies have investigated the impact of the competence or expertise of the judges (or observers) on reliability. These results appear conflicting. For instance, it is not clear that specific training in the assessment of the doctor- patient relationship has a positive effect on interrater agreement. Some studies suggest that this type of training does not significantly decrease the variation between judges (Keck & Arnold 1979; Liston et al. 1981). According to Newble et al. (1980). such training is useless for the good judges and inefficient for the bad ones. At best, it allowed the elimination of the latter. Several other studies, however, have reported 376

Training psychiatrists and family doctors in evaluating interpersonal skills

Embed Size (px)

Citation preview

Page 1: Training psychiatrists and family doctors in evaluating interpersonal skills

Medical Education 1990, 24, 376-381

Training psychiatrists and family doctors in evaluating interpersonal skills

J.E. DES MARCHAIS, P. JEAN t & L. G. CASTONGUAYS

UniversitC de Sherbrooke, Quebec, t The McLaughlin Centre for Evaluation of Clinical Competence, Ottawa and #Department ofPsychology, State University of N e w York at Stony Brook

Summary. Ten psychiatrists and 15 family doctors were asked to score videotapes of patient-doctor encounters before and after each of two training periods. One period focused on the theory of assessment of doctors’ interperso- nal skills, while the other was purely practical. Results indicate that after one training session in either theory or practice, both groups of doctors achieved a significantly higher interrater re- liability. The second session, which crossed over theory and practice of assessment, did not increase the improvement in interrater agreement achieved by the first training period. Although both groups of doctors showed a significant increase in interrater agreement, psy- chiatrists exhibited greater reliability scores than family doctors before the experiment as well as after the second training session. These results were discussed in terms of their implications for future research on the doctor-patient re- lationship.

Key words: psychiatry/*educ; physicians, family/*educ; *physician-patient relations; *clinical competence; education, medical, con- tinuing; Quebec

Introduction

The doctor-patient relationship is gaining recog- nition as an important feature of medical prac-

Correspondence: Professor Jacques E. Des Marchais, Cabinet du vice-doyen aux Ctudes, Faculti de midicine, UniversitC de Sherbrooke, Centre hos- pitalier universitaire de Sherbrooke, Sherbrooke, Quibec, Canada J1H 5N4.

tice. Comstock et al . (1982) have shown that the level of patient satisfaction depends on doctors’ attitude and on the amount of infoEmation that they communicate to the patient. According to Gray (1982), the majority of patient complaints about doctor behaviour deal with interpersonal skills (e.g. listening, understanding, communi- cating). Most medical schools agree that medical knowledge and technical skills alone are not sufficient for proper student training (Rezler 1974; Anon. 1980). Indeed, future doctors are expected to master the skills necessary to estab- lish and maintain a helpful, empathic, respectful relationship with their patients. It is therefore important that whoever teaches these skills should also be able to assess them reliably. Although a variety of methods has been developed to evaluate interpersonal skills, teachers and researchers are still confronted with a problem of interrater reliability (Wilson et a l . 1969; Ludbrook & Marshall 1971).

Reliable measurement of doctors’ interper- sonal skills depends on many factors (Bujold etal . 1982). Few studies have investigated the impact of the competence or expertise of the judges (or observers) on reliability. These results appear conflicting. For instance, it is not clear that specific training in the assessment of the doctor- patient relationship has a positive effect on interrater agreement. Some studies suggest that this type of training does not significantly decrease the variation between judges (Keck & Arnold 1979; Liston et al . 1981). According to Newble et al. (1980). such training is useless for the good judges and inefficient for the bad ones. At best, it allowed the elimination of the latter. Several other studies, however, have reported

376

Page 2: Training psychiatrists and family doctors in evaluating interpersonal skills

Eflect of training on evaluation of interpersonal skills 377

that specific training prior to the assessment improves interrater agreement (Barbee el a!. 1967; Dancer et al . 1978; Bird & Lindley 1979).

Professional training and daily activities are other factors that could also influence the judge’s competence in the assessment of interpersonal skills. To the authors’ knowledge, however, no attempt has yet been made to assess the role of this potentially important variable.

The present research was designed first to verify whether specific training in the assessment of interpersonal skills could lead to greater agreement among experienced medical teachers. A second objective was to determine whether a teacher’s own specialty would influence inter- judge reliability. Two types of training were assessed: theoretical and practical. Of particular interest was the measurement of the separate and combined effect of these two types of training as well as the comparison of their relative efficacy. Two types of clinicians, psychiatrists and family doctors, were also compared. Assuming that psychiatrists receive more extensive education in the establishment and evaluation of therapeutic relationships, it appeared relevant to determine if they would show higher pretest agreement and if training would affect the potential difference between psychiatrists and family doctors.

Three null hypotheses were then formulated, taking into account the three independent vari- ables under study: (1) there is no difference in interrater reliability between psychiatrists and family doctors; ( 2 ) there is no difference in interrater reliability after one period of either practical or theoretical training; and (3) there is no difference in interrater reliability after one or two training periods, irrespective of the order of theory and practice.

Materials and methods

Subjects and sequence assignment

Ten psychiatrists and 15 family doctors partici- pated in the study. All 25 subjects were medical teachers affiliated with the Universite de Mon- treal and had experience in clinical teaching. The subjects were randomly assigned in a two-group crossover design. One group of 13 subjects (five psychiatrists and eight family doctors) received theoretical training in the evaluation of interper-

sonal skills first, followed by practical training. The second group, made up of 12 subjects (five psychiatrists and seven family practitioners) received the same training, but in reverse order. On three occasions, before training (time 0) and after the first (time +1) and second (time +2) training sessions, each group of subjects had to score doctors’ behaviour as observed on video- taped clinical encounters, using a 5-grade scale (see Appendix).

Instruments

Videotapes. Before and after the two training periods, subjects were asked to score three videotapes of 10-minute clinical encounters in which junior or senior clerks were interviewing and examining patients in the usual fashion, a t a hospital ambulatory clinic. All patients consen- ted to participate in the study. To be eligible for the study, their presenting problem had to be frequent, moderately ambiguous (not too easy to diagnose), and of limited complexity. The prob- lems encountered fell into the categories of orthopaedic surgery, general medicine (more specifically headaches). The three videotapes used during the study were selected from a sample of 12 recorded interviews. The inter- views retained were judged by two experimen- ters to be of equivalent difficulty in terms of student interpersonal skills. In other words, exceptionally good/bad clinical performances were eliminated since they could have artificially inflated the interrater agreement. The length of all the videotapes was reduced to 10 minutes. Care was taken, however, to retain the important elements of the doctor-patient relationship to be assessed by the judges. Rating scale. A rating scale for assessing inter-

personal skills (see Appendix) was constructed using a modified Flanagan method (Flanagan 1954), in which each item of the scale refers to an observable doctor behaviour. Items were included and ordered so that the scale sequen- tially covered the usual steps of a clinical inter- view. Critical incidents in an optimal doctor- patient relationship were collected from 20 people such as nurses, receptionists, and patients who observe doctors’ interpersonal skills on a continuing basis. An initial list of 101 descriptive

Page 3: Training psychiatrists and family doctors in evaluating interpersonal skills

378 J . E. Des Marchais et al.

behaviours was reduced to 20 items by a research team comprised of clinicians, psychologists, psychiatrists, and medical educators. Items selected were thought to reflect four major dimensions of adequate interpersonal skills: politeness, empathy, communication, and qual- ity of the information given to the patient.

Conditions ofthe experimental training

The theoretical training session was conducted as an interactive lecture by a psychologist and a psychiatrist with clinical teaching experience. It was based on a 20-page booklet describing the four dimensions of a good doctor-patient rela- tionship upon which the rating scale was con- structed (see Appendix). Each behaviour was discussed and illustrated by one adequate and one unsatisfactory example. Participants were asked to share their own experience with those behav- iours. Thus the theoretical training aimed at a good understanding of the rating scale, to be used in the assessment of interpersonal skills.

The practical training session was given by two senior members of the teaching staff with a vast experience in teacher training development. The session began with the videotape scored by all subjects before the experiment. Participants were asked to rescore the encounter. In order to reach a consensus, individual extreme ratings were discussed. Each behaviour of the rating scale was therefore discussed before participants decided on a final common rating. Thus the practical training session aimed an adequate use of the rating scale.

Index ofintetrater reliability

Scores given by each judge on all 20 items describing the four dimensions were summated. The deviation of each judge’s score from the median of his group was computed to measure interrater reliability. The average deviation score (ADS) within a group became its interrater reliability index.

Statistical analysis

Considering the limited number of subjects, a two-way analysis of variance was made. The

dependent variable consisted in scoring of three videotapes at each test time. Deviation of each subject from the median of his group was calculated. Sets of scores were obtained: one score for each item and a total score for all the items.

Results

Efect of professional background on interrater reliability

The average deviation score before training (time 0) and after two training sessions (time +2) for family doctors and psychiatrists is shown in Fig. 1. Family doctors had an ADS of 3.1 before training and 2.1 after, while psychiatrists scored 2.7 and 1.8 respectively. When considered as a single group, deviation scores were 3.0 at time 0 and 2.0 at time +2.

These results suggest that although the two groups of professionals showed a significant increase in interrater reliability, psychiatrists had greater reliability scores than family doctors before as well as after two training sessions (the lower the score, the higher the interrater reli- ability).

The results ofthe analysis ofvariance indicate a significant effect for the professional group (P = 0.056) and a significant effect for the total assess- ment period (P = 0.OOO). There was no interac- tion between the two.

Efect of training sequence on interrater reliability

As shown in Fig. 2, family doctors and psychiatrists, when considered as a single group, recorded an average deviation score of 3.0 from the median before any training, at scoring time 0. After a first training session, at scoring time + 1, both groups improved significantly to respective ADSs of 2.1 (theoretical training) and of 1.8 (practical training). After a second training session, at scoring time +2, the first group (receiving theory before practice) achieved further improvement with an ADS of 1.9, while the second group (receiving practice before theory) scored 2.1. Combining the two groups yields an average individual deviation from the median of2.8 before the experiment, 1.9 after one training session, and 2.0 after two training sessions, a t the scoring time +2.

Page 4: Training psychiatrists and family doctors in evaluating interpersonal skills

Effect of training on evaluation of interpersonal skills 379

3.5 3.5

3.0

ADS 25

3.0

ADS 25

20 20

A Family doctohn 0 Psychiatrirtr A Family doctohn 0 Psychiatrirtr

I I Before After training

I I Before After training

Figure 1. Effect of professional background on inter- rater reliability in assessing doctors' interpersonal skills (ADS: Average Deviation Score from the median of the group. The lower the ADS, the greater the interrater reliability).

The analysis of variance indicates no signifi- cant effect of treatment in the first (time + I ) and second (time +2) training sessions. However, a significant difference was noted over the total assessment period (P = 0.000). The interaction of these two variables is not significant.

To determine the source of the significant difference over the total assessment period, we conducted a multiple average test comparison (with the Scheffi method), which revealed a significant difference between the scoring 0 and the scoring time +1 ( P = 0.000) and the scoring time 0 and the scoring time +2 but no difference between scoring times + 1 and +2. These results indicate that both training sequences (theory/ practice and practicehheory) lead to an increase in interrater reliability, but that neither sequence is superior. Moreover, although the first training session improved interrater agreement, there was no difference between the two types of training (theoretical and practical). Furthermore, the addition of a second training session does not lead to a significant rise in reliability.

Discussion

This study suggests that a short specific training can increase interrater agreement in the assess- ment of the doctor-patient relationship. Since psychiatrists exhibited greater agreement than family doctors before the experiment and after two training sessions, the first null hypothesis

(professional background does not make any difference) has to be rejected. However, one can accept the second null hypothesis (there is no difference after one period of either practical or theoretical training) because a comparable reli- ability index is seen'after a first training either practical or theoretical. Finally the third null hypothesis (there is no difference in interrater reliability after one or two training pe-ods, irrespective of the order of theory and practice) cannot be rejected, since the two strategies appeared equally efficient.

Perhaps the most significant finding in this study is that, although one training session (theoretical or practical) led to higher interrater reliability, a second one does not further increase the level of agreement. One plausible expla- nation is similarity oftraining: since both types of training were centred around learning the coding scale, the same basic information may have been presented in a different form. This tentative explanation could also account for the fact that no significant difference was obtained between the two types of training in the first test, as well as between the two sequences of training.

It is also possible, however, that the second training session was too short to lead to higher reliability. Longer training may be necessary in order to overcome the progress achieved in the mastery of the scale after the first training session. A second training session, over 2 hours, may also lead to different types of learning,

3.0 Theory before praaice A Practice More theory 0 Total gmup

ADS

20

1.5 - 0 +1 +2

Training sessions

Figure 2. Effect of training format on interrater reliability in assessing doctors' interpersonal skills (ADS: Average Deviation Score from the median of the group. The lower the ADS, the greater the interrater reliabiiity.)

Page 5: Training psychiatrists and family doctors in evaluating interpersonal skills

380 J . E. Des Marchair et al.

especially if a different training method is used. It may well be that when it is sufficiently exhaustive, theoretical/practical training can provide a unique perspective on a complex phenomenon such as the doctor-patient relation- ship. If this is the case, our first treatment period o f 2 hours might have been too short to show any potential difference between the two training sessions. More importantly, the second treat- ment period, also 2 hours long, might not have been long enough to foster the potential comp- lementary synergistic effect of the two types o f training.

The hypothesis that a longer second training session may lead to a significant increase in reliability compared to the first training session seems to be indirectly supported by the difference between the two groups of pro- fessionals. The types of specialists in this study were chosen because their professional training put a special emphasis on the acquisition o f interpersonal skills. However, the psychiatrists had generally received more extensive (theoreti- cal and practical) training about the doctor- patient relationship than the family doctors. This might explain why the psychiatrists demon- strated better reliability than the family doctors at the pretest. The fact that this difference was maintained in the second test may also indicate that family doctors require much longer training before being as reliable as psychiatrists.

More research is clearly needed to determine whether a longer and/or different type of training is beneficial in assessing the doctor-patient re- lationship. Furthermore, although our findings suggest that evaluation of interpersonal skills can be made more reliable in a short period o f time, differences in doctors and training approaches give rise to many unanswered questions. For instance, it would be important to know if the increase in interrater reliability is durable. Hence, future research should include follow-up data. More important, research should be conducted to determine the possible learning effect of assessing more than one set of videotapes. In order to investigate this potential artifact in a study such as this one, a control group (including psychiatrists and family doctors) would have to score the same set of clinical encounters without any training and in the same experimental con- ditions.

Acknowledgements We acknowledge the participation o f colleagues Dr Franqois Borgeat, Dr Pierre Delorme and Mrs Heltne David for their collaboration with the research team, M. Roch Roy from le Dtpar- tement d’information et recherche opira- tionnelle, Universitt de Montrtal, M s Michelle Newman for her editorial support and D r Hugh Scott for his contribution to the traduction of the scale.

This research project was supported by Le Fonds d’tducation mtdicale, U n i t i de Re- cherche et de Diveloppement en Education Mtdicale, Facultt de Midecine, Universiti de Montreal, Quebec, Canada.

References

Anon. (1980) R. S. McLaughlin. Requests for research proposals. Forum XW5: 10.

Barbee R. A., Feldman S. & Chosy L. W. (1967) The quantitative evaluation of student performance in the medical interview. Journal of Medical Education 42, 238-43.

Bird J. & Lindley P. (1979) Interviewing skill: the effect of ultra-brief training for general practitioners. A preliminary report. Medical Education 13, 349-55.

Bujold N., Des Marchais J., Dufour H., Ferland J. & Gagnon S. (1982) Problematique de la mesure des attitudes en mkdecine. Revue de I’iducation midicale

Comstock L. R., Hooper E. M., Goodwin J. M. & Goodwin J. S. (1982) Physician behaviors that correlate with patient satisfaction.Journa1 ofMedical Education 57, 105-12.

Dancer D. D., Braukman C. J.. Schumaker J . B., Kirigin K. A,, Willner A. G. & WolfM. M. (1978) The training and validation of behavior observation and description skills. Behavior Modification 2, 113-33.

Flanagan J. C. (1954) The critical incident technique. Psychological Bulletin 51, 327-58.

Gray C. (1982) Beware the malpractice minefield. Canadian Medical AssociationJoumal 127, 24345.

Keck J. W. & Arnold L. (1979) Development and validation of an instrument to assess the clinical performance of medical residents. Educational and Psychological Measurements 39, 903-8.

Liston E. H., Yager J. & Strauss G. D. (1981) Assessment of psychotherapy skills: the problem of interrater agreement. AmericanJournal of Psychiatry 138, 106S74.

Ludbrook J. & Marshall V. R. (1971) Examiner training for clinical examinations. BritishJournal of Medical Education 5, 152-5.

Newble D. I . , Hoare J. & Sheldrake P. F. (1980) The selection and training of examiners for clinical examinations. Medical Education 14, 345-9.

2, 10-16; 3, 17-21.

Page 6: Training psychiatrists and family doctors in evaluating interpersonal skills

Eflect oftraining on evaluation of interpersonal skills 38 1

Wilson G. M., Lewer R., HardenK. McG., Robertson J. I. S. & MacRitchie J. (1969) Examination of clinical examiners. Lancet i, 37-40.

Rezler A. G. (1974) Attitude changes during medical school: a review of the literature. Journal ofMedical Education 49, 1022-30.

Appendix: Rating scale of doctor-patient relationship

According fo fhefollowing scale, please judge the degree of behavioural realization. 1- absolutely not 2- a little 3- more or less 4- sufficiently 5- absolutely

(1) He fetches the patient in a personal manner. (2) He allows the expression ofthe motives for consultation. (3) He uses judiciously open questions. (4) He listens attentively while the patient speaks. (5) He refrains from doing other things during the interview. (6) He rephrases the patient’s description ofhis problems. (7) He leads the interview towards what he deems pertinent points. (8) He puts his patient a t ease. (9) He does not encourage the expression ofpent-up emotions.

(10) He invites the patient to communicate his reactions and solicits questions. (1 1) He clarifies and synthesizes patient’s deliberations. (12) He modifies the patient’s problem. (13) He respectfully requests the patient’s submission to examination. (14) Heexamines thepatient with tact. (15) He explains clearly his understanding ofthe problem presented. (16) He adequately answers all questions. (17) His recommendations and theirjustification are expressed in a confused manner. (18) He assumes a moralistic attitude. (19) He concludes the consultation with a mutual understanding on the nature of

and management ofthe problem. (20) Globally, I find the consultation good.

Received 17July 1989; editorial comments to authors 10 October 1989; accepted for publication 9January 1990