10
Hidden Markov Model for Quantifying Clinician Expertise in Flexible Instrument Manipulation Jagadeesan Jayender 1,2 , Ra´ ul San Jo´ se Est´ epar 1 , Keith Obstein 3 , Vaibhav Patil 1,2 , Christopher C. Thompson 3, , and Kirby G.Vosburgh 1,2 1 Department of Radiology, Harvard Medical School, Brigham and Women’s Hospital, Boston, MA, USA [email protected], [email protected], [email protected], [email protected] 2 CIMIT Image Guidance Laboratory, Boston, MA,USA 3 Division of Gastroenterology, Brigham and Women’s Hospital, Boston, Massachusetts, USA [email protected], [email protected] Abstract. Clinicians are trained to manipulate a colonoscope while minimizing the force exerted on the colon walls to reduce the danger of luminal perforation and discomfort to the patient. Here, we classify the expertise of the clinician performing colonoscopy using a Hidden Markov Model. Seven models are trained corresponding to the performance of the expert in the entire colon, ascending, transverse and descending colon and three gestures corresponding to roll and two angulations of the distal end of the scope. Experimental results in a colon model (CM-1, Olym- pus, Tokyo, Japan) are shown to compare the performance of the four groups of users - first year, second year and third year GI residents and expert physicians. 1 Introduction The performance of gastroenterologists, surgeons, and related practitioners has historically been assessed subjectively by senior physicians in both training and operating environments. In recent years concern regarding poor surgical dex- terity [1] and the broader use of minimally invasive interventions has promoted efforts to better characterize operator performance [2] and improve the effective- ness and efficacy of training [3]. Analytical approaches, such as task partitioning [4], kinematics analysis [5], off-line “data mining” to train hidden Markov models [6] or support vector machines [7], and many others have been developed. These techniques may now be used to measure the potential value of new interven- tional systems, for example determining the value of augmented reality displays to assist an endoscopist or surgeon in performing procedures [ref withheld] and to elucidate features which are most “helpful” to guide further development. This work has been funded by NIH/NCI under award 2 R42 CA115112-02A2 and the Center for Integration of Medicine and Innovative Technology (CIMIT), Boston, MA. H. Liao et al. (Eds.): MIAR 2010, LNCS 6326, pp. 363–372, 2010. c Springer-Verlag Berlin Heidelberg 2010

Hidden Markov Model for Quantifying Clinician Expertise in Flexible Instrument Manipulation

Embed Size (px)

Citation preview

Hidden Markov Model for Quantifying Clinician

Expertise in Flexible Instrument Manipulation

Jagadeesan Jayender1,2, Raul San Jose Estepar1, Keith Obstein3,Vaibhav Patil1,2, Christopher C. Thompson3,�, and Kirby G.Vosburgh1,2

1 Department of Radiology, Harvard Medical School, Brigham and Women’sHospital, Boston, MA, USA

[email protected], [email protected], [email protected],

[email protected] CIMIT Image Guidance Laboratory, Boston, MA,USA

3 Division of Gastroenterology, Brigham and Women’s Hospital, Boston,Massachusetts, USA

[email protected], [email protected]

Abstract. Clinicians are trained to manipulate a colonoscope whileminimizing the force exerted on the colon walls to reduce the danger ofluminal perforation and discomfort to the patient. Here, we classify theexpertise of the clinician performing colonoscopy using a Hidden MarkovModel. Seven models are trained corresponding to the performance of theexpert in the entire colon, ascending, transverse and descending colonand three gestures corresponding to roll and two angulations of the distalend of the scope. Experimental results in a colon model (CM-1, Olym-pus, Tokyo, Japan) are shown to compare the performance of the fourgroups of users - first year, second year and third year GI residents andexpert physicians.

1 Introduction

The performance of gastroenterologists, surgeons, and related practitioners hashistorically been assessed subjectively by senior physicians in both training andoperating environments. In recent years concern regarding poor surgical dex-terity [1] and the broader use of minimally invasive interventions has promotedefforts to better characterize operator performance [2] and improve the effective-ness and efficacy of training [3]. Analytical approaches, such as task partitioning[4], kinematics analysis [5], off-line “data mining” to train hidden Markov models[6] or support vector machines [7], and many others have been developed. Thesetechniques may now be used to measure the potential value of new interven-tional systems, for example determining the value of augmented reality displaysto assist an endoscopist or surgeon in performing procedures [ref withheld] andto elucidate features which are most “helpful” to guide further development.� This work has been funded by NIH/NCI under award 2 R42 CA115112-02A2 and

the Center for Integration of Medicine and Innovative Technology (CIMIT), Boston,MA.

H. Liao et al. (Eds.): MIAR 2010, LNCS 6326, pp. 363–372, 2010.c© Springer-Verlag Berlin Heidelberg 2010

364 J. Jayender et al.

Here, we consider colonoscopy, wherein a highly flexible endoscope (colono-scope) is inserted into the colon. The colon is elastic and deforms under the forceapplied by the colonoscope. Our initial study suggests that using just the loca-tion and kinematics of the distal dip of the colonoscope is not sufficient to classifythe operator’s performance. In addition, it was also observed that users had sig-nificantly different performance in each section of the colon. Typically, it is mostdifficult to maneuver the scope in the transverse and ascending colon region sincethe length of the colonoscope inserted is large, resulting in greater flexing of thescope. Kinematics based metrics do not identify these differences or highlightthe gestures required to manipulate the scope within the colon. Here we de-velop and evaluate a probabilistic approach based on the Hidden Markov Model(HMM) to classify the operator performance. Also, we establish a model of expertperformance to analyze and classify the ability of colonoscopy trainees. Otherinvestigators, [8], [6], [9], [10], have used HMM techniques to analyze operatorperformance. We build on these studies to characterize colonoscopy, includingthe use of flexible instruments, identifying the operator’s performance in eachsegment of the colon, and specifying the gestures for performing colonoscopyat the expert level. Characterizing the expertise of a user would be useful indeveloping curricula and simulators to train operators to smoothly guide thecolonoscope with minimal discomfort to the patient.

2 Experimental Setup

The experimental setup (Figure 1) consists of a colon model (CM-1, Olympus,Tokyo, Japan), which closely mimics the human colon and includes the ascend-ing, descending and transverse colon. The model is loosely tethered to the backsupport, allowing it to flex and stretch, as observed in an actual procedure. Themodel is draped with a cloth to prevent the user from observing the location of thescope inside the model. A pediatric colonoscope (PCF-Q180AL, Olympus, Tokyo,Japan) is equipped with four electromagnetic 6-DOF position sensors (“Micro-bird” sensors from Ascension Technologies Corp. (ATC), Burlington, VT). Thesensors are placed at 0cm, 10cm, 30cm and 55cm from the distal end. Sensor 1and sensor 2 are placed to record the angulation of the distal end of the scope in2-DOF about the y and z -axis. The position of sensors 3 and 4 arechosen such thatthese sensors are approximately in the recto-sigmoid junction when the distal endof the scope is in the traverse colon region, thereby permitting the detection offlexing and looping of the scope. The ATC electromagnetic system is connectedto an Intel Quad Core 2GHz computer with 4GB RAM. The position readings arelogged at a sampling rate of 67 Hz using MATLAB Simulink.

Four attending endoscopists who have performed more than 2000 colono-scopies (“Experts”) and 9 gastroenterology fellows (3 first-year, 3 second-year, 3third-year) who have performed less than 500 colonoscopies were selected to per-form a colonoscopy. Kinematics data consisting of the position and orientationof the four sensors, and time were recorded from the instant of insertion of thescope into the anus to the instant when the terminal ileum was intubated. The

Hidden Markov Model for Quantifying Clinician Expertise 365

Fig. 1. (a) Colon model and colonoscope showing the position of sensor 1, 2 and 3.Sensor 4 is out of the field. (b) Inner view of the model showing a realistic modelingof the human colon.

trajectories recorded for two expert clinicians were selected to train the expertHMM model. In addition, an experienced resident with considerable training onthe colon simulator was chosen to perform 5 insertions and retractions of thecolonoscope into each segment of the model. During this experiment, the dis-tal tip of the colonoscope in the colon model was tracked visually during theinsertion. These trajectories were used to train three models corresponding tothe motion of the scope in the ascending, descending and transverse colon. The“roll” and the “angulation” of the colonoscope in 2-DOF were computed fromthe measurements of the distal two sensors: these were used as features to berecognized by the HMM.

3 Hidden Markov Model

Our HMM analysis of colonoscopy is based on the approach and notation ofRabiner [11]. HMM analysis to quantify surgical expertise is suitable when themeasurements obtained from the sensors on the colonoscope are statisticallycorrelated to the measurements obtained from other subjects with a similarlevel of expertise. The parameters of the HMM model are defined as follows:

– The HMM is assumed to have N states. The transition probability betweenstate i to j is given by

aij = P (qt+1 = Sj |qt = Si) and A = {aij} (1)

– Each state also has M possible observation symbols Ot. The probability ofobserving a particular symbol Ot in state j is

bj = P (Ot|qt = Sj) and B = {bj} (2)

– Also a state prior πi is defined, which is initial probability of beginning in Si

In short the HMM can be represented as λ = (A, B, π). Therefore, to completelydefine the HMM, we should define the number of states N , the observationsymbols M per state and the probability measure λ. It was observed in our case,that N = 8 and M = 16 provided the best results. A larger value of N or M led

366 J. Jayender et al.

to greater complexity of the model and insufficient training while a smaller valueof N and M led to an extremely simple model without capturing the variationsin the surgical gestures. The model λ is trained according to the trajectoriesobtained from the expert clinicians.

Short-time Fourier TransformThe observation sequence is created from the trajectory of the four sensors. First,the time-domain trajectories are converted into the corresponding Fourier trans-forms to extract the important information from the trajectories. The Fouriertransform is invariant to rotation of the trajectory, preserves the information inthe signals and can be computed efficiently. However, the Fourier transforms lackthe temporal localization of the frequencies. Therefore, we use the Short-TimeFourier Transform (STFT) in short time periods and obtain a feature vectorcorresponding to each time window [6],[12]. The STFT is computed as,

STFT γx =

∫τ

[x(τ)γ(τ − t′)]e−j2πτ dτ (3)

where γ(τ − t′) is the sampling window of the trajectory. The Fourier transformin each sampling window is computed by the Fast Fourier Transform (FFT)algorithm. Information loss is minimized by overlapping the STFT windows.

The trajectories to be recognized by the HMM are the position of the foursensors in 3-DOF Cartesian space. In each sampling window, the STFT for asingle DOF trajectory of a sensor consists of the magnitude of N discrete fre-quency contributions. Therefore, the entire feature space corresponding to thefour sensors in 3-DOF is a 12 N-tuple. In addition, we have also independentlytrained three HMMs corresponding to the roll of the colonoscope and 2 angula-tions of the distal end of the colonoscope. For training each of these HMMs, thefeature map is a 1 N-tuple.

Vector QuantizationSince the HMM structure considered in this paper is discrete, we convert the 12Nor N tuple vector into a single discrete observation symbol using the k-meansclustering algorithm [13]. Consider that there are p 12N tuples over the entireduration of the trajectory for training the expert model. The k-means algorithmpartitions the p vectors into L sets so as to minimize the within-cluster sumof squares. Here L is the size of the codebook and has been chosen as 16. Thediscrete observation symbol is the index of the codebook vector closest to thegiven 12N tuple vector, i.e., the cluster in which the vector belongs.

HMM trainingHaving generated the discrete observation symbols corresponding to the trajec-tory of the expert clinicians, the observation sequence is provided to the HMMnetwork to obtain the updated model λ. The parameters of the models are esti-mated by maximizing the auxiliary function

Q(λ, λ) =∑Q

P (Q|O, λ) log[P (O, Q|λ)] (4)

Hidden Markov Model for Quantifying Clinician Expertise 367

This optimization problem is solved iteratively by the Baum-Welch method [11].Seven different HMM models were trained corresponding to the Expert, Ascend-ing Colon, Transverse Colon, Descending Colon, Roll, Angulation about y-axisand Angulation about z-axis. The Expert model was trained for the trajectoriesof two expert clinicians while the Ascending Colon, Transverse Colon, Descend-ing Colon HMM models were trained on the trajectories executed by a highlyexperienced third year resident with several hours of practice on the colon model.The “Roll” and two “Angulation” models have been trained on the roll and an-gulation trajectories computed from the clinician’s orientation trajectories of thedistal sensors.

HMM PredictionOnce the HMMs have been trained, the next step is to measure whether theHMM classifies the expertise of the operators. That is, we evaluate the likelihoodthat a particular HMM describes the observation sequence. Input data includesthe position trajectories of the four sensors, and the roll and 2-DOF angulationof the distal end of the scope. The probability of predicting the observationsequence given the HMM model is computed inductively using the forward-backward algorithm:

P (O|λ) =N∑

i=1

αT (i) (5)

where αT (i) = P (O1O2...OT , qt = Si|λ) is the forward variable. The reader isreferred to [11] for greater details.

4 Experimental Results

Thirteen GI endoscopists performed colonoscopy in the colon model. The timetaken to reach the terminal ileum from the anus ranged from 82 seconds to 1065seconds. The position measurements from the four electromagnetic sensors werelogged continuously, as shown in Figure 2. Note that the orientations of the foursensors vary considerably as the colonoscope moves through the different regionsof the colon (Figure 3 (a)). A number of kinematic metrics were computed andare shown in Table 1. The position trajectories of the four sensors were providedas input to the trained Expert HMM model. In addition, the orientation of thefour sensors with respect to the electromagnetic transmitter were also logged.Based on the orientation of sensor 2, the position and orientation trajectorieswere segmented into descending, transverse and ascending colon. These trajec-tories were provided as input to the trained HMM models corresponding to theDescending, Transverse and Ascending colon. The distal end of the colonoscopeis capable of bending in 2-DOF about the y and z axis. From the orientationof sensor 1 and sensor 2, the angulation of the colonoscope in 2-DOF was com-puted and is shown in Figure 4(a). In addition, the roll of the colonoscope wascalculated by measuring the roll of the local frame of the sensor at any timepoint with respect to the initial frame of reference, as shown in Figure 4 (b).

368 J. Jayender et al.

Fig. 2. Trajectory of the four electromagnetic sensors

Fig. 3. (a) Variation in orientation of the sensor 2 in the descending, transverse andascending colon. (b) Log-likehood as a function of iteration during training.

Fig. 4. (a) Graph showing the angulation of the distal end of the scope (b) Graphshowing the roll of the four sensors

Hidden Markov Model for Quantifying Clinician Expertise 369

Fig. 5. (a) Spectral distribution of the motion of the colonoscope manipulated by anovice (b) Spectral distribution of the motion of the colonoscope manipulated by anexpert

The spectral analysis of the position trajectory of a novice and expert areshown in Figure 5. The STFT of the position trajectory were converted to dis-crete observation symbols and provided to the different HMM models to firsttrain and then predict the performance of the user. During training, only 2 ex-pert trajectories were utilized to update the model parameters corresponding tothe Expert HMM. The number of iterations required for complete training ofthe HMM was 18 and the log-likehood of observing the training sequence as afunction of the iteration is shown in Figure 3 (b). The trained HMM classifiesthe performance of an operator based on the manipulation of the colonoscope bythe user. The result of the prediction from the Expert model is shown in Figure6 (a). In addition, the segmented descending, ascending and transverse colonposition trajectories were provided to the corresponding trained HMM models.These models provide insight into the expertise of the user in manipulating thecolonoscope in the corresponding regions of the colon. The result of the predic-tion of the user’s performance in the three segments of the colon is shown inFigure 6 (a). In addition, the roll and angulation trajectories were provided as

Fig. 6. (a) Evaluation of surgical performance of the users in manipulating the colono-scope compared to the expert model. The bar graph also show the performance of theusers in different region of the colon. (b) Evaluation of the roll and angulation gesturescompared to the expert model.

370 J. Jayender et al.

input to the HMM trained for identifying the gestures corresponding to the rota-tion and bending of the distal end of the colonoscope. The results correspondingto the HMM prediction of these gestures are shown in Figure 6(b).

5 Discussion

We conclude from Table 1 that the time taken for completion of the procedureis far less for an expert clinician than for the fellows. In addition, the averagepath length of the four sensors for the expert group is less compared with that ofthe first, second and third year fellows. However, none of the simple kinematicsparameters shows a significant difference among the four groups of users. Further,path length and time are not ideal kinematics parameters since in an actualcolonoscopy procedure, the operator could take time or move the scope locallyto study a particular feature or lesion. That is, path length and time are notonly a function of the expertise of the clinician but also of the complexity of theprocedure. In addition, the kinematics metrics provide a single global metric toquantify the performance of the clinician, which is not sufficient to analyze thetrajectories in detail. For example, it would be useful to analyze the performanceof the user in various regions. Typically, in conventional surgical training, theexpert surgeon is considered as the gold standard of performance and the novicesare trained to follow the expert’s movements. This method of training may becaptured by the HMM by comparing the novice’s performance to the expert.

It is observed in Figure 5 that the experts have a larger frequency componentcompared to novices during the entire duration of the procedure. This is con-tradictory to the findings in [6] wherein it is observed that the novices have alarger frequency component in manipulating a laparoscope (rigid surgical tool)compared to the experts. It is our hypothesis that due to the flexibility of thecolonoscope and the elastic nature of the colon, the high frequency componentof the motion of the colonoscope is dampened. In addition, the experts are ob-served to manipulate the distal end of the colonoscope with higher velocities (assuggested by Table 1), resulting in higher frequency components in the STFT.This may indicate that the approach adopted for training clinicians to performlaparoscopic surgery cannot be applied for endoscopy-based procedures.

From Figure 6 (a), it can be seen that the performance of the 13 subjects canbe clearly classified by the Expert trained HMM model based on their knownexpertise. The figure shows that the first year novices are less likely to achieve

Table 1. Metrics for evaluating clinician’s performance

Time Pathlength Flexing Av.Vel. Av.Accel. Angulation Y Angulation Z Rollsec m m mm/sec mm/sec2 degrees degrees rev.

1st Yr 715.4 10.92 5.33 0.83 0.65 39.7 95.7 0.21

2nd Yr 288.2 7.56 3.17 0.98 0.91 44.4 105.8 0.26

3rd Yr 274.9 5.16 1.53 1.15 0.63 40.5 96.8 0.16Expert 150.1 3.31 1.75 1.29 0.99 42.0 95.3 0.22

Hidden Markov Model for Quantifying Clinician Expertise 371

the “Expert” performance compared to other groups of users. It can also beseen that the performance of the first and second year residents is significantlyless probable to match the expert performance in all three regions of the colon.However, the third year residents show comparable performance to the expertsin the descending colon region. A likely explanation is that the descending colonis closest to the anus and therefore, the length of the colonoscope inserted intothe model is small, resulting in less flexing in the colonoscope and easier ma-nipulation. However, once the colonoscope enters the ascending and descendingregions of the colon, the insertion becomes more difficult due to flexing and ex-cessive curvature in the scope. This can be noticed in the performance of theresidents compared to that of the experts (Figure 6 (a)). Figure 6 (b) shows thecomparison of the gestures (roll and angulation of the distal end of the colono-scope) among the four groups of users. It is observed that the gestures performedby the residents are less likely to match “Expert” performance.

6 Conclusion

We have developed a Hidden Markov Model (HMM) to quantify the performanceof a clinician performing colonoscopy using a realistic physical colon model.In addition, we have also analyzed the motion of the scope in each segmentof the colon to identify the degree of expertise of manipulating the scope inthe ascending, descending and transverse colon. We have shown that the HMMapproach robustly classifies the expertise of the users based on their experience.In addition, the roll and angulation gestures are significantly different for the fourgroups of users and are clustered based on the expertise of the clinicians. Thismay provide a useful training tool to characterize the expertise of a physicianin training. Further work is underway in validating the results of this work inhuman subjects.

References

1. Darzi, A., Smith, S., Taffinder, N.: Assessing operative skill: Need to become moreobjective. British Medical Journal 318, 887–888 (1999)

2. Satava, R., Cuschieri, A., Hamdorf, J.: Metrics for objective assessment: Prelimi-nary results of the surgical skills workshop. Surgical Endoscopy 17, 220–226 (2003)

3. Peters, J., Fried, G., Swanstrom, L., Soper, N., Silin, L., Schirmer, B., Hoffman, K.,et al.: Development and validation of a comprehensive program of education andassessment of the basic fundamentals of laparoscopic surgery. Surgery 135, 21–27(2004)

4. Heinrichs, W., Srivastava, S., Montgomery, K., Dev, P.: The fundamental manip-ulations of surgery: A structured vocabulary for designing surgical curricula andsimulators. J. Amer. Assoc. of Gynecologic Laparoscopists 11, 450–456 (2004)

5. Dosis, D., Aggarwal, R., Bello, F., Moorthy, K., Munz, Y., Gillies, D., Darzi, A.:Synchronized video and motion analysis of the assessment of procedures in theoeprating theater. Arch. Surg. 140, 293–299 (2005)

372 J. Jayender et al.

6. Megali, G., Sinigaglia, S., Tonet, O., Dario, P.: Modelling and evaluation of surgi-cal performance using hidden markov models. IEEE Transactions on BioMedicalEngineering 53, 1911–1919 (2006)

7. Allen, B., Nistor, V., Dutson, E., Carman, G., Lewis, C., Faloutsos, P.: Supportvector machines improve the accuracy of performance evaluation of laparoscopictraining tasks. Surgical Endoscopy, 1–14 (2009)

8. Blum, T., Padoy, N., Feubner, H., Navab, N.: Modeling and online recognition ofsurgical phases using hidden markov models. In: Metaxas, D., Axel, L., Fichtinger,G., Szekely, G. (eds.) MICCAI 2008, Part II. LNCS, vol. 5242, pp. 627–635.Springer, Heidelberg (2008)

9. Rosen, J., Brown, J., Chang, L., Sinanan, M., Hannaford, B.: Generalized approachfor modeling minimally invasive surgery as a stochastic process using a discretemarkov model. IEEE Trans. on Biomed. Eng. 53, 399–413 (2006)

10. Leong, J.J.H., Nicolaou, M., Atallah, L., Mylonas, G.P., Darzi, A.W., Yang, G.Z.:HMM assessment of quality of movement trajectory in laparoscopic surgery. Com-puter Aided Surgery 12, 335–346 (2007)

11. Rabiner, L.: A tutorial on hidden markov models and selected applications inspeech recognition. Proceeding of the IEEE 77, 257–286 (1989)

12. Hannaford, B., Lee, P.: Hidden Markov Model analysis of force/torque informationin telemanipulation. The International Journal of Robotics Research 10, 528–539(1991)

13. Likas, A., Vlassisb, N., Verbeekb, J.J.: The global k-means clustering algorithm.Pattern Recognition 36, 451–461 (2003)