5
Are tungro disease counts repeatable? K.G. Schoenly Abstract Scientists often assume that the measurements they take are repeatable across observers and sites; however, experience from different and unrelated disciplines indicate that interobserver repeatability varies with the sophistication of the measurement, observer experience, and the measurement scale used. Recent developments in repeatability methodology from quantitative genetics, for example, has produced an easy-to-interpret repeatability index R that varies from 0 (no repeatability) to 1 (perfect repeatability) derived from a one-way ANOVA (or intra-class correlation) that may be useful for plant protection workers. For measurements that are repeatable across a range of observers, the upper confidence interval for R (e.g., 95%) should equal or approach 1. For tungro disease counts, one study from Thailand revealed an R value of 0.1407 and an upper (95%) confidence interval of 0.5501. Although this value is low, without more field counts from more sites, it is premature to ask if tungro disease counts are repeatable. Although no scientific measurement is expected to have perfect repeatability, this Thai study underscores the need for plant protection workers to conduct frequent repeatability trials of their scientific measurements as a routine quality assurance procedure, particularly when multiple persons are required at multiple sites and when data from such studies are pooled for later statistical analysis. Repeatability of tungro counts Measurements that scientists take are often assumed to be repeatable and highly precise across a range of observers (Krebs 1989); however, inter-observer repeatability varies with the sophistication of the measurement, observer experience, and the measurement scale used. Published trials of repeatability for plant injuries in rice are scarce but revealing. In showing possible causes of varietal reaction to rice tungro disease, Ling (1979) reported that, of 561 rice varieties scored for their reaction to tungro by four observers in 1976, only 49% of within-observer readings (based on two readings of the same varieties) scored identically. Inter-scorer results from 28 rice varieties, taken by six scorers in 1978, showed that 43% of the varieties differed by 2 points on a 5-point scoring scale. This study also showed that scoring ability can be improved by experience. For

jameslitsinger.files.wordpress.com€¦  · Web view12/07/2016 · = 0.8528 [mean of trained and untrained groups]), followed by plant height (0.7483) and number of whiteheads (0.5856;

  • Upload
    vuque

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Page 1: jameslitsinger.files.wordpress.com€¦  · Web view12/07/2016 · = 0.8528 [mean of trained and untrained groups]), followed by plant height (0.7483) and number of whiteheads (0.5856;

Are tungro disease counts repeatable?

K.G. Schoenly

AbstractScientists often assume that the measurements they take are repeatable across observers and sites; however, experience from different and unrelated disciplines indicate that interobserver repeatability varies with the sophistication of the measurement, observer experience, and the measurement scale used. Recent developments in repeatability methodology from quantitative genetics, for example, has produced an easy-to-interpret repeatability index R that varies from 0 (no repeatability) to 1 (perfect repeatability) derived from a one-way ANOVA (or intra-class correlation) that may be useful for plant protection workers. For measurements that are repeatable across a range of observers, the upper confidence interval for R (e.g., 95%) should equal or approach 1. For tungro disease counts, one study from Thailand revealed an R value of 0.1407 and an upper (95%) confidence interval of 0.5501. Although this value is low, without more field counts from more sites, it is premature to ask if tungro disease counts are repeatable. Although no scientific measurement is expected to have perfect repeatability, this Thai study underscores the need for plant protection workers to conduct frequent repeatability trials of their scientific measurements as a routine quality assurance procedure, particularly when multiple persons are required at multiple sites and when data from such studies are pooled for later statistical analysis.

Repeatability of tungro counts

Measurements that scientists take are often assumed to be repeatable and highly precise across a range of observers (Krebs 1989); however, inter-observer repeatability varies with the sophistication of the measurement, observer experience, and the measurement scale used. Published trials of repeatability for plant injuries in rice are scarce but revealing. In showing possible causes of varietal reaction to rice tungro disease, Ling (1979) reported that, of 561 rice varieties scored for their reaction to tungro by four observers in 1976, only 49% of within-observer readings (based on two readings of the same varieties) scored identically. Inter-scorer results from 28 rice varieties, taken by six scorers in 1978, showed that 43% of the varieties differed by 2 points on a 5-point scoring scale. This study also showed that scoring ability can be improved by experience. For field surveys, Ling (1979) recommended that data recording for tungro be restricted to one person using a tape recorder and suggested that a data transfer method be used instead of employing multiple persons to record data at the same site. Studies that require the same data to be collected at multiple sites, however, will likely employ multiple persons. If data from such studies are pooled for statistical analysis, inter-observer repeatability becomes an unavoidable scientific issue.

Recent developments in repeatability methodology from quantitative genetics (Becker 1984, Lessells and Boag 1987) have produced an easy-to-interpret repeatability index R (based on a one-way ANOVA classification; see Krebs 1989 for working examples) that varies from 0 (no repeatability) to 1 (perfect repeatability). R is also known as the intraclass correlation,

Page 2: jameslitsinger.files.wordpress.com€¦  · Web view12/07/2016 · = 0.8528 [mean of trained and untrained groups]), followed by plant height (0.7483) and number of whiteheads (0.5856;

accessible in standard biometrics textbooks

Repeatability methodology is perhaps most useful in the context of quality assurance for identifying both the source and magnitude of correctable error for groups of scientific measurements before they are routinely used in future laboratory, greenhouse, and field trials. In May 1999 at IRRI, for example, a small repeatability study of 20 participants who were asked to individually and independently record several agronomic traits and injuries from the same 10 hills in the field ranked tiller number as the most repeatable measure (R = 0.8528 [mean of trained and untrained groups]), followed by plant height (0.7483) and number of whiteheads (0.5856; Schoenly and Domingo, unpublished data). In contrast, counts of tungro-diseased plants recorded by four observers at 10 sampling stations in one farmer’s field in Thailand (Disthaporn 1987) gave an R value of 0.1407 and an upper 95% confidence limit of 0.5501.

Without additional repeatability results at more sites, it is premature to ask whether multi-person field counts of tungro-diseased plants are repeatable relative to whitehead counts, for example. These studies, however, underscore the need for plant protection workers to conduct frequent repeatability trials of their scientific measurements before they are put to routine use and to revise or even discard, if necessary, those measurements that are not repeatable.

Other contexts where repeatability methodology is potentially useful include pre- and post-testing exercises in training workshops. This venue gives the principal investigator(s) the opportunity to observe the recorders firsthand and to detect and correct departures in protocol (Kahn and Sempos 1989). Publishing repeatability results alerts colleagues in plant protection disciplines to expected error values and confidence limits for scientific measurements that are gathered under specific laboratory, greenhouse, and field conditions.

References

Becker WA. 1984. A manual of quantitative genetics. Pullman, Washington: Academic Enterprises

Disthaporn S. 1987. Studies on sampling methods for rice diseases in Thailand. Ph.D. dissertation. Justus-Liebig University.

Kahn HA. Sempos CT. 1989. Statistical methods in epidemiology. New York: Oxford University Press.

Krebs CJ. 1989. Ecological methodology. New York: Harper-Collins.

Lessells CM, Boag PT. 1987. Unrepeatable repeatabilities: a common mistake. Auk 104:116–121.

Page 3: jameslitsinger.files.wordpress.com€¦  · Web view12/07/2016 · = 0.8528 [mean of trained and untrained groups]), followed by plant height (0.7483) and number of whiteheads (0.5856;

Ling KC. 1979. Variation in varietal reaction to rice tungro disease: possible causes. IRRI Research Paper Series 32:32–38.

Sokal RS. Rohlf FJ. 1995, Biometry. 3rd edition. New York: WH Freeman and Company.

Zar JH. 1984. Biostatistical analysis. 2nd edition. Englewood Cliffs. New Jersey: Prentice Hall.

NotesAuthor’s address: K.G. Schoenly, International Rice Research Institute. MCPO Box 3127. Makati City 1271, Philippines.

Citation: Schoenly KG. 1999. Are tungro disease counts repeatable? p. 81-83. In: Chancellor TCB, Azzam O, Heong KL (editors). 1999. Rice tungro disease management. Proceedings of the International Workshop on Tungro Disease Management. 9-11 November 1998, International Rice Research Institute, Los Baños, Philippines, 166 p.

Page 4: jameslitsinger.files.wordpress.com€¦  · Web view12/07/2016 · = 0.8528 [mean of trained and untrained groups]), followed by plant height (0.7483) and number of whiteheads (0.5856;