Authors' Response to Reviews

Authors’ Response to Reviews

Iain McLeanUniversity of Oxford

André BlaisUniversité de Montréal

James C. GarandLouisiana State University

Micheal GilesEmory University

We thank Political Studies Review for organising this symposium.Three of the four discus-sants have been involved at senior levels in the UK Research Assessment Exercise (RAE),and the fourth with the Australian equivalent. We thank them for their insights on thepolicy issues as well as the political science.

Ron Johnston (2009) accuses us of ‘quantifying the unquantifiable’; queries our impactscore; and says we are ‘over-optimistic’ in interpreting the r2 values we report. If socialscientists succeed in quantifying something which is difficult to quantify they deserve praise,not criticism, which is due only if they quantify something that is literally impossible toquantify, or offer spurious precision in observations of noisy data.

Is journal quality unquantifiable? No.The ISI database generates a noisy signal. So does itsnewer rival, Scopus. So does Google Scholar, as an unobtrusive by-product of a tooldesigned for a different purpose. On books, which enter our analysis only indirectly, a noisysignal of quality is generated by the number of academic libraries that buy them (Whiteet al., 2008; for a paper using our methodology to rank academic publishers, see Goodsonet al., 1999).We offer another, admittedly noisy, signal: the evaluations of political sciencejournals by those best qualified to evaluate them, namely academic political scientists inPhD-awarding university departments. Some of the noise in these rival signals may cancelout, getting us closer to the unknowable ‘true’ quality of each journal.

Furthermore, as all the discussants note, the quality of political science research is regularlyjudged by policy makers in the UK,Australia and numerous other countries.The results areused to dole out scarce resources.This is not going to cease happening;but, as the discussantsnote, there is a current dispute between proponents of peer review and of metric-basedmethods, with some protagonists maintaining that the latter are not appropriate in politicalscience. However, a simple calculation of the number of items to be read and the numberof person hours available for reading shows that peer reviewers must be using (possiblyinformal, possibly idiosyncratic) metrics if they are to do the job in the time available. Ourarticle offers one more possible tool to assist them.

Johnston criticises our ‘impact’ measure as arbitrary.The measure was introduced by Garand(1990) as a summary number to embody both the familiarity and the perceived quality ofeach journal, and it has been used since then in the series of studies of which ours is one.As Johnston notes (2009, p. 53), Garand wrote in his 1990 paper: ‘In measuring journalimpact it is necessary to weight the evaluation indicator by the familiarity measure’(Johnston’s emphasis), and goes on to explain that a simple product of the two was

POLITICAL STUDIES REVIEW: 2009 VOL 7, 88–92

© 2009 The Authors. Journal compilation © 2009 Political Studies Association

http://www.politicalstudies.org/

unsatisfactory because it correlated with familiarity more closely than with evaluation.Aftersome trial and error, the present measure was devised.

Johnston’s italicising of the word ‘necessary’ in his citation of Garand (1990) implies dissent.But Garand was making a plain statement of fact. Impact is indeed a function of bothfamiliarity and perceived quality. If you want a summary measure of impact, you need tocombine these two criteria – which ISI also does. If you do not want or need to do that,the raw familiarity and evaluation data are in our article.To Johnston’s further complaintthat our impact measure embodies a value judgement about the priority of familiarity overscore, we plead guilty as charged. Journals are more valuable and important to the disciplineif many political scientists are familiar with the work published in a journal and thinkfavourably of it.The problem with using only the mean evaluation score is that numerousjournals get high marks from the small groups of scholars who read them. If 5 per cent ofpolitical scientists are familiar with a given journal and can comment on that journal, it isnot having a discipline-wide impact. Hence, underpinning the use of the impact measureis the argument that the most important journals in political science are the ones that gethigh marks for the quality of research that they produce and are widely read by (or familiarto) political scientists.

Does our measure ‘privilege’ broad-based journals with strong favourable evaluations overjournals that are thought of highly by narrower groups of scholars?Yes. But we think thatthis is a value position that is held by many political scientists.We do not mean to imply thatpolitical scientists should avoid specialist journals. Different sorts of paper are properlyaddressed to different audiences. One of us believes that his most important recent paperwas in Welsh History Review (Peterson and McLean, 2007), which is in neither the ISIpolitical science set nor our list of 92.

Finally, Johnston objects to our description of the r2 values for the correlations betweenpairs of countries as ‘quite high’ (as does Russell; althoughWeale disagrees).Perhaps we weretoo prescriptive here.The data are what they are.These are population, not sample, data, andthey are for the reader to interpret. Johnston’s endnote 6 introduces some confusion.Thethree correlations he reports are all correct, and they do not contradict one another, as thecorrelations reported by Garand (1990) are between different things from those reported byGiles et al. (1989). Garand created an initial impact measure by weighting journal evalua-tions by the proportion of respondents who were familiar with each respective journal.Thisimpact measure was differentially correlated with its two component variables, familiarity(r = 0.9738) and mean evaluation (0.5438).When Garand created his final impact measureby adding the mean evaluation and the familiarity-weighted mean evaluation, this variablewas correlated with the component variables familiarity (r = 0.8615) and evaluation (r =0.7952).Hence the new impact measure was similarly correlated with the two variables thatwere used in creating the measure, suggesting that these two components were relativelyequally weighted in creating the final journal impact measure reported by Garand (1990),Garand and Giles (2003) and in our PSR article.We appreciate the opportunity to clarifythis misunderstanding here.

AlbertWeale (2009) describes the 2001 UK RAE in political science, which he chaired.Weare grateful that he thinks that our methodology ‘has a validity that is lacking in some other

AUTHORS’ RESPONSE TO REVIEWS 89

© 2009 The Authors. Journal compilation © 2009 Political Studies AssociationPolitical Studies Review: 2009, 7(1)

attempts to rank journals’ (p. 46). He points out that peer review uninformed by metricssuffers from too few peers. He goes on, however, to warn that a research assessment shouldnot rely solely on a single metric such as our data.We agree.

Like Johnston, he insists that a good researcher may sometimes publish a good paper in aspecialised or interdisciplinary journal, not familiar to most political scientists. However, asa long-standing editor of a leading journal, he can say authoritatively:

Within-journal variation of quality undoubtedly exists. To be sure, with highlyranked journals there is likely to be competition for publication and to the extentto which competition is a quality filter, the journal name will provide someevidence of quality (p. 46).

Define a ‘Type I’ error as publishing a poor quality paper and a ‘Type II’ error as failing topublish a paper of at least as high a quality as the lowest-quality paper published in thejournal. Any editor will confess to making Type II errors. But the sheer pressure ofsubmissions for high-quality journals will likely minimise the type I errors associated withthem.When an article is reviewed for a journal it receives peer review typically by a set ofscholars with some level of expertise over its subject matter and methods. Also, reviewershave higher standards for what they consider to be the most prestigious journals.Acceptedwork is likely to be of higher quality, consider broader questions with broader data and beassessed by the reviewers to have the potential for a major impact on the field.

Andrew Russell (2009) makes a number of policy points which we consider later.He makesthree methodological points.

(1) In setting up our internal panel to evaluate candidate journals to add to our lists, wewere engaging in a mini-peer review ourselves. Response: Yes, we were. That showsthat metrics and peer review are complements rather than substitutes.

(2) Some sub-fields may have responded more fully than others to our survey, which maybias our results. Response: Yes, that is possible: we have sub-field specialism for ourrespondents, but not for non-respondents, and we are unaware of population data onsub-field for the three populations of interest. However ...

(3) Some sub-field interests may have gamed our surveys by encouraging sub-fieldcolleagues to respond, which may augment any bias. Response: Once this possibilitybecomes common knowledge, it becomes self-defeating as all sub-field lobbies will, inequilibrium, do the same, which will induce a welcome increase in total response tosurveys such as ours.

Claire Donovan (2009) suggests that a future exercise might extend to getting respondentsto evaluate academic book publishers as well. It might: the constraints on our survey wereresearch budget and respect for the limited time of busy respondents. The first can beovercome; the second may be trickier. Meanwhile, unobtrusive measures such as thosebeing piloted by Howard White et al. (2008), also cited by Donovan, are being developed.

All four responses discuss the policy context, especially in the UK. Johnston, Russell andDonovan all share the widely expressed disquiet in humanities and social science at any

90 IAIN MCLEAN ET AL.


move to an exclusively metrics-based assessment of people or departments in the ResearchExcellence Framework (REF) that is to succeed the UK’s RAE.They note that the UKgovernment’s 2006 attempt to scrap the 2008 RAE when it was already in progress was metnot with the boundless joy that the government may have expected, but with deepscepticism. We might add that the UK government’s purported demonstration that ametrics-based assessment would have produced quick, cheap results failed to control foruniversity size, so that HM Treasury’s graphs proved that small universities were small andlarge universities were large – see Figure 1.

This shows that naïve reliance on metrics can be as dangerous as naïve reliance on peerreview. Our article argued that journal evaluations complement citation data. If we want tomeasure ‘quality’, which we agree is a subjective evaluation, it makes more sense to usesubjective evaluations than to count citations.The conceptual fit seems better.

Our article does not recommend that metrics supplant peer review. If there is a policy lesson,it is that metrics should inform peer review. Smart peer reviewers might begin by distin-guishing between (1) an effort to determine the initial quality and (potential) impact of anarticle and (2) an effort to assess the actual influence/impact of the article. Using a panel ofscholars to accomplish (1) seems a pointless reduplication of the panel of reviewers servingthe journal that published the work and grossly inefficient compared to reliance on theperceived quality/prestige of the journal in which the work appears. On the other hand,such scholars could serve a purpose in evaluating evidence for (2), and in evaluating paperspublished in more specialist journals. Our article appeals to a principle well known to

Figure 1: A Textbook Example of Spurious Correlation

Source: HM Treasury, 2006. Crown Copyright.

AUTHORS’ RESPONSE TO REVIEWS 91


Aristotle and Condorcet – the wisdom of crowds; at least, the wisdom of crowds of politicalscientists.

(Accepted: 24 September 2008)

About the AuthorsIain McLean, Department of Politics and International Relations, University of Oxford, Oxford OX1 1NF, UK;email: [email protected]

André Blais, Département de science politique, Université de Montréal, CP 6128, succursale Centre-ville, Montréal,Québec, H3C 3J7, Canada; email: [email protected]

James C. Garand, Department of Political Science, 205 Stubbs Hall, Louisiana State University, Baton Rouge,Louisiana 70803-5433, USA; email: [email protected]

Micheal Giles, Department of Political Science, Emory University, Atlanta, Georgia, USA; email: [email protected]

ReferencesDonovan, C. (2009) ‘Gradgrinding the Social Sciences:The Politics of Metrics of Political Science’, Political Studies Review,

7 (1), 73–83.

Garand, J. C. (1990) ‘An Alternative Interpretation of Recent Political Science Journal Evaluations’, PS: Political Science andPolitics, 23 (2), 448–51.

Garand, J. C. and Giles, M.W. (2003) ‘Journals in the Discipline:A Report on a New Survey of American Political Scientists’,PS: Political Science and Politics, 36 (2), 293–308.

Giles, M., Mizell, F. and Paterson, D. (1989) ‘Political Scientists’ Journal Evaluation Revisited’, PS: Political Science and Politics,22 (3), 613–7.

Goodson, L., Dillman, B. and Hira, A. (1999) ‘Ranking the Presses: Political Scientists’ Evaluations of Publisher Quality’, PS:Political Science and Politics, 32 (2), 257–62.

Treasury, H. M. (2006) Science and Innovation Investment Framework 2004–2014: Next Steps. London: published for HMTreasury and other departments by The Stationery Office. Available from: http://www.hm-treasury.gov.uk/media/7/8/bud06_science_332v1.pdf [Accessed 15 September 2008].

Johnston, R. (2009) ‘Where There are Data ... Quantifying the Unquantifiable’, Political Studies Review, 7 (1), 50–62.

Peterson, S. and McLean, I. (2007) ‘Of Wheat, the Church in Wales, and the West Lothian Question’, Welsh History Review,23 (3), 151–74.

Russell, A. (2009) ‘Retaining the Peers: How Peer Review Triumphs over League Tables and Faulty Accounting in theAssessment of Political Science Research’, Political Studies Review, 7 (1), 63–72.

Weale, A. (2009) ‘Metrics versus Peer Review?’, Political Studies Review, 7 (1), 39–49.

White, H. D., Boell, S. K., Yu, H., Davis, M., Wilson, C. S. and Fletcher, C. (2008) ‘Libcitations:A Measure for ComparativeAssessment of Book Publications in the Humanities and Social Sciences’. Unpublished paper, College of InformationScience and Technology, Drexel University/Bibliometrics and Informetrics Research Group, University of New SouthWales.

92 IAIN MCLEAN ET AL.


mailto:[email protected]



http://www.hm-treasury.gov.uk/media/7/8

Documents

Authors' Response to Reviews