30

Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in
Page 2: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in
Page 3: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in

Statistics in Nutrition and Dietetics

Page 4: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in
Page 5: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in

Statistics in Nutrition and Dietetics

Michael NelsonEmeritus Reader in Public Health Nutrition

King’s College London Public Health Nutrition Research Ltd

UK

Page 6: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in

This edition first published 2020 © 2020 by John Wiley & Sons Ltd

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.

The right of Michael Nelson to be identified as the author of in this work has been asserted in accordance with law.

Registered Office(s)John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USAJohn Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK

Editorial Office9600 Garsington Road, Oxford, OX4 2DQ, UK

For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in standard print versions of this book may not be available in other formats.

Limit of Liability/Disclaimer of WarrantyThe contents of this work are intended to further general scientific research, understanding, and discussion only and are not intended and should not be relied upon as recommending or promoting scientific method, diagnosis, or treatment by physicians for any particular patient. In view of ongoing research, equipment modifications, changes in governmental regulations, and the constant flow of information relating to the use of medicines, equipment, and devices, the reader is urged to review and evaluate the information provided in the package insert or instructions for each medicine, equipment, or device for, among other things, any changes in the instructions or indication of usage and for added warnings and precautions. While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

Library of Congress Cataloging‐in‐Publication DataNames: Nelson, Michael (Nutritionist), author. Title: Statistics in nutrition and dietetics / Michael Nelson. Description: Hoboken, NJ : John Wiley & Sons, 2020. | Includes bibliographical references and index. Identifiers: LCCN 2019030279 (print) | ISBN 9781118930649 (paperback) | ISBN 9781118930632 (adobe pdf) | ISBN 9781118930625 (epub) Subjects: MESH: Nutritional Sciences–statistics & numerical data | Statistics as Topic | Research Design Classification: LCC RM217 (print) | LCC RM217 (ebook) | NLM QU 16.1 | DDC 613.2072/7–dc23 LC record available at https://lccn.loc.gov/2019030279LC ebook record available at https://lccn.loc.gov/2019030280

Cover Design: Laurence Parc | NimbleJack &Partners | www.nimblejack.co.ukCover Image: © filo/Getty Images; PHN Courtesy of Public Health Nutrition Research Ltd

Set in 10.5/13pt STIXTwoText by SPi Global, Pondicherry, India

10 9 8 7 6 5 4 3 2 1

Page 7: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in

To Stephanie

Page 8: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in
Page 9: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in

vii

Contents

About the Author ix

Preface xi

Acknowledgements xv

About the Companion Website xvii

PART 1 Setting the Statistical Scene 1

CHAPTER 1 The Scientific Method 3

CHAPTER 2 Populationsand Samples 31

CHAPTER 3 Principlesof Measurement 71

CHAPTER 4 Probabilityand Typesof Distribution 95

CHAPTER 5 ConfidenceIntervalsand SignificanceTesting 115

PART 2 Statistical Tests 131

CHAPTER 6 TwoSampleComparisonsforNormalDistributions:Thet-test 135

CHAPTER 7 NonparametricTwo-SampleTests 155

CHAPTER 8 ContingencyTables,Chi-SquaredTest,and Fisher’sExactTest 167

CHAPTER 9 McNemar’sTest 195

CHAPTER 10 Association:Correlationand Regression 205

CHAPTER 11 Analysisof Variance 227

Page 10: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in

viii Contents

PART 3 Doing Research 249

CHAPTER 12 Design,SampleSize,and Power 251

CHAPTER 13 DescribingStatisticalModelsand SelectingAppropriateTests 263

CHAPTER 14 Designinga ResearchProtocol 267

CHAPTER 15 PresentingResultsto DifferentAudiences 283

PART 4 Solutions to Exercises 299

APPENDIXA1 Probabilities (P)of the BinomialDistributionfor n, r,and p (Basedon SampleProportions)orπ(Proportionin the Population) 323

APPENDIXA2 Areasin the Tailof the NormalDistribution 341

APPENDIXA3 Areasin the Tailof the tDistribution 343

APPENDIXA4 WilcoxonUStatistic(Mann–Whitney Test) 345

APPENDIXA5 WilcoxonTStatistic 347

APPENDIXA6 SignTestStatisticR 349

APPENDIXA7 Percentagesin the Tailof the Chi-SquaredDistribution 351

APPENDIXA8 Quantilesof the SpearmanRankCorrelationCoefficient 353

APPENDIXA9 Percentagesin the Tailof the FDistribution 355

APPENDIXA10 FlowChartforSelectingStatisticalTests 363

Index 365

Page 11: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in

ix

Dr. Michael Nelson is Emeritus Reader in Public Health Nutrition at King’s College London, and for-mer Director of Research and Nutrition at the Children’s Food Trust. He is currently Director of Public Health Nutrition Research Ltd (http://www.phnresearch.org.uk/).

His early career with the Medical Research Council sparked a keen interest in nutritional epide-miology, statistics, and measurement validity. Research interests have included the diets of UK school children and links between diet and poverty, cognitive function, behaviour and attainment, and monitoring the impact of standards on school lunch take‐up and consumption. He collaborates nation-ally and internationally to promote a strong evi-dence base for school food policy. He has published over 200 peer‐reviewed articles and other publica-tions in the public domain.

January 2020

About the Author

Page 12: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in
Page 13: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in

xi

Preface

WHY IS THIS BOOK NEEDED?

Worldwide, there is no basic statistics textbook that provides examples relevant to nutrition and dietet-ics. While it could be argued that general medical science statistics texts address the needs of nutrition and dietetics students, it is clear that students find it easier to take on board the concepts relating to sta-tistical analysis and research if the examples are drawn from their own area of study. Many books also make basic assumptions about students’ back-grounds that may not always be appropriate, and use statistical jargon that can be very off‐putting for stu-dents who are coming to statistics for the first time.

WHO IS THIS BOOK FOR?

The book is aimed at undergraduate and postgradu-ate students studying nutrition and dietetics, as well as their tutors and lecturers. In addition, there are many researchers in nutrition and dietetics who apply basic statistical techniques in the analysis of their data, for whom a basic textbook provides use-ful guidance, and which helps to refresh their uni-versity learning in this area with examples relevant to their own field.

LEVEL AND PRE-REQUISITE

The level of the material is basic. It is based on a course that I taught at King’s College London over many years to nutrition and dietetics students, phys-iotherapists, nurses, and medical students. One of the aims was to take the fear and boredom out of

statistics. I did away with exams, and assessed understanding through practical exercises and coursework.

This book takes you only to the foothills of statistical analysis. A reasonable competence with arithmetic and a little algebra are required. For the application of more demanding and complex statistical techniques, the help of a statistician will be needed. Once you have mastered the material in this book, you may want to attempt a more advanced course on statistics.

AIMS AND SCOPE

The aim of this book is to provide clear, uncompli-cated explanations and examples of statistical con-cepts and techniques for data analysis relevant to learning and research in nutrition and dietetics. There are lots of short, practical exercises to work through. These support insight into why various tests work. There are also examples of SPSS1 output for each test. This makes it is possible to marry up the outcomes computed manually with those pro-duced by the computer. Examples are taken from around the globe relating to all aspects of nutrition, from biochemical experiments to public health nutrition, and from clinical and community practice

1SPSS stands for ‘Statistical Package for the Social Sciences’. It was developed at Stamford University in California, and the first manual was authored by Norman Nie, Dale Bent, and Hadlai Hull in 1970. The package was bought in 2009 by IBM. The worked examples and syntax in this book are based on Version 24 (2016). It has come a long way since its first incarnation, in terms of ease of use, error trapping, and out-put. Be grateful.

Page 14: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in

xii Preface

in dietetics. All of this is complemented by material online, including data sets ready for analysis, so that students can begin to understand how to generate and interpret SPSS output more clearly.

The book focuses on quantitative analysis. Qualitative analysis is highly valuable, but uses dif-ferent approaches to data collection, analysis, and interpretation. There is an element of overlap, for example when quantitative statistical approaches are used to assess opinion data collected using ques-tionnaires. But the two approaches have different underlying principles regarding data collection and analysis. They complement one another, but cannot replace one another.

Two things this book is not. First, it is not a ‘cookbook’ with formulas. Learning to plug num-bers in to formulas by rote does not provide insight into why and how statistical tests work. Such books are good for reminding readers of the formulas which underlie the tests, but useless at conveying the necessary understanding to analyze data prop-erly or read the scientific literature intelligently. Second, it is not a course in SPSS or Excel. While SPSS and Excel are used to provide examples of output (with some supporting syntax for clarity), it is no substitute for a proper course in computer‐based statistical analysis.

Scope

The book provides:

• a basic introduction to the scientific method• an understanding of populations and sam-

ples, principles of measurement, and confi-dence intervals

• an understanding of the basic theory underly-ing basic statistical tests, including ‘paramet-ric’ tests (those intended for use with data that follow mathematically defined distributions such as the so‐called ‘normal’ distribution); and ‘non‐parametric’ tests, for use with data distributions that are not parametric

• lots of worked examples and exercises that show how to compute the relevant outcome measures for each test, both by hand and using SPSS

• real examples from the nutrition and dietetics literature, including biochemical, clinical, and population‐based examples

• principles of research design, transforma-tions, the relevance of sample size, and the concept and calculation of Power

All of the exercises have worked solutions.Some students say, ‘Why do we have to do the

exercises by hand when the computer can do the same computations in a fraction of a second?’ The answer is: computers are stupid. The old adage ‘gar-bage in, garbage out’ means that if you don’t have insight into why certain tests work the way they do, a computer will generate output that might be mean-ingless, but it won’t tell you that you’ve made a mis-take, or ask ‘Is this really what you wanted to do?’ So, the purpose of the textbook and supporting learning materials is to help ensure that when you do use a computer, what goes in isn’t garbage, and what comes out is correct and provides meaningful answers to your research questions that you can interpret intelligently.

Finally, it is worth saying that some students will find this textbook providing welcome explana-tions about why things work the way they do. Others will find it annoyingly slow and detailed, with too much explanation for concepts and applications that seem readily apparent. If you are in the first group, I hope you enjoy the care with which expla-nations and examples are presented and that it helps to demystify what may at first seem a difficult topic. If you are in the second group, read quickly to get to the heart of the matter, and look for other references and resources for material that you feel is better suited to what you want to achieve. However hard or easy the text seems, students in both groups should seek to make friends with a local statistician or tutor experienced in statistical analysis and not try and do it alone.

Page 15: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in

Preface xiii

Unique features

There are many unique features in this textbook and supporting material:

• Examples specific to nutrition and dietetics• Clear simple language for students unfamiliar

with statistical terms and approaches. For many students, the study of statistics is seen as either a burden or irrelevant to their deci-sion to study nutrition and/or dietetics. But they will be required to pass a statistics mod-ule as part of their course. The aim is to make this as engaging and painless as possible.

• Lots of worked examples, with examples of SPSS output to help students with the inter-pretation of their analyses in the future.

• Putting statistics into context so that it is rele-vant to many undergraduate and postgradu-ate research projects.

• A website that provides complementary exer-cises, data sets, and learning and teaching tools and resources for both students and tutors.

CONTENTS

This textbook is based on over 20 years of teaching experience. There are four parts:

Part 1: Setting the statistical scene

This introduces concepts related to the scientific method and approaches to research; populations and samples; principles of measurement; probabil-ity and types of distribution of observations; and the notion of statistical testing.

Part 2: Statistical tests

This covers the basic statistical tests for data analysis. For each test, the underlying theory is explained,

and practical examples are worked through, comple-mented by interpretation of SPSS output.

Part 3: Doing research

Most undergraduate and postgraduate courses require students to collect data and/or interpret existing data sets. This section places the concepts in Part 1 and the learning in Part 2 into a framework to help you design studies, and determine sample size and the strength of a study to test your hypothesis (‘Power’). A Flow Chart helps you select the appro-priate statistical test for a given study design.

The last chapter explores briefly how to present findings to different audiences – what you say to a group of parents in a school should differ in lan-guage and visual aids from a presentation to a conference of your peers.

Part 4: Solutions to exercises

It would be desperately unfair of me to set exercises at the end of each chapter and not provide the solu-tions. Sometimes the solutions are obvious. Other times, you will find a commentary about why the solution is what it is, and not something else.

ONLINE

No textbook is complete these days without online resources that students and tutors can access. For this textbook, the online elements include:

• Teaching tools ○ Teaching notes ○ PowerPoint slides for each chapter ○ SPSS data, syntax, and output files

• Learning resources: ○ Links to online software and websites that

support learning about statistics and use of statistical software

Page 16: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in

xiv Preface

TEACHING TOOLS

Teaching notes

For lecturers delivering courses based on the text-book, I have prepared brief teaching notes. These outline the approach taken to teach the concepts set out in the textbook. I used traditional lecturing cou-pled with in‐class work, practical exercises, home-work, and research protocol development. My current practice is to avoid exams for any of this material. Exams and formal tests tend to distract stu-dents from revision of study materials more central to their course. Some students get completely tied up in knots about learning stats, and they fret about not passing the exam, ultimately to their academic disadvantage.

PowerPoint slide sets

The principal aid for tutors and lecturers is slide sets in PowerPoint. These save hours of prepara-tion, provide consistent format of presentation, and build on approaches that have worked well with literally thousands of students that have taken these courses. When using the slides out-side the context of teaching based on the text book, please ensure that you cite the source of the material.

SPSS data, syntax, and output files

A complete set of SPSS files for the examples and exercises in the text book is provided.

Learning resources

The page on Learning Resources includes website links and reviews of the strengths of a number of sites that I like and find especially helpful.

Unsurprisingly, there is a wealth of websites that support learning about statistics. Some focus on the basics. These are mainly notes from University courses that have been made available to students online. Some are good, some are not so good. Many go beyond the basics presented in this text book. Dil-igent searching by the student (or tutor) will no doubt unearth useful material. This will be equivalent to perusing the reading that I outline in the Introduction to Chapter 1.

Flow Charts are useful to find the statistical test that best fits the data. Appendix A10 in this book shows one. There are more online. Two that I like are described in more detail on the Learning Resources page. I have also included links to sites for determining Power and sample size.

Finally, guidance on the use of Excel and SPSS in statistics is very helpful. There are many sites that offer support, but my favourites are listed on the Learning Resources page.

Page 17: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in

xv

Acknowledgements

I would like to thank the hundreds of students who attended my classes on research methods and statis-tics. They gave me valuable feedback on what worked and what didn’t in the teaching sessions, the notes, and exercises. Irja Haapala and Peter Emery at King’s College London took over the reins when I was working on other projects and made helpful contributions to the notes and slides. Charles Zaiontz at Real Statistics kindly helped with the Wilcoxon U table, and Ellen Marshall at Sheffield Hallam University very helpfully made available the data on diet for the two‐way analysis of variance. Mary Hickson at the University of Plymouth made helpful comments on the text. Mary Hickson, Sarah

Berry, and Wendy Hall at King’s College London, and Charlotte Evans at the University of Leeds kindly made data sets available. Thanks to the many colleagues who said, ‘You should turn the notes into a book!’ Stephanie, Rob, Tom, Cherie, and Cora all gave me great encouragement to keep going and get the book finished. Tom and Cora deserve a special thanks for the illustrations of statisticians. The Javamen supplied the coffee. Finally, I would like to thank Sandeep Kumar, Yogalakshmi Mohanakrishnan, Thaatcher Missier Glen, Mary Aswinee Anton, James Schultz, Madeleine Hurd, and Hayley Wood at Wiley’s for bearing with me over the years, and for their support, patience and encouragement.

Page 18: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in
Page 19: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in

xvii

About the Companion Website

This book is accompanied by a companion Website:

www.wiley.com/go/nelson/statistics

The Website includes:• Datasets• Learning resources• Teaching notes

Page 20: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in
Page 21: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in

1

Statistics in Nutrition and Dietetics, First Edition. Michael Nelson. © 2020 John Wiley & Sons Ltd. Published 2020 by John Wiley & Sons Ltd. Companion website: www.wiley.com/go/nelson/statistics

P A R T 1SETTING THE STATISTICAL SCENE

Learning ObjectivesYou should be reading this textbook be-cause you want to:

• Learn how to design and analyze research projects

• Develop the skills to communicate the results and inferences from research

• Learn to evaluate the scientific litera-ture

The ideas upon which these skills are founded – an understanding of the scientific method, an introduc-tion to different models of scientific investigation, and the statistical tools to understand the signifi-cance of research findings  –  form the core of this book. Practical, worked examples are used through-out to facilitate an understanding of how research methods and statistics operate at their most funda-mental level. Exercises are given at the end of each chapter (with detailed, worked solutions at the end of the book, with more examples and solutions

online) to enable you to learn for yourself how to apply and interpret the statistical tools.

Approaching the Statistician

I have a grown‐up son and a grand‐daughter, age 6 and ¾. They are both very artistic. When I asked them to put their heads together and draw a picture of a statistician by way of illustration for this book, this is what they came up with (Figure 1):

‘What’s that!’ I cried. ‘He’s hideous!’‘Well’, they explained, ‘the eyes are for peering

into the dark recesses of the student’s incompetence, the teeth for tearing apart their feeble attempts at research design and statistical analysis and reporting, and the tongue for lashing them for being so stupid’.

‘No, no, no’, I said. ‘Statisticians are not like that’. So here was their second attempt (Figure 2):

‘That’s better’, I said.They interpreted the new drawing. ‘Statisticians

may appear a bit monstrous, but really they’re quite cuddly. You just have to become familiar with their language, and then they will be very friendly and

Page 22: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in

2 Part 1 Setting the Statistical Scene

helpful. Don’t be put off if some of them look a bit flabby or scaly. This one can also recommend a great dentist and a very creative hair‐stylist’.

Using Computers

Because computers can do in a few seconds what takes minutes or hours by hand, the use of computer statistical software is recommended and encour-aged. However, computers are inherently stupid, and if they are not given the correct instructions, they will display on screen a result which is mean-ingless in relation to the problem being solved. It is vitally important, therefore, to learn how to enter relevant data and instructions correctly and inter-pret computer output to ensure that the computer has done what you wanted it to do. Throughout the book, examples of output from SPSS are used to show how computers can display the results of anal-yses, and how these results can be interpreted.

This text is unashamedly oriented toward exper-imental science and the idea that things can be measured objectively or in controlled circumstances.

This is a different emphasis from books which are oriented toward qualitative science, where descrip-tions of how people feel or perceive themselves or others are of greater importance than quantitative measures such as nutrient intake or blood pressure. Both approaches have their strengths and weak-nesses, and it is not my intention to argue their relative merits here.

The examples are taken mainly from studies in nutrition and dietetics. The aim is to provide material relevant to the reader’s working life, be they stu-dents, researchers, tutors, or practicing nutrition scientists or dietitians.

FIGURE 1 A SCARY STATISTICIAN.

FIGURE 2 A FRIENDLY STATISTICIAN.

Page 23: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in

Statistics in Nutrition and Dietetics, First Edition. Michael Nelson. © 2020 John Wiley & Sons Ltd. Published 2020 by John Wiley & Sons Ltd. Companion website: www.wiley.com/go/nelson/statistics

C H A P T E R 1

3

1.1 KNOWING THINGS

What can I know? —Immanuel Kant, philosopher

The need to know things is essential to our being in the world. Without learning we die. At the very least, we must learn how to find food and keep ourselves

warm. Most people, of course, are interested in more than these basics, in developing lives which could be described as fulfilling. We endeavour to learn how to develop relationships, earn a livelihood, cope with illness, write poetry (most of it pretty terrible), and make sense of our existence. At the core of these endeavours is the belief that somewhere there is the ‘truth’ about how things ‘really’ are.

Much of the seeking after truth is based on feelings and intuition. We may ‘believe’ that all politicians are corrupt (based on one lot of evidence), and at the same time believe that people are inherently good (based on a different lot of evidence). Underlying these beliefs is a tacit conviction that there is truth in what we believe, even though all of our observations are not consistent. There are useful expressions like: ‘It is the exception that proves the rule’ to help us cope with observa-tions that do not fit neatly into our belief systems. But fundamentally, we want to be able to ‘prove’ that what we believe is correct (i.e. true), and we busy ourselves collecting examples that support our point of view.

Some beliefs are easier to prove than others. Argu-ments rage about politics and religion, mainly because

The Scientific Method

Learning ObjectivesAfter studying this chapter you should be able to:

• Describe the process called the sci-entific method: the way scientists plan, design, and carry out research

• Define different types of logic, hypotheses, and research designs

• Know the principles of presenting data and reporting the results of sci-entific research

Page 24: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in

4 Chapter 1 The Scientific Method

the evidence which is presented in favour of one posi-tion is often seen as biased and invalid by those who hold differing or opposing points of view. Some types of observations, however, are seen as more ‘objective’. In science, it is these so‐called objective measures, ostensibly free from bias, that are supposed to enable us to discover truths which will help us, in a systematic way, to make progress in improving our understanding of the world and how it works. This notion may be thought to apply not only to the physical and biological sciences but also the social sciences, and even disci-plines such as economics. There are ‘laws’ which are meant to govern the ways in which things or people or economies behave or interact. These ‘laws’ are devel-oped from careful observation of systems. They may even be derived from controlled experiments in which researchers try to hold constant the many factors which can vary from one setting to another, and allow only one or two factors to vary in ways which can be measured systematically.

It is clear, however, that most of the laws which are derived are soon superseded by other laws (or truths) which are meant to provide better under-standing of the ways in which the world behaves. This process of old truths being supplanted by new truths is often a source of frustration to those who seek an absolute truth which is secure and immu-table. It is also a source of frustration to those who believe that science provides us with objective facts, and who cannot therefore understand why one set of ‘facts’ is regularly replaced by another set of ‘facts’ which are somehow ‘more true’ than the last lot. It is possible, however, to view this pro-cess of continual replacement as a truth in itself: this law states that we are unlikely1 ever to find absolute truths or wholly objective observations, but we can work to refine our understanding and observations so that they more nearly approximate the truth (the world ‘as it is’). This assumes that there is in fact an underlying truth which (for

reasons which we will discuss shortly) we are unable to observe directly.2

Karl Popper puts it this way:

We can learn from our mistakes. The way in which knowledge progresses, and especially our scientific knowledge, is by unjustified (and unjustifiable) anticipations, by guesses, by tentative solutions to our problems, by conjec-tures. These conjectures are controlled by criti-cism; that is, by attempted refutations, which include severely critical tests. Criticism of our conjectures is of decisive importance: by bring-ing out our mistakes it makes us understand the difficulties of the problems which we are trying to solve. This is how we become better acquainted with our problem, and able to propose more mature solutions: the very refutation of a theory – that is, of any serious tentative solution to our problem – is always a step forward that takes us nearer to the truth. And this is how we can learn from our mistakes.

From ‘Conjectures and Refutations. The Growth of Scientific Knowledge’ [1].

This is a very compassionate view of human scientific endeavour. It recognizes that even the sim-plest of measurements is likely to be flawed, and that it is only as we refine and improve our ability to make measurements that we will be able to develop laws which more closely approximate the truth. It also emphasizes a notion which the atomic physicist Heisenberg formulated in his Uncertainty Principle. The Uncertainty Principle states in general terms that as we stop a process to measure it, we change its char-acteristics. This is allied to the other argument which

1As you can see, I am already beginning to hedge my bets. I  am not saying that we will never find absolute truths or wholly objective observations. I am saying that it is unlikely. How unlikely is the basis for another discussion.

2My favourite description of the world is Proposition 1 from Wittgenstein’s Tractatus Logico‐Philosophicus: ‘The world is everything that is the case’. The Propositions get better and better. Have a look at: http://en.wikipedia.org/wiki/Tractatus_Logico‐Philosophicus, or Wittgenstein for Beginners by John Heaton and Judy Groves (1994), Icon Books Ltd., if you want to get more serious.

Page 25: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in

1.2 Logic 5

states that the observer interacts with the measurement process. Heisenberg was talking in terms of subatomic particles, but the same problem applies when measuring diet, or blood pressure, or even more subjective things like pain or well‐being. Asking someone to reflect on how they feel, and the interac-tion between the person doing the measuring and the subject, has the potential to change the subject’s behaviour and responses. This is contrary to Newton’s idea that measurement, if carried out properly, could be entirely objective. It helps to explain why the dis-covery of the ‘truth’ is a process under continual refinement and not something which can be achieved ‘if only we could get the measurements right’.

Consider the question: ‘What do you understand if someone says that something has been proven “scientifically”?’ While we might like to apply to the  demonstration of scientific proof words like ‘objective’, ‘valid’, ‘reliable’, ‘measured’, ‘true’, and so on, the common meanings of these words are very difficult to establish in relation to scientific investi-gation. For all practical purposes, it is impossible to ‘prove’ that something is ‘true’. This does not allow you to go out and rob a bank, for example. An inability on the part of the prosecution to establish the precise moment of the robbery, for instance, would not excuse your action in a court of law. We cope in the world by recognizing that there is a sub-stantial amount of inexactitude, or ‘error’, in all that we do and consider. This does not mean we accept a shop‐keeper giving us the wrong change, or worry that a train timetable gives us information only to the nearest minute when the train might arrive before (or more commonly after) the advertised time. There are lots of ‘gross’ measurements that are good enough for us to plan our days without having to worry about the microdetail of all that we see and do.

For example, it is very difficult to describe an individual’s ‘usual’ intake of vitamin C, or to relate that person’s intake of vitamin C to risk of stroke. On the other hand, if we can accumulate sufficient evidence from many observations to show that increasing levels of usual vitamin C intake are asso-ciated with reduced risk of stroke (allowing for measurement error in assessing vitamin C intake

and the diagnosis of particular types of stroke), it helps us to understand that we can work with impre-cise observations and laws which are not immu-table. Moreover, it is important (for the sake of the growth of scientific knowledge) that any belief which we hold is formulated in a statement in such a way as to make it possible to test whether or not that statement is true. Statements which convey great certainty about the world but which cannot be tested will do nothing to improve our scientific understanding of the world. The purpose of this book, therefore, is to learn how to design studies which allow beliefs to be tested, and how to cope with the imprecision and variation inherent in all measurements when both collecting and analyzing data.

1.2 LOGIC

In science, we rely on logic to interpret our obser-vations. Our aim is usually to draw a conclusion about the ‘truth’ according to how ‘strong’ we think our evidence is. The type of observations we choose to collect, and how we collect them, is the focus of research design: ‘How are we going to collect the information we need to test our belief?’ The decision about whether evidence is ‘strong’ or ‘weak’ is the province of statistics: ‘Is there good evidence that our ideas are correct?’ As Sherlock Holmes put it, ‘It is a capital mistake to theorise before one has data’.3

There are two types of logic commonly applied to experience.

1.2.1 Inductive Logic

The aim with inductive logic is to infer a general law from particular instances: arguing from the particu-lar to the general. This type of logic is good for generating new ideas about what we think might be true. It is less good for testing ideas about what we think is true.

3Sherlock Holmes to Watson in: The Scandal in Bohemia

Page 26: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in

6 Chapter 1 The Scientific Method

Examples of Research Designs that Depend on Inductive Logic

Case studies provide a single example of what is believed to be true. The example is so compelling by itself that it is used to infer that the particular instance described may be generally true. For example:

A dietitian treating a severely underweight teen-age girl worked with a psychotherapist and the girl’s family to create an understanding of both the physiological and psychological basis and consequences of the disordered eating, resulting in a return to normal weight within six months. The approach contained unique elements not previously combined, and could be expected to have widespread benefit for similar patients.

A case study can be interesting and provide a powerful example. But it provides very limited evidence of the general truth of the observation.

Descriptive studies bring together evidence from a number of related observations that demonstrate repeatability in the evidence. For example:

In four old peoples’ homes, improved dining environments using baffles to reduce noise inter-ference and allowing more time for staff to take orders and serve meals resulted in improved nutritional status among residents after one year.

This type of cross‐sectional evidence from numerous homes is better than evidence from a single home or a case study.

The generalizable conclusion, however, depends on a number of factors that might also need to be taken into account: what was the turnover among residents – did new residents have better nutritional status when they arrived, were they younger with better appetites, did they have better hearing so that they could understand more clearly what options were available on the menu for that day, etc.? One of the difficulties with descriptive studies is that we may not always be comparing like with like. We would have to collect information to demonstrate

that apart from differences in noise levels and serving times, there were no other differences which could account for the change in nutritional status. We would also want to know if the circumstances in the four selected care homes were generalizable to other care homes with a similar population of residents.

Experimental studies are designed to assess the effect of a particular influence (exposure) on a particular outcome. Other variables which might affect the outcome are assumed to be held constant (or as constant as possible) during the period of evaluation.

Establish if a liquid iron preparation is effective in treating anaemia.

If an influence produces consistent effects in a chosen group of subjects, we are tempted to conclude that the same influences would have similar effects in all subjects with similar characteristics. When we evaluated the results from our observations, we would try to ensure that other factors which might affect the outcome (age, sex, dietary iron intake, dietary inhibi-tors of iron absorption, etc.) were taken into account.

1.2.2 Deductive Logic

Deductive logic argues from the general to the par-ticular. This type of logic involves a priori reasoning. This means that we think we know the outcome of our observations or experiment even before we start. What is true generally for the population4 will be true for each individual within the population. Here is a simple example:

All animals die.My dog is an animal.My dog will die.

4The term ‘population’ is defined in Chapter 2. It is not limited to the lay definition of all people living in a country. Instead, we can define our ‘population’. In the example above, we are talking about the population of all animals (from yeast to ele-phants). But we could equally well define a population as all women between 35 and 54 years of age living in London, or all GP surgeries in Liverpool. More to come in Chapter 2.

Page 27: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in

1.3 Experimentation and Research Design 7

This type of logic is very powerful for testing to see if our ideas are ‘true’. The logic is: if ‘a’ is true, then ‘b’ will be the outcome. If the evidence is robust (i.e. as good a measure as we can get, given the limi-tations of our measuring instruments) and shows a clear relationship, it should stand up to criticism. And as we shall see, it provides the basis for the statistical inferences based on the tests described in later chapters.

There is a problem, however. The example above about my dog is relatively simple and straightforward. We can define and measure what we mean by an ‘animal’, and we can define and measure what we mean by ‘death’. But suppose we want to understand the impact of vitamin A sup-plementation on risk of morbidity and blindness from measles in children aged 1 to 5 years living in areas where vitamin A deficiency is endemic. Defining and measuring variables in complex biological systems is much harder (particularly in the field of nutrition and dietetics). It becomes harder to argue that what is true generally for the population will necessarily be true for each individual within the population. This is for two reasons. First, we cannot measure all the factors that link ‘a’ (vitamin A deficiency) and ‘b’ (mor-bidity and blindness from measles) with perfect accuracy. Second, individuals within a population will vary from one to the next in terms of their susceptibility to infection (for a wide range of rea-sons) and the consequent impact of vitamin A supplementation.

For deductive logic to operate, we have to assume that the group of subjects in whom we are conducting our study is representative of the population in which we are interested. (The group is usually referred to as a ‘sample’. Ideas about populations and samples are discussed in detail in Chapter 2.) If the group is representative, then we may reasonably assume that what is true in the population should be evident in the group we are studying. There are caveats to this around the size of the sample and the accuracy of our mea-surements, which will be covered in Chap-ters 2 and 12.

Examples of Research Designs that Depend on Deductive Logic

Intervention trials are designed to prove that phenom-ena which are true in the population are also true in a representative sample drawn from that population.

Compare the relative impact of two iron preparations in the treatment of anaemia.

This may sound similar to the statement that was made under ‘Experimental Studies’. The two statements are different, however. In the interven-tion trial, we would try to ensure that the two groups in which we were comparing treatments were sim-ilar to each other and similar to the population from which they were drawn. In the experimental study, we chose a group of subjects, measured the exposure and outcome and other characteristics of the group, and assumed that if the outcome was true in that group, it would be true in the population with sim-ilar characteristics. These differences in approach and logic are subtle but important.

In practice, the aim of most studies is to find evi-dence which is generalizable to the population (or a clearly defined subgroup). The relationship between the type of logic used and the generalizability of the findings is discussed below. The limitations of inductive logic and their resolution are discussed lucidly by Popper [1, pp. 54–55].

1.3 EXPERIMENTATION AND RESEARCH DESIGN

Here is a quote from ‘The Design of Experiments’ by Sir Ronald Fisher [2]:

Men5 have always been capable of some mental processes of the kind we call ‘learning by experience’. Doubtless this experience was

5I presume he means men and women. And children. Or ‘humans’. Use of the term ‘men’ was common to his time of writing. Don’t take offence. The point he is making is important.

Page 28: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in

8 Chapter 1 The Scientific Method

often a very imperfect basis, and the reasoning processes used in interpreting it were very inse-cure; but there must have been in these processes a sort of embryology of knowledge, by which new knowledge was gradu-ally produced.

Experimental observations are only experi-ence carefully planned in advance, and designed to form a secure basis of new knowledge; that is, they are systematically related to the body of knowledge already acquired, and the results are deliberately observed, and put on record accurately.

Research usually has one of two main pur-poses: either to describe in as accurate and reliable a way as possible what one observes, or to test an idea about what one believes to be true. To under-take research, be it quantitative or qualitative, a systematic process of investigation is needed. This involves formulating clear ideas about the nature of the problem to be investigated, designing methods for collecting information, analyzing the data in an appropriate way, and interpreting the results.

1.3.1 A Children’s Story

One of my favourite children’s stories is The Phantom Tollbooth by Norton Juster [3], in which he bril-liantly summarizes the purpose of research and statistics. This may seem unlikely, but read on.

The book tells the story of Milo, a young boy living in an apartment in New York. He is endlessly bored and someone for whom everything is a waste of time. He arrives home after school one day to find a large package sitting in the middle of the living room. (I don’t know where his parents are.) He unpacks and assembles a tollbooth (he lives in America, don’t forget), gets in his electric car, deposits his coin, and drives through the tollbooth into a land of fanciful characters and logical challenges.

The story is this. The princesses Rhyme and Reason have been banished, and it is his job to rescue

them and restore prosperity to the Kingdom of Wisdom. He drives from Dictionopolis (where only words are important) to Digitopolis (where  –  you guessed it – only numbers are important) to reach the Castle in the Air, where the princesses are held captive. He shares his journey with two compan-ions: a Watchdog named Tock who is very vigilant about paying attention to everything (provided he keeps himself wound up); and the Humbug, ‘a large beetle‐like insect dressed in a lavish coat, striped trousers, checked waistcoat, spats and a derby hat’, whose favourite word is BALDERDASH  –  the great sceptic.

On the way to Digitopolis, the road divides into three, with an enormous sign pointing in all three directions stating clearly:

DIGITOPOLIS5 miles1 600 rods8 800 Yards26 400 ft316 800 in633 600 half inches

They argue about which road to take. The Humbug thinks miles are shorter, Milo thinks half‐inches are quicker, and Tock is convinced that whichever road they take it will make a difference. Suddenly, from behind the sign appears an odd creature, the Dodecahedron, with a different face for each emotion for, as he says, ‘here in Digitopolis everything is quite precise’. Milo asks the Dodeca-hedron if he can help them decide which road to take, and the Dodecahedron promptly sets them a hideous problem, the type that makes maths pupils have nightmares and makes grown men weep:

If a small car carrying three people at thirty miles an hour for ten minutes along a road five miles long at 11.35 in the morning starts at the same time as three people who have been travelling in a little automobile at twenty miles an hour for fifteen minutes on another

Page 29: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in

1.4 The Basics of Research Design 9

road and exactly twice as long as one half the distance of the other, while a dog, a bug, and a boy travel an equal distance in the same time or the same distance in an equal time along a third road in mid‐October, then which one arrives first and which is the best way to go?

They each struggle to solve the problem.

‘I’m not very good at problems’,admitted Milo.‘What a shame’, sighed the Dodecahedron.

‘They’re so very useful. Why, did you know that if a beaver two feet long with a tail a foot and half long can build a dam twelve feet high and six feet wide in two days, all you would need to build the Kariba Dam is a beaver sixty‐eight feet long with a fifty‐one foot tail?’

‘Where would you find a beaver as big as that?’ grumbled the Humbug as his pencil snapped.

‘I’m sure I don’t know’, he replied, ‘but if you did, you’d certainly know what to do with him’.

‘That’s absurd’, objected Milo, whose head was spinning from all the numbers and questions.

‘That may be true’, he acknowledged, ‘but it’s completely accurate, and as long as the answer is right, who cares if the question is wrong? If you want sense, you’ll have to make it yourself’.

‘All three roads arrive at the same place at the same time’, interrupted Tock, who had patiently been doing the first problem.

‘Correct!’ shouted the Dodecahedron. ‘Now you can see how important problems are. If you hadn’t done this one properly, you might have gone the wrong way’.

‘But if all the roads arrive at the same place at the same time, then aren’t they all the right way?’ asked Milo.

‘Certainly not’, he shouted, glaring from his most upset face. ‘They’re all the wrong way. Just because you have a choice, it doesn’t mean that any of them has to be right’.

That is research design and statistics in a nutshell. Let me elaborate.

1.4 THE BASICS OF RESEARCH DESIGN

According to the Dodecahedron, the basic elements of research are as shown in Box 1.1:

He may be a little confused, but trust me, all the elements are there.

1.4.1  Developing the Hypothesis

The Dodecahedron: ‘As long as the answer is right, who cares if the question is wrong?’

The Dodecahedron has clearly lost the plot here. Formulating the question correctly is the key starting point. If the question is wrong, no amount of exper-imentation or measuring will provide you with an answer.

The purpose of most research is to try and pro-vide evidence in support of a general statement of what one believes to be true. The first step in this process is to establish a hypothesis. A hypothesis is a clear statement of what one believes to be true. The way in which the hypothesis is stated will also have an impact on which measurements are needed. The formulation of a clear hypothesis is the critical first step in the development of research. Even if we can’t make measurements that reflect the truth, the hypothesis should always be a statement of what

BOX 1.1

The four key elements of research

Hypothesis

Design Statistics

Interpretation

Page 30: Statistics in Nutrition - download.e-bookshelf.de · Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in

10 Chapter 1 The Scientific Method

you believe to be true. Coping with the difference between what the hypothesis says is true and what we can measure is at the heart of research design and statistics.

We can test a hypothesis using both inductive and deductive logic. Inductive logic says that if we can demonstrate that something is true in a particular individual or group, we might argue that it is true generally in the population from which the individual or group was drawn. The evidence will always be relatively weak, however, and the truth of the hypothesis hard to test. Because we started with the individual or group, rather than the population, we are less certain that the person or group that we studied is representative of the population with sim-ilar characteristics. Generalizability remains an issue.

Deductive logic requires us to draw a sample from a defined population. It argues that if the sample in which we carry out our measurements can be shown to be representative of the population, then we can generalize our findings from our sample to the population as a whole. This is a much more powerful model for testing hypotheses.

As we shall see, these distinctions become important when we consider the generalizability of our findings and how we go about testing our hypothesis.

1.4.2  Developing the ‘Null’ Hypothesis

In thinking about how to establish the ‘truth’6 of a hypothesis, Ronald Fisher considered a series of statements:

No amount of experimentation can ‘prove’ an inexact hypothesis.

The first task is to get the question right! Formulating a hypothesis takes time. It needs to be a clear, concise statement of what we believe to be true,7 with no ambiguity. If our aim is to evaluate the effect of a new diet on reducing cholesterol levels in serum, we need to say specifically that the new diet will ‘lower’ cho-lesterol, not simply that it will ‘affect’ or ‘change’ it. If we are comparing growth in two groups of children living in different circumstances, we need to say in which group we think growth will be better, not sim-ply that it will be ‘different’ between the two groups.

The hypothesis that we formulate will deter-mine what we choose to measure. If we take the time to discuss the formulation of our hypothesis with colleagues, we are more likely to develop a robust hypothesis and to choose the appropriate measurements. Failure to get the hypothesis right may result in the wrong measurements being taken, in which case all your efforts will be wasted. For example, if the hypothesis relates to the effect of diet on serum cholesterol, there may be a particular

TIP

Your first attempts at formulating hypotheses may not be very good. Always discuss your ide-as with fellow students or researchers, or your tutor, or your friendly neighbourhood statisti-cian. Then be prepared to make changes until your hypothesis is a clear statement of what you believe to be true. It takes practice – and don’t think you should be able to do it on your own, or get it right first time. The best research is col-laborative, and developing a clear hypothesis is a group activity.

6You will see that I keep putting the word ‘truth’ in single quotes. This is because although we want to test whether or not our hypothesis is true – it is, after all, a statement of what we believe to be true – we will never be able to collect measures that are wholly accurate. Hence, the truth is illusory, not an abso-lute. This is what the single quotes are intended to convey.7‘The term “belief” is taken to cover our critical acceptance of scientific theories – a tentative acceptance combined with an eagerness to revise the theory if we succeed in designing a test which it cannot pass’ [1, p. 51]. It is important to get used to the idea that any ‘truth’ which we hope to observe is likely to be superseded by a more convincing ‘truth’ based on a more robust experiment or set of observations using better measuring instruments, and which takes into account some important details which we did not observe the first time.