23
Alan Aragon’s Research Review – May 2014 [Back to Contents ] Page 1 Copyright © May 1st, 2014 by Alan Aragon Home: www.alanaragon.com/researchreview Correspondence: [email protected] 2 Optimizing activity-based fat loss for aesthetic athletes: Interval or steady-state training? By Joel Minden, PhD, CSCS 5 How to manipulate research. By James Heathers, PhD(c) 12 Changes in exercises are more effective than in loading schemes to improve muscle strength [reviewed by Brad Schoenfeld, PhD, CSCS, CSPS, FNSCA]. Fonseca RM, Roschel H, Tricoli V, de Souza EO, Wilson JM, Laurentino GC, Aihara AY, de Souza Leão AR, Ugrinowitsch C. J Strength Cond Res. 2014 May 14. [Epub ahead of print] [PubMed ] 14 The effects of consuming a high protein diet (4.4 g/kg/d) on body composition in resistance-trained individuals. Antonio J, Peacock CA, Ellerbroek A, Fromhoff B, Silver T. J Int Soc Sports Nutr. 2014 May 12;11:19. [PubMed ] 16 An amino acid-electrolyte beverage may increase cellular rehydration relative to carbohydrate- electrolyte and flavored water beverages. Tai CY, Joy JM, Falcone PH, Carson LR, Mosman MM, Straight JL, Oury SL, Mendez C, Loveridge NJ, Kim MP, Moon JR. Nutr J 2014, 13:47 doi:10.1186/1475-2891-13-47 [PunMed ] 17 Calorie shifting diet versus calorie restriction diet: a comparative clinical trial study. Davoodi SH, Ajami M, Ayatollahi SA, Dowlatshahi K, Javedan G, Pazoki-Toroudi HR. Int J Prev Med. 2014 Apr;5(4):447-56. [PubMed ] 19 Processed foods - are they really that bad for you? By Chris & Eric Martinez 22 How can you get through to people who *think* they understand the science behind a certain topic? By Alan Aragon

4 - May - 2014

Embed Size (px)

DESCRIPTION

Alan Aragon

Citation preview

  • Alan Aragons Research Review May 2014 [Back to Contents] Page 1

    Copyright May 1st, 2014 by Alan Aragon Home: www.alanaragon.com/researchreview Correspondence: [email protected]

    2 Optimizing activity-based fat loss for aesthetic

    athletes: Interval or steady-state training?

    By Joel Minden, PhD, CSCS

    5 How to manipulate research.

    By James Heathers, PhD(c)

    12 Changes in exercises are more effective than in loading schemes to improve muscle strength [reviewed by Brad Schoenfeld, PhD, CSCS, CSPS, FNSCA].

    Fonseca RM, Roschel H, Tricoli V, de Souza EO, Wilson JM, Laurentino GC, Aihara AY, de Souza Leo AR,

    Ugrinowitsch C. J Strength Cond Res. 2014 May 14. [Epub

    ahead of print] [PubMed]

    14 The effects of consuming a high protein diet (4.4 g/kg/d) on body composition in resistance-trained individuals.

    Antonio J, Peacock CA, Ellerbroek A, Fromhoff B, Silver T. J Int Soc Sports Nutr. 2014 May 12;11:19. [PubMed]

    16 An amino acid-electrolyte beverage may increase

    cellular rehydration relative to carbohydrate-electrolyte and flavored water beverages.

    Tai CY, Joy JM, Falcone PH, Carson LR, Mosman MM,

    Straight JL, Oury SL, Mendez C, Loveridge NJ, Kim MP,

    Moon JR. Nutr J 2014, 13:47 doi:10.1186/1475-2891-13-47

    [PunMed]

    17 Calorie shifting diet versus calorie restriction diet:

    a comparative clinical trial study. Davoodi SH, Ajami M, Ayatollahi SA, Dowlatshahi K,

    Javedan G, Pazoki-Toroudi HR. Int J Prev Med. 2014

    Apr;5(4):447-56. [PubMed]

    19 Processed foods - are they really that bad for you? By Chris & Eric Martinez

    22 How can you get through to people who *think*

    they understand the science behind a certain topic? By Alan Aragon

  • Alan Aragons Research Review May 2014 [Back to Contents] Page 2

    Optimizing activity-based fat loss for aesthetic athletes: Interval or steady-state training?

    By Joel Minden

    __________________________________________________

    For aesthetic athletes, such as dancers, gymnasts, and

    bodybuilders, managing body mass and composition is just as

    important as sport-specific training. At a selected body weight,

    fat mass should be minimized, and dietary strategies, such as

    caloric restriction or macronutrient manipulation, are frequently

    used to achieve this. For those who prefer to emphasize activity-

    based methods to reduce body fat, the optimal strategy is

    unclear. Although increasing activity to create a negative energy

    balance should be the primary goal, there is considerable debate

    concerning the differential effectiveness of interval versus

    steady-state training. Perhaps the lack of consensus is due to the

    fact that empirical research in this area is compromised by

    methodological limitations and an inability to control, either

    physically or statistically, for the numerous contextual variables

    that cloud interpretability.

    For example, research on acute metabolic responses to exercise

    is sometimes criticized for the artificiality of the experimental

    setting, limited time course, and uncertain relation of measured

    variables (i.e., substrate utilization, gas exchange, plasma, and

    biopsy data) to long-term changes in body composition.

    Similarly, research on chronic responses to exercise has its own

    set of limitations: individual differences in protocol compliance,

    nonexercise activity, and dietary behavior; unknown accuracy of

    subjects record keeping; and questionable reliability of

    instruments used to track changes in body composition. Finally,

    both acute and chronic outcome data should be interpreted

    within the context of participant variables, including

    demographic characteristics and fitness levels, and dimensions

    of training protocols, such as modality, intensity, duration, and

    frequency of exercise. In light of these factors, its no surprise

    the efficacy debate continues.

    Despite the many challenges to interpretability, consistencies in

    the literature can be identified, and tentative conclusions can be

    made by directly comparing the effects of multi-week interval

    and steady-state training programs on body mass and

    composition. Given the enthusiasm for interval training in both

    scientific and popular media, its somewhat surprising that these

    direct comparisons are limited. In the following section, Ill

    present the results of these studies. For ease of interpretation,

    data on strength training or diet-only conditions will not be

    reported, nor will metabolic or cardiovascular outcome data.

    Studies that compared interval training to no-exercise controls or

    those that combined interval with steady-state training will also

    be excluded. In all studies, interval training sessions, unless

    otherwise noted, included 4 to 15 work intervals performed for

    15 to 240 seconds, with each repetition followed by low- to

    moderate-intensity periods of active recovery for up to 4

    minutes.

    The Research

    In perhaps the earliest direct comparison, Thomas et al1 assigned

    recreationally active male and female college students to steady-

    state or interval running programs matched for energy

    expenditure, 500 kcal per session. Exercise bouts were

    performed 3 times per week for 12 weeks. After statistically

    controlling for pre-intervention differences in body composition

    (assessed through hydrostatic weighing), the data revealed that

    subjects in both conditions experienced a reduction in body fat

    percentage. There were, however, no differences between the

    exercise conditions.

    Following the emergence of research by Tremblay et al,2 steady-

    state endurance training as a fat loss strategy was dismissed by

    many as inferior to intense but brief interval training. In this

    classic study, adults with no previous exercise history completed

    either a 20-week endurance training program or a 5-week

    endurance training program followed by 15 weeks of interval

    training bouts that varied in duration and intensity.

    Heralded as a breakthrough study two decades ago, the results

    appeared to demonstrate a paradoxical advantage of brief

    interval work for fat loss despite an energy cost well below that

    of endurance training. Although frequently noted for its finding

    that subcutaneous fat loss was ninefold greater for those in the

    interval condition, this estimate was made after statistically

    correcting for the energy cost of each type of exercise. When

    actual fat loss between the two conditions was compared, the

    difference was nonsignificant. Other aspects of this heavily cited

    study make firm conclusions about fat loss differences by

    protocol difficult: the undetermined reliability of skinfold data,

    the inclusion of an endurance training component (25 30-minute

    sessions) to the interval training program, and no control for

    dietary behavior.

    Years after the release of this promising study, additional

    evaluations of interval training began to emerge, the bulk of

    which failed to demonstrate any reliable advantage of interval

    training. For example, in Tjnna and colleagues 16-week study

    of metabolic syndrome patients,3 subjects exercised on inclined

    treadmills, and work volume for the interval and endurance

    conditions was equivalent. Both groups experienced reductions

    in weight, BMI, and waist circumference, but no differences

    between the groups were observed.

    Trapp et al4 compared fat loss outcomes of a 20-minute interval

    program and a 40-minute steady-state program, both performed

  • Alan Aragons Research Review May 2014 [Back to Contents] Page 3

    by young adult women on cycle ergometers 3 times per week for

    15 weeks. Despite the difference in duration of exercise bouts,

    estimated energy expenditure over the study period for the two

    groups was equivalent. This was achieved by having subjects in

    the interval condition perform 60 8-second intervals, followed

    by 12-second recovery periods, in each session. The interval

    training group, but not the steady-state group, experienced a

    reduction in DEXA-measured fat mass (~2.5 kg) at the

    completion of the study. This apparent intervention effect must,

    however, be interpreted with caution due to pre-existing group

    differences. At the beginning of the study, the mean fat mass for

    the interval group was 3.8 kg greater than that of the steady-state

    group, and follow-up analyses revealed that approximately of

    the variance in fat loss was accounted for by level of body fat at

    the beginning of the study.

    Schjerve et al5 compared fat loss responses in obese adults to 12

    weeks of interval or steady-state treadmill training performed 3

    times per week. Conditions were equalized for energy

    expenditure. Both groups experienced similarly small but

    significant reductions in weight, BMI, and body fat percentage.

    There were no differences between the conditions in these

    outcomes.

    Wallman et al6 examined the effects of 8 weeks of interval or

    steady-state training performed by overweight and obese men

    and women 4 times per week on a cycle ergometer. Energy

    expenditure between the two conditions was equivalent. The

    results yielded nonsignificant reductions in weight or fat mass

    for both conditions.

    Perhaps the greatest support for fat loss benefits of interval

    training comes from MacPherson et al.7 In this study,

    recreationally athletic college-aged men and women performed 3

    weekly sessions of sprint interval training or steady-state

    running for 6 weeks. Both groups experienced significant

    reductions in body fat percentage and fat mass, as well as small

    increases in lean mass. Although the interval group experienced

    a larger total decrease in fat mass (1.7 kg vs. 0.8 kg), the

    difference between the conditions was nonsignificant. In contrast

    to the methods used in the aforementioned studies, MacPherson

    et al. did not attempt to equalize work or energy expenditure,

    which makes the difference in total exercise time across the

    study period (13.5 and 0.75 hours for the steady-state and

    interval conditions, respectively) noteworthy. Nevertheless,

    subjects in the interval condition were encouraged to engage in

    active rest on the treadmill for 4 minutes following each of

    their maximal effort sprints, which resulted in a total activity

    time commitment of 6.75 hours.

    Recently, Keating et al8 compared fat loss outcomes for

    overweight adults randomly assigned to either an interval or

    steady-state cycle ergometer program. Both groups performed

    exercise 3 days per week for 12 weeks. There was a significant

    decrease in DEXA-measured body fat percentage for the steady-

    state (-2.6%) but not the interval (-0.3%) group. The absence of

    change for the interval group is somewhat unexpected, given

    that the aforementioned studies found equivalent effects for the

    two types of training. The authors indicated that this result may

    be partially explained by the use of interval training bouts that,

    to protect this clinical population, were less intense than those

    used in previous studies. However, a comparison of protocols

    shows the intensity of interval training for the Keating et al.

    subjects (~120% of VO2peak and ~90% of maximal heart rate)

    was consistent with those used in other studies (e.g., Schjerve,

    Tjnna, Wallman and their colleagues) of overweight or obese

    subjects.

    An alternative explanation is that unmeasured subject variables

    contribute to responsivity to exercise. Graphs from the Keating

    et al. study show considerable within-group variability for both

    exercise groups in body fat percentage change. In fact, some

    subjects in both conditions actually gained body fat. This

    highlights the importance of going beyond the aggregate data to

    search for individual differences that distinguish responders

    from non-responders.

    Conclusion

    Collectively, the data reveal that interval training offers no

    reliable advantage over steady-state endurance training for fat

    loss. In addition, the effectiveness of interval training is more

    likely to be demonstrated when work or energy expenditure is

    matched to that of steady-state protocols. This suggests that, in

    spite of any acute metabolic or cardiovascular benefits of

    interval training, intense but brief exercise is insufficient for

    stimulating meaningful fat loss. This was indirectly highlighted

    in Boutchers recent review of research in this area.9 Of the 6

    interval training studies in which fat loss outcomes were

    identified, the two cited (Boudou et al,10

    Mourier et al11

    ) for

    demonstrating the strongest effects included 2 days per week of

    45-minute steady-state training bouts in a program with only one

    interval-training day each week.

    Regarding application, assuming energy intake is regulated,

    activity-based fat loss programs should prioritize energy cost of

    exercise and activity preference. For athletes already involved in

    frequent and intense sport-specific training, activities that have a

    negative impact on quality of practice and competitive

    performance should be avoided. If interval training results in

    poor program compliance, fatigue, overeating, and reduced daily

    activity, alternative strategies should be explored. For aesthetic

    athletes, a realistic fat-loss strategy might involve small dietary

    changes combined with low- to moderate-intensity exercise,

    such as uphill walking at a comfortable pace, performed for an

  • Alan Aragons Research Review May 2014 [Back to Contents] Page 4

    extended duration. In sum, although intense interval training has

    value to the athlete, it may not be the best option for fat loss. In

    the larger context of athletic training, a moderate, comfortable

    approach offers the greatest chance for success.

    ____________________________________________________

    Joel Minden, Ph.D., CSCS, is a lecturer in the psychology and

    kinesiology departments at

    California State University, Chico.

    He writes about strength and

    conditioning, nutrition, sport

    psychology, and dance for his

    website www.joelminden.com.

    ____________________________________________________

    References

    1. Thomas, T. R., Adeniran, S. B., & Etheridge, G. L. (1984). Effects of different running programs on VO2 max, percent fat,

    and plasma lipids. Canadian Journal of Applied Sport Sciences,

    9(2), 55-62. [PubMed]

    2. Tremblay, A., Simoneau, J. A., & Bouchard, C. (1994). Impact of exercise intensity on body fatness and skeletal muscle

    metabolism. Metabolism, 43(7), 814818. [PubMed] 3. Tjnna, A. E., Lee, S. J., Rognmo, ., Stlen, T. O., Bye, A.,

    Haram, P. M., Loennechen, J. P., Al-Share, Q. Y., Skogvoll, E.,

    Slrdahl, S. A., Kemi, O. J., Najjar, S. M., & Wislff, U.

    (2008). Aerobic interval training versus continuous moderate

    exercise as a treatment for the metabolic syndrome: a pilot

    study. Circulation, 118(4), 346354. [PubMed] 4. Trapp, E. G., Chisholm, D. J., Freund, J., & Boutcher, S. H.

    (2008). The effects of high-intensity intermittent exercise

    training on fat loss and fasting insulin levels of young women.

    International Journal of Obesity, 32(4), 684691. [PubMed] 5. Schjerve, I. E., Tyldum, G. A., Tjnna, A. E., Stlen, T.,

    Loennechen, J. P., Hansen, H. E., Haram, P. M,, Heinrich, G.,

    Bye, A., Najjar, S. M,, Smith, G. L., Slrdahl, S. A., & Kemi,

    O. J., Wislff, U. (2008). Both aerobic endurance and strength

    training programmes improve cardiovascular health in obese

    adults. Clinical Science, 115(9), 283293. [PubMed] 6. Wallman, K., Plant, L. A., Rakimov, B., & Maiorana, A. J.

    (2009). The effects of two modes of exercise on aerobic fitness

    and fat mass in an overweight population. Research in Sports

    Medicine, 17(3), 156170. [PubMed] 7. Macpherson, R. E., Hazell, T. J., Olver, T. D., Paterson, D. H.,

    & Lemon, P. W. (2011). Run sprint interval training improves

    aerobic performance but not maximal cardiac output. Medicine

    and Science in Sports & Exercise, 43(1), 115-22. [PubMed]

    8. Keating, S. E., Machan, E. A., O'Connor, H. T., Gerofi, J. A., Sainsbury, A., Caterson, I. D., & Johnson, N. A. (2014).

    Continuous exercise but not high Intensity interval training

    improves fat distribution in overweight adults. Journal of

    Obesity, 2014. [Journal of Obesity]

    9. Boutcher, S. H. (2010). High-intensity intermittent exercise and fat loss. Journal of Obesity, 2011. [Journal of Obesity]

    10. Boudou, P., Sobngwi, E., Mauvais-Jarvis, F., Vexiau, P., & Gautier, J. F. (2003). Absence of exercise-induced variations in

    adiponectin levels despite decreased abdominal adiposity and

    improved insulin sensitivity in type 2 diabetic men. European

    Journal of Endocrinology, 149(5), 421-424. [PubMed]

    11. Mourier, A., Gautier, J. F., De Kerviler, E., Bigard, A. X., Villette, J. M., Garnier, J. P., Duvallet, A., Guezennec, C. Y., &

    Cathelineau, G. (1997). Mobilization of visceral adipose tissue

    related to the improvement in insulin sensitivity in response to

    physical training in NIDDM: effects of branched-chain amino

    acid supplements. Diabetes Care, 20(3), 385-391. [PubMed]

  • Alan Aragons Research Review May 2014 [Back to Contents] Page 5

    How to manipulate research.

    By James Heathers

    _________________________________________________________________

    Most of the audience for this article probably pays attention to

    the broader scientific literature in exercise and musculoskeletal

    physiology, strength and conditioning, nutrition, dietetics and

    sports medicine. From this, you take the available evidence and

    you slot it somewhere into an available framework of what's

    already known. This, everyone is familiar with.

    What people generally dont know is how to cheat.

    Yes, cheat. Let me outline why you would: the way academic

    funding presently works is that, in general, output is rewarded over insight two papers are better than one. So, the more you write, the better off youre going to be. There are many problems with this, and the environment it creates. One of those problems

    is that it becomes very tempting to 'massage' results from

    different research projects in order to achieve reportable

    outcomes. I should mention here that the majority of the time

    this isnt actually dishonesty its the fact that researchers have convinced themselves that theyve asked a good question, and that if they just change a few key variables with the analysis and

    reporting, suddenly theyll have the result that they know is there. And when that result turns up, it was because the initial

    analysis was wrong.

    Unfortunately, science doesnt work like that. Much of my academic work is in dealing with problems surrounding this

    issue; I am a methodologist. This means I concentrate heavily on

    how research should be conducted essentially, research into research. Methodologists develop new techniques in analysis,

    and verify that old ones work in the manner we hope they do.

    Think of the production of knowledge via academic outcomes as

    a game of poker. Research, like poker, is an expensive,

    stochastic process full of frustration, late nights, and alcohol but, also like poker, eventually if youre good enough, the balance of probabilities favour you winning. Insight, like money,

    is hard won.

    This process comes to a scrunching halt when someone starts to

    obscure the honest truth of what happened in a study, because

    there is no skill or reason that can be applied. You literally cant win, because the odds of something being supportable or

    repeatable are being manipulated.

    However, just like poker, there are 'tells' certain signs which allow you to detect another process at work. This is a partial list

    of those tells, illustrated with examples drawn liberally from the

    medical and social sciences. Ive tried to use exercise science and nutrition studies where convenient, but the principles are the

    same regardless often Ive simply chosen the most convenient examples that have come to mind. Please bear in mind I dont think these papers are guilty of any kind of conscious

    dishonesty, they are merely convenient examples of the

    principles involved.

    This list is not comprehensive and is in no particular order some of them work over time across different papers, some of

    them are specific to individual papers. These errors are both

    common and uncommon, both serious and trivial. They have

    various degrees of culpability (likely intent to deceive),

    significance (the ability to influence the outcome of the study

    overall), and detectability (how easy it is to spot from the article

    text). All have the potential to be dishonest.

    1. Altered endpoints, timepoints or measurement criteria

    Murphy et al1 investigated the effect of beetroot consumption on

    running performance due to their nitrate content. n=11 received

    a supplement of either beetroot (standardised to contain 500mg

    nitrates) or cranberry puree, in a double-blind cross-over

    fashion. Their heart rate, perceived exertion and time to

    completion of a 5km run was recorded. In the first mile,

    participants rated their perceived exertion significantly higher in

    the cranberry condition. In the final 1.8km, participants were

    significantly faster in the beetroot condition.

    Why was a 5km broken into miles? i.e. 0 1.6km, 1.6 3.2km, 3.2 5km.

    There are an infinite number of ways to divide a time interval

    into pieces. This analysis could have been performed as single

    kilometre intervals, or using a simple statistical model which

    predicts the overall effect of time through the race on exertion,

    and the overall difference between the groups by time. There is

    no reason to use a unit of measurement invented by the ancient

    Romans and formally defined in 1593.

    Researchers are well aware that trying different assortments of

    time intervals can uncover differences between timepoints due to random variation. Say we split the data into 100m intervals there are now 50 separate comparisons over the 5km where we

    can analyse the difference between our Beetroot and Cranberry

    groups. We are essentially making so many comparisons that

    one will be true due simply due to the noise present in the

    measurement.

    (Of course, there are methods for statistically controlling

    multiple comparisons2 but researchers don't report all the

    comparisons they used... in this case, the reader doesn't know

    that these multiple comparisons need to be controlled.)

    The other extreme is also a problem. Say we analyse the dataset

    only over the whole 5km, but beetroot consumption improved

    the finishing speed of the run. This would be a highly significant

    finding, as we know that in middle/long distance races there is

    already a pattern between laps or race phases (e.g., Tucker et

    al3).

    Culpability: low to medium

    Significance: medium

    Detectability: high

    2. Conveniently one-sided significance testing 3. Methodological fiddles

    (These are not always associated, but theyve been so neatly combined in a paper from a few years ago that Ive put them together here.)

    Christian et al4 enrolled n=279 in a computer-support program

    for weight loss at an American public hospital. All participants

  • Alan Aragons Research Review May 2014 [Back to Contents] Page 6

    were suffering from metabolic syndrome. Participants were

    given a full health and blood screening 12 months apart, and

    were assigned to either a computer-based tailored lifestyle

    intervention or a standard package of information on weight

    management. Participants were more likely to lose weight in the

    intervention vs. control group (-3.3lbs vs 0.33lbs, p=0.002).

    Participants who lost more than 10% BW had lower total

    cholesterol (-14.9 vs -3.9, p=0.05) which appears to be driven by

    the loss of LDL cholesterol (-14.0 vs. -4.1, p=0.04).

    Why was the outcome of the program determined by a group of people who lost 5% of bodyweight which included BOTH members of the intervention and the control group?

    This is the methodological fiddle there were n=46 participants who lost more than 10% BW and n=11 of them were from the control group (and thus n=35 from the intervention). These were

    lumped together to create the impression that the program was

    effective. This is hardly the most honest conclusion when about

    a quarter of the people with significant weight lost were from the

    control group it would be more true to say here that people with metabolic syndrome who lose weight improve their serum

    lipids regardless of how they do it. This is hardly evidence in favour of the intervention.

    The other fiddle is staring us in the face from the p-values

    above

    Why was the above difference assessed with a one-sided t-test?

    As were all probably aware here, the p-value is the calculated probability of getting the observed result if the null hypothesis is

    true. In this case, this is that the standard intervention and the

    computer-based intervention were identical. We accept that

    when a result is sufficiently unlikely to have occurred on this

    basis, that the experimental hypothesis is true in other words, that our intervention has actually intervened.

    One-tailed statistical tests assume that this process has a

    direction that the effect will have a direction (i.e. A will be higher than B). This gives you twice the statistical flexibility than you otherwise might have in a two-tailed test.

    There are a few situations where one-tailed tests are necessary.

    Firstly, when we have strong directional hypothesis: good

    evidence than our intervention should be better than the control

    group. In this case, we do not the researchers mention previous work with the same intervention being only somewhat effective

    in a diabetic sample. Secondly, when we are using very few t-

    tests to compare different values. In this case, we do not the researchers have around twenty individual tests.

    However, these are not hard and fast rules, and researchers often

    have another rule of thumb which simply goes like this: one tailed tests are what you use when youre trying to get something to achieve a criteria of significance when it hasnt quite made it. They have traditionally found refuge in questionable results, and

    as weve just discussed, theyre being used here to assess the difference between did lose weight and didnt lose weight regardless of group. A classic fiddle, and one the reviewers

    really should have spotted.

    Conveniently one-sided tests.

    Culpability: medium to high

    Significance: low to medium

    Detectability: high

    Methodological fiddles

    Culpability: medium

    Significance: medium

    Detectability: medium to high

    4. Overly complicated or uninterpretable models

    Another rather impressive looking technique is to take individual

    measures which are quite complicated and roll them into a

    model far more complicated than the average reader can

    understand. Social scientists try this much more than exercise

    physiologists, in my experience. But it does occur.

    A recent paper5 studied the split times of 2 world-record

    marathon runs, most recently Patrick Makaus Berlin Marathon (2011) which was a scarcely believable 2 hours, 3 mins and 38

    seconds. It describes several different curve fits possible to these

    runs, combines headwind and gradient data with individual

    kilometre time splits, and tries to find an optimal model or

    pacing strategy.

    Towards the conclusion it states:

    Oscillations at the micro-level overlay low-frequency, macro-level oscillations or modes indicating that an athletes resulting pacing trace represents a potentially complex amalgam of numerous signalling processes emanating from the brain, each with their own activation frequency.

    Of course, concluding that the best ever marathon times are

    employing highly sub-optimal pacing strategies seems wildly

    implausible because of the extraordinary amount of competition over such a long period of time, one might assume

    that either a) the best times ever were, in fact, fairly well paced

    by definition or b) that an optimal strategy doesnt exist due to individual differences that are impossible to predict (a stubbed

    toe, a very slightly tight hamstring, a bad nights sleep, a micro-change in gradient, and so on). An optimal pacing strategy cant be followed, of course, if its highly impractical. That is essentially stating If X was possible, then it would be better in an environment where X cant be practically be performed.

    Culpability: low

    Significance: low to medium

    Detectability: high

    5. Over-testing, a.k.a. random sifting

    Ive thought long and hard about how to get you an example of this, and Im not sure I can. Heres how random sifting works:

    We decide to measure the effect of a new training regime of volume squats on short-course track times.

    We assign 30 experienced middle-distance runners equally to three groups no extra training, 1 extra

  • Alan Aragons Research Review May 2014 [Back to Contents] Page 7

    training day of squats per fortnight, and 3 extra training

    days per fortnight.

    We take demographic variables to start with (age, gender, race), anthropometry (height, weight, BMI,

    body composition), bloods (c-reactive protein, cortisol)

    and training readiness (neurological assessment, heart

    rate variability).

    We take race variables (400m time, 3 times w. 5 mins rest between races), and 1500m time (with lap split

    times). Participants rate perceived effort and

    pain/soreness after each race. Then we run the program

    for 6 weeks, test all the above again (mid-line) and test

    again at 12 weeks. Naturally, we record the poundages

    moved in each session for the two training groups.

    Not the worst design ever, right? Comprehensive, detailed?

    Wrong. Its dreadful.

    This is the most unholy octopus of impossible interlocking

    variables youll ever see. Any one of the above can be used to control for, or combine with, any another. Variables you add to a

    study are not additive: if I measure seven things at one

    timepoint, I dont have seven potential comparisons in the data. I have instead any combination of the presence or absence of

    those variables, using a cut-off that I define (or choose from the

    literature), or using the top or bottom standard deviation, or all

    the values over the mean (or the median) to define groups.

    With the full access to the above information in my hypothetical

    studies, there are so many ways you can find to combine the

    outcomes that the answers that you will find are bordering on

    meaningless unless the results you find are statistically very strong. Make no mistake: if I had the above dataset, I am 100%

    entirely confident that I could produce a set of statistical

    analyses which conclusively showed that our squat intervention was effective. Even if our squat intervention did literally nothing

    or even made performance worse.

    The only trick is to hide all the analyses that didnt work, then write up the one analysis which worked by pure chance as being

    predicted by specific research questions that we started with.

    This is formally called post-hoc reasoning and very hard to detect. After you test hundreds or thousands of pathways

    through the above variables and find that, say, any squat intervention (1 or 3 sessions per fortnight) is effective on split

    times in 1500m but not total times, and reduces perceived effort

    but only in men, you then come up with a reason which specifically addresses why you might find this (and you choose

    past literature to reference accordingly).

    The behavioural economist and statistical guru Uri Simonsohn

    has a now-classic paper which conclusively proves that listening to the song When Im 64 actually makes you older.6 Obviously, this is a crazy conclusion because a song cant modify your age, but it is borne out of the analysis that he conducted simply by

    hiding all the analyses which didnt work.

    He also has a great statement that he encourages reviewers to

    send to every paper they peer-review which goes like this:

    "I request that the authors add a statement to the paper confirming whether, for all experiments, they have reported all measures, conditions, data exclusions, and how they determined their sample sizes."

    In other words, if the researchers have tested hundreds or

    thousands of models trying to find a result, they need to report

    the fact that they did so. This statement forces researchers to

    either a) assent to the statement and upgrade their untrustworthy

    analysis to outright fraud or b) admit that over-testing occurred.

    The best way of controlling for this rather insidious and hard-to-

    detect method is study pre-registration this is where the researchers write and publish a formal prediction of their study

    outcome before they start the research. Its not a perfect solution, but its much better than the alternative.

    Culpability: medium to high

    Significance: medium to high

    Detectability: low

    6. The creeping over-extrapolation

    This fiddle is a little different to the others, as it involves the

    external perception of the study. Its also very common, so common that it took me about 45 seconds to find this example.

    The science journalism site sciencealert.com.au ran this rather

    bold headline a month or so ago.

    Depression can be detected with a blood test

    Interesting, right? Heres the subheadline, now that has your attention.

    Doctors may soon be able to diagnose mental illness

    with a simple blood test, new research suggests.

    Sounds like a breakthrough, right? Not so fast. The title of the

    article its describing is:

    "Platelet Serotonin Transporter Function Predicts Default-Mode Network Activity"7

    Heres the glossy and rather tortured logic that connects them:

    The serotonin transporter protein removes serotonin from

    extracellular space. The main method of this is via the

    transporter protein on blood platelets. There is also a good

    relationship between this platelet uptake and the synaptosomal

    uptake (the uptake by areas of the brain).

    Separately, there is a relationship between depression and the

    activity of the default-mode network in the brain a coordinated system of activity which is active at rest and seems

    to be implicated with receiving and processing information

    which is self-referential. It is hypothesised that this network is

    disrupted in depressed people, which is the hypothetical source

    of intrusive thoughts and poor concentration in depression.

    Finally, we know that serotonin is implicated in depression as

    serotonin reuptake inhibitors are a frontline treatment for

  • Alan Aragons Research Review May 2014 [Back to Contents] Page 8

    depression. That is to say, like most psychotropic medication,

    they work sometimes in some people. We also are well aware

    that while they fairly straightforwardly increase free serotonin

    levels this is probably NOT their primary method of action

    (otherwise, why would these drugs which raise serotonin in 20

    minutes take weeks to start improving mood in depressed

    patients?)

    But anyway: if we can measure the blood platelet serotonin

    reuptake velocity (related to the same function in the brain), it

    might be related to the metabolic activity of the brain by the

    default-mode network (impaired in depression; serotonin

    implicated in function).

    So the researchers took a sample of healthy people and found a reasonable relationship between their blood platelet serotonin

    uptake with the function of the default-mode network as

    measured by blood-oxygen level dependent fMRI scan.

    And finally, please recognise that the above is itself a simplification.

    This is what gives us depression detected with a blood test.

    I understand, of course, that journalism sensationalises

    complicated topics like neurobiology. But the obvious caveat to

    that it really shouldnt simplify something so much that is isnt reasonably true anymore. And why would they do such a thing? Well, partly because its their job, but also partly because the researchers put out a press release with exactly the same

    headline, containing wonderfully compelling but detail-poor

    sentences such as serotonin transporter regulates neural depression networks.

    This is creeping over-extrapolation. You start with a result

    which, as far as I can tell, is a fairly solid piece of neurobiology

    relating brain oxygen level uptake over certain cortical networks

    to measured platelet serotonin uptake in the blood. Then you

    write a paper discussion and abstract which extrapolates the

    results somewhat, talking about what might be possible in future

    (if several important caveats are true). About this, you write a

    simplified press release which presents the results in a glowing

    light and presents those extrapolations as the point of the paper.

    Then you let a journalist with no formal science education write

    about it.8

    Im including this as an error researchers make because its the 21

    st century, and researchers have an obligation to ensure that

    their research is correctly reported. It is common for researchers

    trying to justify the external impact of their work in grant applications to collect these lazy, overwhelmingly positive

    stories and list them prominently on their CVs. Be cautious of

    any academic who is proud of how many newspaper articles are

    written about them.

    Culpability: medium

    Significance: medium to high

    Detectability: high

    7. Outlier forgetting 8. Outlier remembering.

    Again, these are together because they are closely related.

    Outlier remembering

    Reger et al9 matched a controlled dose of medium-chain

    triglyceride (MCT) oil in n=20 Alzheimer's Disease patients, to

    see if the presence of blood ketones had an immediate effect on

    cognition. I have reproduced the graph of the central result here

    on the left it shows that an increase in performance on a cognitive task was correlated with increase in blood ketones or was it?

    That point you can see on the left-hand side of the left-hand side

    graph with the big arrow represents a participant who performed

    much more poorly on the cognitive task after MCT oil than after

    placebo. This is the dead-set opposite of what was predicted, a

    decrease in performance a few times bigger than the alleged increases in performance observed in other people. Theres no good reason for this to happen, and its both in the opposite of the predicted direction and dramatically in excess of everyone

    elses change scores.

    Now, there are several tests which determine whether or not a

    value is an outlier some researchers simply do this by feel, but the more correct way is with a test which compares the value

    to the rest of the sample. The most common version of this is

    Grubbs test10 and this flags that value as being an outlier.

    Why was the outlier left in?

    When this value is removed, the level of statistical significance

    drops from p=0.02 to p=0.08, and reduces the r value (the

    correlation coefficient) from 0.5 to 0.42. In other words, it

    waters down the impact of the central finding. While it isnt actually a big difference, it does cast doubt on the central

    result.11

    As you can probably tell from this, outliers being included are

    very easy to spot. Even when only the means and standard

    deviations of numbers are reported, it's usually obvious when

    something is off.

    Outlier forgetting

    Its hard to find an example of outlier forgetting (the removal of extreme values which disagree with the theory to improve the

  • Alan Aragons Research Review May 2014 [Back to Contents] Page 9

    central result) for the simple reason that they arent there to find! There are some sophisticated methods you can try to determine

    if there is enough variation in a sample, but until Im writing for Alan Aragons Statistical Review, well have to let these slide.

    Suffice to say, this can be a real problem. If you selectively

    remove values which ruin your result, it very quickly runs the

    risk of becoming straightforwardly dishonest. This is why I don't

    have an example of one all I have is an example of where someone didn't do it.

    You can see a good example of this recently. Kogan et al12

    examined the relationship between heart rate variability (HRV) the same kind we use for athletic monitoring and depression / social functioning. They found some values which were outliers,

    and repeated the analysis with outliers both out and in, and then reported the separate models. This is definitely the honest way to do business if you're removing values, the fact that youre doing it, what the values are, and what this changed about the

    analysis should ALL be reported in the paper.

    Remembering:

    Culpability: medium

    Significance: medium

    Detectability: very high

    Forgetting:

    Culpability: high

    Significance: high

    Detectability: low

    9. 'Cute' covariates

    Arai et al13

    looked at the inter-relationship between heart rate

    variability (the same kind we use for athletic monitoring) with

    QT-interval (another metric of health/autonomic outflow which

    we get out of the electrocardiogram) with a sea of possible

    covariates in n=150 young participants.

    If I criticised everything in this paper which I didnt like, I would bore you more than is strictly necessary and wear my

    fingers down to stumps. So lets leave the criticisms like the incorrect use of the analysis of covariation to one side, and just

    concentrate on what might be useful for you: how to spot a

    dodgy covariates.

    There are several tells here. Firstly, the presence of a lot of covariates and models for a simple question. Here, 9 measures of

    different heart rate indices are compared with seven possible

    covariates. As before with our hypothetical squat study, a lot of

    possible comparisons is a red flag.

    Secondly, the use of covariates which are not statistically

    independent. For instance, there are models in the paper which

    use BMI in the same model as body fat percentage as measured by impedance. These numbers will obviously be related, and

    inter-relationships between these variables complicate our ability

    to understand the study outcomes dramatically.

    Lastly, the use of broad appeals. The paper justifies adding

    covariates of BMI and fat mass into the sample because it was

    relevant elsewhere (because obese and overweight people often

    have impaired HRV, for instance). But this sample Arai et al use

    is drawn from Japanese students at a school of medicine the female sample has a mean BMI of 20.1 and a standard deviation

    of 2.1. This means that of the 86 women, it is likely that only one participant or even absolutely no participants at all were even overweight (let alone obese). Their comparison paper was drawn

    from a sample in Mexico which, as you might be aware, holds the dubious honour of being the worlds most obese country.

    Regression is a complicated topic, and its very easy to hide dodgy techniques behind a wall of metrics and numbers.

    Researchers and reviewers fail to understand the implications of what theyre doing with a concerning regularity.

    Culpability: very high

    Significance: very high

    Detectability: low

    10. Conflating statistical and practical effects

    DeWall et al14

    tested n=93 undergraduates on the Intimate

    Partner Violence scale and Trait Physical Aggression scales in

    two groups, who received either a placebo or an intranasal dose

    of the hormone oxytocin and a priming condition where they underwent painful / stressful tasks. The paper strongly concluded

    that oxytocin increased intimate partner violence inclinations in

    participants who were high in trait physical aggression.

    Now, this may be strictly true in the statistical sense the results are probably calculated correctly. But does X is mathematically different to Y have any meaning in this context?

    The Intimate Partner Violence scale is a series of charming items

    where people are asked to score their likelihood of slapping,

    shoving, hitting, kicking etc. their current romantic partner. It is

    ranked from 1 not at all likely through to 5 extreme likely, and then averaged. The problem here is the whole group in the

    study had an average of 1.13 (SD = 0.39).

    I tried to model this, and its impossible to predict well (remember that no scores can be below 1). Probably two-thirds

    of the entire sample in ALL groups put not at all likely for every single possible answer. The entire sample could be driven by

    some combination of a) the very few people who reported some

    vague likelihood of violence, and b) the fact that some of the

    groups have no mathematical variability AT ALL everyone put the same answer. In psychology this is called a floor effect, and it has the potential to make analyses do awfully strange

    things.

    As this is a social science example, lets cast the same scenario into a hypothetical exercise science study:

    Say we have a new supplement which is designed to decrease

    post-exercise pain. N=80 participants firstly take either our

    supplement or a matched placebo, then all perform a high-

    volume high-intensity deadlift program, doing sets of 85% 1RM

    to concentric failure, and then 80%, 75%, etc. until total

  • Alan Aragons Research Review May 2014 [Back to Contents] Page 10

    concentric failure with 50% 1RM is reached. They then rate their

    lower back and hamstring pain 48 hours after exercise. More or

    less everyone writes 10 I am in the maximum amount of imaginable exercise discomfort mainly because this is an insane protocol which shouldnt be attempted. But a few people in our supplement group write 8 I am in a very, very large amount of pain.

    Now, can we accept that this is a meaningful difference? Well,

    with hundreds and hundreds of participants, maybe. But it is far

    more likely to be semantic we hurt our participants very badly and seem to only be fiddling at the margins of the value of

    interest. What we were looking for was the absence of pain, and not the presence of very slightly less.

    Our domestic violence questionnaire is the other way around statistical significance or not, the change from, say, extremely unlikely to quite unlikely may not be particularly useful at telling us about actual aggressive tendencies.

    Culpability: high

    Significance: low to medium

    Detectability: high

    Bonus: Making up data

    I have to include this although it isnt really a manipulation in the way other things are its fraud! Shang and Hasenberg15 investigated the effect of exercise training subsequent to Roux-

    En-Y gastric bypass (i.e. stomach surgery). N=60 morbidly

    obese participants were randomised to receive either once or

    twice-weekly exercise training. Significantly more body weight

    and fat mass was lost in the multiple-exercise group, who also

    showed significant improvement in co-morbidities.

    The problem here is that none of this actually happened.

    Someone from either the hospital or associated research group

    noticed that in the location where the data was reported from

    only n=21 patients had actually undergone any procedure at all

    in the period the paper was written over the data, as it stood, couldnt exist! On questioning, Dr. Shang couldnt produce any of the raw data and had no answer for where it had come from.

    Naturally, this paper is retracted.

    Culpability: very high

    Significance: very high

    Detectability: very low

    Conclusions:

    Please keep in mind firstly that researchers arent science-robots from an alternate dimension, theyre people. Theyre people with children and mortgages, and research programs which have to work out so they can continue to be funded, in highly

    competitive jobs, often competing against people who are

    willing to bend publication requirements to look better.

    Research isnt by any means a hotbed of fraud and deceit.

    That being said, researchers even from famous and venerable institutions can also be stunningly ignorant of the sub-structure

    of the research methodology they need to understand, can make

    basic mistakes in analysis, can deceive themselves, and can

    cheat, manipulate or defraud the process of producing scientific

    knowledge.

    The thing that we have in our favour in trying to ascertain the

    presence of the above is that science is the pursuit of knowledge

    on the public record. Anything thats fiddled, or dishonest, or under-handed, or incorrect, can only ever be hidden in plain

    sight, and in general the ideas that everyone agrees are the most

    important receive the most scrutiny. This might sound laudable,

    but it is anything but straightforward. Progress lurches along

    quite slowly. There are a few things that you, the interested

    reader (or perhaps peer-reviewer) can do to help, and to satisfy

    your own curiosity.

    1. Contact the researchers. Ask for data.

    Researchers, in general, like to talk about their work. Generally

    the person who is on the paper as the corresponding author is the right person to ask about it. However, be aware when the last author in the list of authors is listed as corresponding this generally means the most senior person on the project is also the

    person youre contacting, who is also often the busiest.

    (In situations like this, I generally Google the first author and

    ask them if they can help...)

    Researchers can be notoriously precious about sending their data

    to other people. This isnt just because theyre afraid of scrutiny or persecution (they often are). Its also because data files can be a complete mess after the completion of a study, in three

    different files (with different versions) only comprehensible to a

    co-author, and squirreled away on a university server with a

    password known only to the research assistant who quit 9

    months ago. What youre asking could represent a big investment of time on the part of the researchers. But you can

    always ask.

    2. Support efforts to put data in the public domain

    This is a big component of whats called open science the trend towards publishing datasets with experiments, as well as

    analytical tools etc. that are used. Remember that people who do

    this are extending what until now has been a privilege, which is

    the ability to look under the hood of how a study works. I feel strongly that researchers who publish data earn an extra degree

    of trust.

    3. Post on pubpeer or PubMed Commons

    These are both websites where you can leave comments for the

    public record on published research. If you want answers for

    questions that you have, they are very useful. To get access, I

    believe you need either an academic email address (i.e. one from

    a tertiary institution) or an invitation from an existing user.

    4. Start a conversation

    A few years ago, I was very amused when Alan was arguing

    with Dr. Robert Lustig of sugar is evil fame, and was told

  • Alan Aragons Research Review May 2014 [Back to Contents] Page 11

    rather huffily that academics do not have head-to-head

    confrontations on blogs, social media, forums, etc. I was amused

    because they damn well do all the time, and at great volume. There are plenty of outlets for legitimate questions about

    research which arent the old, formal methods if you know someone with a public blog, ask them to start a conversation for

    you. Or start one yourself. Invite the researchers to comment.

    Remember with all of the above to be courteous and show

    interest, rather than trying to storm the ramparts. Everyone is

    looking for answers, but some are looking better than others.

    ____________________________________________________

    James is just about to finish a

    PhD in cardiac electrophysiology.

    In his spare time, he breaks

    things for money. Everything else

    you need to know is here:

    jamesheathers.com

    ____________________________________________________

    References:

    1. Murphy, M., K. Eliot, et al. (2012). "Whole beetroot consumption acutely improves running performance." J Acad

    Nutr Diet 112(4): 548-552. [PubMed]

    2. http://en.wikipedia.org/wiki/Bonferroni_correction 3. Tucker, R., M. I. Lambert, et al. (2006). "An analysis of

    pacing strategies during men's world-record performances in

    track athletics." Int J Sports Physiol Perform 1(3): 233-245.

    [PubMed]

    4. Christian, J. G., T. E. Byers, et al. (2011). "A computer support program that helps clinicians provide patients with

    metabolic syndrome tailored counseling to promote weight

    loss." J Am Diet Assoc 111(1): 75-83. [PubMed]

    5. Angus, S. D. (2014). "Did recent world record marathon runners employ optimal pacing strategies?" J Sports Sci

    32(1): 31-45. [PubMed]

    6. Simmons, J. P., L. D. Nelson, et al. (2011). "False-positive psychology: undisclosed flexibility in data collection and

    analysis allows presenting anything as significant." Psychol

    Sci 22(11): 1359-1366. [PubMed]

    7. Scharinger, C., U. Rabl, et al. (2014). "Platelet serotonin transporter function predicts default-mode network activity."

    PLoS One 9(3): e92543. [PubMed]

    8. And then someone who doesnt even understand the journalism uses it in an argument on the internet!

    9. Reger, M. A., S. T. Henderson, et al. (2004). "Effects of beta-hydroxybutyrate on cognition in memory-impaired adults."

    Neurobiol Aging 25(3): 311-314. [PubMed]

    10. http://en.wikipedia.org/wiki/Grubbs'_test_for_outliers

    11. In statistical terminology, this is only an outlier on the x-axis and it's in the right place so technically it's a point of leverage not an outlier.

    12. Kogan, A., J. Gruber, et al. (2013). "Too much of a good thing? Cardiac vagal tone's nonlinear relationship with well-

    being." Emotion 13(4): 599-604. [PubMed]

    13. Arai, K., Y. Nakagawa, et al. (2013). "Relationships between QT interval and heart rate variability at rest and the covariates

    in healthy young adults." Auton Neurosci 173(1-2): 53-57.

    [AN/BC]

    14. DeWall, C.N., O. Gillath, et al. (2014). When the Love Hormone Leads to Violence: Oxytocin Increases Intimate

    Partner Violence Inclinations Among High Trait Aggressive

    People Soc Psych Pers Sci, Published online Feb 12th. [SPPS]

    15. Shang, E. and T. Hasenberg (2010). "Aerobic endurance training improves weight loss, body composition, and co-

    morbidities in patients after laparoscopic Roux-en-Y gastric

    bypass." Surg Obes Relat Dis 6(3): 260-266. [PubMed]

  • Alan Aragons Research Review May 2014 [Back to Contents] Page 12

    Changes in exercises are more effective than in loading schemes to improve muscle strength [reviewed by Brad Schoenfeld, PhD, CSCS, CSPS, FNSCA].

    Fonseca RM, Roschel H, Tricoli V, de Souza EO, Wilson JM,

    Laurentino GC, Aihara AY, de Souza Leo AR, Ugrinowitsch C.

    J Strength Cond Res. 2014 May 14. [Epub ahead of print]

    [PubMed]

    ____________________________________________________

    BACKGROUND/PURPOSE: This study investigated the

    effects of varying strength exercises and/or loading scheme on

    muscle cross-sectional area (CSA) and maximum strength after

    four strength training loading schemes: constant intensity and

    constant exercise (CICE), constant intensity and varied exercise

    (CIVE), varied intensity and constant exercise (VICE), varied

    intensity and varied exercise (VIVE). METHODS: Forty-nine

    individuals were allocated into five groups: CICE, CIVE, VICE,

    VIVE, and control group (C). Experimental groups underwent a

    twice a week training for 12 weeks. Squat 1RM was assessed at

    baseline and after the training period. Whole quadriceps muscle

    and its heads CSA were also obtained pre- and post-training.

    RESULTS: The whole quadriceps CSA increased significantly

    (p

  • Alan Aragons Research Review May 2014 [Back to Contents] Page 13

    for sure that the same findings would be seen in a well-trained

    population. Indeed, if the authors hypothesis that changing the

    rep range had a negative effect on neural drive is in fact correct,

    it could alternatively be hypothesized that this detriment would

    not occur in more experienced subjects since neural adaptations

    would already be well-ingrained.

    One issue that can be raised with the design is that the rep range

    employed for the varied intensity groups (6-10 reps per set) was

    fairly narrow. It would be difficult to imagine that changes in

    muscle growth would have been significantly different using

    such a narrow range over the course of a few months. What

    would have been more interesting from a hypertrophy

    standpoint, IMO, is if the rep range had of encompassed a low

    rep condition (i.e. 5 reps), moderate rep condition (10 reps) and

    a high rep condition (15 reps). Based on the concept of the

    strength-endurance continuum, comparing a constant intensity of

    10 reps per set versus a varied intensity of 5-10-15 reps per set

    would have made more sense to see if muscle hypertrophy

    differs along this continuum.

    Ultimately the study provides intriguing findings that have

    practical implications for training. Most importantly, it

    reinforces the need to vary exercise selection to maximize

    muscular symmetry as well as strength. It also suggests that,

    from a maximal strength standpoint, limiting variation in

    intensity of load is beneficial during the early stages of training.

    Ideally this study should be replicated, perhaps with wider

    intervals in rep range, in well-trained subjects to provide better

    generalizability for those with lifting experience.

    ____________________________________________________

    Brad Schoenfeld, PhD, CSCS, CSPS, FNSCA, is a

    lecturer in the exercise science department for

    Lehman College and is the head of their

    human performance laboratory. His primary

    research interests focus on elucidating the

    mechanisms of muscle hypertrophy and their

    application to resistance training. He has

    published over 40 peer-reviewed journal

    articles and currently serves on the Board of

    Directors for the NSCA. He is author of the

    book, "The M.A.X. Muscle Plan" which is

    available at all major bookstores and on

    Amazon.com. He maintains an active blog on his website:

    http://www.lookgreatnaked.com/

  • Alan Aragons Research Review May 2014 [Back to Contents] Page 14

    The effects of consuming a high protein diet (4.4 g/kg/d) on body composition in resistance-trained individuals.

    Antonio J, Peacock CA, Ellerbroek A, Fromhoff B, Silver T. J

    Int Soc Sports Nutr. 2014 May 12;11:19. [PubMed] [Full Text]

    BACKGROUND: The consumption of dietary protein is important for resistance-trained individuals. It has been posited that intakes of 1.4 to 2.0 g/kg/day are needed for physically active individuals. Thus, the purpose of this investigation was to determine the effects of a very high protein diet (4.4 g/kg/d) on body composition in resistance-trained men and women. METHODS: Thirty healthy resistance-trained individuals participated in this study (mean SD; age: 24.1 5.6 yr; height: 171.4 8.8 cm; weight: 73.3 11.5 kg). Subjects were randomly assigned to one of the following groups: Control (CON) or high protein (HP). The CON group was instructed to maintain the same training and dietary habits over the course of the 8 week study. The HP group was instructed to consume 4.4 grams of protein per kg body weight daily. They were also instructed to maintain the same training and dietary habits (e.g. maintain the same fat and carbohydrate intake). Body composition (Bod Pod), training volume (i.e. volume load), and food intake were determined at baseline and over the 8 week treatment period. RESULTS: The HP group consumed significantly more protein and calories pre vs post (p < 0.05). Furthermore, the HP group consumed significantly more protein and calories than the CON (p < 0.05). The HP group consumed on average 307 69 grams of protein compared to 138 42 in the CON. When expressed per unit body weight, the HP group consumed 4.4 0.8 g/kg/d of protein versus 1.8 0.4 g/kg/d in the CON. There were no changes in training volume for either group. Moreover, there were no significant changes over time or between groups for body weight, fat mass, fat free mass, or percent body fat. CONCLUSION: Consuming 5.5 times the recommended daily allowance of protein has no effect on body composition in resistance-trained individuals who otherwise maintain the same training regimen. This is the first interventional study to demonstrate that consuming a hypercaloric high protein diet does not result in an increase in body fat. SPONSORSHIP: JA is the CEO of the International Society of Sports Nutrition. The protein powder was provided by MusclePharm and Adept Nutrition (Europa Sports Products brand); both are sponsors of the ISSN conferences.

    Study strengths

    A big strength of this study is the underlying concept, and the

    interesting question investigated. Its one of the fun studies that pushes the what if we tried this crazy idea factor, examining a highly experimental and exploitive protocol. And, it happened to

    yield some intriguing results. Overfeeding studies have thus far

    focused on carbohydrate and/or fat,1-7

    with a glaring scarcity of

    studies on protein overfeeding.8 Furthermore, the majority of

    overfeeding trials are short, ranging from a few days to less than

    a month. Subjects were resistance-trained, which minimizes the

    respond-strongly-to-anything tendency of novices.

    Study limitations

    Air displacement plethysmography (ADP, or Bod Pod) was used

    to assess body composition. A comprehensive review by Fields

    et al states:9 In conclusion, the BOD POD is a reliable and

    valid technique that can quickly and safely evaluate body composition in a wide range of subject types, including those who are often difficult to measure, such as the elderly, children, and obese individuals. However, it should be noted that the majority of studies on the Bod Pod have compared it to

    hydrostatic weighing. Ball and Altena10

    compared Bod Pod to

    dual X-ray absorptiometry (DXA) in a large sample of men

    (n=160) and found that although the results from the two

    methods were highly correlated, the difference increased as

    bodyfat increased. Quoting their conclusion (which I feel is

    hugely important):10

    Practitioners should be aware that even with the use of technologically sophisticated methods (i.e., Bod Pod, DXA), differences between methods exist and the determination of body composition is at best, an estimation.

    Another limitation is the questionable reliability of self-reported

    dietary intake (and activity output). Research that immediately

    comes to mind is Lichtman et al, who found that obese subjects

    with a reported history of diet resistance under-reported food intake by an average of 47%, and over-reported physical activity

    by 51%.11

    In the case of the present study, there was a massive

    amount of protein assigned to the experimental group (4.4 g/kg

    or 307 g/day). The investigators were aware of the inherent

    difficulty in carrying this out, hence their purposely uneven

    randomization: 20 subjects were assigned to the high-protein

    (HP) group, and 10 subjects to the control group. Its not out of the question that over-reporting occurred, since its human nature to avoid admitting failure to fully follow the program.

    Aside from the limitations inherent with self-reported intake,

    there was no objective measure of energy expenditure An

    attempt to control for training volume was made via daily

    journaling. There thus was the reliance upon the accuracy of the

    subjects records, instead of an objective measure of energy expenditure such as the doubly labeled water (DLW) technique.

    The use of DLW has been called the gold standard of assessing energy expenditure, particularly in non-confined conditions.

    12

    However, its rare to see DLW used in sports nutrition studies (or most any type of research, for that matter). This is because

    its expensive and requires specifically trained personnel. Thus, were left with open questions about how the experimental protein overfeeding affected non-exercise activity thermogenesis

    (NEAT). One of the most memorable examples of DLW use

    capturing the impressive extent of NEAT was in 1999 when

    Levine et al13

    found that the metabolic response to a 1000-kcal

    surplus ranged from a 98 kcal decrease to a 692 kcal increase in

    NEAT. The groups mean increase in NEAT was 336 kcal. The authors summation is worth quoting directly:13

    Thus, activation of NEAT can explain the variability of fat gain with overeating. As humans overeat, those with effective activation of NEAT can dissipate excess energy so that it is not available for storage as fat, [...] The maximum increase in NEAT that we detected (692 kcal/day, volunteer 5) could be accounted for by an increased strolling-equivalent activity of 15 min/hour during waking hours.

    Comment/application

    The most salient finding was the lack of significant change in

    body composition in either group over the 8-week period:

  • Alan Aragons Research Review May 2014 [Back to Contents] Page 15

    Surprisingly, the HP groups body composition showed no significant changes despite the assignment of an additional 800

    kcal (in protein) above and beyond that assigned to the control

    group. But, unlike Levine et als overfeeding study, the present study based overfeeding on protein exclusively. The HP groups consumption of ~307 g protein versus the control groups ~138 g without a doubt had a higher thermic effect. As reported by

    Jquier,14

    the thermic effect of protein (expressed as a

    percentage of energy content) is 25-30%, carbohydrate is 6-8%,

    and fat is 2-3%. However, not all of the literature is in precise

    agreement. Halton and Hu reported greater variability, with the

    thermic effect of protein being 20-35%, carbohydrate at 5-15%,

    and fat being subject to debate since some investigators found a

    lower thermic effect than carbohydrate while others found no

    difference.15

    Despite relative variations in carbohydrate and fat,

    protein has consistently shown a markedly higher thermic effect

    than either of them. In combination, the thermic effect of protein

    combined with a liberal presumption of NEAT, the majority of

    the dissipated protein energy is accounted for. The remainder is

    plausibly attributable to reporting error.

    In a recent study that made waves for being the first of its kind,

    Bray et al16

    compared the overfeeding effects of a low-protein

    (5%), normal-protein (15%), and high-protein (25%) diet.

    Carbohydrate was kept the same across the treatments, with fat

    filling in the remainder. Among this studys design strengths was the use of DLW to assess energy expenditure. A 40%

    energy surplus (954 kcal) was imposed for 8 weeks, and the low-

    protein lost lean mass, all groups increased fat mass equally, but

    the normal & high-protein groups gained lean mass, with the

    latter gaining the more lean mass by a small margin. The low

    protein group gained significantly less total bodyweight than the

    higher-protein groups, but this was due to differences in lean

    mass gain.

    In the present study, no lean mass was gained despite an

    increased protein intake in the HP group. This can be attributed

    to the advanced resistance-trained status of the subjects (they

    trained an average of 8.5 hours/week for the past 8.9 years), and

    their baseline protein intake was already high (~1.9-2.3 g/kg). In

    contrast, Bray et als subjects were untrained, and their protein intake at baseline was 1.2 g/kg, and this was raised to 1.8 g/kg in

    the high-protein treatment essentially crossing the threshold from sub-optimal to optimal. Another point made by the authors

    of the present study was that the subjects were instructed to

    maintain their habitual training program, thus precluding any

    novel or greater training stimulus that might elicit further gains.

    Taking the results on face-value, it almost seems surplus calories

    dont count since NEAT will save you as long as the surplus is from protein. However, its worth reiterating that not all aspects of this study were tightly controlled, and reporting error could

    have played a significant confounding role in the results.

    On the other hand, there is still the possibility that relatively

    advanced, resistance-trained subjects have a heightened

    capability of dissipating surplus energy from dietary protein

    through involuntary, non-exercise means. This possibility also

    holds potentially important implications for meal planning of

    dieting individuals as well as those who are striving to maintain

    weight loss, but also need to control appetite.

    On a practical note, the following detail should be weighed into

    consideration: ...every subject in the high protein group consumed protein powder in order to meet the requirements for the study. Otherwise, it would be have virtually impossible or highly unlikely that one could consume a 4.4 g/kg/d via food alone. Protein supplements (in this case, whey and casein powders) only contain trace amounts of fat and carbohydrate.

    Those who want to experiment with higher protein intakes

    should keep in mind that inadvertent addition of fat and

    carbohydrate along with the extra protein (i.e., via mixed-

    macronutrient dishes and/or fatty meats) would not mimic the

    protocol nor the effects seen in the present study.

  • Alan Aragons Research Review May 2014 [Back to Contents] Page 16

    An amino acid-electrolyte beverage may increase cellular rehydration relative to carbohydrate-electrolyte and flavored water beverages.

    Tai CY, Joy JM, Falcone PH, Carson LR, Mosman MM,

    Straight JL, Oury SL, Mendez C, Loveridge NJ, Kim MP,

    Moon JR. Nutr J 2014, 13:47 doi:10.1186/1475-2891-13-47

    [PubMed]

    BACKGROUND: In cases of dehydration exceeding a 2% loss of body weight, athletic performance can be significantly compromised. Carbohydrate and/or electrolyte containing beverages have been effective for rehydration and recovery of performance, yet amino acid containing beverages remain unexamined. Therefore,

    the purpose of this study is to compare the rehydration capabilities of an electrolyte-carbohydrate (EC), electrolyte-branched chain amino acid (EA), and flavored water (FW) beverages. METHODS: Twenty men (n = 10; 26.7 +/- 4.8 years; 174.3 +/- 6.4 cm; 74.2 +/- 10.9 kg) and women (n = 10; 27.1 +/- 4.7 years; 175.3 +/- 7.9 cm; 71.0 +/- 6.5 kg) participated in this crossover study. For each trial, subjects were dehydrated, provided one of three random beverages, and monitored for the following three hours. Measurements were

    collected prior to and immediately after dehydration and 4 hours after dehydration (3 hours after rehydration) (AE = -2.5 +/- 0.55%; CE = -2.2 +/- 0.43%; FW = -2.5 +/- 0.62%). Measurements collected at each time point were urine volume, urine specific gravity, drink volume, and fluid retention. RESULTS: No significant differences (p > 0.05) existed between beverages for urine volume, drink volume, or fluid retention for any time-point.

    Treatment x time interactions existed for urine specific gravity (USG) (p < 0.05). Post hoc analysis revealed differences occurred between the FW and EA beverages (p = 0.003) and between the EC and EA beverages (p = 0.007) at 4 hours after rehydration. Wherein, EA USG returned to baseline at 4 hours post-dehydration (mean difference from pre to 4 hours post-dehydration = -0.0002; p > 0.05) while both EC (-0.0067) and FW (-0.0051) continued to produce

    dilute urine and failed to return to baseline at the same time-point (p < 0.05). CONCLUSION: Because no differences existed for fluid retention, urine or drink volume at any time point, yet USG returned to baseline during the EA trial, an EA supplement may enhance cellular rehydration rate compared to an EC or FW beverage in healthy men and women after acute dehydration of around 2% body mass loss. SPONSORSHIP: MusclePharm Corporation.

    Study strengths

    This study is innovative since its the first to compare the hydrating effects of a BCAA-electrolyte (AE) beverage with that

    of a carbohydrate-electrolyte (CE) beverage. Furthermore, the

    protocol involved a more realistic fluid dose than the typically

    massive fluid doses given in previous research examining

    rehydration beverages. Subjects were required to have a

    minimum of one year of endurance and resistance training

    experience, which minimized the chance of confounding

    newbie effects. This investigation is of relevance to trainees aiming to economize caloric intake which is often hiked by the

    carbohydrate content of conventional recovery beverages.

    Study limitations

    While this study may have relevance to those seeking to

    economize carbohydrate intake, such a population would be very

    sparse among trainees seeking to improve endurance-type

    performance. This is because the inclusion of carbohydrate

    would serve the dual purpose of driving better exercise

    performance, as well as faster glycogen resynthesis post-exercise

    (both of which can benefit competitive endurance sports especially those with multiple glycogen-depleting events per

    day). Missing from the comparison was a condition containing

    amino acids, carbohydrate, and electrolytes. However, the

    authors duly cite research by Lambert et al,17

    who found that in

    the absence of electrolytes, no significant differences in

    rehydration were seen between beverages containing versus

    omitting carbohydrate. This implicates electrolytes as the critical

    factor in rehydration (rather than carbohydrate, whose function

    would be limited to glycogen resynthesis). Still, potentially

    interactive or synergistic effects of a combination of carbs,

    electrolytes, and amino acids on hydration would have been a

    worthy condition to investigate in the present studys comparison. For example, chocolate milk has demonstrated

    effectiveness for rehydration, glycogen resynthesis, and muscle

    recovery, and is more nutrient-dense than typical commercial

    recovery drinks.18

    A final limitation was that the treatments were

    not equal in terms of potassium content (AE had the most).

    Comment/application

    The findings only partially agreed with the authors hypothesis going into the experiment. They originally predicted that AE and

    CE beverages would rehydrate similarly, yet to a greater extent

    than the flavored water (FW) beverage. Interestingly, quoting

    the authors: The AE and CE beverages rehydrated about equally; however, they were also equal to the FW beverage. However, they go on to mention a subtle detail that separated the

    CE beverage from the other two, rendering it superior. CE and

    FW yielded more diluted urine than AE, as indicated by urine

    specific gravity (USG), depicted below:

    At 4 hours post-dehydration, USG in the CE and FW trials was

    significantly lower than pre-testing, while AEs USG was the same as pre-testing at this time point. This suggests greater

    urinary diuresis and less cellular retention in CE & FW

    compared to AE. It should be noted that the measured

    differences in fluid retention between conditions were not

    statistically significant (43.5% in AE, 40.8% in CE, and 42.2%

    in FW). This begs the question of how clinically relevant these

    small differences really are, and how necessary or beneficial this

    special rehydration product is despite its inclusion of BCAA and absence of carbohydrate.

  • Alan Aragons Research Review May 2014 [Back to Contents] Page 17

    Calorie shifting diet versus calorie restriction diet: a comparative clinical trial study. Davoodi SH, Ajami M, Ayatollahi SA, Dowlatshahi K, Javedan G, Pazoki-Toroudi HR. Int J Prev Med. 2014 Apr;5(4):447-56. [PubMed]

    BACKGROUND: Finding new tolerable methods in weight loss

    has largely been an issue of interest for specialists. Present study

    compared a novel method of calorie shifting diet (CSD) with classic

    calorie restriction (CR) on weight loss in overweight and obese

    subjects. METHODS: Seventy-four subjects (body mass index

    25; 37) were randomized to 4 weeks control diet, 6 weeks CSD or CR diets, and 4 weeks follow-up period. CSD consisted of three

    phases each lasts for 2 weeks, 11 days calorie restriction which

    included four meals every day, and 4 h fasting between meals

    follow with 3 days self-selecting diet. CR subjects receive

    determined low calorie diet. Anthropometric and metabolic

    measures were assessed at different time points in the study.

    RESULTS: Four weeks after treatment, significant weight, and fat

    loss started (6.02 and 5.15 kg) and continued for 1 month of follow-

    up (5.24 and 4.3 kg), which was correlated to the restricted energy

    intake (P < 0.05). During three CSD phases, resting metabolic rate

    tended to remain unchanged. The decrease in plasma glucose, total

    cholesterol, and triacylglycerol were greater among subjects on the

    CSD diet (P < 0.05). Feeling of hunger decreased and satisfaction

    increased among those on the CSD diet after 4 weeks (P < 0.05).

    CONCLUSIONS: The CSD diet was associated with a greater

    improvement in some anthropometric measures, Adherence was

    better among CSD subjects. Longer and larger studies are required

    to determine the long-term safety and efficacy of CSD diet.

    SPONSORSHIP: None listed.

    Study strengths

    This is the first study to compare the effects of this particular permutation of a calorie shifting diet (CSD: 11 days restricted, 3 days unrestricted) pattern with a linear calorie-restricted (CR) diet. The investigation is an important one given the generally unimpressive weight loss and weight loss maintenance from conventional caloric restriction.

    19-21 The sample size (n=74) was

    fairly large, especially for diet research, which is notorious for its small subject numbers. More subjects translates to greater statistical power and less likelihood of by-chance occurrences. Study limitations

    The design included an intervention period as well as a follow-up period which is a good thing, its just that both periods were short (6-week intervention, 4-week follow-up). This essentially gives us hypothesis-generating pilot data rather than long-term data that we can lean on with greater confidence. Another limitation is the diet construction these are your typical, crappy research diets. Protein intake during the intervention phase was actually less than the subjects habitual intakes at baseline in both groups. The deficits were rather severe, but strangely, they were not equated. CSDs reduction was set at 45% of baseline reported maintenance, and CRs was set at 55% of maintenance. The more severe deficit in CSD may have imparted an advantage. Furthermore, the results of this study might be limited to the subject profile (obese, untrained). Bioelectrical impedance analysis (BIA) was used to assess body composition.

    Comment/application

    CSD outperformed CR on several parameters:

    CSD yielded greater weight loss at the end of the follow-up period but not the intervention period. Full body composition change details here).

    CSD yielded greater fat loss at the end of the follow-up period but not at the end of the intervention period.

    CSDs decrease in RMR was less than that of CR (which ended up being lower than baseline by the end of the study).

    CSD yielded greater decreases in glucose, total cholesterol, and triacylglycerol by the end of the study.

    CSD tended to yield lesser subjective feelings of hunger toward the end of the trial.

    CSD had a much higher subject retention rate; 36.8% dropped out of CR, and 15.6% dropped out of CSD.

    Overall, CSD trumped CR, especially by the end of the 4-week follow-up period. However, its very important to view these results in the proper perspective. It cant be overemphasized that this was a short intervention (6 weeks plus a 4-week follow-up). Ill also reiterate that the diets imposed upon both groups were far from optimal in terms of protein intake. Baseline protein intake of CSD was ~1.1 g/kg, and this dropped to ~0.9 g/kg during the intervention. Baseline protein intake in CR was ~1.1 g/kg, and this dropped to ~0.8 g/kg during the intervention. These protein intakes are approximately half of what has been repeatedly been shown to be a favorable and effective target for optimizing muscular adaptations to hypocaloric conditions.

    22-24

    Nevertheless, perhaps a case can be made for CSD over CR under conditions of subpar protein intake.

    Still, the biological plausibility of there being inherent advantages to the CSD pattern is questionable beyond its potential to bolster compliance, at least in the short-term. CSD had more rules and structuring, particularly during the 11-day cycles where 4 meals were strictly spaced 4 hours apart with no between-eating allowed. This may have raised subjects awareness and focus on the protocol, keeping them more compliant. In contrast, the linear calorie-restricted group could have been lulled into a monotonous grind conducive to the loosening of adherence over time. In contrast, the CSD group essentially took a 3-day diet break (consuming maintenance-level calories) after every 11-day block of dieting.

    The weight and fat loss benefits of CSD were not clearly apparent until the end of the follow-up period. Its thus easy to speculate that the 6 weeks of linear, aggressive caloric restriction may have been met with deprivation backlash during the follow-up period where the objective was to consume maintenance-level calories. Remember that in CR, 55% of baseline intake was subtracted, leaving subjects with 6 weeks of consuming 1186 kcal/day (down from 2432 kcal at baseline). An important indicator of the CSDs effectiveness was the doubly higher dropout rate in CR. The more favorable biochemical changes in CSD can be attributed primarily to the greater weight and fat loss at by end of the follow-up. The relative success of the 11/3 CSD model gives rise to the potential effectiveness of other more convenient and realistic non-linear models. For example, a 5/2 model, with 5 calorie-restricted days followed by 2 self-selected days, would mirrors a weekdays/weekend cycle which could potentially fit better into the common work schedule.

  • Alan Aragons Research Review May 2014 [Back to Contents] Page 18

    1. Lecoultre V, Egli L, Carrel G, Theytaz F, Kreis R, Schneiter

    P, Boss A, Zwygart K, L KA, Bortolotti M, Boesch C, Tappy L. Effects of fructose and glucose overfeeding on hepatic insulin sensitivity and intrahepatic lipids in healthy humans. Obesity (Silver Spring). 2013 Apr;21(4):782-5. [PubMed]

    2. Sobrecases H, L KA, Bortolotti M, Schneiter P, Ith M, Kreis R, Boesch C, Tappy L. Effects of short-term overfeeding with fructose, fat and fructose plus fat on plasma