4 - May - 2014

Alan Aragons Research Review May 2014 [Back to Contents] Page 1

Copyright May 1st, 2014 by Alan Aragon Home: www.alanaragon.com/researchreview Correspondence: [email protected]

2 Optimizing activity-based fat loss for aesthetic

athletes: Interval or steady-state training?

By Joel Minden, PhD, CSCS

5 How to manipulate research.

By James Heathers, PhD(c)

12 Changes in exercises are more effective than in loading schemes to improve muscle strength [reviewed by Brad Schoenfeld, PhD, CSCS, CSPS, FNSCA].

Fonseca RM, Roschel H, Tricoli V, de Souza EO, Wilson JM, Laurentino GC, Aihara AY, de Souza Leo AR,

Ugrinowitsch C. J Strength Cond Res. 2014 May 14. [Epub

ahead of print] [PubMed]

14 The effects of consuming a high protein diet (4.4 g/kg/d) on body composition in resistance-trained individuals.

Antonio J, Peacock CA, Ellerbroek A, Fromhoff B, Silver T. J Int Soc Sports Nutr. 2014 May 12;11:19. [PubMed]

16 An amino acid-electrolyte beverage may increase

cellular rehydration relative to carbohydrate-electrolyte and flavored water beverages.

Tai CY, Joy JM, Falcone PH, Carson LR, Mosman MM,

Straight JL, Oury SL, Mendez C, Loveridge NJ, Kim MP,

Moon JR. Nutr J 2014, 13:47 doi:10.1186/1475-2891-13-47

[PunMed]

17 Calorie shifting diet versus calorie restriction diet:

a comparative clinical trial study. Davoodi SH, Ajami M, Ayatollahi SA, Dowlatshahi K,

Javedan G, Pazoki-Toroudi HR. Int J Prev Med. 2014

Apr;5(4):447-56. [PubMed]

19 Processed foods - are they really that bad for you? By Chris & Eric Martinez

22 How can you get through to people who *think*

they understand the science behind a certain topic? By Alan Aragon


Optimizing activity-based fat loss for aesthetic athletes: Interval or steady-state training?

By Joel Minden

__________________________________________________

For aesthetic athletes, such as dancers, gymnasts, and

bodybuilders, managing body mass and composition is just as

important as sport-specific training. At a selected body weight,

fat mass should be minimized, and dietary strategies, such as

caloric restriction or macronutrient manipulation, are frequently

used to achieve this. For those who prefer to emphasize activity-

based methods to reduce body fat, the optimal strategy is

unclear. Although increasing activity to create a negative energy

balance should be the primary goal, there is considerable debate

concerning the differential effectiveness of interval versus

steady-state training. Perhaps the lack of consensus is due to the

fact that empirical research in this area is compromised by

methodological limitations and an inability to control, either

physically or statistically, for the numerous contextual variables

that cloud interpretability.

For example, research on acute metabolic responses to exercise

is sometimes criticized for the artificiality of the experimental

setting, limited time course, and uncertain relation of measured

variables (i.e., substrate utilization, gas exchange, plasma, and

biopsy data) to long-term changes in body composition.

Similarly, research on chronic responses to exercise has its own

set of limitations: individual differences in protocol compliance,

nonexercise activity, and dietary behavior; unknown accuracy of

subjects record keeping; and questionable reliability of

instruments used to track changes in body composition. Finally,

both acute and chronic outcome data should be interpreted

within the context of participant variables, including

demographic characteristics and fitness levels, and dimensions

of training protocols, such as modality, intensity, duration, and

frequency of exercise. In light of these factors, its no surprise

the efficacy debate continues.

Despite the many challenges to interpretability, consistencies in

the literature can be identified, and tentative conclusions can be

made by directly comparing the effects of multi-week interval

and steady-state training programs on body mass and

composition. Given the enthusiasm for interval training in both

scientific and popular media, its somewhat surprising that these

direct comparisons are limited. In the following section, Ill

present the results of these studies. For ease of interpretation,

data on strength training or diet-only conditions will not be

reported, nor will metabolic or cardiovascular outcome data.

Studies that compared interval training to no-exercise controls or

those that combined interval with steady-state training will also

be excluded. In all studies, interval training sessions, unless

otherwise noted, included 4 to 15 work intervals performed for

15 to 240 seconds, with each repetition followed by low- to

moderate-intensity periods of active recovery for up to 4

minutes.

The Research

In perhaps the earliest direct comparison, Thomas et al1 assigned

recreationally active male and female college students to steady-

state or interval running programs matched for energy

expenditure, 500 kcal per session. Exercise bouts were

performed 3 times per week for 12 weeks. After statistically

controlling for pre-intervention differences in body composition

(assessed through hydrostatic weighing), the data revealed that

subjects in both conditions experienced a reduction in body fat

percentage. There were, however, no differences between the

exercise conditions.

Following the emergence of research by Tremblay et al,2 steady-

state endurance training as a fat loss strategy was dismissed by

many as inferior to intense but brief interval training. In this

classic study, adults with no previous exercise history completed

either a 20-week endurance training program or a 5-week

endurance training program followed by 15 weeks of interval

training bouts that varied in duration and intensity.

Heralded as a breakthrough study two decades ago, the results

appeared to demonstrate a paradoxical advantage of brief

interval work for fat loss despite an energy cost well below that

of endurance training. Although frequently noted for its finding

that subcutaneous fat loss was ninefold greater for those in the

interval condition, this estimate was made after statistically

correcting for the energy cost of each type of exercise. When

actual fat loss between the two conditions was compared, the

difference was nonsignificant. Other aspects of this heavily cited

study make firm conclusions about fat loss differences by

protocol difficult: the undetermined reliability of skinfold data,

the inclusion of an endurance training component (25 30-minute

sessions) to the interval training program, and no control for

dietary behavior.

Years after the release of this promising study, additional

evaluations of interval training began to emerge, the bulk of

which failed to demonstrate any reliable advantage of interval

training. For example, in Tjnna and colleagues 16-week study

of metabolic syndrome patients,3 subjects exercised on inclined

treadmills, and work volume for the interval and endurance

conditions was equivalent. Both groups experienced reductions

in weight, BMI, and waist circumference, but no differences

between the groups were observed.

Trapp et al4 compared fat loss outcomes of a 20-minute interval

program and a 40-minute steady-state program, both performed


by young adult women on cycle ergometers 3 times per week for

15 weeks. Despite the difference in duration of exercise bouts,

estimated energy expenditure over the study period for the two

groups was equivalent. This was achieved by having subjects in

the interval condition perform 60 8-second intervals, followed

by 12-second recovery periods, in each session. The interval

training group, but not the steady-state group, experienced a

reduction in DEXA-measured fat mass (~2.5 kg) at the

completion of the study. This apparent intervention effect must,

however, be interpreted with caution due to pre-existing group

differences. At the beginning of the study, the mean fat mass for

the interval group was 3.8 kg greater than that of the steady-state

group, and follow-up analyses revealed that approximately of

the variance in fat loss was accounted for by level of body fat at

the beginning of the study.

Schjerve et al5 compared fat loss responses in obese adults to 12

weeks of interval or steady-state treadmill training performed 3

times per week. Conditions were equalized for energy

expenditure. Both groups experienced similarly small but

significant reductions in weight, BMI, and body fat percentage.

There were no differences between the conditions in these

outcomes.

Wallman et al6 examined the effects of 8 weeks of interval or

steady-state training performed by overweight and obese men

and women 4 times per week on a cycle ergometer. Energy

expenditure between the two conditions was equivalent. The

results yielded nonsignificant reductions in weight or fat mass

for both conditions.

Perhaps the greatest support for fat loss benefits of interval

training comes from MacPherson et al.7 In this study,

recreationally athletic college-aged men and women performed 3

weekly sessions of sprint interval training or steady-state

running for 6 weeks. Both groups experienced significant

reductions in body fat percentage and fat mass, as well as small

increases in lean mass. Although the interval group experienced

a larger total decrease in fat mass (1.7 kg vs. 0.8 kg), the

difference between the conditions was nonsignificant. In contrast

to the methods used in the aforementioned studies, MacPherson

et al. did not attempt to equalize work or energy expenditure,

which makes the difference in total exercise time across the

study period (13.5 and 0.75 hours for the steady-state and

interval conditions, respectively) noteworthy. Nevertheless,

subjects in the interval condition were encouraged to engage in

active rest on the treadmill for 4 minutes following each of

their maximal effort sprints, which resulted in a total activity

time commitment of 6.75 hours.

Recently, Keating et al8 compared fat loss outcomes for

overweight adults randomly assigned to either an interval or

steady-state cycle ergometer program. Both groups performed

exercise 3 days per week for 12 weeks. There was a significant

decrease in DEXA-measured body fat percentage for the steady-

state (-2.6%) but not the interval (-0.3%) group. The absence of

change for the interval group is somewhat unexpected, given

that the aforementioned studies found equivalent effects for the

two types of training. The authors indicated that this result may

be partially explained by the use of interval training bouts that,

to protect this clinical population, were less intense than those

used in previous studies. However, a comparison of protocols

shows the intensity of interval training for the Keating et al.

subjects (~120% of VO2peak and ~90% of maximal heart rate)

was consistent with those used in other studies (e.g., Schjerve,

Tjnna, Wallman and their colleagues) of overweight or obese

subjects.

An alternative explanation is that unmeasured subject variables

contribute to responsivity to exercise. Graphs from the Keating

et al. study show considerable within-group variability for both

exercise groups in body fat percentage change. In fact, some

subjects in both conditions actually gained body fat. This

highlights the importance of going beyond the aggregate data to

search for individual differences that distinguish responders

from non-responders.

Conclusion

Collectively, the data reveal that interval training offers no

reliable advantage over steady-state endurance training for fat

loss. In addition, the effectiveness of interval training is more

likely to be demonstrated when work or energy expenditure is

matched to that of steady-state protocols. This suggests that, in

spite of any acute metabolic or cardiovascular benefits of

interval training, intense but brief exercise is insufficient for

stimulating meaningful fat loss. This was indirectly highlighted

in Boutchers recent review of research in this area.9 Of the 6

interval training studies in which fat loss outcomes were

identified, the two cited (Boudou et al,10

Mourier et al11

) for

demonstrating the strongest effects included 2 days per week of

45-minute steady-state training bouts in a program with only one

interval-training day each week.

Regarding application, assuming energy intake is regulated,

activity-based fat loss programs should prioritize energy cost of

exercise and activity preference. For athletes already involved in

frequent and intense sport-specific training, activities that have a

negative impact on quality of practice and competitive

performance should be avoided. If interval training results in

poor program compliance, fatigue, overeating, and reduced daily

activity, alternative strategies should be explored. For aesthetic

athletes, a realistic fat-loss strategy might involve small dietary

changes combined with low- to moderate-intensity exercise,

such as uphill walking at a comfortable pace, performed for an


extended duration. In sum, although intense interval training has

value to the athlete, it may not be the best option for fat loss. In

the larger context of athletic training, a moderate, comfortable

approach offers the greatest chance for success.

____________________________________________________

Joel Minden, Ph.D., CSCS, is a lecturer in the psychology and

kinesiology departments at

California State University, Chico.

He writes about strength and

conditioning, nutrition, sport

psychology, and dance for his

website www.joelminden.com.

____________________________________________________

References

1. Thomas, T. R., Adeniran, S. B., & Etheridge, G. L. (1984). Effects of different running programs on VO2 max, percent fat,

and plasma lipids. Canadian Journal of Applied Sport Sciences,

9(2), 55-62. [PubMed]

2. Tremblay, A., Simoneau, J. A., & Bouchard, C. (1994). Impact of exercise intensity on body fatness and skeletal muscle

metabolism. Metabolism, 43(7), 814818. [PubMed] 3. Tjnna, A. E., Lee, S. J., Rognmo, ., Stlen, T. O., Bye, A.,

Haram, P. M., Loennechen, J. P., Al-Share, Q. Y., Skogvoll, E.,

Slrdahl, S. A., Kemi, O. J., Najjar, S. M., & Wislff, U.

(2008). Aerobic interval training versus continuous moderate

exercise as a treatment for the metabolic syndrome: a pilot

study. Circulation, 118(4), 346354. [PubMed] 4. Trapp, E. G., Chisholm, D. J., Freund, J., & Boutcher, S. H.

(2008). The effects of high-intensity intermittent exercise

training on fat loss and fasting insulin levels of young women.

International Journal of Obesity, 32(4), 684691. [PubMed] 5. Schjerve, I. E., Tyldum, G. A., Tjnna, A. E., Stlen, T.,

Loennechen, J. P., Hansen, H. E., Haram, P. M,, Heinrich, G.,

Bye, A., Najjar, S. M,, Smith, G. L., Slrdahl, S. A., & Kemi,

O. J., Wislff, U. (2008). Both aerobic endurance and strength

training programmes improve cardiovascular health in obese

adults. Clinical Science, 115(9), 283293. [PubMed] 6. Wallman, K., Plant, L. A., Rakimov, B., & Maiorana, A. J.

(2009). The effects of two modes of exercise on aerobic fitness

and fat mass in an overweight population. Research in Sports

Medicine, 17(3), 156170. [PubMed] 7. Macpherson, R. E., Hazell, T. J., Olver, T. D., Paterson, D. H.,

& Lemon, P. W. (2011). Run sprint interval training improves

aerobic performance but not maximal cardiac output. Medicine

and Science in Sports & Exercise, 43(1), 115-22. [PubMed]

8. Keating, S. E., Machan, E. A., O'Connor, H. T., Gerofi, J. A., Sainsbury, A., Caterson, I. D., & Johnson, N. A. (2014).

Continuous exercise but not high Intensity interval training

improves fat distribution in overweight adults. Journal of

Obesity, 2014. [Journal of Obesity]

9. Boutcher, S. H. (2010). High-intensity intermittent exercise and fat loss. Journal of Obesity, 2011. [Journal of Obesity]

10. Boudou, P., Sobngwi, E., Mauvais-Jarvis, F., Vexiau, P., & Gautier, J. F. (2003). Absence of exercise-induced variations in

adiponectin levels despite decreased abdominal adiposity and

improved insulin sensitivity in type 2 diabetic men. European

Journal of Endocrinology, 149(5), 421-424. [PubMed]

11. Mourier, A., Gautier, J. F., De Kerviler, E., Bigard, A. X., Villette, J. M., Garnier, J. P., Duvallet, A., Guezennec, C. Y., &

Cathelineau, G. (1997). Mobilization of visceral adipose tissue

related to the improvement in insulin sensitivity in response to

physical training in NIDDM: effects of branched-chain amino

acid supplements. Diabetes Care, 20(3), 385-391. [PubMed]


How to manipulate research.

By James Heathers

_________________________________________________________________

Most of the audience for this article probably pays attention to

the broader scientific literature in exercise and musculoskeletal

physiology, strength and conditioning, nutrition, dietetics and

sports medicine. From this, you take the available evidence and

you slot it somewhere into an available framework of what's

already known. This, everyone is familiar with.

What people generally dont know is how to cheat.

Yes, cheat. Let me outline why you would: the way academic

funding presently works is that, in general, output is rewarded over insight two papers are better than one. So, the more you write, the better off youre going to be. There are many problems with this, and the environment it creates. One of those problems

is that it becomes very tempting to 'massage' results from

different research projects in order to achieve reportable

outcomes. I should mention here that the majority of the time

this isnt actually dishonesty its the fact that researchers have convinced themselves that theyve asked a good question, and that if they just change a few key variables with the analysis and

reporting, suddenly theyll have the result that they know is there. And when that result turns up, it was because the initial

analysis was wrong.

Unfortunately, science doesnt work like that. Much of my academic work is in dealing with problems surrounding this

issue; I am a methodologist. This means I concentrate heavily on

how research should be conducted essentially, research into research. Methodologists develop new techniques in analysis,

and verify that old ones work in the manner we hope they do.

Think of the production of knowledge via academic outcomes as

a game of poker. Research, like poker, is an expensive,

stochastic process full of frustration, late nights, and alcohol but, also like poker, eventually if youre good enough, the balance of probabilities favour you winning. Insight, like money,

is hard won.

This process comes to a scrunching halt when someone starts to

obscure the honest truth of what happened in a study, because

there is no skill or reason that can be applied. You literally cant win, because the odds of something being supportable or

repeatable are being manipulated.

However, just like poker, there are 'tells' certain signs which allow you to detect another process at work. This is a partial list

of those tells, illustrated with examples drawn liberally from the

medical and social sciences. Ive tried to use exercise science and nutrition studies where convenient, but the principles are the

same regardless often Ive simply chosen the most convenient examples that have come to mind. Please bear in mind I dont think these papers are guilty of any kind of conscious

dishonesty, they are merely convenient examples of the

principles involved.

This list is not comprehensive and is in no particular order some of them work over time across different papers, some of

them are specific to individual papers. These errors are both

common and uncommon, both serious and trivial. They have

various degrees of culpability (likely intent to deceive),

significance (the ability to influence the outcome of the study

overall), and detectability (how easy it is to spot from the article

text). All have the potential to be dishonest.

1. Altered endpoints, timepoints or measurement criteria

Murphy et al1 investigated the effect of beetroot consumption on

running performance due to their nitrate content. n=11 received

a supplement of either beetroot (standardised to contain 500mg

nitrates) or cranberry puree, in a double-blind cross-over

fashion. Their heart rate, perceived exertion and time to

completion of a 5km run was recorded. In the first mile,

participants rated their perceived exertion significantly higher in

the cranberry condition. In the final 1.8km, participants were

significantly faster in the beetroot condition.

Why was a 5km broken into miles? i.e. 0 1.6km, 1.6 3.2km, 3.2 5km.

There are an infinite number of ways to divide a time interval

into pieces. This analysis could have been performed as single

kilometre intervals, or using a simple statistical model which

predicts the overall effect of time through the race on exertion,

and the overall difference between the groups by time. There is

no reason to use a unit of measurement invented by the ancient

Romans and formally defined in 1593.

Researchers are well aware that trying different assortments of

time intervals can uncover differences between timepoints due to random variation. Say we split the data into 100m intervals there are now 50 separate comparisons over the 5km where we

can analyse the difference between our Beetroot and Cranberry

groups. We are essentially making so many comparisons that

one will be true due simply due to the noise present in the

measurement.

(Of course, there are methods for statistically controlling

multiple comparisons2 but researchers don't report all the

comparisons they used... in this case, the reader doesn't know

that these multiple comparisons need to be controlled.)

The other extreme is also a problem. Say we analyse the dataset

only over the whole 5km, but beetroot consumption improved

the finishing speed of the run. This would be a highly significant

finding, as we know that in middle/long distance races there is

already a pattern between laps or race phases (e.g., Tucker et

al3).

Culpability: low to medium

Significance: medium

Detectability: high

2. Conveniently one-sided significance testing 3. Methodological fiddles

(These are not always associated, but theyve been so neatly combined in a paper from a few years ago that Ive put them together here.)

Christian et al4 enrolled n=279 in a computer-support program

for weight loss at an American public hospital. All participants


were suffering from metabolic syndrome. Participants were

given a full health and blood screening 12 months apart, and

were assigned to either a computer-based tailored lifestyle

intervention or a standard package of information on weight

management. Participants were more likely to lose weight in the

intervention vs. control group (-3.3lbs vs 0.33lbs, p=0.002).

Participants who lost more than 10% BW had lower total

cholesterol (-14.9 vs -3.9, p=0.05) which appears to be driven by

the loss of LDL cholesterol (-14.0 vs. -4.1, p=0.04).

Why was the outcome of the program determined by a group of people who lost 5% of bodyweight which included BOTH members of the intervention and the control group?

This is the methodological fiddle there were n=46 participants who lost more than 10% BW and n=11 of them were from the control group (and thus n=35 from the intervention). These were

lumped together to create the impression that the program was

effective. This is hardly the most honest conclusion when about

a quarter of the people with significant weight lost were from the

control group it would be more true to say here that people with metabolic syndrome who lose weight improve their serum

lipids regardless of how they do it. This is hardly evidence in favour of the intervention.

The other fiddle is staring us in the face from the p-values

above

Why was the above difference assessed with a one-sided t-test?

As were all probably aware here, the p-value is the calculated probability of getting the observed result if the null hypothesis is

true. In this case, this is that the standard intervention and the

computer-based intervention were identical. We accept that

when a result is sufficiently unlikely to have occurred on this

basis, that the experimental hypothesis is true in other words, that our intervention has actually intervened.

One-tailed statistical tests assume that this process has a

direction that the effect will have a direction (i.e. A will be higher than B). This gives you twice the statistical flexibility than you otherwise might have in a two-tailed test.

There are a few situations where one-tailed tests are necessary.

Firstly, when we have strong directional hypothesis: good

evidence than our intervention should be better than the control

group. In this case, we do not the researchers mention previous work with the same intervention being only somewhat effective

in a diabetic sample. Secondly, when we are using very few t-

tests to compare different values. In this case, we do not the researchers have around twenty individual tests.

However, these are not hard and fast rules, and researchers often

have another rule of thumb which simply goes like this: one tailed tests are what you use when youre trying to get something to achieve a criteria of significance when it hasnt quite made it. They have traditionally found refuge in questionable results, and

as weve just discussed, theyre being used here to assess the difference between did lose weight and didnt lose weight regardless of group. A classic fiddle, and one the reviewers

really should have spotted.

Conveniently one-sided tests.

Culpability: medium to high

Significance: low to medium

Detectability: high

Methodological fiddles

Culpability: medium


Detectability: medium to high

4. Overly complicated or uninterpretable models

Another rather impressive looking technique is to take individual

measures which are quite complicated and roll them into a

model far more complicated than the average reader can

understand. Social scientists try this much more than exercise

physiologists, in my experience. But it does occur.

A recent paper5 studied the split times of 2 world-record

marathon runs, most recently Patrick Makaus Berlin Marathon (2011) which was a scarcely believable 2 hours, 3 mins and 38

seconds. It describes several different curve fits possible to these

runs, combines headwind and gradient data with individual

kilometre time splits, and tries to find an optimal model or

pacing strategy.

Towards the conclusion it states:

Oscillations at the micro-level overlay low-frequency, macro-level oscillations or modes indicating that an athletes resulting pacing trace represents a potentially complex amalgam of numerous signalling processes emanating from the brain, each with their own activation frequency.

Of course, concluding that the best ever marathon times are

employing highly sub-optimal pacing strategies seems wildly

implausible because of the extraordinary amount of competition over such a long period of time, one might assume

that either a) the best times ever were, in fact, fairly well paced

by definition or b) that an optimal strategy doesnt exist due to individual differences that are impossible to predict (a stubbed

toe, a very slightly tight hamstring, a bad nights sleep, a micro-change in gradient, and so on). An optimal pacing strategy cant be followed, of course, if its highly impractical. That is essentially stating If X was possible, then it would be better in an environment where X cant be practically be performed.

Culpability: low


Detectability: high

5. Over-testing, a.k.a. random sifting

Ive thought long and hard about how to get you an example of this, and Im not sure I can. Heres how random sifting works:

We decide to measure the effect of a new training regime of volume squats on short-course track times.

We assign 30 experienced middle-distance runners equally to three groups no extra training, 1 extra


training day of squats per fortnight, and 3 extra training

days per fortnight.

We take demographic variables to start with (age, gender, race), anthropometry (height, weight, BMI,

body composition), bloods (c-reactive protein, cortisol)

and training readiness (neurological assessment, heart

rate variability).

We take race variables (400m time, 3 times w. 5 mins rest between races), and 1500m time (with lap split

times). Participants rate perceived effort and

pain/soreness after each race. Then we run the program

for 6 weeks, test all the above again (mid-line) and test

again at 12 weeks. Naturally, we record the poundages

moved in each session for the two training groups.

Not the worst design ever, right? Comprehensive, detailed?

Wrong. Its dreadful.

This is the most unholy octopus of impossible interlocking

variables youll ever see. Any one of the above can be used to control for, or combine with, any another. Variables you add to a

study are not additive: if I measure seven things at one

timepoint, I dont have seven potential comparisons in the data. I have instead any combination of the presence or absence of

those variables, using a cut-off that I define (or choose from the

literature), or using the top or bottom standard deviation, or all

the values over the mean (or the median) to define groups.

With the full access to the above information in my hypothetical

studies, there are so many ways you can find to combine the

outcomes that the answers that you will find are bordering on

meaningless unless the results you find are statistically very strong. Make no mistake: if I had the above dataset, I am 100%

entirely confident that I could produce a set of statistical

analyses which conclusively showed that our squat intervention was effective. Even if our squat intervention did literally nothing

or even made performance worse.

The only trick is to hide all the analyses that didnt work, then write up the one analysis which worked by pure chance as being

predicted by specific research questions that we started with.

This is formally called post-hoc reasoning and very hard to detect. After you test hundreds or thousands of pathways

through the above variables and find that, say, any squat intervention (1 or 3 sessions per fortnight) is effective on split

times in 1500m but not total times, and reduces perceived effort

but only in men, you then come up with a reason which specifically addresses why you might find this (and you choose

past literature to reference accordingly).

The behavioural economist and statistical guru Uri Simonsohn

has a now-classic paper which conclusively proves that listening to the song When Im 64 actually makes you older.6 Obviously, this is a crazy conclusion because a song cant modify your age, but it is borne out of the analysis that he conducted simply by

hiding all the analyses which didnt work.

He also has a great statement that he encourages reviewers to

send to every paper they peer-review which goes like this:

"I request that the authors add a statement to the paper confirming whether, for all experiments, they have reported all measures, conditions, data exclusions, and how they determined their sample sizes."

In other words, if the researchers have tested hundreds or

thousands of models trying to find a result, they need to report

the fact that they did so. This statement forces researchers to

either a) assent to the statement and upgrade their untrustworthy

analysis to outright fraud or b) admit that over-testing occurred.

The best way of controlling for this rather insidious and hard-to-

detect method is study pre-registration this is where the researchers write and publish a formal prediction of their study

outcome before they start the research. Its not a perfect solution, but its much better than the alternative.

Culpability: medium to high

Significance: medium to high

Detectability: low

6. The creeping over-extrapolation

This fiddle is a little different to the others, as it involves the

external perception of the study. Its also very common, so common that it took me about 45 seconds to find this example.

The science journalism site sciencealert.com.au ran this rather

bold headline a month or so ago.

Depression can be detected with a blood test

Interesting, right? Heres the subheadline, now that has your attention.

Doctors may soon be able to diagnose mental illness

with a simple blood test, new research suggests.

Sounds like a breakthrough, right? Not so fast. The title of the

article its describing is:

"Platelet Serotonin Transporter Function Predicts Default-Mode Network Activity"7

Heres the glossy and rather tortured logic that connects them:

The serotonin transporter protein removes serotonin from

extracellular space. The main method of this is via the

transporter protein on blood platelets. There is also a good

relationship between this platelet uptake and the synaptosomal

uptake (the uptake by areas of the brain).

Separately, there is a relationship between depression and the

activity of the default-mode network in the brain a coordinated system of activity which is active at rest and seems

to be implicated with receiving and processing information

which is self-referential. It is hypothesised that this network is

disrupted in depressed people, which is the hypothetical source

of intrusive thoughts and poor concentration in depression.

Finally, we know that serotonin is implicated in depression as

serotonin reuptake inhibitors are a frontline treatment for


depression. That is to say, like most psychotropic medication,

they work sometimes in some people. We also are well aware

that while they fairly straightforwardly increase free serotonin

levels this is probably NOT their primary method of action

(otherwise, why would these drugs which raise serotonin in 20

minutes take weeks to start improving mood in depressed

patients?)

But anyway: if we can measure the blood platelet serotonin

reuptake velocity (related to the same function in the brain), it

might be related to the metabolic activity of the brain by the

default-mode network (impaired in depression; serotonin

implicated in function).

So the researchers took a sample of healthy people and found a reasonable relationship between their blood platelet serotonin

uptake with the function of the default-mode network as

measured by blood-oxygen level dependent fMRI scan.

And finally, please recognise that the above is itself a simplification.

This is what gives us depression detected with a blood test.

I understand, of course, that journalism sensationalises

complicated topics like neurobiology. But the obvious caveat to

that it really shouldnt simplify something so much that is isnt reasonably true anymore. And why would they do such a thing? Well, partly because its their job, but also partly because the researchers put out a press release with exactly the same

headline, containing wonderfully compelling but detail-poor

sentences such as serotonin transporter regulates neural depression networks.

This is creeping over-extrapolation. You start with a result

which, as far as I can tell, is a fairly solid piece of neurobiology

relating brain oxygen level uptake over certain cortical networks

to measured platelet serotonin uptake in the blood. Then you

write a paper discussion and abstract which extrapolates the

results somewhat, talking about what might be possible in future

(if several important caveats are true). About this, you write a

simplified press release which presents the results in a glowing

light and presents those extrapolations as the point of the paper.

Then you let a journalist with no formal science education write

about it.8

Im including this as an error researchers make because its the 21

st century, and researchers have an obligation to ensure that

their research is correctly reported. It is common for researchers

trying to justify the external impact of their work in grant applications to collect these lazy, overwhelmingly positive

stories and list them prominently on their CVs. Be cautious of

any academic who is proud of how many newspaper articles are

written about them.

Culpability: medium

Significance: medium to high

Detectability: high

7. Outlier forgetting 8. Outlier remembering.

Again, these are together because they are closely related.

Outlier remembering

Reger et al9 matched a controlled dose of medium-chain

triglyceride (MCT) oil in n=20 Alzheimer's Disease patients, to

see if the presence of blood ketones had an immediate effect on

cognition. I have reproduced the graph of the central result here

on the left it shows that an increase in performance on a cognitive task was correlated with increase in blood ketones or was it?

That point you can see on the left-hand side of the left-hand side

graph with the big arrow represents a participant who performed

much more poorly on the cognitive task after MCT oil than after

placebo. This is the dead-set opposite of what was predicted, a

decrease in performance a few times bigger than the alleged increases in performance observed in other people. Theres no good reason for this to happen, and its both in the opposite of the predicted direction and dramatically in excess of everyone

elses change scores.

Now, there are several tests which determine whether or not a

value is an outlier some researchers simply do this by feel, but the more correct way is with a test which compares the value

to the rest of the sample. The most common version of this is

Grubbs test10 and this flags that value as being an outlier.

Why was the outlier left in?

When this value is removed, the level of statistical significance

drops from p=0.02 to p=0.08, and reduces the r value (the

correlation coefficient) from 0.5 to 0.42. In other words, it

waters down the impact of the central finding. While it isnt actually a big difference, it does cast doubt on the central

result.11

As you can probably tell from this, outliers being included are

very easy to spot. Even when only the means and standard

deviations of numbers are reported, it's usually obvious when

something is off.

Outlier forgetting

Its hard to find an example of outlier forgetting (the removal of extreme values which disagree with the theory to improve the


central result) for the simple reason that they arent there to find! There are some sophisticated methods you can try to determine

if there is enough variation in a sample, but until Im writing for Alan Aragons Statistical Review, well have to let these slide.

Suffice to say, this can be a real problem. If you selectively

remove values which ruin your result, it very quickly runs the

risk of becoming straightforwardly dishonest. This is why I don't

have an example of one all I have is an example of where someone didn't do it.

You can see a good example of this recently. Kogan et al12

examined the relationship between heart rate variability (HRV) the same kind we use for athletic monitoring and depression / social functioning. They found some values which were outliers,

and repeated the analysis with outliers both out and in, and then reported the separate models. This is definitely the honest way to do business if you're removing values, the fact that youre doing it, what the values are, and what this changed about the

analysis should ALL be reported in the paper.

Remembering:

Culpability: medium


Detectability: very high

Forgetting:

Culpability: high

Significance: high

Detectability: low

9. 'Cute' covariates

Arai et al13

looked at the inter-relationship between heart rate

variability (the same kind we use for athletic monitoring) with

QT-interval (another metric of health/autonomic outflow which

we get out of the electrocardiogram) with a sea of possible

covariates in n=150 young participants.

If I criticised everything in this paper which I didnt like, I would bore you more than is strictly necessary and wear my

fingers down to stumps. So lets leave the criticisms like the incorrect use of the analysis of covariation to one side, and just

concentrate on what might be useful for you: how to spot a

dodgy covariates.

There are several tells here. Firstly, the presence of a lot of covariates and models for a simple question. Here, 9 measures of

different heart rate indices are compared with seven possible

covariates. As before with our hypothetical squat study, a lot of

possible comparisons is a red flag.

Secondly, the use of covariates which are not statistically

independent. For instance, there are models in the paper which

use BMI in the same model as body fat percentage as measured by impedance. These numbers will obviously be related, and

inter-relationships between these variables complicate our ability

to understand the study outcomes dramatically.

Lastly, the use of broad appeals. The paper justifies adding

covariates of BMI and fat mass into the sample because it was

relevant elsewhere (because obese and overweight people often

have impaired HRV, for instance). But this sample Arai et al use

is drawn from Japanese students at a school of medicine the female sample has a mean BMI of 20.1 and a standard deviation

of 2.1. This means that of the 86 women, it is likely that only one participant or even absolutely no participants at all were even overweight (let alone obese). Their comparison paper was drawn

from a sample in Mexico which, as you might be aware, holds the dubious honour of being the worlds most obese country.

Regression is a complicated topic, and its very easy to hide dodgy techniques behind a wall of metrics and numbers.

Researchers and reviewers fail to understand the implications of what theyre doing with a concerning regularity.

Culpability: very high

Significance: very high

Detectability: low

10. Conflating statistical and practical effects

DeWall et al14

tested n=93 undergraduates on the Intimate

Partner Violence scale and Trait Physical Aggression scales in

two groups, who received either a placebo or an intranasal dose

of the hormone oxytocin and a priming condition where they underwent painful / stressful tasks. The paper strongly concluded

that oxytocin increased intimate partner violence inclinations in

participants who were high in trait physical aggression.

Now, this may be strictly true in the statistical sense the results are probably calculated correctly. But does X is mathematically different to Y have any meaning in this context?

The Intimate Partner Violence scale is a series of charming items

where people are asked to score their likelihood of slapping,

shoving, hitting, kicking etc. their current romantic partner. It is

ranked from 1 not at all likely through to 5 extreme likely, and then averaged. The problem here is the whole group in the

study had an average of 1.13 (SD = 0.39).

I tried to model this, and its impossible to predict well (remember that no scores can be below 1). Probably two-thirds

of the entire sample in ALL groups put not at all likely for every single possible answer. The entire sample could be driven by

some combination of a) the very few people who reported some

vague likelihood of violence, and b) the fact that some of the

groups have no mathematical variability AT ALL everyone put the same answer. In psychology this is called a floor effect, and it has the potential to make analyses do awfully strange

things.

As this is a social science example, lets cast the same scenario into a hypothetical exercise science study:

Say we have a new supplement which is designed to decrease

post-exercise pain. N=80 participants firstly take either our

supplement or a matched placebo, then all perform a high-

volume high-intensity deadlift program, doing sets of 85% 1RM

to concentric failure, and then 80%, 75%, etc. until total


concentric failure with 50% 1RM is reached. They then rate their

lower back and hamstring pain 48 hours after exercise. More or

less everyone writes 10 I am in the maximum amount of imaginable exercise discomfort mainly because this is an insane protocol which shouldnt be attempted. But a few people in our supplement group write 8 I am in a very, very large amount of pain.

Now, can we accept that this is a meaningful difference? Well,

with hundreds and hundreds of participants, maybe. But it is far

more likely to be semantic we hurt our participants very badly and seem to only be fiddling at the margins of the value of

interest. What we were looking for was the absence of pain, and not the presence of very slightly less.

Our domestic violence questionnaire is the other way around statistical significance or not, the change from, say, extremely unlikely to quite unlikely may not be particularly useful at telling us about actual aggressive tendencies.

Culpability: high


Detectability: high

Bonus: Making up data

I have to include this although it isnt really a manipulation in the way other things are its fraud! Shang and Hasenberg15 investigated the effect of exercise training subsequent to Roux-

En-Y gastric bypass (i.e. stomach surgery). N=60 morbidly

obese participants were randomised to receive either once or

twice-weekly exercise training. Significantly more body weight

and fat mass was lost in the multiple-exercise group, who also

showed significant improvement in co-morbidities.

The problem here is that none of this actually happened.

Someone from either the hospital or associated research group

noticed that in the location where the data was reported from

only n=21 patients had actually undergone any procedure at all

in the period the paper was written over the data, as it stood, couldnt exist! On questioning, Dr. Shang couldnt produce any of the raw data and had no answer for where it had come from.

Naturally, this paper is retracted.

Culpability: very high

Significance: very high

Detectability: very low

Conclusions:

Please keep in mind firstly that researchers arent science-robots from an alternate dimension, theyre people. Theyre people with children and mortgages, and research programs which have to work out so they can continue to be funded, in highly

competitive jobs, often competing against people who are

willing to bend publication requirements to look better.

Research isnt by any means a hotbed of fraud and deceit.

That being said, researchers even from famous and venerable institutions can also be stunningly ignorant of the sub-structure

of the research methodology they need to understand, can make

basic mistakes in analysis, can deceive themselves, and can

cheat, manipulate or defraud the process of producing scientific

knowledge.

The thing that we have in our favour in trying to ascertain the

presence of the above is that science is the pursuit of knowledge

on the public record. Anything thats fiddled, or dishonest, or under-handed, or incorrect, can only ever be hidden in plain

sight, and in general the ideas that everyone agrees are the most

important receive the most scrutiny. This might sound laudable,

but it is anything but straightforward. Progress lurches along

quite slowly. There are a few things that you, the interested

reader (or perhaps peer-reviewer) can do to help, and to satisfy

your own curiosity.

1. Contact the researchers. Ask for data.

Researchers, in general, like to talk about their work. Generally

the person who is on the paper as the corresponding author is the right person to ask about it. However, be aware when the last author in the list of authors is listed as corresponding this generally means the most senior person on the project is also the

person youre contacting, who is also often the busiest.

(In situations like this, I generally Google the first author and

ask them if they can help...)

Researchers can be notoriously precious about sending their data

to other people. This isnt just because theyre afraid of scrutiny or persecution (they often are). Its also because data files can be a complete mess after the completion of a study, in three

different files (with different versions) only comprehensible to a

co-author, and squirreled away on a university server with a

password known only to the research assistant who quit 9

months ago. What youre asking could represent a big investment of time on the part of the researchers. But you can

always ask.

2. Support efforts to put data in the public domain

This is a big component of whats called open science the trend towards publishing datasets with experiments, as well as

analytical tools etc. that are used. Remember that people who do

this are extending what until now has been a privilege, which is

the ability to look under the hood of how a study works. I feel strongly that researchers who publish data earn an extra degree

of trust.

3. Post on pubpeer or PubMed Commons

These are both websites where you can leave comments for the

public record on published research. If you want answers for

questions that you have, they are very useful. To get access, I

believe you need either an academic email address (i.e. one from

a tertiary institution) or an invitation from an existing user.

4. Start a conversation

A few years ago, I was very amused when Alan was arguing

with Dr. Robert Lustig of sugar is evil fame, and was told


rather huffily that academics do not have head-to-head

confrontations on blogs, social media, forums, etc. I was amused

because they damn well do all the time, and at great volume. There are plenty of outlets for legitimate questions about

research which arent the old, formal methods if you know someone with a public blog, ask them to start a conversation for

you. Or start one yourself. Invite the researchers to comment.

Remember with all of the above to be courteous and show

interest, rather than trying to storm the ramparts. Everyone is

looking for answers, but some are looking better than others.

____________________________________________________

James is just about to finish a

PhD in cardiac electrophysiology.

In his spare time, he breaks

things for money. Everything else

you need to know is here:

jamesheathers.com

____________________________________________________

References:

1. Murphy, M., K. Eliot, et al. (2012). "Whole beetroot consumption acutely improves running performance." J Acad

Nutr Diet 112(4): 548-552. [PubMed]

2. http://en.wikipedia.org/wiki/Bonferroni_correction 3. Tucker, R., M. I. Lambert, et al. (2006). "An analysis of

pacing strategies during men's world-record performances in

track athletics." Int J Sports Physiol Perform 1(3): 233-245.

[PubMed]

4. Christian, J. G., T. E. Byers, et al. (2011). "A computer support program that helps clinicians provide patients with

metabolic syndrome tailored counseling to promote weight

loss." J Am Diet Assoc 111(1): 75-83. [PubMed]

5. Angus, S. D. (2014). "Did recent world record marathon runners employ optimal pacing strategies?" J Sports Sci

32(1): 31-45. [PubMed]

6. Simmons, J. P., L. D. Nelson, et al. (2011). "False-positive psychology: undisclosed flexibility in data collection and

analysis allows presenting anything as significant." Psychol

Sci 22(11): 1359-1366. [PubMed]

7. Scharinger, C., U. Rabl, et al. (2014). "Platelet serotonin transporter function predicts default-mode network activity."

PLoS One 9(3): e92543. [PubMed]

8. And then someone who doesnt even understand the journalism uses it in an argument on the internet!

9. Reger, M. A., S. T. Henderson, et al. (2004). "Effects of beta-hydroxybutyrate on cognition in memory-impaired adults."

Neurobiol Aging 25(3): 311-314. [PubMed]

10. http://en.wikipedia.org/wiki/Grubbs'_test_for_outliers

11. In statistical terminology, this is only an outlier on the x-axis and it's in the right place so technically it's a point of leverage not an outlier.

12. Kogan, A., J. Gruber, et al. (2013). "Too much of a good thing? Cardiac vagal tone's nonlinear relationship with well-

being." Emotion 13(4): 599-604. [PubMed]

13. Arai, K., Y. Nakagawa, et al. (2013). "Relationships between QT interval and heart rate variability at rest and the covariates

in healthy young adults." Auton Neurosci 173(1-2): 53-57.

[AN/BC]

14. DeWall, C.N., O. Gillath, et al. (2014). When the Love Hormone Leads to Violence: Oxytocin Increases Intimate

Partner Violence Inclinations Among High Trait Aggressive

People Soc Psych Pers Sci, Published online Feb 12th. [SPPS]

15. Shang, E. and T. Hasenberg (2010). "Aerobic endurance training improves weight loss, body composition, and co-

morbidities in patients after laparoscopic Roux-en-Y gastric

bypass." Surg Obes Relat Dis 6(3): 260-266. [PubMed]


Changes in exercises are more effective than in loading schemes to improve muscle strength [reviewed by Brad Schoenfeld, PhD, CSCS, CSPS, FNSCA].

Fonseca RM, Roschel H, Tricoli V, de Souza EO, Wilson JM,

Laurentino GC, Aihara AY, de Souza Leo AR, Ugrinowitsch C.

J Strength Cond Res. 2014 May 14. [Epub ahead of print]

[PubMed]

____________________________________________________

BACKGROUND/PURPOSE: This study investigated the

effects of varying strength exercises and/or loading scheme on

muscle cross-sectional area (CSA) and maximum strength after

four strength training loading schemes: constant intensity and

constant exercise (CICE), constant intensity and varied exercise

(CIVE), varied intensity and constant exercise (VICE), varied

intensity and varied exercise (VIVE). METHODS: Forty-nine

individuals were allocated into five groups: CICE, CIVE, VICE,

VIVE, and control group (C). Experimental groups underwent a

twice a week training for 12 weeks. Squat 1RM was assessed at

baseline and after the training period. Whole quadriceps muscle

and its heads CSA were also obtained pre- and post-training.

RESULTS: The whole quadriceps CSA increased significantly

(p


for sure that the same findings would be seen in a well-trained

population. Indeed, if the authors hypothesis that changing the

rep range had a negative effect on neural drive is in fact correct,

it could alternatively be hypothesized that this detriment would

not occur in more experienced subjects since neural adaptations

would already be well-ingrained.

One issue that can be raised with the design is that the rep range

employed for the varied intensity groups (6-10 reps per set) was

fairly narrow. It would be difficult to imagine that changes in

muscle growth would have been significantly different using

such a narrow range over the course of a few months. What

would have been more interesting from a hypertrophy

standpoint, IMO, is if the rep range had of encompassed a low

rep condition (i.e. 5 reps), moderate rep condition (10 reps) and

a high rep condition (15 reps). Based on the concept of the

strength-endurance continuum, comparing a constant intensity of

10 reps per set versus a varied intensity of 5-10-15 reps per set

would have made more sense to see if muscle hypertrophy

differs along this continuum.

Ultimately the study provides intriguing findings that have

practical implications for training. Most importantly, it

reinforces the need to vary exercise selection to maximize

muscular symmetry as well as strength. It also suggests that,

from a maximal strength standpoint, limiting variation in

intensity of load is beneficial during the early stages of training.

Ideally this study should be replicated, perhaps with wider

intervals in rep range, in well-trained subjects to provide better

generalizability for those with lifting experience.

____________________________________________________

Brad Schoenfeld, PhD, CSCS, CSPS, FNSCA, is a

lecturer in the exercise science department for

Lehman College and is the head of their

human performance laboratory. His primary

research interests focus on elucidating the

mechanisms of muscle hypertrophy and their

application to resistance training. He has

published over 40 peer-reviewed journal

articles and currently serves on the Board of

Directors for the NSCA. He is author of the

book, "The M.A.X. Muscle Plan" which is

available at all major bookstores and on

Amazon.com. He maintains an active blog on his website:

http://www.lookgreatnaked.com/


The effects of consuming a high protein diet (4.4 g/kg/d) on body composition in resistance-trained individuals.

Antonio J, Peacock CA, Ellerbroek A, Fromhoff B, Silver T. J

Int Soc Sports Nutr. 2014 May 12;11:19. [PubMed] [Full Text]

BACKGROUND: The consumption of dietary protein is important for resistance-trained individuals. It has been posited that intakes of 1.4 to 2.0 g/kg/day are needed for physically active individuals. Thus, the purpose of this investigation was to determine the effects of a very high protein diet (4.4 g/kg/d) on body composition in resistance-trained men and women. METHODS: Thirty healthy resistance-trained individuals participated in this study (mean SD; age: 24.1 5.6 yr; height: 171.4 8.8 cm; weight: 73.3 11.5 kg). Subjects were randomly assigned to one of the following groups: Control (CON) or high protein (HP). The CON group was instructed to maintain the same training and dietary habits over the course of the 8 week study. The HP group was instructed to consume 4.4 grams of protein per kg body weight daily. They were also instructed to maintain the same training and dietary habits (e.g. maintain the same fat and carbohydrate intake). Body composition (Bod Pod), training volume (i.e. volume load), and food intake were determined at baseline and over the 8 week treatment period. RESULTS: The HP group consumed significantly more protein and calories pre vs post (p < 0.05). Furthermore, the HP group consumed significantly more protein and calories than the CON (p < 0.05). The HP group consumed on average 307 69 grams of protein compared to 138 42 in the CON. When expressed per unit body weight, the HP group consumed 4.4 0.8 g/kg/d of protein versus 1.8 0.4 g/kg/d in the CON. There were no changes in training volume for either group. Moreover, there were no significant changes over time or between groups for body weight, fat mass, fat free mass, or percent body fat. CONCLUSION: Consuming 5.5 times the recommended daily allowance of protein has no effect on body composition in resistance-trained individuals who otherwise maintain the same training regimen. This is the first interventional study to demonstrate that consuming a hypercaloric high protein diet does not result in an increase in body fat. SPONSORSHIP: JA is the CEO of the International Society of Sports Nutrition. The protein powder was provided by MusclePharm and Adept Nutrition (Europa Sports Products brand); both are sponsors of the ISSN conferences.

Study strengths

A big strength of this study is the underlying concept, and the

interesting question investigated. Its one of the fun studies that pushes the what if we tried this crazy idea factor, examining a highly experimental and exploitive protocol. And, it happened to

yield some intriguing results. Overfeeding studies have thus far

focused on carbohydrate and/or fat,1-7

with a glaring scarcity of

studies on protein overfeeding.8 Furthermore, the majority of

overfeeding trials are short, ranging from a few days to less than

a month. Subjects were resistance-trained, which minimizes the

respond-strongly-to-anything tendency of novices.

Study limitations

Air displacement plethysmography (ADP, or Bod Pod) was used

to assess body composition. A comprehensive review by Fields

et al states:9 In conclusion, the BOD POD is a reliable and

valid technique that can quickly and safely evaluate body composition in a wide range of subject types, including those who are often difficult to measure, such as the elderly, children, and obese individuals. However, it should be noted that the majority of studies on the Bod Pod have compared it to

hydrostatic weighing. Ball and Altena10

compared Bod Pod to

dual X-ray absorptiometry (DXA) in a large sample of men

(n=160) and found that although the results from the two

methods were highly correlated, the difference increased as

bodyfat increased. Quoting their conclusion (which I feel is

hugely important):10

Practitioners should be aware that even with the use of technologically sophisticated methods (i.e., Bod Pod, DXA), differences between methods exist and the determination of body composition is at best, an estimation.

Another limitation is the questionable reliability of self-reported

dietary intake (and activity output). Research that immediately

comes to mind is Lichtman et al, who found that obese subjects

with a reported history of diet resistance under-reported food intake by an average of 47%, and over-reported physical activity

by 51%.11

In the case of the present study, there was a massive

amount of protein assigned to the experimental group (4.4 g/kg

or 307 g/day). The investigators were aware of the inherent

difficulty in carrying this out, hence their purposely uneven

randomization: 20 subjects were assigned to the high-protein

(HP) group, and 10 subjects to the control group. Its not out of the question that over-reporting occurred, since its human nature to avoid admitting failure to fully follow the program.

Aside from the limitations inherent with self-reported intake,

there was no objective measure of energy expenditure An

attempt to control for training volume was made via daily

journaling. There thus was the reliance upon the accuracy of the

subjects records, instead of an objective measure of energy expenditure such as the doubly labeled water (DLW) technique.

The use of DLW has been called the gold standard of assessing energy expenditure, particularly in non-confined conditions.

12

However, its rare to see DLW used in sports nutrition studies (or most any type of research, for that matter). This is because

its expensive and requires specifically trained personnel. Thus, were left with open questions about how the experimental protein overfeeding affected non-exercise activity thermogenesis

(NEAT). One of the most memorable examples of DLW use

capturing the impressive extent of NEAT was in 1999 when

Levine et al13

found that the metabolic response to a 1000-kcal

surplus ranged from a 98 kcal decrease to a 692 kcal increase in

NEAT. The groups mean increase in NEAT was 336 kcal. The authors summation is worth quoting directly:13

Thus, activation of NEAT can explain the variability of fat gain with overeating. As humans overeat, those with effective activation of NEAT can dissipate excess energy so that it is not available for storage as fat, [...] The maximum increase in NEAT that we detected (692 kcal/day, volunteer 5) could be accounted for by an increased strolling-equivalent activity of 15 min/hour during waking hours.

Comment/application

The most salient finding was the lack of significant change in

body composition in either group over the 8-week period:


Surprisingly, the HP groups body composition showed no significant changes despite the assignment of an additional 800

kcal (in protein) above and beyond that assigned to the control

group. But, unlike Levine et als overfeeding study, the present study based overfeeding on protein exclusively. The HP groups consumption of ~307 g protein versus the control groups ~138 g without a doubt had a higher thermic effect. As reported by

Jquier,14

the thermic effect of protein (expressed as a

percentage of energy content) is 25-30%, carbohydrate is 6-8%,

and fat is 2-3%. However, not all of the literature is in precise

agreement. Halton and Hu reported greater variability, with the

thermic effect of protein being 20-35%, carbohydrate at 5-15%,

and fat being subject to debate since some investigators found a

lower thermic effect than carbohydrate while others found no

difference.15

Despite relative variations in carbohydrate and fat,

protein has consistently shown a markedly higher thermic effect

than either of them. In combination, the thermic effect of protein

combined with a liberal presumption of NEAT, the majority of

the dissipated protein energy is accounted for. The remainder is

plausibly attributable to reporting error.

In a recent study that made waves for being the first of its kind,

Bray et al16

compared the overfeeding effects of a low-protein

(5%), normal-protein (15%), and high-protein (25%) diet.

Carbohydrate was kept the same across the treatments, with fat

filling in the remainder. Among this studys design strengths was the use of DLW to assess energy expenditure. A 40%

energy surplus (954 kcal) was imposed for 8 weeks, and the low-

protein lost lean mass, all groups increased fat mass equally, but

the normal & high-protein groups gained lean mass, with the

latter gaining the more lean mass by a small margin. The low

protein group gained significantly less total bodyweight than the

higher-protein groups, but this was due to differences in lean

mass gain.

In the present study, no lean mass was gained despite an

increased protein intake in the HP group. This can be attributed

to the advanced resistance-trained status of the subjects (they

trained an average of 8.5 hours/week for the past 8.9 years), and

their baseline protein intake was already high (~1.9-2.3 g/kg). In

contrast, Bray et als subjects were untrained, and their protein intake at baseline was 1.2 g/kg, and this was raised to 1.8 g/kg in

the high-protein treatment essentially crossing the threshold from sub-optimal to optimal. Another point made by the authors

of the present study was that the subjects were instructed to

maintain their habitual training program, thus precluding any

novel or greater training stimulus that might elicit further gains.

Taking the results on face-value, it almost seems surplus calories

dont count since NEAT will save you as long as the surplus is from protein. However, its worth reiterating that not all aspects of this study were tightly controlled, and reporting error could

have played a significant confounding role in the results.

On the other hand, there is still the possibility that relatively

advanced, resistance-trained subjects have a heightened

capability of dissipating surplus energy from dietary protein

through involuntary, non-exercise means. This possibility also

holds potentially important implications for meal planning of

dieting individuals as well as those who are striving to maintain

weight loss, but also need to control appetite.

On a practical note, the following detail should be weighed into

consideration: ...every subject in the high protein group consumed protein powder in order to meet the requirements for the study. Otherwise, it would be have virtually impossible or highly unlikely that one could consume a 4.4 g/kg/d via food alone. Protein supplements (in this case, whey and casein powders) only contain trace amounts of fat and carbohydrate.

Those who want to experiment with higher protein intakes

should keep in mind that inadvertent addition of fat and

carbohydrate along with the extra protein (i.e., via mixed-

macronutrient dishes and/or fatty meats) would not mimic the

protocol nor the effects seen in the present study.


An amino acid-electrolyte beverage may increase cellular rehydration relative to carbohydrate-electrolyte and flavored water beverages.

Tai CY, Joy JM, Falcone PH, Carson LR, Mosman MM,

Straight JL, Oury SL, Mendez C, Loveridge NJ, Kim MP,

Moon JR. Nutr J 2014, 13:47 doi:10.1186/1475-2891-13-47

[PubMed]

BACKGROUND: In cases of dehydration exceeding a 2% loss of body weight, athletic performance can be significantly compromised. Carbohydrate and/or electrolyte containing beverages have been effective for rehydration and recovery of performance, yet amino acid containing beverages remain unexamined. Therefore,

the purpose of this study is to compare the rehydration capabilities of an electrolyte-carbohydrate (EC), electrolyte-branched chain amino acid (EA), and flavored water (FW) beverages. METHODS: Twenty men (n = 10; 26.7 +/- 4.8 years; 174.3 +/- 6.4 cm; 74.2 +/- 10.9 kg) and women (n = 10; 27.1 +/- 4.7 years; 175.3 +/- 7.9 cm; 71.0 +/- 6.5 kg) participated in this crossover study. For each trial, subjects were dehydrated, provided one of three random beverages, and monitored for the following three hours. Measurements were

collected prior to and immediately after dehydration and 4 hours after dehydration (3 hours after rehydration) (AE = -2.5 +/- 0.55%; CE = -2.2 +/- 0.43%; FW = -2.5 +/- 0.62%). Measurements collected at each time point were urine volume, urine specific gravity, drink volume, and fluid retention. RESULTS: No significant differences (p > 0.05) existed between beverages for urine volume, drink volume, or fluid retention for any time-point.

Treatment x time interactions existed for urine specific gravity (USG) (p < 0.05). Post hoc analysis revealed differences occurred between the FW and EA beverages (p = 0.003) and between the EC and EA beverages (p = 0.007) at 4 hours after rehydration. Wherein, EA USG returned to baseline at 4 hours post-dehydration (mean difference from pre to 4 hours post-dehydration = -0.0002; p > 0.05) while both EC (-0.0067) and FW (-0.0051) continued to produce

dilute urine and failed to return to baseline at the same time-point (p < 0.05). CONCLUSION: Because no differences existed for fluid retention, urine or drink volume at any time point, yet USG returned to baseline during the EA trial, an EA supplement may enhance cellular rehydration rate compared to an EC or FW beverage in healthy men and women after acute dehydration of around 2% body mass loss. SPONSORSHIP: MusclePharm Corporation.

Study strengths

This study is innovative since its the first to compare the hydrating effects of a BCAA-electrolyte (AE) beverage with that

of a carbohydrate-electrolyte (CE) beverage. Furthermore, the

protocol involved a more realistic fluid dose than the typically

massive fluid doses given in previous research examining

rehydration beverages. Subjects were required to have a

minimum of one year of endurance and resistance training

experience, which minimized the chance of confounding

newbie effects. This investigation is of relevance to trainees aiming to economize caloric intake which is often hiked by the

carbohydrate content of conventional recovery beverages.

Study limitations

While this study may have relevance to those seeking to

economize carbohydrate intake, such a population would be very

sparse among trainees seeking to improve endurance-type

performance. This is because the inclusion of carbohydrate

would serve the dual purpose of driving better exercise

performance, as well as faster glycogen resynthesis post-exercise

(both of which can benefit competitive endurance sports especially those with multiple glycogen-depleting events per

day). Missing from the comparison was a condition containing

amino acids, carbohydrate, and electrolytes. However, the

authors duly cite research by Lambert et al,17

who found that in

the absence of electrolytes, no significant differences in

rehydration were seen between beverages containing versus

omitting carbohydrate. This implicates electrolytes as the critical

factor in rehydration (rather than carbohydrate, whose function

would be limited to glycogen resynthesis). Still, potentially

interactive or synergistic effects of a combination of carbs,

electrolytes, and amino acids on hydration would have been a

worthy condition to investigate in the present studys comparison. For example, chocolate milk has demonstrated

effectiveness for rehydration, glycogen resynthesis, and muscle

recovery, and is more nutrient-dense than typical commercial

recovery drinks.18

A final limitation was that the treatments were

not equal in terms of potassium content (AE had the most).

Comment/application

The findings only partially agreed with the authors hypothesis going into the experiment. They originally predicted that AE and

CE beverages would rehydrate similarly, yet to a greater extent

than the flavored water (FW) beverage. Interestingly, quoting

the authors: The AE and CE beverages rehydrated about equally; however, they were also equal to the FW beverage. However, they go on to mention a subtle detail that separated the

CE beverage from the other two, rendering it superior. CE and

FW yielded more diluted urine than AE, as indicated by urine

specific gravity (USG), depicted below:

At 4 hours post-dehydration, USG in the CE and FW trials was

significantly lower than pre-testing, while AEs USG was the same as pre-testing at this time point. This suggests greater

urinary diuresis and less cellular retention in CE & FW

compared to AE. It should be noted that the measured

differences in fluid retention between conditions were not

statistically significant (43.5% in AE, 40.8% in CE, and 42.2%

in FW). This begs the question of how clinically relevant these

small differences really are, and how necessary or beneficial this

special rehydration product is despite its inclusion of BCAA and absence of carbohydrate.


Calorie shifting diet versus calorie restriction diet: a comparative clinical trial study. Davoodi SH, Ajami M, Ayatollahi SA, Dowlatshahi K, Javedan G, Pazoki-Toroudi HR. Int J Prev Med. 2014 Apr;5(4):447-56. [PubMed]

BACKGROUND: Finding new tolerable methods in weight loss

has largely been an issue of interest for specialists. Present study

compared a novel method of calorie shifting diet (CSD) with classic

calorie restriction (CR) on weight loss in overweight and obese

subjects. METHODS: Seventy-four subjects (body mass index

25; 37) were randomized to 4 weeks control diet, 6 weeks CSD or CR diets, and 4 weeks follow-up period. CSD consisted of three

phases each lasts for 2 weeks, 11 days calorie restriction which

included four meals every day, and 4 h fasting between meals

follow with 3 days self-selecting diet. CR subjects receive

determined low calorie diet. Anthropometric and metabolic

measures were assessed at different time points in the study.

RESULTS: Four weeks after treatment, significant weight, and fat

loss started (6.02 and 5.15 kg) and continued for 1 month of follow-

up (5.24 and 4.3 kg), which was correlated to the restricted energy

intake (P < 0.05). During three CSD phases, resting metabolic rate

tended to remain unchanged. The decrease in plasma glucose, total

cholesterol, and triacylglycerol were greater among subjects on the

CSD diet (P < 0.05). Feeling of hunger decreased and satisfaction

increased among those on the CSD diet after 4 weeks (P < 0.05).

CONCLUSIONS: The CSD diet was associated with a greater

improvement in some anthropometric measures, Adherence was

better among CSD subjects. Longer and larger studies are required

to determine the long-term safety and efficacy of CSD diet.

SPONSORSHIP: None listed.

Study strengths

This is the first study to compare the effects of this particular permutation of a calorie shifting diet (CSD: 11 days restricted, 3 days unrestricted) pattern with a linear calorie-restricted (CR) diet. The investigation is an important one given the generally unimpressive weight loss and weight loss maintenance from conventional caloric restriction.

19-21 The sample size (n=74) was

fairly large, especially for diet research, which is notorious for its small subject numbers. More subjects translates to greater statistical power and less likelihood of by-chance occurrences. Study limitations

The design included an intervention period as well as a follow-up period which is a good thing, its just that both periods were short (6-week intervention, 4-week follow-up). This essentially gives us hypothesis-generating pilot data rather than long-term data that we can lean on with greater confidence. Another limitation is the diet construction these are your typical, crappy research diets. Protein intake during the intervention phase was actually less than the subjects habitual intakes at baseline in both groups. The deficits were rather severe, but strangely, they were not equated. CSDs reduction was set at 45% of baseline reported maintenance, and CRs was set at 55% of maintenance. The more severe deficit in CSD may have imparted an advantage. Furthermore, the results of this study might be limited to the subject profile (obese, untrained). Bioelectrical impedance analysis (BIA) was used to assess body composition.

Comment/application

CSD outperformed CR on several parameters:

CSD yielded greater weight loss at the end of the follow-up period but not the intervention period. Full body composition change details here).

CSD yielded greater fat loss at the end of the follow-up period but not at the end of the intervention period.

CSDs decrease in RMR was less than that of CR (which ended up being lower than baseline by the end of the study).

CSD yielded greater decreases in glucose, total cholesterol, and triacylglycerol by the end of the study.

CSD tended to yield lesser subjective feelings of hunger toward the end of the trial.

CSD had a much higher subject retention rate; 36.8% dropped out of CR, and 15.6% dropped out of CSD.

Overall, CSD trumped CR, especially by the end of the 4-week follow-up period. However, its very important to view these results in the proper perspective. It cant be overemphasized that this was a short intervention (6 weeks plus a 4-week follow-up). Ill also reiterate that the diets imposed upon both groups were far from optimal in terms of protein intake. Baseline protein intake of CSD was ~1.1 g/kg, and this dropped to ~0.9 g/kg during the intervention. Baseline protein intake in CR was ~1.1 g/kg, and this dropped to ~0.8 g/kg during the intervention. These protein intakes are approximately half of what has been repeatedly been shown to be a favorable and effective target for optimizing muscular adaptations to hypocaloric conditions.

22-24

Nevertheless, perhaps a case can be made for CSD over CR under conditions of subpar protein intake.

Still, the biological plausibility of there being inherent advantages to the CSD pattern is questionable beyond its potential to bolster compliance, at least in the short-term. CSD had more rules and structuring, particularly during the 11-day cycles where 4 meals were strictly spaced 4 hours apart with no between-eating allowed. This may have raised subjects awareness and focus on the protocol, keeping them more compliant. In contrast, the linear calorie-restricted group could have been lulled into a monotonous grind conducive to the loosening of adherence over time. In contrast, the CSD group essentially took a 3-day diet break (consuming maintenance-level calories) after every 11-day block of dieting.

The weight and fat loss benefits of CSD were not clearly apparent until the end of the follow-up period. Its thus easy to speculate that the 6 weeks of linear, aggressive caloric restriction may have been met with deprivation backlash during the follow-up period where the objective was to consume maintenance-level calories. Remember that in CR, 55% of baseline intake was subtracted, leaving subjects with 6 weeks of consuming 1186 kcal/day (down from 2432 kcal at baseline). An important indicator of the CSDs effectiveness was the doubly higher dropout rate in CR. The more favorable biochemical changes in CSD can be attributed primarily to the greater weight and fat loss at by end of the follow-up. The relative success of the 11/3 CSD model gives rise to the potential effectiveness of other more convenient and realistic non-linear models. For example, a 5/2 model, with 5 calorie-restricted days followed by 2 self-selected days, would mirrors a weekdays/weekend cycle which could potentially fit better into the common work schedule.


1. Lecoultre V, Egli L, Carrel G, Theytaz F, Kreis R, Schneiter

P, Boss A, Zwygart K, L KA, Bortolotti M, Boesch C, Tappy L. Effects of fructose and glucose overfeeding on hepatic insulin sensitivity and intrahepatic lipids in healthy humans. Obesity (Silver Spring). 2013 Apr;21(4):782-5. [PubMed]

2. Sobrecases H, L KA, Bortolotti M, Schneiter P, Ith M, Kreis R, Boesch C, Tappy L. Effects of short-term overfeeding with fructose, fat and fructose plus fat on plasma

Documents

4 - May - 2014