Upload
buikhuong
View
215
Download
0
Embed Size (px)
Citation preview
1
The impact of recent forcing and ocean heat uptake
data on estimates of climate sensitivity
Nicholas Lewis1
Judith Curry2
Permission to place a copy of this work on this server has been provided by the AMS. The AMS does
not guarantee that the copy provided here is an accurate copy of the published work. This version is of
the manuscript accepted by Journal of Climate on 12 April 2018, reformatted for easier reading.
1 Corresponding author: Nicholas Lewis, Bath, United Kingdom. Email: [email protected]
2 Climate Forecast Applications Network, Reno, NV USA Email: [email protected]
2
Abstract: Energy budget estimates of equilibrium climate sensitivity (ECS) and transient
climate response (TCR) are derived based on the best estimates and uncertainty ranges for
forcing provided in the IPCC Fifth Assessment Scientific Report (AR5). Recent revisions to
greenhouse gas forcing and post-1990 ozone and aerosol forcing estimates are incorporated
and the forcing data extended from 2011 to 2016. Reflecting recent evidence against strong
aerosol forcing, its AR5 uncertainty lower bound is increased slightly. Using a 1869–1882
base period and a 2007−2016 final period, which are well-matched for volcanic activity and
influence from internal variability, medians are derived for ECS of 1.50 K (5−95%: 1.05−2.45
K) and for TCR of 1.20 K (5−95%: 0.9−1.7 K). These estimates both have much lower upper
bounds than those from a predecessor study using AR5 data ending in 2011. Using infilled,
globally-complete temperature data gives slightly higher estimates; a median of 1.66 K for
ECS (5−95%: 1.15−2.7 K) and 1.33 K for TCR (5−95%:1.0−1.90 K). These ECS estimates
reflect climate feedbacks over the historical period, assumed time-invariant. Allowing for
possible time-varying climate feedbacks increases the median ECS estimate to 1.76 K
(5−95%: 1.2−3.1 K), using infilled temperature data. Possible biases from non-unit forcing
efficacy, temperature estimation issues and variability in sea-surface temperature change
patterns are examined and found to be minor when using globally-complete temperature data.
These results imply that high ECS and TCR values derived from a majority of CMIP5 climate
models are inconsistent with observed warming during the historical period.
3
1. Introduction
There has been considerable scientific investigation of the magnitude of the warming of
Earth’s climate from changes in atmospheric carbon dioxide (CO2) concentration. Two
standard metrics summarize the sensitivity of global surface temperature to an externally
imposed radiative forcing. Equilibrium climate sensitivity (ECS) represents the equilibrium
change in surface temperature to a doubling of atmospheric CO2 concentration. Transient
climate response (TCR), a shorter-term measure over 70 years, represents warming at the time
CO2 concentration has doubled when it is increased by 1% a year.
For over thirty years, climate scientists have presented a likely range for ECS that has
hardly changed. The ECS range 1.5−4.5 K in 1979 (Charney 1979) is unchanged in the 2013
Fifth Assessment Scientific Report (AR5) from the IPCC. AR5 did not provide a best
estimate value for ECS, stating (Summary for Policymakers D.2): "No best estimate for
equilibrium climate sensitivity can now be given because of a lack of agreement on values
across assessed lines of evidence".
At the heart of the difficulty surrounding the values of ECS and TCR is the substantial
difference between values derived from climate models versus values derived from changes
over the historical instrumental data record using energy budget models. The median ECS
given in AR5 for current generation (CMIP5) atmosphere-ocean global climate models
(AOGCMs) was 3.2 K, versus 2.0 K for the median values from historical-period energy
budget based studies cited by AR5.
Subsequently Lewis and Curry (2015; hereafter LC15) derived, using observationally-
based energy budget methodology, a median ECS estimate of 1.6 K from AR5's global
forcing and heat content estimate time series, which made the discrepancy with ECS values
derived from AOGCMs even larger. LC15 also derived a median TCR value of 1.3 K, well
below the 1.8 K median TCR for CMIP5 models in AR5.
Considerable effort has been expended in attempts to reconcile the observationally-
based ECS values with values determined using climate models. Most of these efforts have
focused on arguments that the methodologies used in the energy balance model
determinations result in ECS and/or TCR estimates that are biased low (e.g., Marvel et al.
2016; Richardson et al. 2016; Armour 2017).
Using a standard global energy budget approach, this paper seeks to clarify the
implications for climate sensitivity (both ECS and TCR) of incorporating the most up-to-date
surface temperature, forcing and ocean heat content data. Forcing and heat content estimates
given in AR5 are extended from 2011 to 2016, with recent revisions to greenhouse gas
forcing-concentration relationships and post-1990 tropospheric ozone and aerosol forcing
changes applied and a new ocean heat content dataset incorporated. This paper also addresses
a range of concerns that have been raised regarding using energy balance models to determine
4
climate sensitivity: variability in patterns of sea-surface temperature change, non-unit forcing
efficacy, temperature estimation issues and time-varying climate feedbacks.
The paper is structured as follows. The global energy budget approach is discussed in
Section 2. Section 3 deals with data sources and uncertainties, Section 4 with choice of base
and final periods, and methods are described in Section 5. Section 6 sets out the results, which
are discussed in Section 7. Section 8 concludes.
2. Global energy budget approach
A general energy budget framework has been widely used in the estimation and analysis of
climate sensitivity, such as by Armour and Roe (2011) and Roe and Armour (2011), and in
AR5 (Bindoff et al. 2014). Estimation of climate sensitivity from changes in conditions
between periods early and late in the industrial era has been developed by Gregory et al.
(2002), Otto et al. (2013), Masters (2014), LC15 and other papers. Advantages of the energy
budget approach are described by LC15; relative to less simple models that use zonally,
hemispherically or land-ocean resolved data, the energy budget approach includes improved
quantification of and robustness against uncertainties through use only of global mean data.
Generally, complex models are ill-suited to observationally-based climate sensitivity
estimation since it may not be practicable to produce, by perturbing their internal parameters,
a simulated climate system that is adequately consistent with observed variables. An
increasingly popular alternative is the 'emergent constraint' approach: identifying
observationally-constrainable metrics in the current climate which correlate with ECS in
complex models. However, it has been shown for CMIP5 models that all such metrics are
likely only to constrain shortwave cloud feedback, and not other factors controlling their ECS
(Qu et al. 2017). The ability in a state-of-the-art complex model to engineer ECS over a wide
range (largely arising from differing shortwave cloud feedback) by varying the formulation of
convective precipitation, without being able to find a clear observational constraint that favors
one version over the others (Zhao et al. 2016), casts further doubt on the emergent constraint
approach.
Using a simple rather than a complex climate model also has the important advantage
of transparency and reproducibility. What determines ECS and TCR in a complex model is
obscure, and their estimation is affected by internal variability. The energy budget framework
provides an extremely simple physically-based climate model that, given the assumptions
made, follows directly from energy conservation. It has only one uncertain parameter, λ,
which can be directly derived from estimates of changes in historical global mean surface
temperature (hereafter 'surface temperature'), forcing and heat uptake rate.
The main assumption made by the energy budget model concerns the radiative
response ΔR to a change in radiative forcing ΔF that alters positively the Earth's net
downwards top-of-atmosphere (TOA) radiative imbalance N. The assumption is that, in
temporal mean terms, ΔR – the change in net outgoing radiation resulting from the change in
5
the state of the climate system caused by the forcing imposition – is linearly proportional to
the forcing-induced change in surface temperature ΔT. Mathematically,
RR T µ (1)
with λ – the climate feedback parameter, representing the increase in net outgoing energy flux
per degree of surface warming – constant, and µR a random zero-mean residual term
representing internal fluctuations in the system unrelated to fluctuations in T. Together µR and
fluctuations in R arising, through its relation to T, from internal variability in T – which will
have a different signature –represent the internal variability in R. A constant λ implies it is
independent of T, other aspects of the climate state, the magnitude and composition of ΔF,
and the time since forcing was applied.
By conservation of energy, ΔN = ΔF − ΔR . Therefore, in temporal mean terms,
substituting using (1)
( ) /F N T (2)
It follows from (2) that, designating the radiative forcing from a doubling of atmospheric CO2
concentration as 2 CO2F , once equilibrium is restored following such a doubling (implying
ΔN=0),
2 CO2 / ECSF (3)
Hence, substituting in (2), in general
2 CO2ECS
TF
F N
(4)
with the CO2 forcing component of F calculated on a basis consistent with that used for
2 CO2F . Here, N is conventionally regarded and measured as the rate of planetary heat uptake,
which provides identical ΔN values to measuring its net downwards radiative imbalance.
Equation (4) assumes that T is entirely externally forced, but it does not imply a linear
relationship between ΔN and ΔT, unlike the 'kappa' model (Gregory and Forster 2008).
We apply (4) to estimate ECS based on changes in mean values of estimates for T, F
and N between well separated, fairly long base and final periods.
Being inferred from transient changes, ECS as defined in (4) is an effective climate
sensitivity that embodies the assumption of a constant linear climate feedback parameter λ.
Equilibrium climate sensitivity, by contrast, requires the atmosphere-ocean system (although
not slow components of the climate system, such as ice sheets) to have equilibrated.
Equilibrium and effective climate sensitivity will not be identical if the feedback parameter is
inconstant over time or dependent on ΔF or ΔT. The behavior of CMIP5 models may provide
some insight into these issues.
Throughout 140-year simulations in which CO2 forcing is increased smoothly at 1%
per annum (1pctCO2), the responses of almost all CMIP5 AOGCMs can be accurately
emulated by convolving the rate of increase in forcing with the step response in their
simulations in which CO2 concentration is abruptly quadrupled (abrupt4xCO2) (Figure 1;
Good et al. 2011; Caldeira and Myhrvold 2013). Such behavior strongly suggests that
6
feedback strength in CMIP5 models generally does not change with F or T per se, at
least for CO2 forcing from up to respectively quadrupling of its preindustrial concentration
and the warming reached in abrupt4xCO2 simulations after half a century or so (typically 4−5
K). Otherwise one would expect to see divergences, particularly in the first few decades of the
1pctCO2 simulation when the applicable temperature is furthest below the mean temperature
of the abrupt4xCO2-derived step-emulation components. We have also investigated feedback
strength in the MPI-ESM-1.2 AOGCM under differing abrupt CO2 increases. Feedback
strength is almost the same between abrupt2xCO2 and abrupt4xCO2 simulations up to at least
year 150, when ΔT reaches 5 K under quadrupled CO2.
Fig. 1 Comparison of actual and step-emulated ensemble-mean changes from preindustrial
in global surface temperature, ΔT, and TOA radiative imbalance, ΔN, in 1pctCO2
simulations. Small and large circles show respectively annual and pentadal mean actual
values, blue for ΔT and green for ΔN. The red and magenta lines show respectively ΔT and
ΔN values as emulated from the step-responses of the same models in abrupt4xCO2
simulations. The non-logarithmic element of the CO2 forcing–concentration relationship
(Byrne and Goldblatt 2014; Etminan et al. 2016) has been allowed for. The same ensemble
of 31 CMIP5 models is used as in Table S2. The minor excess of the emulated ΔN values in
the middle years is due principally to the behavior of GISS-E2 models; if their p3 versions
are excluded the match for ΔN becomes almost perfect throughout, while that for ΔT
remains so.
7
However, in most CMIP5 AOGCMs, λ (here dN
dT ) tends to decrease a few decades
into abrupt4xCO2 simulations – when N is plotted against T (a 'Gregory plot'), the slope is
gentler after that time – although generally λ then remains almost constant for the rest of the
simulation (Armour 2017 Fig. S1). In some cases, the decrease in λ may be linked to
temperature- or time-dependent energy leakage (Hobbs et al. 2016). However, typically the
decrease in λ appears to arise primarily from the strength of modeled shortwave cloud
feedbacks varying with time, likely linked to evolving patterns of surface warming (Andrews
et al. 2015). The decrease in λ means that effective climate sensitivity estimates derived from
simulations forced by abrupt or ramped CO2 changes tend to increase with the analysis period,
although in most cases they change only modestly once a multidecadal period has elapsed. It
is unclear to what extent, if any, this behavior occurs in the real climate system. Possible
implications of time-varying feedbacks for historical period energy budget ECS estimation are
analyzed in section 7f. Until then, ECS estimates are not distinguished according to what
extent they are potentially affected by time-varying feedbacks.
ECS would also differ from the estimate provided by (4) if that were significantly
affected by internal variability, or if effect on ΔT or ΔN of the composite forcing change over
the estimation period differed from that of CO2 forcing. These issues are discussed in Sections
3b, 3c, 4 and 7c. The possibility of internal variability in spatial surface temperature patterns
affecting ECS estimation is discussed in Section 7a.
Transient climate response (TCR) is the increase in surface temperature (averaged
over twenty years) at the time of CO2 concentration doubling when it is increased by 1% a
year, implying an almost linear forcing ramp over 70 years. Although designed as a measure
of transient response in AOGCMs, TCR can be regarded as a property of the real climate
system. TCR can be estimated by scaling the ratio of the response of global surface
temperature to the change in forcing accruing approximately linearly over a period of circa 70
years (Bindoff et al. 2014, p.920). That is:
2 CO2TCR
TF
F
(5)
TCR can be estimated using (5) with a recent final period and a base period ending circa
1950. Although occurring mainly over the last 70 years, the effect on surface temperature of
the development of forcing over the whole historical period (post ~1850) has been estimated
to be broadly equivalent to that of a 100-year linear forcing ramp (Armour 2017). TCR may
therefore also be estimated using a base period early in the historical period, with a possible
marginal upwards bias since with a longer ramp period the climate system will have had more
time to respond to the ramped forcing. LC15 found that estimating TCR using (5) with a
recent final period and a base period either early in the historical period or of 1930−1950
provided an estimate of TCR closely consistent with its definition.
The energy budget approach has also been applied to estimate both ECS and TCR
using regression over all or a substantial part of the historical period, rather than taking
8
differences between base and final periods (Gregory and Forster 2008; Schwartz 2012).
Although regression makes fuller use of available information than the two-period method,
using averages over base and final periods captures much of the available information, since
internal variability is high on sub-decadal timescales and total forcing has only become
reasonably large relative to its uncertainty relatively recently. Moreover, handling
multidecadal internal variability and volcanic eruptions poses a challenge when using
regression. Gregory and Forster (2008) excluded years with significant volcanism, but
subsequent years may be affected by the recovery from volcanic forcing.
It is important to use an appropriate forcing metric for energy budget sensitivity
estimation. The surface temperature response to forcing from a particular agent relative to that
from CO2 (its 'efficacy': Hansen et al. 2005) is in some cases sensitive to the metric used. In
such cases, efficacy is normally much closer to unity when the effective radiative forcing
(ERF) metric (Sherwood et al. 2015; Myhre et al. 2014) is used rather than the common
stratospherically-adjusted radiative forcing (RF) metric. Unlike ERF, the RF metric does not
allow for the troposphere and land surface adjusting to the imposed forcing. Since ERF is a
construct designed to fit the global radiative response as a linear function of ΔT over time
scales of decades to a century (Sherwood et al. 2015), it is an appropriate metric for energy
budget sensitivity estimation. References here to forcing are to ERF except where indicated
otherwise. AR5 only gives estimated forcing time-series for ERF. Its best estimates of 2011
ERF differ from those of RF only for aerosols and contrails, although uncertainty ranges are
generally wider for ERF than for RF.
Uncertainty in energy budget estimates of ECS and TCR from instrumental
observations stems primarily from uncertainty in F (LC15), which also produces most of
the asymmetry in probability distributions for ECS and TCR estimates (Roe and Armour
2011). The two main contributors to uncertainty in F are aerosols and, to a substantially
smaller extent, well-mixed greenhouse gases (WMGG).
3. Data sources and uncertainties
As in LC15, forcing and heat uptake data and uncertainty estimates identical to those given in
AR5 have been used unless stated otherwise. AR5 estimates represent carefully considered
assessments in which many climate scientists with relevant expertise were involved, and
underwent an extensive review process. Post-2011 values have insofar as possible been
derived entirely from observational data, on a basis consistent with that in AR5. Trend-based
extrapolation has only been used for some minor forcing and heat uptake components, except
for 2016 aerosol and tropospheric ozone forcing. Only a brief discussion of the treatment of
data uncertainties and internal variability is given here, since full details of our treatment can
be found in LC15. This section summarizes information about the forcing, heat uptake and
temperature data. Full details of changes relative to AR5 estimates for certain forcing and heat
9
uptake components, and of the updating of all components from 2011 to 2016, are provided in
the Supplementary Material (S1 and S2).
a. Forcings
ERF time series medians up to 2011 (relative to 1750) are sourced from Table AII.1.2
of AR5, with uncertainty estimates for 2011 derived from Table 8.6 and Table 8.SM.5 of
AR5. The only changes to Table AII.1.2 values concern forcing from the principal WMGG,
where recent revisions to forcing-concentration relationships (Etminan et al. 2016) have been
incorporated throughout, and post-1990 changes in aerosol and tropospheric ozone forcing,
where new estimates of their evolution based on updated anthropogenic emission data for
1990−2015 (Myhre et al. 2017) have been adopted, adding their estimated post-1990 changes
to the AR5 1990 values. Recent evidence concerning volcanic forcing (Andersson et al. 2015)
was considered, but no revision to AR5 estimates was found necessary (Supplementary
Material S1). The principal effect of these revisions is to make methane (CH4) forcing more
positive, and post-1990 aerosol forcing less negative, than per AR5. After reaching −0.9
Wm−2
in 1995, ERFAerosol weakens to −0.8 Wm−2
in 2011. The 2011 forcing uncertainty
ranges are used, in conjunction with AR5 2011 medians, to specify the fractional uncertainty
for each forcing constituent.
Since AR5, understanding of anthropogenic aerosol forcing (ERFAerosol) has improved.
A number of recent studies point to total aerosol forcing being substantially weaker than the
lower end of the −1.9 to −0.1 Wm−2
2011 range (median −0.9 Wm−2
) given in AR5, primarily
due to negative forcing from aerosol-cloud interactions being weaker than previously thought
(Seifert et al. 2015; Stevens 2015; Gordon et al. 2016; Zhou and Penner 2017; Nazarenko et
al. 2017; Lohmann 2017; Malavelle et al. 2017; Stevens et al. 2017; Fiedler et al. 2017; Toll
et al. 2017). Recent evidence regarding positive aerosol forcing from absorbing carbonaceous
aerosols (Wang et al. 2014, Samset et al. 2014, Wang et al. 2016, Zhang et al. 2017) is mixed,
on balance suggesting it may be lower than the AR5 best estimate, but above its lower
uncertainty bound in AR5. Although some post-AR5 studies (e.g. Cherian et al. 2014, McCoy
et al. 2017) have reported relatively strong aerosol forcing, Stevens (2015) presented several
observationally-based arguments that total aerosol forcing since preindustrial was weak, and
could not be stronger than −1.0 Wm−2
.1 Supporting those arguments, Zhou and Penner (2017)
and Sato et al. (2018) showed that negative cloud-lifetime aerosol forcing simulated by
AOGCMs was unrealistic, Bender et al. (2016) showed that the positive correlation between
aerosol loading and cloud albedo displayed in most climate models is not seen in
observations, and Nazarenko et al. (2017) showed that aerosol forcing was weaker when
climate feedbacks were allowed for. In the light of these developments, the −1.9 Wm−2
model-derived lower bound for 2011 aerosol forcing in AR5 now appears too strong. We have
1 Substituting, for consistency, the higher WMGG forcing used in this study for that used in Stevens (2015)
would slightly change its −1.0 Wm−2
aerosol forcing lower bound, to −1.06 Wm−2
, too little to weaken the
argument for the proposal made here.
10
therefore weakened it slightly to −1.7 Wm−2
, as in Armour (2017), making the range
symmetrical about the AR5 2011 median.
Following LC15, CO2 and 'GHG Other' forcings are combined into a single ERFGHG
time series, since AR5 does not distinguish between the two as regards ERF uncertainty.
Uncertainty in forcing from WMGG almost entirely relates to how much forcing a given
concentration of each greenhouse gas produces – uncertainty in concentrations is minor – and
is likely highly correlated among WMGG. AR5 (Section 8.5.1) assumes that fractional ERF
uncertainties for CO2 applies to all WMGG and to total WMGG, implying that fractional
uncertainty in 2 CO2F is the same as, and fully correlated with, that in ERFGHG. We follow Otto
et al. (2013) and LC15 in adopting this assumption. Although uncertainty in WMGG forcing
is substantial, since 2 CO2F appears in the numerator of (4) and (5), and F (to which ERFGHG
is by far the largest contributor) in the denominator, the effects on ECS and TCR estimation
of uncertainty in forcing from WMGG cancel out to a substantial extent. Dropping the
assumption of uncertainty being correlated between CO2 and 'GHG Other' forcing would have
a negligible effect on ECS and TCR estimate uncertainty ranges. That same applies if in
addition the ERF-to-RF uncertainty ratio for non-CO2 WMGG were increased from the
20%:10% ratio assumed in AR5 to 30%:10%, even if uncertainty were treated as perfectly
correlated between all non-CO2 WMGG, as in AR5.
Ozone (both Tropospheric and Stratospheric), Stratospheric H2O (Water vapor) and
Land Use Change forcings, for which uncertainty distributions can be added in quadrature, are
combined into a single ERFOWL forcing component series (termed ERFnonGABC in LC15).
The resulting forcing best estimates and uncertainties used for the main results are
summarised in Table 1, for both 2011 and 2016. AR5 forcing estimates and uncertainty
ranges for 2011 are also shown. Following LC15, the uncertainty ranges for solar and
volcanic forcing have been widened. The revised total 1750−2011 anthropogenic forcing
estimate has increased by 9% from the AR5 value; the largest contribution comes from the
revision in CH4 forcing. 2 CO2F has also been revised up by 2.5%, to 3.80 Wm−2
, which has an
opposing effect on sensitivity estimation to the upward revision in total forcing. Figure 2
shows the original AR5 and revised anthropogenic forcing time-series.
LC15 concluded that volcanic forcing (ERFVolcano) in AR5 needs to be scaled down by
40−50% in order to produce a comparable effect on surface temperature to ERFGHG and other
forcings. Gregory et al. (2016) likewise found that volcanic forcing produced a substantially
smaller response in AOGCMs than CO2 forcing. They quantified the effect in HadCM3,
where ERFVolcano was smaller relative to stratospheric aerosol optical depth than per AR5 and
its efficacy was also lower, implying that AR5 volcanic forcing needed to be scaled down by
~50% for use in a global energy budget model. Since there is no authority in AR5 for
applying an adjustment factor, the issue is sidestepped by using base and final periods with
11
Fig. 2 Anthropogenic forcings from 1750 to 2016. All time-series that are affected by the
revisions to AR5 CO2, CH4 and nitrous oxide forcing–concentration relationships and to
post-1990 revisions to AR5 aerosol and tropospheric ozone forcing are shown separately. In
some cases the Original AR5 1750–2011 time-series overlay the Revised 1750–2016 time-
series prior to 2012. Unrevised anthropogenic forcing components (Stratospheric H2O,
Land use (albedo), BC on snow, Contrails) have been combined into a single Other
Anthropogenic time-series. Natural forcings (Solar, Volcanic) are not shown as they have
not been revised and post 2011 changes in them are very small.
matching mean volcanic forcing, as in LC15. The results of applying a scaling factor of 0.55
are shown where sensitivity testing of estimates to the choice of base and final periods
involves mismatched volcanic forcing. Likewise, as in LC15 the AR5 Land Use Change
forcing (ERFLUC) series is used despite it representing only effects on surface albedo. AR5
assessed that including other effects of land use change it is about as likely as not to have
caused net cooling. The effect of setting ERFLUC to zero is also reported. AR5 gives an
estimated efficacy range of 2−4 for the minor black carbon on snow and ice forcing
(ERFBCsnow), which is applied probabilistically.
12
ERF component This study
1750–2016
best estimate
Revised
1750–2011
best estimate
AR5 1750–2011 best
estimate and 90% CI
Part treated
as
independent
Added fixed
uncertainty
90% CI
WMGG 3.176 2.989 2.831 (2.260–3.400) 0%
Ozone (total) 0.392 0.379 0.350 (0.141–0.559)
Stratospheric
H2O 0.074 0.073 0.073 (0.022–0.124)
Land use
(albedo) −0.151 −0.150
−0.150 (−0.253–
−0.047)
Total OWL 0.315 0.302 0.273 (0.034– 0.512) 50%
Aerosol (total) −0.769 −0.777
−0.900 (−1.900–
−0.100); revised to
(−1.700–−0.100)
25%
BC on snow 0.040 0.040 0.040 (0.020–0.090) Ignored
Contrails 0.059 0.050 0.050 (0.020–0.150) Ignored
Total
anthropogenic 2.821 2.581 2.294(1.134–3.334)
Solar 0.021 0.030 0.030 (−0.021–
+0.081) 50% ±0.05
Volcanic −0.099 −0.125 −0.125 (−0.160–
−0.090) 50% ±0.072
Table 1 Components of ERF and treatment of their uncertainties. Units are Wm−2.
b. Heat uptake
Planetary heat uptake – the rate of increase in its heat content – occurs primarily
(>90%) in the ocean. The AR5 estimates for heat uptake by the atmosphere, ice, land and the
deep (sub-2000 m) ocean are used unaltered up to 2011 and extended to 2016. AR5's source
for 700−2000 m ocean heat content (OHC), Levitus et al. (2012), has been updated, but a new
dataset (Cheng et al. 2017) is also available; the average of those two datasets is used here.
AR5's source for 0–700 m OHC has not been updated to 2016. The average of three available
fully updated 0−700 m OHC datasets (Cheng et al., Levitus et al., and Ishii and Kimoto
(2009)) is used instead, for all years. There are considerable divergences between OHC
estimates from the various datasets, arising from differences in the data used, corrections
made to it, and the mapping (infilling) methods used. Averaging results from different OHC
datasets reduces the effect of errors particular to individual datasets. Over the main
1995−2011 and 1987−2011 final periods used in LC15, implementing the foregoing changes
to the sourcing and calculation of OHC estimates produces slightly higher 0−2000 m ocean
13
heat uptake (OHU) estimates than use of the original AR5 datasets. Since the mid-2000s,
when the Argo floating buoy network achieved near-global coverage, OHC uncertainty has
been lower. The revised estimation basis produces total heat uptake within 0.02 Wm−2
of the
estimates by Desbruyeres et al. (2017) of 0.72 Wm−2
over 2006-2014 and by Johnson et al.
(2016) of 0.71 W m−2
over 2005−2015.
As in previous energy budget studies, AOGCM simulation-derived estimates of heat
uptake are used for the base periods, since OHC was not measured then. The heat uptake
values used in LC15, which were derived from simulations by CCSM4 starting in AD 850
(Gregory et al. 2013), scaled by 0.60, were 0.15, 0.10 and 0.20 W m−2
respectively for the
1859−1882, 1850−1900 and 1930−1950 base periods. The unscaled CCSM4-derived values
were consistent with the value derived by Gregory et al. (2002) from a different AOGCM.
The LC15 values are adopted (taking the 1859−1882 value for 1869−1882), as are the LC15
standard error estimates, being in each case 50% of the heat uptake estimate.
The variability in total heat uptake of 0.045 Wm−2
for all base and final periods used
in LC15, derived from the ultra-long HadCM3 (Gordon et al. 2000) control run, is also
adopted. Investigation showed this to be adequate for each of the base and final periods used
here.
c. Surface temperature
As in LC15, the HadCRUT4 surface temperature dataset (Morice et al. 2012) is used,
updated from HadCRUT4v2 to HadCRUT4v5. Results are also presented using a globally-
complete version infilled by kriging (Had4_krig_v2: Cowtan and Way 2014). The surface
temperature trends over 1900−2010 are identical in both versions, with Had4_krig_v2
warming faster than HadCRUT4v5 early and late in the record.
Unlike GISStemp and NCDC MLOST (now NOAA GlobalTemp), the other two
surface temperature datasets cited in AR5, HadCRUT4 extends back to 1850 rather than 1880,
providing adequate data early in the historical period prior to the period of heavy volcanism
from 1883 on. The warming shown by the infilled GISStemp and NOAA4.0.1 datasets
between twenty-year periods early and late in their records (1880−1899 and 1997−2016) was
respectively 0.85 K and 0.82 K, against 0.83 K for HadCRUT4v5 and 0.89 K for
Had4_krig_v2.
Both versions of HadCRUT4 provide an ensemble of 100 temperature realizations that
preserves the time-dependent correlation structure. Uncertainty in mean surface temperature
for each period is calculated on a basis consistent with the applicable covariance matrix of
observational uncertainty, and combined in quadrature with an estimate of inter-period
internal variability in T . The LC15 estimate of 0.08 K standard deviation for such internal
variability is adopted; it was conservatively scaled up from 0.06 K derived from the ultra-long
HadCM3 control run. Sensitivity testing in LC15 showed that a further 50% increase in
internal variability in T had almost no effect on uncertainty in ECS and TCR estimates.
14
4. Choice of base and final periods
Two-period energy budget studies have used base and final periods lasting between one and
five decades. Longer periods reduce the effects of interannual and decadal internal variability,
but shorter periods make it feasible to avoid major volcanism and a short final period provides
a higher signal. Base and final periods should be at least a decade, to sufficiently reduce the
influence of interannual variability. Volcanic forcing efficacy, relative to AR5 forcing
estimates, appears to be substantially below unity, and may differ according to the location
and type of eruption. Moreover, prior to the satellite (post-1978) era there are considerable
uncertainties regarding the magnitude of volcanic eruptions and resulting forcing. Therefore,
accurate sensitivity estimation requires estimated volcanic forcing to be matched between the
base and final period, and relatively low. Likewise, initial and final periods should be well
matched regarding the influence of the principal sources of interannual and multidecadal
internal variability, notably ENSO and Atlantic multidecadal variability.
Atlantic multidecadal variability is often quantified by an index of detrended north
Atlantic sea-surface temperatures, either including (Enfield et al. 2001) or excluding (van
Oldenborgh et al. 2009) the tropics, and termed the Atlantic Multidecadal Oscillation (AMO).
The internal multidecadal pattern in near-global sea-surface temperature found by Delsole et
al. (2011) is very similar to Enfield et al.'s AMO index. Enfield et al. (2001) detrended
relative to time, whereas van Oldenburgh detrended relative to surface temperature. While
following van Oldenburgh et al. in excluding the tropics (which are more affected by ENSO
state than the extratropics), we prefer detrending relative to total forcing, omitting volcanic
years, in order to exclude any forced signal. Whichever definition is used, the AMO has had a
quasi-periodicity of 60−70 years during the instrumental record, peaking around 1875, 1940
and 2005. When using a final period ending in 2016, to maximise the anthropogenic warming
signal, matching its mean AMO state requires a base period either early in the historical
period or in the mid-twentieth century.
Matching mean ENSO state for base and final period levels is not practical where a
base period early in the record is used, since the mean ENSO state, as represented by the MEI
index (Wolter and Timlin 1993), was lower then than in recent decades. However, the MEI
index depends partly on non-detrended sea-surface temperature (SST) and could include a
forced element, so use of a detrended version is arguably preferable. On that basis, there is no
difficulty in matching mean ENSO state. In any event, of the natural sources of influence on
sensitivity estimation considered, mean ENSO state appears to be the least influential.
LC15 used base periods of 1859−1882, 1930−1950 and 1850−1900. LC15's preferred
base and final periods were 1859−1882 and 1995−2011, being the longest periods near the
start and at the end of the instrumental record with low volcanic activity and with adequately
matched AMO influence. As volcanic activity has remained low since 2011, the obvious
choice of updated final period is 1995−2016. This includes a number of relatively cold years
but also two very strong El Niños. The decade 2007−2016, which includes a mix of cold and
15
warm years and ends with a powerful El Niño, is arguably preferable as it provides a higher
F and the best constrained TCR and ECS estimates. Moreover, as the Argo network was
operational throughout 2007−2016, confidence in the reliability of OHU estimation is higher.
Although 1859−1882 is well matched with both 1995−2016 and 2007−2016 as regards
mean volcanic forcing, and acceptably matched for mean AMO state, HadCRUT4v5
observational data sampled a particularly low proportion of the Earth's surface throughout
most of the 1860s – substantially lower than both prior to 1860 and from 1869 on. During the
same period, larger than usual differences arose between the original HadCRUT4v5 and the
globally complete Had4_krig_v2 surface temperature estimates. Infilling through kriging is
subject to greater uncertainty when observations are sparser. There is merit in using the longer
1850−1882 period, excluding all years with low (under 20% of global area) HadCRUT4v5
coverage (being 1860−1868); however, as volcanic forcing was strong (below –0.5 Wm−2
)
over 1856−1858 those years would also need to be excluded to avoid mismatched volcanic
forcing. Since the complete shorter 1869−1882 period produces essentially identical TCR and
ECS estimation we use that instead. It is well matched with the 1995−2016 and 2007−2016
final periods as regards mean volcanic forcing as well as AMO and ENSO state. The better
observed 1930−1950 period is also well matched with those final periods, although its mean
AMO state is stronger.
TCR and ECS estimates are also computed using much longer base and final periods.
The 1850−1900 long base period, taken in AR5 to represent pre-industrial surface
temperature, has substantial mean volcanic forcing. It is matched with 1980−2016, which has
almost identical mean volcanic forcing and acceptably similar mean AMO and ENSO states.
Figure 3 shows variations in the three sources of natural variability discussed, along
with areal coverage of HadCRUT4v5. Five year running means are shown for the MEI and
AMO indexes.
5. Methods
The method used to calculate ECS and TCR is identical to that in LC15, where it is set out in
detail. In summary, the main steps in deriving best estimates and uncertainty ranges for ECS
and TCR for each base period and final period combination are as follows:
1. Unrevised AR5 2011 values for each forcing component (ERFGHG, ERFAerosol,
ERFBCsnow, ERFContrails, ERFOWL, ERFSolar and ERFVolcano) are sampled, using the original
AR5 uncertainty distributions except for aerosol forcing. For aerosol forcing a normal
distribution with unchanged −0.9 Wm−2
median but the revised −0.1 to −1.7 Wm−2
5−95% uncertainty range is used. Where appropriate, part of fractional-type uncertainty
in a forcing component (being all but any fixed element) is treated as independent
between the base and final periods, and the total uncertainty is split between separate
common and independent random elements before sampling. The AR5 efficacy range
for ERFBCsnow is applied probabilistically at this stage.
16
Fig. 3 Natural factors that influence selection of base and final periods, and surface
temperature dataset coverage, during 1850–2016. Volcanic forcing is from AR5. The AMO
index comprises the residuals from regressing 25–60 N, 5–70 W HadSST3 data on total
forcing with years in which volcanic forcing is < −0.5 Wm−2 omitted, and is scaled up by 3
times. The MEI index has been extended before 1950 using a regression fit to the MEI.ext
index (Wolter and Timlin 2011), and then detrended (relative to time). The two indices are
plotted as five-year centered means (three-year/one-year means for next-but-end/end years);
their units are arbitrary. Annual means of HadCRUT4v5 monthly grid-cell coverage as a
fraction of the Earth's surface are shown. The preferred base and final periods are shaded.
After dividing by the AR5 2011 best estimates, the (one million) samples are used to
scale the period means computed from the best estimate time series (revised from AR5
where relevant), samples from the fixed elements of solar and volcanic forcing
uncertainty are added and the components combined, thus deriving sampled F values.
The central 2 CO2F value is scaled in the same proportion as the central ERFGHG values.
This produces 2 CO2F samples with uncertainty realizations (proportionately) matching
those for WMGG forcing.
2. Uncertainty distributions for T (using the relevant 100-realizations ensemble) and
for N are computed, adding in quadrature the estimated uncertainties of the base and
final period means and the estimated internal variability, and random samples drawn
from those distributions.
3. For each sample realization of T , F , N and 2 CO2F , the ECS and TCR values
given by equations (4) and (5) are calculated. Histograms of the sample ECS and TCR
17
values are then computed to provide median estimates, uncertainty ranges and
probability densities, treating samples where the denominator is negative as having
infinitely positive sensitivities.
The estimates of T , F and N , and their uncertainty ranges, are given in Table 2, with
the relevant corresponding values from LC15 shown for comparison.
Base period Final period ΔT HadCRUT4
[K]
ΔT Had4_krig_v2
[K]
ΔF
[Wm−2
]
ΔN
[Wm−2
]
1869–1882 2007–2016 0.80 (0.65–0.95) 0.88 (0.73–1.03) 2.52 (1.68–3.36) 0.50 (0.25–0.75)
1869–1882 1995–2016 0.73 (0.58–0.87) 0.79 (0.63–0.94) 2.26 (1.44–3.09) 0.49 (0.29–0.69)
1850–1900 1980–2016 0.65 (0.51–0.79) 0.71 (0.56–0.86) 2.01 (1.21–2.82) 0.40 (0.21–0.60)
1930–1950 2007–2016 0.61 (0.47–0.75) 0.65 (0.51–0.79) 1.94 (1.22–2.66) 0.45 (0.18–0.72)
Lewis and Curry (2015) estimates for comparison
1859–1882 1995–2011 0.71 (0.56–0.86) n/a 1.98 (0.99–2.86) 0.36 (0.15–0.58)
1850–1900 1987–2011 0.66 (0.52–0.81) n/a 1.88 (0.92–2.74) 0.41 (0.19–0.63)
Table 2 Best estimates (medians) and 5–95% uncertainty ranges for changes T in global mean
surface temperature, F in effective radiative forcing and N in total heat uptake. between the
base and final periods indicated. The final two lines show comparative values for LC15 for the
first two period combinations given in that paper. The values for F are after probabilistically
applying the AR5 efficacy range for ERFBCsnow.
6. Results
ECS and TCR estimates based on each of the four combinations of base period – final period
are presented in Table 3. The ECS estimates in this Section assume that the climate feedback
parameter over the historical period, which they reflect, is a constant. That is, they measure
effective climate sensitivity but assume it equals equilibrium climate sensitivity; the possible
implications of relaxing this assumption are discussed in Section 7f. The relevant results from
LC15 are shown for comparison. Estimates based on both original HadCRUT4v5 surface
temperature data and on the globally-complete Had4_krig_v2 version are given. Probability
density functions (PDFs) for these ECS and TCR estimates are presented in Figure 4.
18
Base period Final period ECS
best
estimate
[K]
ECS
17-83%
range
[K]
ECS
5-95%
range
[K]
TCR
best
estimate
[K]
TCR
17-83%
range
[K]
TCR
5-95%
range
[K]
1869–1882 2007–2016 1.50
1.66
1.2–1.95
1.35–2.15
1.05–2.45
1.15–2.7
1.20
1.33
1.0–1.45
1.1–1.60
0.9–1.7
1.0–1.9
1869–1882 1995–2016 1.56
1.69
1.2–2.1
1.35–2.25
1.05–2.75
1.15–3.0
1.22
1.32
1.0–1.5
1.1–1.65
0.85–1.85
0.95–2.0
1850–1900 1980–2016 1.54
1.67
1.2–2.15
1.3–2.3
1.0–2.95
1.1–3.2
1.23
1.33
1.0–1.6
1.05–1.7
0.85–1.95
0.9–2.15
1930–1950 2007–2016 1.56
1.65
1.2–2.15
1.25–2.3
1.0–3.0
1.05–3.15
1.20
1.27
0.95–1.5
1.05–1.6
0.85–1.85
0.9–1.95
Lewis and Curry (2015) results for comparison
1859–1882 1995–2011 1.64 1.25–2.45 1.05–4.05 1.33 1.05–1.8 0.90–2.5
1850–1900 1987–2011 1.67 1.25–2.6 1.0–4.75 1.31 1.0–1.8 0.85–2.55
Table 3 Best estimates (medians) and uncertainty ranges for ECS and TCR using the base and
final periods indicated. Values in roman type compute T using the HadCRUT4v5 dataset;
values in italics compute T using the infilled, globally-complete Had4_krig_v2 dataset. The
preferred estimates are shown in bold. Ranges are stated to the nearest 0.05 K. The final two
lines show the comparable results from LC15 for the first two period combinations given in
that paper. All these ECS estimates assume that the climate feedback parameter is a constant.
For each source of surface temperature data, the four best (median) estimates agree
closely for both ECS and TCR. Based on HadCRUT4v5 data, the best estimates are in the
range 1.50−1.56 K for ECS and 1.20−1.23 K for TCR. Based on globally-complete
Had4_krig_v2 data, which show greater warming, the best estimates are in the range
1.65−1.69 K for ECS and 1.27−1.33 K for TCR. Lower (5%) uncertainty bounds for ECS and
TCR vary little between the four period combinations. Use of 1869−1882 as the base period
and 2007−2016 as the final period provides the best-constrained, preferred, estimates, with
95% bounds for ECS and TCR of 2.45 K and 1.7 K respectively using HadCRUT4v5 (2.7 K
and 1.9 K using Had4_krig_v2); the corresponding median estimates are 1.50 K and 1.20 K
(Had4_krig_v2: 1.66 K and 1.33 K).
The new ECS and TCR median estimates based on HadCRUT4v5 are approximately
10% lower than those in LC15, largely due to the positive revisions to estimated CH4 and
post-1990 aerosol forcing, partly offset by the higher estimated 2 CO2F and (for ECS) by
estimated heat uptake in the final period being a slightly higher fraction of forcing.
19
Fig. 4 Estimated probability density functions for ECS and TCR using each period
combination shown in the main results. Original GMST refers to use of the HadCRUT4v5
record; Infilled GMST refers to use of the Had4_krig_v2 record. Box plots show probability
percentiles, accounting for probability beyond the range plotted: 5–95 (bars at line ends),
17–83 (box-ends) and 50 (bar in box: median).
Results of some sensitivity analyses are shown in Table 4, with various aspects of the
1869−1882 base period, 2007−2016 final period case being modified. These analyses do not
systematically explore all possible variations in choice of data, uncertainty assumptions or
methodology. For clarity, only values based on HadCRUT4v5 surface temperature data are
shown; fractional sensitivities are similar using Had4_krig_v2 data.
Using 1850−1882 as the base period, with low observational coverage and volcanic
years excluded, produces virtually identical ECS and TCR medians and uncertainty ranges to
using 1869−1882. Generally, estimates of ECS and TCR are modestly sensitive to selection of
base period if no allowance is made for volcanic forcing (as estimated in AR5) having a low
efficacy; when its efficacy is taken as 0.55 the ECS and TCR best estimates are little changed
upon substituting 1850−1900 or 1850−1882 (all years) as the base period. Moreover, applying
a volcanic forcing efficacy of 0.55 when regressing surface temperature per HadCRUT4v5 on
(efficacy-adjusted) forcing over all years in 1850−2016 produces a TCR estimate of 1.19 K,
almost identical to the two-period estimate. By comparison, doing so using unit volcanic
efficacy gives a much lower TCR value of 0.98 K.
The residuals from regressing surface temperature per HadCRUT4v5 on efficacy-
adjusted forcing over 1850−2016 with volcanic efficacy set at 0.55 have a mean over
2007−2016 only 0.01 K higher than that over 1869−1882 (0.03 K higher using Had4_krig_v2
data). For the 1995−2016 final period the corresponding excesses are similar. The tiny
magnitudes of these inter-period differences indicate that both final periods are well matched
with the 1869−1882 base period as regards internal variability.
20
Variation from 1869–1882 base period, 2007–2016
final period, main results for the HadCRUT4v5 case
ECS
best
estimate
[K]
ECS
5-95%
range
[K]
TCR
best
estimate
[K]
TCR
5-95%
range
[K]
Base case – no variations 1.50 1.05–2.45 1.20 0.9–1.7
Base period 1850–18822 ex low cover & volcanic yrs
3 1.50 1.05–2.45 1.20 0.9–1.7
Base period 1850–1900; volcanic efficacy 1.0 1.44 1.05–2.15 1.16 0.9–1.6
Base period 1850–1900; volcanic efficacy 0.55 1.52 1.1–2.35 1.21 0.9–1.65
Base period 1850–18821; volcanic efficacy 0.55 1.52 1.1–2.4 1.22 0.9–1.7
ERFAerosol uncertainty range 5% bound as per AR5 1.51 1.05–2.65 1.21 0.9–1.8
ERFWMGG uncertainty range scaled up by 50% 1.50 1.05–2.6 1.20 0.9–1.75
ERFOWL uncertainty range scaled up by 50% 1.50 1.05–2.55 1.20 0.85–1.75
ERFAerosol uncertainty range scaled down by 50% 1.50 1.1–2.15 1.20 0.95–1.55
AR5 original ERFGHG + >1990 aerosol & O3 forcing4 1.68 1.1–3.25 1.31 0.9-2.05
0-2000 m OHC based only on Cheng et al data 1.47 1.05–2.35 n/a n/a
0-2000 m OHC based only on Levitus/NOAA data 1.54 1.05–2.55 n/a n/a
ERFLUC set to zero (increases F by 0.10 Wm−2
) 1.43 1.0–2.25 1.16 0.85–1.6
Table 4 Sensitivity of best estimates (medians) and uncertainty ranges for ECS and TCR. Ranges
are stated to the nearest 0.05 K.
Reverting the aerosol forcing 5% uncertainty bound back from –1.7 Wm−2
to the
original AR5 –1.9 Wm−2
level increases the 95% bounds for ECS and TCR by respectively
0.2 K and 0.1 K; their median estimates barely change. Scaling up by 50% the uncertainty
range for ERFWMGG increases those bounds by 0.15 K and 0.05 K respectively, while doing so
for ERFOWL increases them by 0.1 K and 0.05 K respectively; scaling down these uncertainty
ranges by 50% has approximately equal but opposite effects. Reducing the aerosol forcing
uncertainty range by 50% reduces the 95% bound for ECS by 0.3 K to 2.15 K and that for
TCR by 0.15 K to 1.55 K.
2 Heat uptake for the 1850−1882 base period is set mid-way between those for the 1850−1900 and
1869−1882 periods, or equal to the latter when low coverage and volcanic years are excluded.
3 The criteria for excluding years from the 1850-1882 base period due to low coverage or volcanism is
HadCRUT4v5 areal coverage < 0.2 or ERFVolcano < –0.5 Wm−2
. 4 AR5 original (unrevised) post-2011 tropospheric ozone and aerosol forcing are derived by extrapolation
using their small 2002−11 trends.
21
Using unrevised AR5 forcing–concentration relationship estimates for the principal
WMGG and for post-1990 aerosol and tropospheric ozone forcing results in the ECS and
TCR median values increasing by 0.18 K and 0.11 K respectively. The 95% uncertainty
bounds for ECS and TCR increase more, by 0.8 K and 0.35 K respectively, but remain well
below their levels in LC15. In contrast, computing 0−2000 m OHU using only Cheng et al., or
only Levitus et al., data instead of using estimates averaged over those datasets (and, for the
0–700 m layer, the Ishii and Kimoto dataset), affects ECS best estimates by merely ±2−3%,
with the 95% bound altering by ±0.1 K; TCR estimates are unaffected.
7. Discussion
Since publication of LC15, various papers have claimed that the energy budget approach
and/or temperature dataset used in LC15 do not enable ECS and TCR to be determined
satisfactorily from historical observations, and lead to the LC15 estimates being biased low.
Here we address these critiques, as well as implications of feedback analysis and research
concerning SST warming patterns.
a. Role of historical sea-surface temperature warming patterns
The pattern of observed surface warming over the historical period differs from that
simulated by most CMIP5 models. Gregory and Andrews (2016) (GA16) found that feedback
strength λ in simulations by two atmosphere-only models (AGCMs), HadGEM2-A and
HadCM3-A, driven by observed evolving changes in SST and sea-ice, but with preindustrial
atmospheric composition and other forcings fixed (amipPiForcing), was considerably higher
over the historical period than in years 1–20 of abrupt4xCO2 simulations. Moreover, λ
showed substantial decadal variation, being particularly large over the post-1978 period. Zhou
et al. (2016) found broadly similar behavior in two other AGCMs.
We focus here on GA16's amipPiForcing simulation data from the more advanced,
current generation HadGEM2 model. GA16's analysis of variation in λ (their α ) measured by
regression over a 30-year sliding window, with small temperature changes except towards the
end, is not relevant to energy budget estimation spanning much longer periods and larger
changes. Moreover, GA16’s analysis method produces large variability in λ estimates when
tested on pseudodata embodying a constant λ (Figure S2).
Plotting ΔR against ΔT using pentadal means, averaging-out interannual noise, and
considering how averages over consecutive longer periods compare (Figure 5a), provides a
more suitable assessment of the stability of feedback strength in HadGEM2-A over the
historical period. Over the last 75 years, during which over 80% of the total forcing change
occurred, ΔR and ΔT pentadal anomalies are clustered around the best-fit line, with means for
all five 15-year sub-periods lying very close to it. There are a few pentadal points some
distance from the best fit line, as one would expect from internal variability, but little
evidence of fluctuating multidecadal feedback strength. The largest excursions of ΔR from the
22
best-fit λ estimate of 1.90 Wm−2
K−1
were in the 20 years prior to 1925 and in the decade
centered on 1980 (Figure S3).5 The latter was responsible for the strong 1970−1995 upwards
trend in 30-year regression-based λ in GA16 Figure 2(a); if the 1976−1985 ΔR values are
suitably adjusted, the trend is flat from 1960 on (Figure S4). However, the anomalous ΔR
values circa 1980 have only a minor effect on λ estimates derived from 15-year means: for
both 1966−1980 and 1981−1995, ΔR / ΔT was only 7% lower than for 1996−2010. The early
heavy volcanism (during 1883–1905) appears not to have affected the best-fit λ: the ratio of
changes in R and T between 1931−1960 and 1996−2010, two volcanism-free periods, gives
almost the same value. Fits for each individual amipPiForcing run are very similar (negligible
y-intercept, slopes within 5% of the 1.90 Wm−2
K−1
for the ensemble-mean, R2 = 0.93 versus
0.94 for the ensemble-mean, in all cases with 1906−25 data excluded). This analysis shows
that HadGEM2-A displays a near constant λ of 1.9 Wm−2
K−1
over the historical period when
driven by observed evolving SST patterns – over 2.3⤬ as high as the 0.82 W m−2
K−1
over
years 1−20 of HadGEM2-ES's abrupt4xCO2 simulation, and corresponding to an effective
climate sensitivity of only 1.67 K.6
GA16 offered three possible explanations for feedback strength being higher over the
historical period in their amipPiForcing experiments than over years 1−20 of the
abrupt4xCO2 simulations. They found two of them conflicted with their calculated trends in
λ, leading them to favour the importance of the third explanation, being that unforced
variability strongly influenced historical variations in SST patterns. However, Zhou et al.
(2016) found that if CMIP5 control simulations realistically estimate internal variability on
decadal timescales, then at least part of the 1980-2005 SST trend pattern must be forced. In
HadGEM2's case, under 1% of internal variability realizations simulated by CMIP5
AOGCMs would raise the ΔR value for the final 15 years of the amipPiForcing run implied
by the λ value HadGEM2-ES exhibits early in its abrupt4xCO2 simulation even 30% towards
its actual amipPiForcing value (Figure S5). Our finding that the relationship between pentadal
ΔR and ΔT in HadGEM2-A during its amipPiForcing experiment is stable, apart from two
excursions, (Figures 5a and S3) strongly points to the observed SST pattern evolution being
largely forced and to much lower λ values in years 1−20 of HadGEM2-ES's abrupt4xCO2
experiment reflecting unrealistic simulated SST pattern evolution. If follows that there is no
reason to believe energy budget sensitivity estimates based on changes over the full historical
period are biased downwards by internal variability in SST patterns.
5 The first excursion is cotemporaneous with a period of strongly negative SST anomalies in the North
Atlantic and reconstructed salinity anomalies in the Labrador Sea (Muller et al. 2014). The second
excursion is cotemporaneous with decadal variability linked to the 1976 Pacific climate shift (Trenberth
and Hurrell 1994). Both events likely arose from multidecadal internal variability; there is little evidence of
either being forced.
6 Based on our estimated 2 CO2F for HadGEM2-ES of 3.18 Wm
−2.
23
Fig. 5 Change in net outgoing radiation ΔR plotted against change in surface temperature
ΔT. Blue and cyan circles show pentadal means, red squares show 15-year means. Panel a:
average anomalies, relative to the 1871–1900 mean, from two 1871–2010 amipPiForcing
simulations by HadGEM2-A. The black line shows the linear fit with pentads spanning
1906–1925 (cyan circles) excluded (see Figure S2). No-intercept fitting with all pentads
included yields an almost identical fit. Plotted 15-year means are for periods ending 1950,
1965, 1980, 1995 and 2010. Panel b: observationally-estimated anomalies over 1872–2016
relative to the 1850–1884 mean. Forcing is as per section 3.a, with an efficacy of 0.55
applied to ERFVolcano. ΔR is estimated as (2.52 − 0.50)/2.52 * ΔF; this scaling is based on
the ΔF and ΔN values from row one of Table 2. Had4_krig_v2 is used for ΔT. The black
line shows the no-intercept linear fit to all pentadal values. Fitting with an intercept, but
excluding pentads spanning 1907–1926, gives a 1% lower best-fit slope. Plotted 15-year
means are for periods 1927–1941, 1942–1956, 1957–1971, 1972–1986, 1987–2001 and
2002–2016. Pre-1927 pentads are colored cyan.
Observational estimates of the relationship between ΔR and ΔT throughout the
historical period are also relevant. We estimate λ using all 15-year periods in 1927−2016, as
well as by regression over 1872−2016, anomalizing relative to a 1850−1884 base period.
Average volcanism in 1850−1884 matches that over both 1927−2016 and 1872−2016, and
when using 2007−2016 anomalies that base period gives the same λ estimate (2.29 Wm−2
K−1
,
corresponding to an ECS of 1.66 K) as per the main 2007-2016 based results with globally-
complete ΔT. Until recent decades ΔR was unobserved; we approximate it by scaling ΔF pro
rata to the observationally-estimated ΔR:ΔF ratio for 1869−1882 to 2007−2016, assuming
that ΔN is proportional to ΔT over the historical period (Gregory and Forster 2008). We scale
ERFVolcano by 0.55 to adjust for its low efficacy. Our no-intercept pentadal regression fit over
24
1872–2016 gives λ = 2.27 Wm−2
K−1
. Post-1926 (ΔR, ΔT) pentadal means (Figure 5b) cluster
around the best-fit line, while most of the 15-year means lie almost on it.
The considerable stability of observationally-based λ estimates over 1927−2016
provides further evidence that feedback strength did not fluctuate materially during the
historical period, and strengthens confidence in our main results.
b. Weaknesses in the feedback analysis constraint
It has been argued that relatively well understood feedbacks (water vapor/lapse rate,
albedo) imply, in the absence of evidence for cloud feedbacks being significantly negative, an
upper bound on the climate feedback parameter corresponding to ECS being 2 K or higher,
particularly if anvil cloud-height feedback is also included. However, an analysis of feedbacks
and forcing in CMIP5 models (Caldwell et al. 2016) indicates that if diagnosed cloud
feedbacks are excluded, the median implied ECS reduces from 3.4 K to 2.3 K, with ECS
falling below 2 K in a quarter of the models. More fundamentally, the fact that AGCMs can
generate widely varying climate feedback strength depending on the pattern of SST change
(which feedback analysis does not constrain) weakens the feedbacks constraint argument.
A substantial part of the initial radiative response to CO2 forcing may be viewed (and
mathematically modeled) as reflecting a sub-decadal timescale ocean adjustment process
during which ocean heat transport and SST patterns alter, negatively affecting shortwave
cloud radiative effect (Andrews et al. 2015) so that R increases for a given T, thus partially
counteracting the forcing independently of surface temperature increase (Williams et al. 2008;
Sherwood et al. 2015; Rugenstein et al. 2016). Feedback analysis derived constraints, even if
correct, do not apply to such an adjustment. Accordingly, as during the initial decade or two
the radiative response partly reflects adjustments, the apparent climate feedback parameter
may be considerably higher then than feedback analysis suggests is possible. While the
(lower) underlying climate feedback parameter if not affected by adjustments, and may be
time-invariant, ECS is affected.
In abrupt4xCO2 simulations, where diagnosed climate feedback strength is typically
substantially greater in the first decade or two than subsequently, eigenmode decomposition
of CMIP5 AOGCM responses (Proistosescu and Huybers 2017) indicates that only about one-
third of the initial forcing remains once sub-decadal timescale responses are complete, and
that the climate feedback parameter associated with sub-decadal timescale responses, if not
regarded as partially associated with adjustment processes, ranges up to 3 Wm−2
K−1
.
We conclude that simple global feedback analysis cannot rule out low ECS even if
global cloud feedback is ultimately positive, because radiative response, forcing adjustments
and feedbacks depend on the pattern of SST warming, which may differ significantly from
that simulated by AOGCMs.
25
c. ERF efficacy
There have been suggestions that the composite forcing during the historical period
has an overall ERF efficacy below one, so that historical forcing will have produced less
warming than CO2 forcing of equal ERF magnitude (Shindell 2014; Kummer and Dessler
2014; Marvel et al. 2016). In most cases, the shortfall is attributed principally to spatially
inhomogeneous negative aerosol forcing having an efficacy exceeding one. Using historical
all-forcings, WMGG-only and natural forcings-only simulations by a small ensemble of
CMIP5 models, Shindell estimated that aerosol ERF – combined with the much smaller ozone
ERF – had an efficacy of 1.5, resulting in the (transient) efficacy of historical ERF being
approximately 0.85. Kummer and Dessler showed that applying Shindell's aerosol and ozone
ERF efficacy estimate increased their ECS estimate by 50%.
Marvel et al., using the GISS-E2-R model and a set of single-forcing simulation-
ensembles as well as a historical all-forcings simulation-ensemble, with the applicable ERF
determined from a further set of simulation-ensembles, estimated historical composite ERF to
have transient and equilibrium efficacies below one; we discuss these findings below.
However, they found that these shortfalls were due to solar, volcanic, ozone and (for
equilibrium efficacy) WMGG ERF having an efficacy below one, with aerosol ERF having an
efficacy of 1.0. Other single forcing simulation studies also indicate that aerosol ERF does not
have an efficacy exceeding one (Hansen et al. 2005; Ocko et al. 2014; Paynter and Frölicher
2015; Forster 2016). Although Rotstayn et al. (2015) obtained an aerosol ERF efficacy
estimate of 1.4 by regressing surface temperature change over the historical period against
estimated aerosol ERF in an ensemble of CMIP5 models, their result is strongly model-
ensemble dependent. Excluding an outlier model (FGOALS-s2) makes their efficacy estimate
statistically indistinguishable from one.
Complicating matters, for aerosols the forcing and response may vary significantly
with climate state (Miller et al. 2014; Nazarenko et al. 2017).Shindell (2014) (and thereby
Kummer and Dessler 2014) and Marvel et al. (2016) estimated aerosol ERF using model
simulations in which the climate state differed from that when composite historical forcing
was applied, so their results are unreliable in the presence of aerosol forcing or response
climate-state dependency. As Shindell differenced results from forced simulations involving
different climate states and forcing combinations, his findings (and thereby Kummer and
Dessler's) are particularly susceptible to bias from aerosol forcing or response climate-state
dependency.
Efficacy estimates based purely on composite historical forcing may be more reliable.
Marvel et al. estimated the efficacy (their transient efficacy) of composite historical
instantaneous radiative forcing at the tropopause (iRF, an approximation to RF) as 1.00.
Although their corresponding ERF (transient) efficacy estimate, which is more relevant to
energy-budget studies, was 0.88, they derived it by comparing year-2000 forcing with mean
1996−2005 temperatures, which does not produce a satisfactory estimate. In GISS-E2-R, year
2000 forcing was higher than the 1996−2005 mean, and surface temperature in the second
26
half of the 1990s was still depressed by recovery from the Pinatubo eruption (Table S1).
Recalculating efficacy using warming over 2000−2005, scaling year-2000 historical ERF by
the ratio of average 2000−2005 iRF to year-2000 iRF, raises the Marvel et al. (transient)
efficacy of historical ERF to 1.00 (Supplementary Material: S3). Consistent with this, Hansen
et al. (2005) estimated (transient) efficacy relative to historical ERF derived by regression as
marginally above one.
Marvel et al. also derived a new efficacy metric, equilibrium efficacy, that accounts
for variation in heat uptake efficiency between forcings. However, their methods also bias
downwards their historical forcing equilibrium efficacy estimates. Recalculating equilibrium
efficacy for historical ERF using the same mean 2000−2005 historical ERF value as for our
re-estimation of transient efficacy, and the full TOA radiative imbalance rather than just its
ocean heat uptake component, raises their 0.76 equilibrium efficacy estimate to 1.04 when the
comparison is made with the response to CO2-only forcing over a similar time period
(Supplementary Material: S3).
Hence we conclude that assertions that historical forcing has an efficacy below one
appear to be unjustified, so that the assumption of λ being independent of forcing composition
holds for the change in composite forcing over the historical period (of which the volcanic
component is negligible).
d. Global incompleteness of the surface temperature dataset
In principle a globally-complete surface temperature dataset is preferable, although the
potential inaccuracy introduced by infilling might be greater than estimated, particularly in the
early part of the record. Even during the well-observed satellite period, it is not invariably true
that infilling is beneficial. ECMWF (2015) gives a global-mean comparison over 1979−2014
of 2 m air temperature for land and SST for ocean per ERA-interim (Dee et al. 2011) –
generally considered the best reanalysis dataset – both on a globally-complete basis and with
monthly coverage reduced to match that of HadCRUT4. The 1979−2014 linear trend of their
globally-complete estimates was closely in line with that based on HadCRUT4 coverage
(which equaled the actual HadCRUT4v5 trend), whereas Had4_krig_v2 shows a 9% higher
trend over that period.
Nevertheless, it is more appropriate to use sensitivity estimates based on globally-
complete surface temperature data for comparisons with CMIP5 model ECS and TCR values
and others based on globally-complete data. We use only our Had4_krig_v2-based estimates
for doing so.
e. Use of anomaly temperatures and SST versus air temperature over the oceans
Using CMIP5 model simulations, it has been claimed (Cowtan et al. 2015; Richardson
et al. 2016) that even a globally-complete surface temperature estimate like Had4_krig_v2
may understate warming in global mean near-surface air temperature due to its use of SST
over the ice-free ocean and of anomaly temperatures. Richardson et al. (2016) estimated a
27
historical bias of 7−9% if real-world behavior matched that of the average CMIP5 model.
They refer to the related discussion by Cowtan et al. (2015), who estimated an average bias of
7% for historical warming (their Table S1, averaging all periods with >0.2 K warming). Two
causes each contributed approximately half of the 7% bias.
First, Cowtan et al. argued that temperature changes in areas becoming free of sea ice,
as it shrinks, are understated due to the use of anomalies. However, CMIP5 model simulations
cannot provide a realistic estimate of any resulting bias in historical warming, since most
models simulate strong warming in Antarctica and a reduction in surrounding sea ice, whereas
little Antarctic warming has occurred and sea ice there has actually increased. Cowtan and
Way (2014: update) found that in reality the effect on temperature estimates of assuming sea
ice extent was fixed (in which case no bias arises) was minimal.
Secondly, Cowtan et al. argued that in CMIP5 models SST (tos) warms less than
ocean near-surface air temperature (tas), resulting on average in surface temperature warming
less when SST rather than marine air temperature is used. However, CMIP5 models generally
treat the ocean's skin temperature, which determines its interactions with the atmosphere, as
equal to the top model ocean layer, typically 10 m deep, so that tas – tos really reflects the
difference between model-simulated air temperature and ocean skin temperature. Even if the
excess of near-surface air temperature increase over ocean skin temperature increase in
CMIP5 models is realistic, SST, which is typically measured at 5−10 m deep, is significantly
different from skin temperature and may increase faster. Observations provide alternative
evidence. The HadNMAT2 (Kent et al. 2013) dataset shows a lower global trend in near-
surface marine air temperature over its 1880−2010 record than does HadSST3.1.1.0, the sea-
surface temperature component of HadCRUT4v5, although possible inhomogeneities mean
this result is uncertain. Moreover, the 1979−2014 trend in the globally-complete ERA-interim
data increases by just 2% when using background 2 m marine air temperature (calculated by
the reanalysis AGCM) rather than SST (ECMWF 2015).7 Over 1979 to July 2016 – a period
in which the bulk of the historical period warming and sea-ice reduction occurred – ERA-
interim shows marginally greater warming when using background marine air temperature
rather than analyzed SST, but the trend is 0.17 K/decade in both cases – and lower than the
0.18 K/decade per both the SST-using HadCRUT4v5 and Had4_krig_v2 datasets (Simmons
et al. 2017).
7 Digitizing the complete global averages data in the ECMWF (2015) bar graph gives a 1979−2014 trend of
0.158 K decade−1, or 0.159 K decade−1 when masked to HadCRUT4 coverage. This data is a blend of 2 m
temperature over land and SST over ocean (Paul Berrisford, ECMWF, pers. comm. 2016). A 2% higher
1979−2014 trend of 0.161 K decade−1 was computed using data from
https://climate.copernicus.eu/sites/default/files/repository/Temp_maps/Data_for_month_8_2017_plot_3.txt. That
data is for surface air temperature anomalies. Both sets of data have been adjusted by ECMWF for
inhomogeneities in their source of analyzed SST.
28
On balance the observational evidence points to past warming in global mean
temperature when using near-surface air temperature everywhere being little different from
when blending it with SST over the ocean. The evidence from comparing ERA-interim trends
using marine air temperature and using SST, which points to approximately 2% slower
warming when using SST, is perhaps most credible. However, this excess is tiny, and could
be biased high by the reanalysis AGCM's behavior.
We conclude that any underestimation of past global near-surface air temperature
warming arising from blending SST data over the ice-free ocean with near-surface air
temperature elsewhere, as in Had4_krig_v2, is sufficiently small to be ignored (and could
even be negative). While incorporating an extra, multiplicative, uncertainty with a standard
deviation of 4% in all the Had4_krig_v2 ΔT values might nevertheless be justified, it would
not alter any ECS or TCR 5-95% uncertainty range by more than ±0.01 K.
f. ECS versus ECShist
The possibility that energy-budget climate sensitivity estimates based on changes over
the historical period, which measure λ over that period and assume it is invariant (and which
thus actually reflect an effective climate sensitivity, ECShist), might differ from ECS was
brought up in section 2. ECShist can be quantified fairly accurately in AOGCMs, their ECS
estimated from centennial model response in abrupt4xCO2 simulations, and an ECS-to-
ECShist ratio derived.
We have calculated an ECS-to-ECShist ratio for an ensemble of 31 CMIP5 models,
deriving ECS by Gregory-plot regression (Gregory et al. 2004) over years 21−150 (Armour
2017) and taking the mean ECShist estimate from three methods that access different
realisations of model internal variability (Supplementary Material: S4). The three methods
provide almost identical ensemble-mean ECShist estimates (Table S2). Over the entire
ensemble, ECS varies between 0.91⤬ and 1.52⤬ ECShist, the median ratio being 1.095, very
close to the 1.096 ratio estimated by Mauritsen and Pincus (2017). Armour (2017) and
Proistosescu and Huybers (2017) reported higher ECS-to-ECShist ratios (respectively 1.26
ensemble-mean and 1.34 ensemble-median), but we find their estimation methods less
satisfactory, causing quantifiable biases.
A reconciliation of the mean ECS-to-ECShist ratio for CMIP5 per Armour (2017)
(A17) to our 1.095 ratio is as follows. We provide a similar reconciliation for Proistosescu
and Huybers (2017) in the Supplementary Material (S5).
1. A17, in calculating ECShist values (there termed ECSinfer), estimated 2 CO2F from the y-
axis intercept when regressing ΔN against ΔT over years 1−5 of abrupt4xCO2
simulations. Doing so does not provide an unbiased ERF basis estimate of 2 CO2F , since
during year one CO2 top-of-atmosphere forcing is moving from its instantaneous value
towards its ERF value as the stratosphere, troposphere and other annual- or shorter-
timescale climate system components adjust to the imposed forcing independently of
29
surface temperature increase. For example, stratospheric adjustment, which reduces
forcing, takes several months to complete. When regressing over only five years, the
inclusion of year one data significantly increases the mean 2 CO2F
estimate, resulting in
lower ECShist estimates. We regress over years 2−10, avoiding bias from non-fully
adjusted year one data; time-variation of λ is insignificant in the first decade. A17's
ensemble-mean ECS-to-ECShist ratio calculated using regression over years 2−10 to
determine 2 CO2F
, but otherwise using his methods, would be 1.215.
2. A17 did not allow for the slightly faster than logarithmic relationship of CO2 forcing
to concentration (Etminan et al. 2016). There is no reason to think that CO2 radiative
forcing code in CMIP5 models does not, on average, reflect that relationship – the
logarithmic relationship given in AR5 was known only to be an approximation. The
effect is a 0.7% upwards bias in A17's mean ECShist estimate (which is based on ΔN
and ΔT values in years 85−115 of 1pctCO2 simulations) but a 4.6% upwards bias in
A17's mean ECS estimate (which is based on abrupt4xCO2 simulations). Adjusting
for this bias (3.9% net) reduces A17's ECS-to-ECShist ratio estimate further, to 1.170.
3. A17 estimate ECS using OLS regression of annual-mean years 21−150 abrupt4xCO2
ΔN and ΔT values, but ΔT as well as ΔN is affected by internal variability and their
fluctuations are generally weakly correlated. Where the regressor variable contains
errors, OLS regression underestimates the slope coefficient (Deming, 1943). Using
Deming regression to derive unbiased ECS estimates (Supplementary Material S4),
A17's ensemble-mean ECS estimate is 2.0% lower than when using OLS regression.
Adjusting for this bias further reduces A17's ensemble-mean ECS-to-ECShist ratio, to
1.146.
4. We use three different methods to estimate ECShist, one being A17's method,
averaging their results. For A17's ensemble our mean ECShist estimate is the same as
when using only A17's method, so using our ECShist estimation basis its mean ECS-to-
ECShist ratio is also 1.146.
5. A17 quote a mean ECS-to-ECShist ratio, but since the distribution is skewed it is
appropriate to use the median, a robust and parameterization-independent measure, as
the central estimate. The ensemble-median A17 ECS-to-ECShist ratio, using our
ECShist calculation basis, is 1.115, lower than the 1.146 mean ECS-to-ECShist ratio.
6. A17 use a smaller ensemble of CMIP5 models (21 rather than our 31), which
disproportionately excludes models with low ECS-to-ECShist ratios. For our ensemble,
the median ECS-to-ECShist ratio using our calculation basis is 1.095.8
8 If our ensemble were equally weighted by modeling center, the median ECS-to-ECShist ratio would be
1.082.
30
The ECS-to-ECShist ratio in CMIP5 models should vary positively with ECShist
(Armour 2017); it tends to do so and is generally moderate (≤ 1.16) where ECShist is under 2.9
K, although a linear fit has little explanatory power. We derive a probabilistic estimate for
ECS that reflects behavior of CMIP5 models by scaling our globally-complete Had4_krig_v2-
based energy-budget ECShist estimate using CMIP5 model ECS-to-ECShist ratios, binned (0.2
K width) by ECShist. We allocate the million sample observationally-based ECShist estimates
between the bins and scale them by the ECS-to-ECShist ratios of models in each bin, taking
models from the nearest bin(s) where the ECShist bin is empty and allocating samples falling
in each bin equally between the applicable models. The resulting ECS median estimate is 1.76
K (5–95%:1.2–3.1 K). Scaling the median of our energy-budget ECShist estimate by the 1.06
median ECS-to-ECShist ratio for the 14 CMIP5 models with an ECShist value within its
1.15−2.7 K uncertainty range likewise produces a 1.76 K median ECS estimate. A 3.1 K 95%
uncertainty bound for ECS also results if the million sample Had4_krig_v2-based ECShist
estimates are scaled using the CMIP5 ensemble-median ECS-to-ECShist ratio of 1.095 with
normally-distributed uncertainty added to give a 0.79–1.40 5−95% range.
The upper bound generated for ECS is not necessarily robust; the joint distribution of
ECS and ECShist in CMIP5 models may not be a realistic enough measure of uncertainty in
the ECS-to-ECShist ratio, nor is not known how accurately ECS can be estimated from 150
year abrupt4xCO2 simulations. If the 95% uncertainty bound for ECShist estimated using
Had4_krig_v2 data, 2.7 K, were scaled up by the highest ECS-to-ECShist ratio among CMIP5
models with an ECShist below 2.85 K, the ECS upper bound would be 3.4 K. However, much
of any excess of ECS over ECShist would take centuries to be realised in surface warming,
with little effect on warming in 2100. Twenty-first century warming arising from future
forcing increases will largely be determined by TCR, with any excess of ECS over ECShist
being almost irrelevant. Even if the highest ECS-to-ECShist ratio found in CMIP5 models
applied, warming in 2100 due to the past increase in forcing would be only 0.1 K greater than
if ECS equaled ECShist (Mauritsen and Pincus 2017).
Observationally-based evidence of the ECS-ECShist relationship can be obtained by
comparing historical-period energy budget sensitivity estimates with those based on past
changes between equilibrium climate states (implying zero N ), using proxy paleoclimate
data. However, uncertainties in forcing and temperature changes are considerably greater for
past periods, particularly for more remote periods, and climate feedbacks might have been
considerably different then. The most recent and best studied such change is that from the last
glacial maximum (LGM) to the preindustrial Holocene. It is not obvious that ECS for the
LGM transition should be lower than from preindustrial conditions, and an energy-budget
approach has long been applied to estimate ECS from this period. Although Goelzer et al.
(2011) found that the LGM-transition ECS could be reduced by melting ice sheets, the effect
was minimal when estimated ECS was below 2.5 K.
Reasonably thorough proxy-based estimates of changes in surface temperature (Annan
et al. 2013: 4.0 K; Friedrich et al. 2016: 5.0 K) and forcings (Kohler et al. 2010: total 9.5
31
Wm−2
) are available for the LGM transition. These values imply, using (4), an ECS estimate
of 1.76 K, (averaging the two surface temperature increase estimates and taking 2 CO2F per
AR5, since the WMGG forcings were derived using AR5 formulae), in line with the median
obtained by scaling this study's ECShist estimate.
8. Conclusions
Using updated and revised data, we have derived ECShist and TCR estimates that are much
better constrained, and slightly lower when using the same surface temperature dataset
(HadCRUT4), than those in the predecessor LC15 study: 1.50 K median (5−95%: 1.05−2.45
K) for ECShist and 1.20 K median (5−95%: 0.9−1.7 K) for TCR. Using infilled, globally-
complete temperature data (Had4_krig_v 2) slightly increases the new estimates, to a median
of 1.66 K for ECShist (5−95%: 1.15−2.7 K) and 1.33 K for TCR (5−95%:1.0−1.90 K). We
have also shown that various concerns that have been raised about the accuracy of historical
period energy budget climate sensitivity estimation are misplaced. We assess nil bias from
either non-unit forcing efficacy or varying SST warming patterns, and that any downwards
estimation bias when using blended infilled surface temperature data is trivial. We find that
high CMIP5 model-based estimates of the ratio of ECS to ECShist, the proxy for ECS that
historical period based studies estimate, become far lower when calculated more
appropriately. By using the ECS-to-ECShist ratios that we calculate for CMIP5 models to scale
our Had4_krig_v2-based ECShist probability distribution, we derive a median estimate for
ECS of 1.76 K (5−95%: 1.2−3.1 K).
Relative to LC15, most of the improvement in ECS estimation precision is due to
higher greenhouse gas concentrations when using data to 2016 rather than 2011 and to the
revisions to estimated CH4 and post-1990 aerosol forcing. Forcing uncertainty remains the
dominant contributor to the widths of the ECS and TCR ranges, and reducing the uncertainty
in aerosol forcing would narrow them much more than reducing uncertainty in any other
forcing component.
It is notable that the best estimates for both ECS and TCR are almost identical across
all four combinations of base period and final period. This is consistent with a modest
influence of shorter-term climate system internal variability and of measurement/estimation
error on energy budget sensitivity estimates. The estimates using the 1869−1882 base period
and 2007−2016 final period combination are preferred; they have the highest T and F
values and as a result are best constrained. Moreover, with the Argo ocean-observing network
fully operational throughout 2007–2016, there is also higher confidence in the reliability of
the ocean heat uptake estimate when using that final period. Although HadCRUT4
observational coverage was modest during 1869−1882, the fact that TCR estimation is very
similar using the higher-coverage 1930−1950 base period gives confidence in the ECS and
TCR estimates using the former base period.
32
Implied value of each variable if the other
two each equal their median value
ΔT [K] 2 CO2/F F 2 CO2/N F
Parameter and lowest inconsistent value in
CMIP5 model ensemble
ECShist: 2.89 K 1.54 0.43 0.36
TCR: 1.91 K 1.26 0.46 n/a
Observational 5-95% range 0.73–1.03 0.49–0.83 0.06–0.21
Table 5 Simplified, one-at-a-time analysis of data values implied by statistically inconsistent
CMIP5 models. The first two rows of data show the values of each of T , 2 CO2/F F and
2 CO2/N F implied by the stated ECShist and TCR values, if the remaining two of those variables
each took its median value. Those ECShist and TCR values are, for each parameter, the lowest for
any CMIP5 model in the ensemble that is above the 95% uncertainty bounds given by the
preferred (1859–82 to 2007–16) estimates from the main analysis using the globally-complete
Had4_krig_v2 dataset. The final row shows the observationally-derived uncertainty ranges the
three variables. Best estimates and uncertainty ranges are derived from the same one million
samples used for the main statistical analysis.
Over half of 31 CMIP5 models have best-estimate ECShist values of 2.9 K or higher,
exceeding by over 7% our 2.7 K observationally-based 95% uncertainty bound using infilled
temperature data. Moreover, a majority of the models have best-estimate TCR values above
our corresponding 1.9 K 95% bound. A majority of the models also have best-estimate ECS
values above our 3.1 K 95% bound. A simplified analysis (Table 5) based on considering in
turn uncertainty only in T , 2 CO2/F F and 2 CO2/N F (thus taking into account
uncertainty in 2 CO2F ), confirms that in each case the lowest CMIP5 model TCR and ECShist
values that we find to be inconsistent with observed warming imply, implausibly, that T ,
2 CO2/F F and 2 CO2/N F have values outside their uncertainty ranges.
The implications of our results are that high best estimates of ECShist, ECS and TCR
derived from a majority of CMIP5 climate models are inconsistent with observed warming
during the historical period (confidence level 95%). Moreover, our median ECS and TCR
estimates using infilled temperature data imply multicentennial or multidecadal future
warming under increasing forcing of only 55−70% of the mean warming simulated by CMIP5
models.
Acknowledgements. We thank Cheng Lijing for providing OHC data, updated to 2016, and
three reviewers for helpful comments.
33
References Andersson, S. M., et al., 2015: Significant radiative impact of volcanic aerosol in the lowermost
stratosphere. Nature communications, 6, 7692.
Andrews T, Gregory JM, Webb MJ, Taylor KE, 2012: Forcing, feedbacks and climate sensitivity in
CMIP5 coupled atmosphere-ocean climate models. Geophys .Res. Lett.,
39:doi:101029/2012GL051607.
Andrews, T., J. M. Gregory, and M. J. Webb, 2015: The dependence of radiative forcing and
feedback on evolving patterns of surface temperature change in climate models. J. Climate, 28(4),
1630-1648.
Annan, J. D., and J. C. Hargreaves , 2013: A new global reconstruction of temperature changes at the
Last Glacial Maximum. Climate Past., 9, 367–376.
Armour, K. C., and G. H. Roe, 2011: Climate commitment in an uncertain world. Geophys. Res. Lett.
38:L01707.
Armour, K. C., 2017: Energy budget constraints on climate sensitivity in light of inconstant climate
feedbacks. Nature Climate Change, 7, 331-335.
Bender, F. A. M., A. Engström and J. Karlsson, 2016: Factors controlling cloud albedo in marine
subtropical stratocumulus regions in climate models and satellite observations. J Climate, 29(10),
3559-3587.
Bindoff, N. L., and Coauthors, 2014: Detection and Attribution of Climate Change: from Global to
Regional. Climate Change 2013: The Physical Science Basis Contribution of Working Group I to the
Fifth Assessment Report of the Intergovernmental Panel on Climate Change. Cambridge University
Press, Cambridge.
Byrne, B., and C. Goldblatt, 2014: Radiative forcing at high concentrations of well‐mixed greenhouse
gases. Geophys. Res. Lett., 41, 152–160, doi:10.1002/2013gl058456.
Caldeira, K., and N. P. Myhrvold, 2013: Projections of the pace of warming following an abrupt
increase in atmospheric carbon dioxide concentration. Environmental Res. Lett. 8.3: 034039.
Caldwell, P.M., M.D. Zelinka, K.E.Taylor, and K. Marvel, 2016: Quantifying the sources of
intermodel spread in equilibrium climate sensitivity. Journal of Climate, 29(2), pp.513-524.
Charney, J. G., 1979: Carbon Dioxide and Climate: A Scientific Assessment. National Academies of
Science Press, Washington, DC, 22 pp.
Cheng, L., K. Trenberth, J. Fasullo, T. Boyer, J. Abraham, and J. Zhu, 2017: Improved estimates of
ocean heat content from 1960 to 2015. Science Advances, 3, e1601545
Cowtan, K., and R. G. Way, 2014: Coverage bias in the HadCRUT4 temperature series and its impact
on recent temperature trends. Quart. J. Roy. Meteor. Soc., 140(683), 1935-1944. (update at
http://www.webcitation.org/6t09bN8vM; data at http://www-
users.york.ac.uk/%7Ekdc3/papers/coverage2013/series.html)
Cowtan, K., and Coauthors, 2015: Robust comparison of climate models with observations using
blended land air and ocean sea surface temperatures. Geophys. Res. Lett., 42, 6526–6534.
Cherian, R., J. Quaas, M. Salzmann, and M. Wild, 2014: Pollution trends over Europe constrain global
aerosol forcing as simulated by climate models, Geophys. Res. Lett., 41, 2176-2181,
doi:10.1002/2013GL058715.
Dee, D. P., and Coauthors, 2011: The ERA-Interim reanalysis: Configuration and performance of the
data assimilation system. Quart. J. Roy. Meteor. Soc., 137, 553–597
34
DelSole, T., M.K. Tippett, and J. Shukla, 2011: A significant component of unforced multidecadal
variability in the recent acceleration of global warming. J. Climate, 24, 909-926
Deming, W. E., 1943: Statistical Adjustment of Data. Wiley, NY (Dover Publications edition, 1985),
288 pp.
Desbruyeres, D., E. McDonagh, B. King, and V. Thierry, 2017: Global and Full depth Ocean
Temperature Trends during the early 21 century from Argo and Repeat Hydrography. J. Climate,
30(6), 1985-97.
ECMWF, 2015: ECMWF releases global reanalysis data for 2014. European Centre for Medium-
Range Weather Forecasts. Archived at:
http://web.archive.org/web/20180118202517/https://www.ecmwf.int/en/about/media-
centre/news/2015/ecmwf-releases-global-reanalysis-data-2014-0.
Enfield D. B., A. M. Mestas-Nunez, and P. J. Trimble, 2001: The Atlantic Multidecadal Oscillation
and its relationship to rainfall and river flows in the continental US. Geophys. Res. Lett. 28:2077-2080.
Etminan, M., G. Myhre, E. J. Highwood, and K. P. Shine, 2016: Radiative forcing of carbon dioxide,
methane, and nitrous oxide: A significant revision of the methane radiative forcing. Geophys. Res.
Lett. 43(24) doi:10.1002/2016GL071930.
Fiedler, S., B. Stevens, and T. Mauritsen, 2017: On the sensitivity of anthropogenic aerosol forcing to
model-internal variability and parameterizing a Twomey effect, J. Adv. Model. Earth Syst., 9,
doi:10.1002/2017MS000932.
Forster, P, 2016: Inference of Climate Sensitivity from Analysis of Earth's Energy Budget. Annu. Rev.
Earth Planet. Sci., 44, doi: 10.1146/annurev-earth-060614-105156
Friedrich, T., A. Timmermann, M. Tigchelaar, O. E. Timm and A. Ganopolski, 2016: Nonlinear
climate sensitivity and its implications for future greenhouse warming. Science Advances, 2(11),
e1501923.
Goelzer, H., P. Huybrechts, M. F.Loutre, H. Goosse, T. Fichefet and A. Mouchet, 2011: Impact of
Greenland and Antarctic ice sheet interactions on climate sensitivity. Climate Dynamics, 37(5-6),
1005-1018.
Good, P., J. M. Gregory and J. A. Lowe, 2011: A step‐response simple climate model to reconstruct
and interpret AOGCM projections. Geophys. Res. Lett., 38(1).
Gordon, C., C. Cooper, C. A. Senior, H. Banks, J. M. Gregory, T. C. Johns, J. F. B. Mitchell and R. A.
Wood, 2000: The simulation of SST, sea ice extents and ocean heat transports in a version of the
Hadley Centre coupled model without flux adjustments. Climate Dynamics., 16:147-168
Gordon, H., and Coauthors, 2016: Reduced anthropogenic aerosol radiative forcing caused by
biogenic new particle formation. Proceedings of the National Academy of Sciences, 113, 43, 12053–
12058
Gregory, J. M., R. J., Stouffer, S. C. B. Raper, P. A. Stott and N. A. Rayner, 2002: An observationally
based estimate of the climate sensitivity. J. Climate, 15:3117–3121
Gregory, J.M., and P. M. Forster, 2008: Transient climate response estimated from radiative forcing
and observed temperature change. J. Geophys. Res., 113(D23).
Gregory, J. M., and Coauthors, 2013: Climate models without pre-industrial volcanic forcing
underestimate historical ocean thermal expansion. Geophys. Res. Lett., 40(8), 1600–1604.
Gregory, J. M., and Coauthors, 2004: A new method for diagnosing radiative forcing and climate
sensitivity. Geophys. Res. Lett. 31.3, L03205.
35
Gregory, J. M., and T. Andrews, 2016: Variation in climate sensitivity and feedback parameters during
the historical period. Geophys. Res. Lett., 43: 3911–3920.
Hansen, J., and Coauthors, 2005: Efficacy of climate forcings. J. Geophys Res., 110: D18104,
doi:101029/2005JD005776
Hobbs, W., M. D. Palmer, and D. Monselesan, 2016:. An energy conservation analysis of ocean drift
in the CMIP5 global coupled models. J. Climate, 29(5), 1639-1653
Ishii, M., and M. Kimoto, 2009: Reevaluation of historical ocean heat content variations with time-
varying XBT and MBT depth bias corrections. J. Oceanography, 65, 287-299.
Johnson, G. C., J. M. Lyman and N. G. Loeb, N. G., 2016: Improving estimates of Earth's energy
imbalance. Nature Climate Change, 6, 639-640.
Kent, E. C., and Coauthors, 2013: Global analysis of night marine air temperature and its uncertainty
since 1880: The HadNMAT2 data set. J. Geophys. Res., 118, 1281–1298
Köhler, P., R. Bintanja, H. Fischer, F. Joos, R. Knutti, G. Lohmann, and V. Masson-Delmotte, 2010:
What caused Earth's temperature variations during the last 800,000 years? Data-based evidence on
radiative forcing and constraints on climate sensitivity. Quaternary Science Reviews, 29, 129–145.
Kummer, J. R., and A. E. Dessler, 2014: The impact of forcing efficacy on the equilibrium climate
sensitivity. Geophys. Res. Lett. 41, 3565-3568.
Levitus, S., and Coauthors, 2012: World ocean heat content and thermosteric sea level change (0-
2000m),1955-2010. Geophys. Res. Lett., 39:L10603
Lewis, N., and J. A. Curry, 2015: The implications for climate sensitivity of AR5 forcing and heat
uptake estimates. Climate Dynamics, 45(3-4), 1009-1023.
Lohmann, U., 2017: Why does knowledge of past aerosol forcing matter for future climate change? J.
Geophys. Res., 122, 5021–5023.
McCoy, D. T., F. A.-M. Bender, J. K. C. Mohrmann, D. L. Hartmann, R. Wood, and D. P. Grosvenor,
2017: The global aerosol-cloud first indirect effect estimated using MODIS, MERRA, and AeroCom,
J. Geophys. Res. Atmos.,122, 1779–1796,
Malavelle, F.F. and Coauthors, 2017: Strong constraints on aerosol–cloud interactions from volcanic
eruptions. Nature, 546, 485–491.
Marvel, K., G. A. Schmidt, R. L. Miller and L. S. Nazarenko, 2016: Implications for climate
sensitivity from the response to individual forcings. Nature Climate Change, 6(4), 386-389.
Masters, T., 2014: Observational estimate of climate sensitivity from changes in the rate of ocean heat
uptake and comparison to CMIP5 models. Climate Dynamics, 42,2173-2181 DOI 101007/s00382-
013-1770-4.
Mauritsen, T., and R. Pincus, 2017: Committed warming inferred from observations. Nature Climate
Change, 7(9), nclimate3357.
Miller, R. L., and Coauthors, 2014: CMIP5 historical simulations (1850–2012) with GISS ModelE2. J.
Advances in Modeling Earth Systems, 6, 441_477.
Morice, C. P., J. J. Kennedy, N. A. Rayner and P. D. Jones, 2012: Quantifying uncertainties in global
and regional temperature change using an ensemble of observational estimates: The HadCRUT4 data
set. J. Geophys. Res., 117: D08101 doi:101029/2011JD017187.
Müller, W.A., and Coauthors, 2015: A twentieth-century reanalysis forced ocean model to reconstruct
the North Atlantic climate variation during the 1920s. Climate Dynamics, 44(7-8), 1935-1955.
36
Myhre, G. and Coauthors, 2014: Anthropogenic and Natural Radiative Forcing. Climate Change 2013:
The Physical Science Basis Contribution of Working Group I to the Fifth Assessment Report of the
Intergovernmental Panel on Climate Change. Cambridge University Press, Cambridge
Myhre, G., and Coauthors, 2017: Multi-model simulations of aerosol and ozone radiative forcing due
to anthropogenic emission changes during the period 1990–2015. Atmos. Chemistry and Phys., 17(4),
2709-2720.
Nazarenko, L., D. Rind, K. Tsigaridis, A. D. Del Genio, M. Kelley, and N. Tausnev, 2017: Interactive
nature of climate change and aerosol forcing. J. Geophys, Res., 122, 6, 3457-3480.
Ocko I. B., V. Ramaswamy and Y. Ming, 2014: Contrasting climate responses to the scattering and
absorbing features of anthropogenic aerosol forcings. J. Climate, 27, 5329–5345
Otto, A. and Coauthors, 2013: Energy budget constraints on climate response. Nature Geosci., 6, 415–
416.
Paynter, D., and T. L. Frölicher, 2015: Sensitivity of radiative forcing, ocean heat uptake, and climate
feedback to changes in anthropogenic greenhouse gases and aerosols, J. Geophys. Res. Atmos., 120,
9837–9854, doi:10.1002/2015JD023364.
Proistosescu, C., and P. J. Huybers, 2017: Slow climate mode reconciles historical and model-based
estimates of climate sensitivity. Science Advances, 3(7), e1602821.
Qi, L., Q. Li, C. He, X. Wang, and J. Huang: Effects of the Wegener–Bergeron–Findeisen process on
global black carbon distribution, Atmos. Chem. Phys., 17, 7459-7479, https://doi.org/10.5194/acp-17-
7459-2017, 2017.
Qu, X., and Coauthors, 2017: On the emergent constraints of climate sensitivity. J. Climate,
doi:10.1175/JCLI-D-17-0482.1.
Richardson, M., K. Cowtan, E. Hawkins, and M. B. Stolpe, 2016: Reconciled climate response
estimates from climate models and the energy budget of Earth. Nature Climate Change, 6(10), 931-
935.
Ridley, D. A., and Coauthors, 2014: Total volcanic stratospheric aerosol optical depths and
implications for global climate change, Geophys. Res. Lett., 41, 7763–7769.
Roe G. H., and K. C. Armour, 2011: How sensitive is climate sensitivity? Geophys. Res. Lett.,
38:L14708.
Rotstayn, L. D., M. A. Collier, D. T. Shindell, and O. Boucher, 2015: Why does aerosol forcing
control historical global-mean surface temperature change in CMIP5 models?, J. Clim., 28, 6608–
6625, doi:10.1175/jcli-d-14-00712.1.
Rugenstein, M. A., J. M. Gregory, N. Schaller, J. Sedláček and R. Knutti, 2016: Multiannual ocean–
atmosphere adjustments to radiative forcing. J. Climate, 29(15), 5643-5.
Samset, B. H., and Coauthors, 2014: Modelled black carbon radiative forcing and atmospheric
lifetime in AeroCom Phase II constrained by aircraft observations. Atmos. Chem. Phys., 14,12465-
12477.
Sato, M., J.E. Hansen, M.P. McCormick, and J.B. Pollack, 1993: Stratospheric aerosol optical depth,
1850-1990. J. Geophys. Res. 98, 22987-22994.
Sato, Y., D. Goto, T. Michibata, K. Suzuki, T. Takemura, H. Tomita, and T. Nakajima, 2018: Aerosol
effects on cloud water amounts were successfully simulated by a global cloud-system resolving model.
Nature communications, 9(1), 985.
37
Seifert, A., T. Heus, R. Pincus, and B. Stevens, 2015: Large-eddy simulation of the transient and near-
equilibrium behavior of precipitating shallow convection. J. Advances in Modeling Earth Systems, 7,
1918–1937.
Sherwood, S.C., and Coauthors, 2015: Adjustments in the forcing-feedback framework for
understanding climate change. Bulletin of the American Meteorological Society, 96(2), 217-228.
Shindell, D.T., 2014: Inhomogeneous forcing and transient climate sensitivity. Nature Climate
Change, 4, 274–277.
Simmons, A. J., and Coauthors, 2017: A reassessment of temperature variations and trends from
global reanalyses and monthly surface climatological datasets. Quart. J. Roy. Meteor. Soc., 143, 101–
119.
Stevens, B., 2015: Rethinking the lower bound on aerosol radiative forcing. J. Climate, 28, 4794–
4819.
Stevens, B., and Coauthors, 2017: MACv2-SP: a parameterization of anthropogenic aerosol optical
properties and an associated Twomey effect for use in CMIP6, Geosci. Model Dev., 10, 433-452.
Toll, V., M. Christensen, S. Gass, and N. Bellouin, 2017: Volcano and ship tracks indicate excessive
aerosol-induced cloud water increases in a climate model. Geophys. Res. Lett. doi:
10.1002/2017GL075280.
Trenberth, K.E. and J. W. Hurrell, 1994: Decadal atmosphere-ocean variations in the Pacific. Climate
Dynamics, 9(6),.303-319.
van Oldenborgh, G. J., L. A. te Raa, H.A. Dijkstra, and S. Y. Philip, 2009: Frequency- or amplitude-
dependent effects of the Atlantic meridional overturning on the tropical Pacific Ocean, Ocean Science,
5, 293-301, https://doi.org/10.5194/os-5-293-2009.
Wang, Q. Q., and Coauthors, 2014: Global budget and radiative forcing of black carbon
aerosol: Constraints from pole-to-pole (HIPPO) observations across the Pacific. J. Geophys. Res.-
Atmos., 119,195-206.
Wang, R., and Coauthors, 2016: Estimation of global black carbon direct radiative forcing and its
uncertainty constrained by observations, J. Geophys. Res. Atmos., 121, 5948–5971,
Williams, K. D., W. J. Ingram, and J. M. Gregory, 2008: Time Variation of Effective Climate
Sensitivity in GCMs. J. Climate, 21,5076–5090.
Wolter, K., and M. S. Timlin, 1993: Monitoring ENSO in COADS with a seasonally adjusted
principal component index. Proc of the 17th Climate Diagnostics Workshop, Norman, OK,
NOAA/NMC/CAC, NSSL, Oklahoma Clim Survey, CIMMS and the School of Meteor., University of
Oklahoma, 52-57.
Wolter, K., and M. S. Timlin, 2011: El Niño/Southern Oscillation behaviour since 1871 as diagnosed
in an extended multivariate ENSO index (MEIext). Intl. J. Climatology, 31,1074–1087.
Zhang, Y., and Coauthors, 2017: Top-of-atmosphere radiative forcing affected by brown carbon in the
upper troposphere. Nature Geoscience, 10, 486–489.
Zhao, M., and Coauthors, 2016:. Uncertainty in model climate sensitivity traced to representations of
cumulus precipitation microphysics. J. Climate 29.2, 543-560.
Zhou, C., and J. E. Penner, 2017: Why do general circulation models overestimate the aerosol cloud
lifetime effect? A case study comparing CAM5 and a CRM. Atmos. Chem. Phys., 17, 21-29
Zhou, C., M. D. Zelinka, and S. A. Klein, 2016: Impact of decadal cloud variations on the Earth’s
energy budget. Nature Geoscience, 9, 871–874.