Upload
khangminh22
View
1
Download
0
Embed Size (px)
Citation preview
Hat Graphs 1
Running Header: INTRODUCING HAT GRAPHS
Introducing Hat Graphs
Jessica K. Witt
Colorado State University
Under review at Cognitive Research: Principles and Implications
Hat Graphs 2
Visualizing data through graphs can be an effective way to communicate one’s results. A ubiquitous
graph and common technique to communicate behavioral data is the bar graph. The bar graph was first
invented in 1786 and little has changed in its format. Here, a replacement for the bar graph is proposed.
The new format, called a hat graph, maintains some of the critical features of the bar graph such as its
discrete elements, but eliminates redundancies that are problematic when the baseline is not at 0. Hat
graphs also include elements designed to improve Gestalt grouping principles and engage object-based
attention. The effectiveness of the hat graph was tested in two empirical studies for which participants
had to find and identify the condition that lead to the biggest improvement from baseline to final test.
Performance with hat graphs was 30 - 40% faster than with bar graphs.
Key words: Information visualization; graphs
Significance Statement: Visualizations are an important way to communicate results from data. As Big
Data has increased impact on daily life, communication of data is of critical importance. The bar graph is
ubiquitous and yet has not been fundamentally updated since its inception in the 18th century. The hat
graph is offered as a modernized version that more effectively communicates differences across
conditions. Hat graphs increased the speed to find the condition associated with the biggest difference
by 30-40% relative to bar graphs. Hat graphs can significantly improve how scientists communicate their
data.
Hat Graphs 3
Introducing Hat Graphs
Bar graphs are a common technique to visualize data in psychology. In a recent issue of
Psychological Science, 50% of the articles included a bar graph (2018, v 29, issue 12). The bar graph
dates back to 1786 (Playfair, 1786), and the first bar graph bares great resemblance to typical bar graphs
used today. Bar graphs are problematic for behavioral research. Bar graphs depict differences in two
ways: one is by the relative position of the end of each bar and the other is by length of each bar (and,
perhaps a third, is by area of each bar). If the baseline of the graph is not set to 0, these various sources
of information will be inconsistent with each other. The relative position of the bar ends will accurately
reflect the differences between the bars, but the relative difference in length will be exaggerated (in
cases for which the baseline is greater than 0 and the bars are positive) or understated (in cases for
which the baseline is less than 0 and the bars are positive). For this reason, users are encouraged to
always use a baseline of 0 for bar graphs (Pandey, Rall, Satterthwaite, Nov, & Bertini, 2015). However,
this baseline creates large biases in readers’ perceptions of the size of effects depicted in bar graphs
(Witt, in press). Even big effects appear small with a baseline at 0. An ideal baseline is one that is at
least 0.75 standard deviations (SDs) below the general mean so as to maximize compatibility between
the visual size of the effect and the effect size being depicted. Given that effect size in psychology is
often measured in terms of SDs, and an effect size of 0.8 is considered “big” (Cohen, 1988), it is sensible
to set the range of the y-axis to be 1.5 SDs. This range is problematic when using bar graphs, however,
given the mixed meanings across the different features when the baseline is not set to 0.
Alternatives to bar graphs include point graphs and line graphs. Point graphs have the
advantage that only one feature specifies the data, namely relative position. Thus, inconsistencies are
not created by non-zero baselines. But point graphs do not have as strong Gestalt grouping principles to
help facilitate perceptual grouping of pairs of data points. This grouping problem can be solved by
connecting the points with a line, thus making the graph a line graph. Line graphs are a natural choice
Hat Graphs 4
when communicating trends, such as differences in a dependent variable across a continuous
independent. Line graphs are not, however, a natural choice when communicating discrete values.
They can even lead to misinterpretations of discrete variables, such as when comparing across distinct
groups like construction workers versus librarians.
The goal here was to create a new kind of graph that had the strong grouping properties of bar
and line graphs, the natural discrete interpretations of bar graphs, and the flexibility to have a
standardized y-axis range with a non-zero baseline. The new graph format, called hat graphs, is a simple
modification from the bar graph (see Figure 1). The “brim” of the hat is the top border of the bar of the
first of a pair of conditions. The “crown” of the hat is the portion of the bar from the second of the pair
that is different from the first. Thus, the information contained in the hat graph is the same as the
information in the bar graph. Now, however, the length of the bar (i.e., the size of the hat) corresponds
to the difference between the two conditions. Typically it is differences between two conditions that
represents the main finding in a paper, so the hat graph also has the advantage of emphasizing the key
result. Because the data are no longer represented as bars themselves, a non-zero baseline will not
create conflicting interpretations of the data. Because the pieces of the hat are connected, the hat will
benefit from some of the strongest Gestalt grouping principles (connectedness, proximity) and likely be
seen as a single object.
Figure 1. Illustration of the format of the hat graph as it relates to a bar graph.
Hat Graphs 5
To corroborate these theoretical advantages for hat graphs, empirical studies were conducted
to explore the relative benefits of hat graphs versus bar graphs. In three experiments, participants
made judgments about the best advertisement on change in attitudes from baseline to final test based
on information displayed in bar graphs or hat graphs.
Experiment 1
Participants were shown images depicting attitude scores on baseline and final tests for 3 or 6
advertisements. Their task was to indicate which advertisement produced the largest improvement in
attitude at final score over baseline.
Method
Participants. Twenty-two participants volunteered in exchange for course credit. Given the
theoretical reasons to think hat graphs would have an advantage over bar graphs, a large effect was
assumed. A power analysis for a paired-samples t-test with an effect size of d = 0.80 and alpha = 0.05
(two-tailed) showed that 14 pairs are needed to achieve 80% power. Data collection was scheduled to
stop on a day for which this number was likely to be achieved, although more participants were
collected than needed, resulting in 95% power to find an effect size of d = 0.80.
Ethics, Consent and Permissions. All participants provided informed consent, and the protocol
was approved by Colorado State University’s IRB (protocol number 12-3709H).
Stimuli and Apparatus. Stimuli were displayed on computer monitors. The stimuli were
created using data simulated and plotted in R (R Core Team, 2017). Four factors were manipulated.
One factor was graph type (hat graph versus bar graph). Each set of simulated data were plotted with a
hat graph and with a bar graph. Another factor was number of advertisements (3 or 6). A third factor
was the position of the target (best) advertisement. These were evenly distributed across the locations,
Hat Graphs 6
and target position was repeated for the graphs with only 3 advertisements. The fourth factor was the
alignment across advertisements. One third of the graphs were aligned to have similar baselines, so the
target advertisement also had the highest final score. One third were aligned to have similar final
scores, so the target advertisement had the lowest baseline score. The final third were aligned at the
center or middle value between baseline and final score (see Figure 2). Each graph style was repeated 4
times to have several variants to show to the participants. This resulted in 288 unique graphs (288 = 2 x
2 x 6 x 3 x 4).
A B
C D
Figure 2. Sample graphs from Experiment 1. Panels A and B show the hat graph version and the bar graph version of the same data. Panels A, C, and D show the 3 types of alignment (baseline, final, center, respectively).
Hat Graphs 7
Data for the graphs were created from simulations. As noted below, the task for participants
was to indicate which advertisement produced the largest change in attitude, so the critical data are the
differences between conditions. For the non-target advertisements, the differences between baseline
and final scores were the mean value of 100 samples from a normal distribution with a mean of 1 and a
SD of 2. For the target advertisements, the differences between baseline and final scores were the
mean value of 100 samples from a normal distribution with a mean of 2.5 and a SD of 1. Thus, the
target advertisement produced a bump in attitude scores 2.5 times more than the non-target
advertisements. These difference scores were added to the baseline scores for the baseline aligned and
center aligned graphs. For these graphs, the baseline scores were the mean value of 100 samples from
a normal distribution with a mean of 3 and a SD of 1. For the final align graphs, the final scores were the
mean value of 100 samples from a normal distribution with a mean of 5.5 and a SD of 1, and the
difference score were subtracted from the final scores. The process was the same for the target
conditions with the exception that .5 was subtracted from the baseline condition so that it would not
align with the other baseline conditions.
For each set of data, two graphs were created. One was a bar graph and one was a hat graph.
Thus, the data contained in the graphs were identical across graph types. For the bar graph, the
baseline condition was white and the final condition was black. For the hat graphs, the lines were black
and the crown was white. The y-axis ranged from 1 to 8 on every graph. Advertisements were labeled A
– F and were always in alphabetical order.
Procedure. Participants completed two blocks of trials, one with bar graphs and one with hat
graphs. Start order was counterbalanced across participants. For the hat graphs, they were shown
these initial instructions: “An advertising company is interested in which ads lead to the biggest changes
in attitude. They ran a study testing several different ads. In each study, they measured attitude at
BASELINE (before seeing any ads) and again at the FINAL test (after seeing the ads). All of the ads
Hat Graphs 8
increased attitude. Your task is to determine which ad produced the BIGGEST increase in attitude. The
baseline attitude will be shown as a horizontal line. The final attitude is shown as the top of the box. The
height of the box shows the change in attitude from baseline to final test. Which ad produces the
biggest change? Enter your response for each graph on the keyboard. Respond as fast and accurately as
possible. Press ENTER to begin”. For the bar graphs, the instructions were the same except instead of
describing the hat graph, they were told the following: “The baseline attitude will be shown in white
boxes. The final attitude will be shown in black boxes. The difference between the white and black
boxes shows change in attitude from baseline to final test.”
On each trial, after a fixation screen of 500ms, a graph was shown and participants entered a
response A-C (for graphs with 3 advertisements) or A-F (for graphs with 6 advertisements). The graph
remained visible until participants made their response. At which point, a blank screen was shown for
500ms before the next trial began. Participants completed 144 trials with one type of graph before
switching to the block of trials with the other graph type. Order within block was randomized.
Results and Discussion
The data were initially explored for outliers. Reaction times (RTs) beyond 1.5 times the
interquartile range (IQR) for each subject for each condition were excluded (6.2% of the data). Next,
median RTs and mean accuracy scores were calculated for each subject and each condition and
submitted to separate boxplots. No participants were beyond the IQR on the RTs but four were beyond
1.5 times the IQR for accuracy scores. These participants were excluded. For remaining participants,
accuracy was nearly perfect (M = 98.9%, SD = 1.2%), so the analysis focused on RTs.
RTs are positively skewed, so they were log-transformed. Data were analyzed with linear mixed
models using the LME4 and LMERTEST packages in R (Bates, Machler, Bolker, & Walker, 2015;
Kuznetsova, Brockhoff, & Christensen, 2017). A linear mixed model was run with the log RTs as the
Hat Graphs 9
dependent factor. The independent factors were graph type (bar or hat), number of advertisers (3 or 6;
entered as a factor), graph alignment (baseline, final, center), and start condition. Only two-way
interactions with graph type were included in the model. Random effects were included for participant
including intercepts and slopes for graph type and number of advertises. The model did not converge
with the inclusion of a random slope for graph alignment. Random intercepts were also included for
each set of simulated data that was plotted in both the hat and bar graphs. The main effect of graph
type was significant, t = 12.66, p < .001, estimate = 0.48, SE = .04. Relative to hat graphs, responses to
bar graphs were 38% slower (see Figure 3).
Figure 3. Reaction time is plotted as a function of number of items and graph type for Experiment 1. The brim of the hat corresponds to the RTs for the hat graph condition, and the top of the crown of the hat corresponds to RTs for the bar graph condition. The size of the crown indicates the difference in RTs between the two conditions. Error bars are 1 SEM calculated within-subjects. The range of the y-axis is 4 within-subjects SDs.
Hat Graphs 10
The main effect of number of items was significant, t = 5.90, p < .001, estimate = .13, SE = .02.
Going from 3 to 6 items was associated with a 14% increase in reaction time. The interaction between
graph type and number of items was not significant, t = -0.27, p = .78, estimate < .01, SE = .01. Thus, hat
graphs made people faster at finding the target item but not necessarily more efficient given the similar
search slopes (see Figure 3). Graph alignment had a significant effect on RTs. RTs were 5% slower when
the final scores were aligned than when the baseline scores were aligned, t = 2.23, p = .027, estimate =
.054, SE = .02. RTs were just less than 5% slower when the center of both scores were aligned than
when the baseline scores were aligned, t = 1.97, p = .050, estimate = .048, SE = .02. The interaction
between graph type and graph alignment was significant. RTs were 6% faster with hat graphs than with
bar graphs when the baseline scores were aligned than when the final score were aligned, t = 3.88, p <
.001, estimate = .06, SE = .02, and were less than 3% faster with hat graphs than with bar graphs when
baseline scores were align than when center values were aligned, though this did not reach statistical
significance, t = 1.67, p = .096, estimate = .026, SE = .015 (see Figure 4).
Figure 4. Reaction time is plotted as a function of number of items and graph type for Experiment 1. The brim of the hat corresponds to the RTs for the hat graph condition, and the
Hat Graphs 11
top of the crown of the hat corresponds to RTs for the bar graph condition. The size of the crown indicates the difference in RTs between the two conditions. Error bars are 1 SEM calculated within-subjects. The range of the y-axis is 4 within-subjects SDs.
There was a significant interaction between graph type and start condition, t = 4.74, p < .001,
estimate = .26, SE = .06 (see Figure 5). Hat graphs had a bigger advantage over bar graphs for people
who started with the bar graphs than for people who started with the hat graphs. Another way to
interpret this pattern, however, is that both groups got faster in the second block, and there was also a
main effect of graph type (see Figure 5).
Figure 5. Reaction time is plotted as a function of graph type and start condition (left panel) or block (right panel) for Experiment 1. The brim of the hat corresponds to the RTs for the hat graph condition, and the top of the crown of the hat corresponds to RTs for the bar graph condition. The size of the crown indicates the difference in RTs between the two conditions. In the left panel, error bars are 1 SEM calculated within-subjects for each start condition. In the right panel, the error bars are 1 SEM calculated between-subjects. The range of the y-axis is 6 within-subjects SDs.
Hat graphs improved speed to find the advertisement that produced the biggest boost in
attitude relative to bar graphs.
Hat Graphs 12
Experiment 2
Experiment 2 serves as a replication of Experiment 1. The effect size depicted in the graphs was
slightly smaller than in Experiment 1.
Method
Seventeen students volunteered in exchange for course credit. Everything was the same as in
Experiment 1 except the target difference was simulated as a magnitude of 2 (rather than 2.5).
Results and Discussion
Data were analyzed as before, and 5% of trials were excluded because the RT was beyond 1.5
times the IQR for that participant for that condition. RTs were then summarized for each participant for
each condition, and 3 participants were identified as having a median RT greater than 1.5 times the IQR
for the group and 4 participants had mean accuracy scores less than 1.5 times the IQR for the group. All
7 were excluded. This eliminated a large portion of the data, but resulted in the inclusion of 10
participants, which was sufficient to achieve 95% power to find an effect of d = 1.28 (as a one-sample t-
test). For the included participants, mean accuracy was 98.9% (SD = 1.2%).
The data were submitted to a linear mixed model with the log of the reaction times as the
dependent factor, graph type and number of items as independent factors, and subject and graph image
as random effects (including random slopes for subject for both independent factors). The main effect
for graph type was significant, t = 12.46, p < .001, estimate = 0.37, SE = .03 (see Figure 6). Participants
were 31% slower to respond to bar graphs than to hat graphs. The main effect for number of items was
significant, t = 7.16, p < .001, estimate = .17, SE = .02. Participants were 15% slower to respond to 6
items than to 3 items. The interaction between graph type and number of items was not significant, t =
Hat Graphs 13
0.46, p = .64, estimate < .01, SE = .02. Hat graphs lead to faster response times but not more efficient
search slopes.
Figure 6. Reaction time is plotted as a function of number of items and graph type for Experiment 2. The brim of the hat corresponds to the RTs for the hat graph condition, and the top of the crown of the hat corresponds to RTs for the bar graph condition. The size of the crown indicates the difference in RTs between the two conditions. Error bars are 1 SEM calculated within-subjects. The range of the y-axis is 5 within-subjects SDs.
Responses were faster to graphs with aligned baselines than to graphs with aligned centers, t =
2.18, p = .031, estimate = .05, SE = .02. Participants were marginally faster to graphs with aligned
baselines than to graphs with aligned final scores, t = 1.61, p = .110, estimate = .04, SE = .02. The
interaction between graph type and alignment was significant for the comparison between baseline
aligned and final score aligned graphs, t = 2.10, p = .036, estimate = .05, SE = .02, but not for the other
comparisons, ps > .40 (see Figure 7).
Hat Graphs 14
Figure 7. Reaction time is plotted as a function of number of items and graph type for Experiment 2. The brim of the hat corresponds to the RTs for the hat graph condition, and the top of the crown of the hat corresponds to RTs for the bar graph condition. The size of the crown indicates the difference in RTs between the two conditions. Error bars are 1 SEM calculated within-subjects. The range of the y-axis is 4 within-subjects SDs.
The interaction between graph type and start condition was significant, t = 7.02, p < .001,
estimate = .25, SE = .04. Participants who started with bar graphs were even faster with hat graphs
relative to bar graphs compared with participants who started with hat graphs (see Figure 8). An
alternative explanation is that both groups got faster from block 1 to block 2, and there was also a main
effect of graph type.
Hat Graphs 15
Figure 8. Reaction time is plotted as a function of graph type and start condition (left panel) or block (right panel) for Experiment 2. The brim of the hat corresponds to the RTs for the hat graph condition, and the top of the crown of the hat corresponds to RTs for the bar graph condition. The size of the crown indicates the difference in RTs between the two conditions. In the left panel, error bars are 1 SEM calculated within-subjects for each start condition. In the right panel, the error bars are 1 SEM calculated between-subjects. The range of the y-axis is 6 within-subjects SDs.
The results replicate those found in Experiment 1 and show a large advantage for the hat graphs
over the bar graphs.
General Discussion
Hat graphs were designed based on principles of Gestalt grouping, graph design, and object-
based attention as a better alternative to bar graphs. The current experiments showed that
performance on detecting differences across pairs of conditions was 20% faster for hat graphs than for
bar graphs. In a 2 x N design for which comparisons across pairs of conditions is of interest, a graph of
the data is easier to read when the pairs are visually-grouped together. Such grouping can be readily
achieved through the principles of Gestalt grouping (Wagemans et al., 2012; Wertheimer, 1912). Hat
graphs leverage two of the strongest principles, namely connectedness and proximity. Because the brim
Hat Graphs 16
of the hat is close to and physically connected to the crown, the hat should be seen as a grouped object.
This is likely to have contributed to why performance was better with the hat graphs than the bar
graphs.
Graph design principles also contributed to the creation of the hat graphs. According to graph
design, discrete elements (like in bar graphs) are more likely to be interpreted as a discrete effect (e.g.
“males are taller than females”) whereas continuous lines (like in line plots) are more likely to be
interpreted as a continuous effect even when such an interpretation is in appropriate (e.g. “the more
male a person is, the taller he is”, Zacks & Tversky, 1999). Although not directly tested here, the hat
graphs bare more of a resemblance to the discrete elements of the bar graph than the continuous
elements of the line graph. It is recommended that hat graphs be used instead of bar graphs for plotting
discrete data, whereas line graphs continue to be used for continuous data.
In addition to performance being better with hat graphs than with bar graphs, hat graphs have
another advantage related to graph design in that they are not restricted to having a baseline at zero.
Bar graphs should always have a zero baseline so that the impression given by the length of the bars is
consistent with the differences between the values of the various conditions (e.g., Pandey et al., 2015).
Hat graphs do not have the same restriction because they do not contain the misleading element within
bar graphs, namely bar length. It is important to not have the baseline be restricted to zero because
graph design indicates that for behavioral studies for which effect size is measured in terms of standard
deviations, the range of the y-axis should be 1.5 standard deviations (Witt, in press). Setting the y-axis
in this way helps to maximize compatibility between the visual impression of the effect and the size of
the effect. Small effects will look small and big effects will look big. In cases for which the size of the
effect is too big to fit within 1.5 SDs (as was the case in the current studies), the y-axis should be
expanded and its range noted in the figure caption.
Hat Graphs 17
Another graph design principle concerns the data-to-ink ratio (Tufte, 2001). According to this
principle, redundant ink should be minimized within reason. For the bar graph, both the length of the
two side lines and the position of the top of the bar are all redundant, so to minimize redundant ink, one
could remove two of the three indicators. The hat graph has a higher data-to-ink ratio than the bar
graph, which could be one of the factors contributing to its advantage. The hat graph still contains
redundant elements (such as the brim of the hat). However, sometimes redundancy can be
advantageous for reading graphs, so it is important to not take the data-to-ink ratio too far (Carswell,
1992; Gillan & Richman, 1994). Future studies could determine whether adding or deleting ink from hat
graphs improves performance.
A third cognitive principle at play with hat graphs relates to attention. Attention can be
deployed spatially or towards a particular object (e.g., Posner, 1980; Scholl, 2001). With hat graphs, the
difference between conditions is itself an object and thus could benefit from object-based attention,
whereas the difference in spatial location between the brim of the hat and the crown of the hat could
benefit from spatial attention. With hat graphs, either type of attention would result in the same
outcome. Neither bar graphs nor point graphs would likely benefit as much as hat graphs from object-
based attention. The added benefit of the difference between conditions being represented as an
object and thus engaging object-based attention could contribute to the advantage for hat graphs over
bar graphs.
When researchers want to emphasize differences across two conditions (such as difference in
RTs), there is a dilemma whether to highlight the differences themselves by plotting those differences in
a bar graph versus plotting the raw RTs and having the reader infer the differences between the bars or
the points. Hat graphs solve this dilemma by plotting both at the same time. One disadvantage to the
hat graphs compared with plotting the differences themselves is that the crowns in the hat graphs are
unlikely to be perfectly aligned. The relative lengths of aligned bars are easier to discriminate than the
Hat Graphs 18
relative lengths of misaligned bars (Cleveland & McGill, 1985). Thus, researchers will have to decide if
the added benefit of more precise estimate of length outweighs the benefit of showing both the
differences (as misaligned bars in the crown of the hat) and the raw scores. Of course, if precision is an
important goal for the researcher, the researcher could put the actual number corresponding to the
differences in or near the crown of the hat or in the text itself.
Hat graphs have a number of limitations and future research questions. One question concerns
the best way to add error bars. The figures here show one way to do so, but this format was not
empirically tested. Another limitation is how best to signal when the direction of the hat is reversed
from one condition to another (such as would be found in a crossed interaction). Perhaps filling in
crowns of hats that are “upside-down” would be beneficial, but reversing the hat so the brim points to
the right is another alternative. The Appendix provides R code for creating hat graphs, which can be
manipulated to try various options.
A critical limitation is that hat graphs are limited to 2 x N designs for which comparisons across
pairs are the critical comparisons. It is unclear how to expand hat graphs to allow comparison across 3
or more conditions. Many reported studies involve designs with factors that have 2 levels, so there is
plenty of room for hat graphs to exert their advantage even if they are not generalizable to all research
scenarios.
An unknown factor, and potential limitation, is that the current studies focused on a task for
which participants had to identify differences between baseline and final scores. The nature of the task
dictates which graph will be most appropriate (e.g., Gillan, Wickens, Hollands, & Carswell, 1998). For
finding differences, hat graphs proved to be more effective than bar graphs. However, if the task is
instead to ignore differences and identify the precise value of a given condition or identify the baseline
with the highest value, another format may prove more effective than hat graphs. According to the
Hat Graphs 19
proximity compatibility principle (Wickens & Carswell, 1995), the greater proximity achieved in the hat
graphs should improve performance on an information integration task (such as finding the biggest
difference) but should decrease performance on a focused attention task (such as finding the precise
value of a particular condition).
In conclusion, hat graphs resulted in better performance than bar graphs. Hat graphs show that
design principles based on the nature of cognitive processes (e.g., Gestalt grouping principles, object-
based attention) can significantly improve how researchers visualize and communicate their data.
Hat Graphs 20
Author Note
Jessica K. Witt, Department of Psychology, Colorado State University.
Data, scripts, and supplementary materials available at https://osf.io/khjb9/. For peer-review
process, use this link: https://osf.io/khjb9/?view_only=6c453d634ee24b019d1b247c5be14b6b. This
work was supported by a grant from the National Science Foundation (BCS-1632222). Author has no
competing interests to declare.
Address correspondence to JKW, Department of Psychology, Colorado State University, Fort
Collins, CO 80523, USA. Email: [email protected]
Hat Graphs 21
Appendix: R Code for Creating Hat Graphs # Note: You can run code as is to get random data and see how graphs work # Instructions how to implement with your data below - see [1] # Instructions for how to add error bars - see [2] # Instructions for how to shade in upside-down hats - see [3] dt <- data.frame(subjectID = seq(1,30),DV = rnorm(30,50,20),IV1 = rep(c(1,2),15), IV2 = c(rep(1,15),rep(2,15))) #dt <- read.csv("YourDataHere.csv",header=T) #### [1] - HOW TO IMPLEMENT WITH YOUR DATA #### # Replace DV with your dependent measure (or name your dependent measure "DV") # Replace IV1 with your independent measure that you want paired in your hats (or name this measure "IV1") # Assumes only two levels of IV1 # Replace IV2 with your independent measure that you want spaced across the x-axis (or name this measure "IV2") # # Also replace label names here to match how you want your axes labeled: xLab <- "Independent Variable 2" yLab <- "Dependent Variable" legendLab <- c("IV1: Level 1","IV1: Level 2") upsideDownColor <- "black" #[3] Color to fill hat crown when second part is less than first part - set to "white" if you do not want shading ## Computes means for plotting # Also computes SD for setting y-axis range and for error bars # If your design involves within-subjects, see Loftus & Masson (1994) for computing SEM within-subjects dtMeans <- aggregate(DV ~ IV1 + IV2, dt, mean) dtSEM <- aggregate(DV ~ IV1 + IV2, dt, sd) #assumes between-subjects effects only dtMeans$SEM <- dtSEM$DV / sqrt(length(unique(dt$subjectID))) dtMeans$hatPos <- as.integer(as.factor(dtMeans$IV1)) dtMeans$xPos <- as.integer(as.factor(dtMeans$IV2)) ## Values for setting the range of the y-axis # Based on recommendations from Witt (in press) # If data do not fit in graph with this value, increase .75 to greater number # Should not decrease .75 grandMean <- mean(dtMeans$DV) grandSD <- mean(dtSEM$DV) * .75 # produces y-axis range of 1.5 SDs (see Witt (in press) for details) hatWidth <- .25
Hat Graphs 22
plot(dtMeans$IV1,dtMeans$DV,ylim=c(grandMean - grandSD,grandMean + grandSD),xlim=c(min(dtMeans$xPos) - (2 * hatWidth),max(dtMeans$xPos) + (2 * hatWidth)),xaxt="n",col="white",bty="l",xlab=xLab,ylab=yLab) axis(side = 1, at=dtMeans$xPos,labels = dtMeans$IV2) for (i in unique(dtMeans$xPos)) { m1 <- which(dtMeans$xPos == i & dtMeans$hatPos == 1) m2 <- which(dtMeans$xPos == i & dtMeans$hatPos == 2) segments(i-hatWidth,dtMeans$DV[m1],i,dtMeans$DV[m1],lwd=3) # Brim of the hat if (dtMeans$DV[m2] > dtMeans$DV[m1]) { rect(i,dtMeans$DV[m1],i+hatWidth,dtMeans$DV[m2],lwd=3) # Crown of the hat } else { rect(i,dtMeans$DV[m1],i+hatWidth,dtMeans$DV[m2],lwd=3, col=upsideDownColor) # Crown of the hat } } # [2] - Add error bars dtMeans$semPos <- ifelse(dtMeans$hatPos == 1, dtMeans$xPos - (hatWidth/2), dtMeans$xPos + (hatWidth/2)) for (i in 1:length(dtMeans$semPos)) { segments(dtMeans$semPos[i],dtMeans$DV[i] - dtMeans$SEM[i],dtMeans$semPos[i],dtMeans$DV[i] + dtMeans$SEM[i]) } # Add Legend legend("topleft",pch=c(NA,0),lwd=c(3,NA),pt.cex=c(NA,3),cex=1.5,legend=legendLab,bty="n")
Hat Graphs 23
References
Bates, D., Machler, M., Bolker, B., & Walker, S. (2015). Fitting Linear Mixed-Effects Models Using lme4.
Journal of Statistical Software, 67(1), 1-48. doi:10.18637/jss.v067.i01
Carswell, C. M. (1992). Choosing Specifiers: An Evaluation of the Basic Tasks Model of Graphical
Perception. Human Factors, 34(5), 535-554. doi:10.1177/001872089203400503
Cleveland, W. S., & McGill, R. (1985). Graphical Perception and Graphical Methods for Analyzing
Scientific Data. Science, 229(4716), 828-833. doi:10.1126/science.229.4716.828
Cohen, J. (1988). Statistical Power Analyses for the Behavioral Sciences. New York, NY: Routledge
Academic.
Gillan, D. J., & Richman, E. H. J. H. F. (1994). Minimalism and the syntax of graphs. Human Factors, 36(4),
619-644.
Gillan, D. J., Wickens, C. D., Hollands, J. G., & Carswell, C. M. J. H. F. (1998). Guidelines for presenting
quantitative data in HFES publications. Human Factors, 40(1), 28-41.
Kuznetsova, A., Brockhoff, P. B., & Christensen, R. H. B. (2017). lmerTest Package: Tests in Linear Mixed
Effects Models. Journal of Statistical Software, 82(13), 1-26. doi:10.18637/jss.v082.i13
Pandey, A. V., Rall, K., Satterthwaite, M. L., Nov, O., & Bertini, E. (2015). How Deceptive are Deceptive
Visualizations?: An Empirical Analysis of Common Distortion Techniques. Paper presented at the
Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems,
Seoul, Republic of Korea.
Playfair, W. (1786). Commercial and political atlas: Representing, by copper-plate charts, the progress of
the commerce, revenues, expenditure, and debts of England, during the whole of the eighteenth
century. London, Corry.
Posner, M. I. (1980). Orienting of attention. Quarterly Journal of Experimental Psychology, 32(1), 3-25.
Scholl, B. J. (2001). Objects and attention: The state of the art. Cognition, 80(1-2), 1-46.
Hat Graphs 24
Team, R. C. (2017). R: A language and environment for statistical computing. Retrieved from
https://www.r-project.org
Tufte, E. R. (2001). The Visual Display of Quantitative Information (Second Edition ed.). Cheshire, CT:
Graphics Press.
Wagemans, J., Elder, J. H., Kubovy, M., Palmer, S. E., Peterson, M. A., Singh, M., & von der Heydt, R. J.
(2012). A century of Gestalt psychology in visual perception: I. Perceptual grouping and figure–
ground organization. Psychological Bulletin, 138(6), 1172-1217.
Wertheimer, M. (1912). Experimentelle studien uber das sehen von bewegung. Zeitschrift fur
Psychologie, 61, 161-265. (Translated extract reprinted as “Experimental studies on the seeing
of motion”. In T. Shipley (Ed.), (1961). Classics in psychology (pp. 1032–1089). New York, NY:
Philosophical Library.).
Wickens, C. D., & Carswell, C. M. J. H. f. (1995). The proximity compatibility principle: Its psychological
foundation and relevance to display design. Human Factors, 37(3), 473-494.
Witt, J. K. (in press). Graph Construction: An Empirical Investigation on Setting the Range of the Y-Axis.
Meta-Psychology.
Zacks, J., & Tversky, B. (1999). Bars and lines: A study of graphic communication. Memory & Cognition,
27(6), 1073-1079.