New Lecture 4nens230.stanford.edu/slides/week4.pdf · 2015. 10. 13. · Cell502 Poor Graphs Figure...

Preview:

Citation preview

Lecture 4: Visualization

• Basic plotting commands

• Types of plots

• Customizing plots graphically

• Specifying color

• Customizing plots programmatically

• Exporting figures

Outline

Why use Matlab for visualization?

• Matlab is flexible enough to let you quickly visualize data, and powerful enough to give you complete control over the final product

• Features:• Interactive plotting• simple 3D plotting• programmatic annotation

Basic Plots

• 2D Visualization• plot (line plots)• histogram• scatter (scatter plots)• image/imagesc (images)

• 3D Visualization• surf/mesh (surfaces)• plot3 (lines)• scatter3

Working with Figures

• Create a new figure : figure();• Specify a figure number: figure(1)• Hold onto a figure “handle”

• figHandle1 = figure(1); • Re-select a figure:

• figure(figHandle1)• Some useful functions:

• clf: clear figure• close all: closes all figures• gcf: get handle to current figure

Demo: Figures

plot­– Syntax:����������� ������������������  plot(x,y)����������� ������������������  plots����������� ������������������  points����������� ������������������  in����������� ������������������  the����������� ������������������  vector����������� ������������������  y����������� ������������������  against����������� ������������������  points����������� ������������������  in����������� ������������������  the����������� ������������������  vector����������� ������������������  x

histogram/bar­– Syntax:����������� ������������������  histogram(y)����������� ������������������  plots����������� ������������������  a����������� ������������������  histogram����������� ������������������  of����������� ������������������  the����������� ������������������  values����������� ������������������  in����������� ������������������  y,����������� ������������������  bar(x,y)����������� ������������������  plots����������� ������������������  bars����������� ������������������  at����������� ������������������  the����������� ������������������  points����������� ������������������  given����������� ������������������  by����������� ������������������  (x,y)

scatter­– Syntax:����������� ������������������  scatter(x,y,s,c)����������� ������������������  lets����������� ������������������  you����������� ������������������  specify����������� ������������������  the����������� ������������������  size����������� ������������������  (s)����������� ������������������  and����������� ������������������  color����������� ������������������  (c)����������� ������������������  of����������� ������������������  each����������� ������������������  point����������� ������������������  given����������� ������������������  by����������� ������������������  (x,y)

image/imagesc­– Syntax:����������� ������������������  image(C)����������� ������������������  plots����������� ������������������  the����������� ������������������  values����������� ������������������  stored����������� ������������������  in����������� ������������������  the����������� ������������������  matrix����������� ������������������  C����������� ������������������  as����������� ������������������  an����������� ������������������  image

surf & mesh­– Syntax:����������� ������������������  surf(x,y,z)����������� ������������������  and����������� ������������������  mesh(x,y,z)����������� ������������������  are����������� ������������������  used����������� ������������������  to����������� ������������������  visualize����������� ������������������  a����������� ������������������  surface����������� ������������������  in����������� ������������������  three����������� ������������������  dimensions

plot3­– Syntax:����������� ������������������  plot3(x,y,z)����������� ������������������  plot����������� ������������������  points����������� ������������������  in����������� ������������������  3D

Demo: Plot Types

http://www.mathworks.com/help/matlab/2-and-3d-plots.html

subplots­– the����������� ������������������  ‘subplot’����������� ������������������  command����������� ������������������  let’s����������� ������������������  you����������� ������������������  plot����������� ������������������  multiple����������� ������������������  plots����������� ������������������  on����������� ������������������  one����������� ������������������  figure����������� ������������������  

­– syntax:����������� ������������������  subplot(nRows,����������� ������������������  nCols,����������� ������������������  index)

(Figure����������� ������������������  1)

subplot(1,3,1) subplot(1,3,2) subplot(1,3,3)

subplots­– the����������� ������������������  ‘subplot’����������� ������������������  command����������� ������������������  let’s����������� ������������������  you����������� ������������������  plot����������� ������������������  multiple����������� ������������������  plots����������� ������������������  on����������� ������������������  one����������� ������������������  figure����������� ������������������  

­– syntax:����������� ������������������  subplot(nRows,����������� ������������������  nCols,����������� ������������������  index)

(Figure����������� ������������������  2)

subplot(3,2,1)

subplot(3,2,3)

subplot(3,2,5)

subplot(3,2,2)

subplot(3,2,4)

subplot(3,2,6)

Demo: Subplots

Other functions­– gca����������� ������������������  ����������� ������������������  ����������� ������������������  ����������� ������������������  ����������� ������������������  ����������� ������������������  ����������� ������������������  ����������� ������������������  ����������� ������������������  ����������� ������������������  ����������� ������������������  ����������� ������������������  ����������� ������������������  ����������� ������������������  ����������� ������������������  ����������� ������������������  ����������� ������������������  get����������� ������������������  handle����������� ������������������  to����������� ������������������  current����������� ������������������  axis����������� ������������������  

­– panel

Panel()­– user-submitted����������� ������������������  function����������� ������������������  from����������� ������������������  Matlab����������� ������������������  File����������� ������������������  Exchange����������� ������������������  (FEX)����������� ������������������  

­– http://www.mathworks.com/matlabcentral/fileexchange/20003-panel

­– Provides����������� ������������������  MUCH����������� ������������������  more����������� ������������������  control����������� ������������������  over����������� ������������������  subplot����������� ������������������  positioning,����������� ������������������  layout,����������� ������������������  margins,����������� ������������������  etc.

Customizing Graphs Graphically

Plot Tools

Demo: Customizing Graphically

• Plot() plots along dimension 1 of an array. • If there are multiple dimensions, plot creates a

separate line for each column• If your data isn’t constructed this way, just transpose

with the apostrophy character: • plot(data’)

The plot() function (again)

Demo: Plot()

• For line plots, specify the line type using a format string:

• plot(x,y,’b’) % plots blue line (default)• plot(x,y,’b.’) % plots blue dots• plot(x,y,’b:’) % plots blue dotted line• plot(x,y,’k--’) % plots black dashed line• plot(x,y,’ro’) % plots red circles

• Chain together characters for full specification of color, marker, and line

• plot(x,y,’ro-’) % plots red circles with solid line• plot(x,y,’ro:’) % plots red circles with dotted line

The plot() function (again)

• For line plots, specify the line type using a format string:

• plot(x,y,’b’) % plots blue line (default)• plot(x,y,’b.’) % plots blue dots• plot(x,y,’b:’) % plots blue dotted line• plot(x,y,’k--’) % plots black dashed line• plot(x,y,’ro’) % plots red circles

• Chain together characters for full specification of color, marker, and line

• plot(x,y,’ro-’) % plots red circles with solid line• plot(x,y,’ro:’) % plots red circles with dotted line

Plot Line Style

Plot Line Style

Demo: LineSpec

Customizing Programmatically

• Everything you can do graphically you can also do programmatically.

• DON’T do something by hand if you have to do it more than once!

• Examples• axes labels: xlabel(‘text’), ylabel(‘text’) • plot/axis title: title(‘text’)• Add text: text(x,y, ‘text to add’)

Customizing Programmatically

• Graphics parameters are usually specified as ‘parameter’, value pairs:

• plot(x,y, ’linewidth’, 1.4)• plot(x,y, ’bo-‘, ’linewidth’, 2, ‘markersize’, 15)• plot(x,y, ‘o-’,‘MarkerFaceColor’, [1 0 0],‘markerEdgeColor’, [0 0 1])

Other useful functions

• grid on adds grid lines• axis off turns off the axes• colorbar adds a colorbar to image plot• colormap hot switch colormap

Demo: Customizing

Programmatically

Plot colorsMatlab has 8 built-in colors:

Black (k), Red (r), Blue (b), Green (g),Cyan (c), Magenta (m), Yellow (y), White (w)

We can specify other colors using RGB (red, green blue) notation:red = [1 0 0]blue = [0 0 1]green = [0 1 0]gray = [0.2 0.2 0.2]black = [0 0 0]

All RGB colors are 1x3 arrays and all elements between 0-1.

Demo: Color

Colormaps

Colormaps are used to specify how data gets mapped onto different colors.

Matlab has a few built-in colormaps, but you can also specify your own!

Why are colormaps important?

Much better!

Avoid the default colormap (jet)

Manipulating Figures

Figures in Matlab are referenced using “handles”, which are pointers to different parts of the figure.

Example:myhandle = plot(x,y);

Will return a handle to the plot. Then you can run the following:

get(myhandle); % to see a list of propertiesset(myhandle,‘Name’,Value); % to set the value of a property

Different parts of the figure are organized hierarchically:

Manipulating Figures

>> gcf

>> gca

>> get(gca,'Children') >> get(gcf,'Children')

Demo: Annotating plots

Exporting Figures - Formats

Matlab saves figures using it’s own .fig format.

To share figures or view outside matlab, export to other formats, including:

JPG, PNG, EPS, PDF, TIFF

Bitmap vs. Vector graphics

Two main classes of image formats: bitmap vs. vector graphics

Bitmap (jpg, png):• Fixed image sizes• Best for actual images (pictures of stuff)

Vector (eps, pdf):• Variable image sizes• Best for line / bar graphs, scatter plots, etc.

Exporting Figuresprint(figHandle,filename,formattype)

e.g.: print(figure(1), ’MyPlot’,’-dpng’)

formats: ‘-dpng’, ‘-depsc2’, ‘-dpdf’, etc

add flag for resolution: ‘-r300’, etc

Demo: Exporting Figures

Other Resourceshttp://www.mathworks.com/help/matlab/2-and-3d-plots.html

http://colorbrewer.org

2D and 3D visualization examples:

Custom colormaps:

http://www.mathworks.com/matlabcentral/fileexchange/20003-panel

Panel

Colors in figures (blog post)http://figuredesign.blogspot.com/2012/04/meeting-recap-colors-in-figures.html

Poor GraphsCell502

Figure 1. Classification of TFBS Regions

TFBS regions for Sp1, cMyc, and p53 wereclassified based upon proximity to annota-tions (RefSeq, Sanger hand-curated annota-tions, GenBank full-length mRNAs, and En-sembl predicted genes). The proximity wascalculated from the center of each TFBS re-gion. TFBS regions were classified as follows:within 5 kb of the 5! most exon of a gene,within 5 kb of the 3! terminal exon, or withina gene, novel or outside of any annotation,and pseudogene/ambiguous (TFBS overlap-ping or flanking pseudogene annotations,limited to chromosome 22, or TFBS regionsfalling into more than one of the above cate-gories).

imental data, preliminary evidence for the presence of that are located on the 3! end of the well-characterizedgene appear to be located 5! of the overlapping novelnovel transcripts was derived from chromosome 21 and

22 RNA maps (Kapranov et al., 2002) and from the pub- transcript, which suggests that these transcripts maybe regulated by these factors and in precisely the samelicly available EST data. Novel transcripts were verified

using RT-PCR analyses in 9/11 regions and were found way as protein coding genes.Additional supporting evidence that these TFs mayto have little coding capacity (less then 50 amino acids).

Northern hybridization analysis of these isolated tran- be regulating antisense transcripts was found by relatingthem to full-length mRNAs and ESTs with confidentlyscripts with strand-specific oligonucleotides or ribo-

probes indicate that they are polyadenylated, in some assignable strandedness (determined from splicing andpolyadenylation sites and signals). 1782 clusters of tran-cases spliced, and are present as single and multi-exon

isoforms ranging in size from 800 bp to 9 Kb (Supple- scripts were formed of well-oriented sequences frompublic databases aligning to chromosomes 21 or 22.mental Figure S3 on Cell website). Together with the

strand-specific RT-PCR data, this suggests that several Among these clusters, there was a significant associa-tion (chi-square p value " 10#15) between the propertyof them might also be antisense to known genes, such

as, for example, EP300 (Figures 2C and 2D), UBASH3A of proximity to a noncanonical TF and the property ofhaving evidence for transcription on the opposite strand.(Supplemental Figures S2A and S2B online), SEC14L2

(Supplemental Figures S2C and S2D), and others. In this context, a noncanonical TF is one not located atthe 5! end of a known gene and evidence for transcrip-The Ewing sarcoma gene (EWSR1) (Plougastel et al.,

1993), the tumor suppressor gene, EP300 (Gayther et tion on the opposite strand is based on public sequencedata. Twenty-one percent (363) of these transcript clus-al., 2000), and mitogen-activated protein kinase MAPK1

(Gonzalez et al., 1992) on chromosome 22 illustrate po- ters are made up of sense antisense pairs, 44% (161)have an associated noncanonical TF. Of the 161 sensetential utilization of common TFs to regulate both well-

characterized and novel transcripts (Figure 2). Sequence antisense pairs that have a noncanonical TF, 52% con-tain at least one site conserved between the humananalysis of the novel transcripts that overlap EWSR1

and EP300 indicate that they are spliced RNAs. Interest- and mouse genomes based on BlastZ human-mousealignments (Schwartz et al., 2003).ingly, a conserved region in the 3! UTR of the EWSR1

gene is consistent with the evidence of antisense regula-tion of this gene (Lipman, 1997). The EP300 gene is a Differential Expression Patterns

of Novel Transcriptsstriking example (Figures 2C and 2D), having a TFBSregion 17 kb away from the 3! end and a novel transcript To address the issue of whether the observed overlap-

ping noncoding transcripts are biologically important,that splices from this site into the 3! end of the gene.Additionally, overlapping novel transcripts from the we examined whether some of them exhibited a repro-

ducible and coordinated program of differential expres-genes encoding nuclear protein UBASH3A (Supplemen-tal Figures S2A and S2B), phosphatidylinositol transfer- sion correlated with the companion coding transcripts.

The expression profiles of the poly(A)$ cytosolic RNAlike protein SEC14L2 (Supplemental Figures S2C andS2D), TBC/rabGAP domain protein EPI64 (Supplemental fraction were monitored during the response of a pluri-

potent human germ cell tumor-derived cell line, NCCIT,Figures S2E and S2F), guanine-nucleotide exchangefactor TIAM1 (Supplemental Figures S2G and S2H), which undergoes retinoic acid (RA)-induced differentia-

tion into keratin- and neurofilament-positive somaticKIAA0376 protein (Supplemental Figures S2I and S2J),and GTSE1 (Supplemental Figures S2K and S2L) were cells (Damjanov et al., 1993). Empirically derived tran-

scriptional maps of NCCIT using the chromosome 21verified by RT-PCR and/or Northern blot analyses (Sup-plemental Figure S3). In many of these cases, the TFBS and 22 genome tiling arrays during various stages of

Cawley et. al., Cell, Volume 116, Issue 4, 20 February 2004.

Poor Graphs

Cotter et. al., Journal of Clinical Epidemiology 57 (2004)

D.J. Cotter et al. / Journal of Clinical Epidemiology 57 (2004) 1086–1095 1093

D.J. Cotter et al. / Journal of Clinical Epidemiology 57 (2004) 1086–1095 1091

<30 30- <33 33- <36 36- <39 >=39 All0%

25%

50%

75%

100%

<=8,738 units/wk >8,738-13,944 units/wk >13,944-21,692 units/wk >21,692 units/wk

Hematocrit Group(%)

Pro

port

ion

Fig. 1. Distribution of epoetin dose by quartiles Q1–Q4, using mean dose per week (units/wk) disaggregated by hematocrit group. Within each epoetindose quartile, the distribution of dosing resembles a bell-shaped curve around the recommended target hematocrit range (33% to !36%). Quartiles arerepresented by shaded segments on histogram bars, darkest for the first quartile (bottom), lightest for the fourth quartile (bar), with the following values:Q1, "8,738; Q2, #8,738 to 13,944; Q3, #13,944 to 21,692; Q4, #21,692.

Surrogates fail for a number of reasons and can be ex-plained by one or another of many failed-surrogate mecha-nisms [13,16]. In the case of epoetin, mistaken conclusionscan potentially occur using two different mechanisms. One,if the surrogate end point is associated with the actual clinicalend point due to a shared common cause, a treatment thataddresses the surrogate end point without affecting thecommon causal agent may not have an effect on the actualend point. In a second possibility, treatments can affect out-comes through unanticipated causal pathways that are unre-lated to the surrogate end point. The difference in mortalityrates among patients with similar hematocrit levels couldbe related to either of these possibilities. We will discussthe clinical interpretation of each of these possibilities.

As shown in Fig. 3B, the observed relationship betweenhematocrit and mortality could be due to other, potentiallyunmeasured, aspects of a patient’s health status that indepen-dently affect hematocrit, epoetin responsiveness and sur-vival. Factors affecting epoetin responsiveness are not wellunderstood [28], but several possibilities have been men-tioned in the literature. Ma et al. [7] cited inflammation as

Table 3Unadjusted 1-year mortality rates per 1,000 patients, by hematocritlevel and epoetin dose quartile

Hematocrit group

30% to 33% to 36% toDose quartilea !30% !33% !36% !39% $39% All

Q1 271 245 185 184 177 203Q2 344 278 212 195 186 232Q3 425 316 247 199 180 265Q4 501 354 280 227 196 310All 412 297 225 200 186 251

a For dose quartiles, see Table 2.

one possible common cause of poor epoetin response andmortality, and Tonelli et al. [29] found an association be-tween sensitivity to epoetin and markers of inflammation.Current guidelines call for investigation of inflammationwhen a patient exhibits a poor response to epoetin [14].Because inflammation and protein-energy malnutrition havea high prevalence and are found to be closely related to eachother in dialysis patients, they are referred to together asmalnutrition–inflammation complex syndrome (MICS)[30]. Taken together, MICS is hypothesized to blunt theresponsiveness of anemia of ESRD to epoetin. Althoughthe possible interactions between inflammatory and nutri-tional markers and their influence on anemia and epoetinhyporesponsiveness require further study, poor responderswho continue to have low hematocrit levels despite receivinghigh doses of epoetin may not benefit significantly frommore epoetin. With patients who are poor responders toepoetin therapy, K/DOQI recommends that they be consid-ered for other approaches that might be complementary toincrease response, such as iron supplementation, improveddialysis adequacy, or improved nutrition.

Recently, Ifudu et al. [31] studied 309 hemodialysis pa-tients to determine the relative effects of adequacy of dial-ysis and intravenous iron on hematocrit. Pointing out thatpatients with low hematocrit levels may have received inade-quate dialysis and may have been inappropriately adminis-tered excess intravenous iron as a corrective measure, Ifuduet al. [31] concluded that adequacy of dialysis predicts theresponse to epoetin therapy. Tonelli et al. [29], however, ina study of 135 chronic hemodialysis patients dialyzed to aKt/Vurea of 1.6 (where K is clearance, t is time, and V isvolume) found no such relationship. The authors theorizeda possible threshold effect. Both Tonelli et al. [29] and Esch-bach et al. [28] reported that lower serum albumin levels

Recommended