Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
5/19/2009
1
Simulation of Molecular
Evolution
with Bioinformatics Analysis
Barbara N. Beck, Rochester Community and Technical College, Rochester, MN
Project created by:
Barbara N. Beck, Ph.D., Rochester Community and Technical College, Rochester, MN
Chi-Cheng Lin, Ph.D., Winona State University, Winona, MN
Mingrui Zhang, Ph.D., Winona State University, Winona, MN
Gayle Olsen, M.S., C.N.P., Winona State University, Winona, MN
Robyn L. Keyport, M.Ed., Hastings High School, Hastings, MN
5/19/2009
2
Learning objective
Students will cause a set of molecules to “evolve” and
then use bioinformatics computational tools to analyze
the relatedness of this set of molecules, displaying the
results as phylogenetic trees.
Students will gain an understanding of how phylogenetic
trees display evolutionary relationships.
• The “evolved” set of molecules is created by manipulating strings of
Pop-It beads consisting of four colors of beads, representing the four
DNA bases.
• Base substitutions result from substituting one color of bead for
another.
5/19/2009
3
• The ancestor (original sequence) molecule diverges into two lineages, each of which undergoes an independent mutation.
• Each of these lineages also diverges, with the descendents undergoing independent mutations until a population of eight lineages is created.
#1 ______●______________________________
#2 _________________●____________________
#1a ______●__________●___________________
#1b ______●___________________●__________
#2a _________________●___●________________
#2b ___________●_____●____________________
#1aa ______●__________●______________●_____
#1ab ______●__________●__________________●_
#1ba ______●_______________●___●___________
#1bb ______●________●__________●___________
#2aa __●______________●___●________________
#2ab _______________●_●___●________________
#2ba ___________●_____●_______________●____
#2bb ___________●_____●_________●__________
● = a mutated site
Data recording
• Line up bead strings with “bead-size” Excel spreadsheet pages
(cells in spreadsheet same size as beads) to record mutations
(changes in bead color).
5/19/2009
4
Data recording
• Transfer changes to “live” spreadsheet.
Data recording
• Save the altered file with a new name and then select the data-
containing rows 3-10 from column B to BG and click Copy.
5/19/2009
5
Data format conversion
• We need to convert the bead data from RGBY (bead colors) to
ACGT (nucleotides) and to convert it to FASTA format, a format
compatible with the publicly available analysis tools.
• Open a new document in Wordpad or Notepad (easier) and click
Paste to transfer your data to one of these text editors.
Data format conversion
• Click File, Save as, choose a location for storing the file (the
Desktop is easy), type in a filename, and in the Save as type field,
choose Text Document (*.txt).
5/19/2009
6
Data format conversion
• Open the Java file BBead.jar by double-clicking it.
• Browse for the data file you just saved (as a .txt file) and select it as
the input file.
Data format conversion
• Type in an output file name and then click Convert Sequence.
• The converted sequence will be displayed on the screen and the
output file will be created in the location of the BBead.jar file.
5/19/2009
7
Data analysis
• Open a web browser (Explorer, Firefox) and enter the
URL http://workbench.sdsc.edu.
• A new user is required to register for a FREE account
and then log in.
• We will upload our FASTA-formatted data file and use
the computational tools available through this site to
align the sequences and calculate the phylogenetic tree.
• Your work will be saved in your session.
Data analysis
5/19/2009
8
Data analysis
Data analysis
• To start a new session, click on Session Tools.
• You will then see the following screen. Highlight Start New Session
and click Run
5/19/2009
9
Data analysis
• Name your session and then click Start New Session.
• Your new session will be highlighted as shown below.
Data analysis
• Our data are DNA sequences, so we will use Nucleic Tools.
• Click on Nucleic Tools. You will see the following page. “-Empty”
means that no DNA sequences have yet been imported.
• Click Add New Nucleic Sequence and then click Run.
5/19/2009
10
Data analysis
• Click on Browse, find the .txt data file that you converted to FASTA
format, select it, and then click Upload File.
Data analysis
• Once the file is uploaded, your sequences will appear on the page.
Click Save to store them on the site.
5/19/2009
11
Data analysis
• In the drop-down list, first highlight Select All Sequences (all the
boxes should get checked), and then highlight CLUSTALW –
Multiple Sequence Alignment.
Data analysis
• CLUSTALW compares and then aligns the sequences.
• You are now on a “Check” page. Verify that all your sequences are
listed. (If not, click Abort and re-do the last step.) Then click
Submit.
• Click on Submit
5/19/2009
12
Data analysis
• The alignment may take a minute or so. When it is finished, scroll down the page. The sequence alignments are color-coded. Blue indicates that all the nucleotides at that position are identical.
Guide trees vs. phylogenetic trees
• If you scroll down below the alignment, you will see a guide tree
displayed. CLUSTALW builds a guide tree to help align the
sequences; the guide tree is not the same as a phylogenetic tree,
although it may look very similar.
• In order to calculate the phylogenetic tree, we will do the following:
• On the CLUSTALW result page, click “Import Alignment(s)” near either
the top or the bottom of the page.
5/19/2009
13
Calculating phylogenetic trees
• On the result page, check the box in front of “CLUSTALW -
Nucleic”. In the drop-down box, highlight and click “DRAWTREE”.
Calculating phylogenetic trees
• Click Submit on the newly returned page and you will see the
inferred unrooted phylogenetic tree in the result. An unrooted tree
does not assume a direction of evolution and therefore will not
include a single ancestor node, but will show evolutionary distances.
Unrooted Phylogenetic Tree
5/19/2009
14
• Click Return to get back to the page showing the alignment.
Highlight and click “DRAWGRAM” in the drop-down box. Click
Submit on the newly returned page and you will see the inferred
rooted tree in the result. The rooted tree assumes there is a
direction of evolution and displays a single ancestor node.
Calculating Phylogenetic Trees
Rooted Phylogenetic Tree
Tree analysis
The detailed results obtained will vary each time, of course, since the
input sequences are created by “random” mutation of the original
sequence. However, the overall picture should be similar:
The paired sequences, 1aa and 1ab, 1ba and 1bb, 2aa and 2ab,
2ba and 2bb should be most closely related to each other.
Then the 1aa – 1ab and 1ba – 1bb pairs should cluster, as should
the 2aa – 2ab and 2ba – 2bb pairs.
The clustering reflects the temporal order in which the lineages were
derived, thus showing that the alignment and tree-drawing
algorithms can correctly infer the ancestral relationships.
5/19/2009
15
Conclusions
• It is hoped that the hands-on activity of creating a set of related
molecules and then of analyzing the sequence data represented by
those molecules will give students a firmer grasp on the concept of
how phylogenetic trees display information about evolutionary
relationships.
• This activity can be extended by providing (or having students find)
protein or DNA sequences using the Taxonomy Browser tool at the
NCBI Taxonomy Homepage and then using the tools at the SDSC
Biology Workbench to align the sequences and draw the
phylogenetic trees.
All files for this exercise, and a pdf version of this presentation are available by e-mail from the
presenter at [email protected]