44
An introduction to metalloenzymes and biotechnological approaches to studying them 12.755 L10 e is an enzyme that catalyzes the hydrolysis of urea into bon dioxide and ammonia. The reaction occurs as follows: (NH2)2CO + H2O → CO2 + 2NH3 Aconitase

An introduction to metalloenzymes and biotechnological approaches to studying them 12.755 L10

  • Upload
    nelly

  • View
    51

  • Download
    1

Embed Size (px)

DESCRIPTION

An introduction to metalloenzymes and biotechnological approaches to studying them 12.755 L10. Urease is an enzyme that catalyzes the hydrolysis of urea into carbon dioxide and ammonia . The reaction occurs as follows: (NH2)2CO + H2O → CO2 + 2 NH3. Aconitase. Outline. - PowerPoint PPT Presentation

Citation preview

Page 1: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10

An introduction to metalloenzymes andbiotechnological approaches to studying them

12.755 L10

Urease is an enzyme that catalyzes the hydrolysis of urea into carbon dioxide and ammonia. The reaction occurs as follows:

(NH2)2CO + H2O → CO2 + 2NH3

Aconitase

Page 2: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10

Outline

• Introduction – global BGC to cellular physiology to metalloenzyme and molecular

• Categories of metalloprotein and metalloenzymes functions

• The code: amino acids

• The Genomic Firehose

• Bioinformatic terminology

• Intergrated Microbial Genomics Portal

Page 3: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10
Page 4: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10
Page 5: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10
Page 6: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10

Roles of metal in biology (From Bioinorganic Chemistry, Lippard and Berg)

Metalloprotein Functions • Dioxygen Transport

• Hemoglobin-myoglobin family• Hemocyanins• Hemerthyrins

• Electron Transfer (e.g. nitrogen fixation)

• Structural Roles (zinc fingers)

Metalloenzyme Functions (Note: Metalloenzymes are metalloproteins that perform a catalytic function)

• Hydrolytic Enzymes (Carbonic Anhydrases)

• Two Electron Redox Enzymes (Nitrate Reductase, oxidation of hydrocarbons by P-450)

• Multielectron Pair Redox Enzymes (Cytochrome c, PSII, Nitrogenase)

• Rearrangements (Vitamin B12)

Page 7: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10

Metalloenzymes in Photosynthesis

Page 8: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10

Metalloenzymes in Photosynthesis (From Raven 2000)

Page 9: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10
Page 10: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10
Page 11: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10

Metalloenzymes in carbon fixation

Page 12: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10

Metalloenzymes in Nitrogen Utilization

Page 13: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10

Metalloenzymes in the Nitrogen Biogeochemical Cycle

Page 14: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10

Key enzyme in the nitrification reaction:

ammonia (NH3) hydroxylamine (NH2OH) nitrite (NO2-)

Found in anaerobic oxidizing bacteria (AOB) but not the more abundant anaerobic oxidizing archaea (AOA)

24 hemes (irons) per molecule!

What does nature actually use in the oceans if this enzyme is not present?

Page 15: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10
Page 16: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10
Page 17: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10

How does a particular amino acid sequence create the function of a metalloprotein or the activity of a metalloenzyme?

Page 18: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10
Page 19: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10
Page 20: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10
Page 21: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10
Page 22: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10
Page 23: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10
Page 24: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10

“The sequence itself is not informative; it must be analyzed by comparative methods against existing databases to develop hypothesis concerning relatives and function.“

Terminology for comparing sequences:

• Identity: The extent to which two (nucleotide or amino acid) sequences are invariant.

• Similarity: The extent to which nucleotide or protein sequences are related. The extent of similarity between two sequences can be based on percent sequence identity and/or conservation. In BLAST similarity refers to a positive matrix score.

• Conservation: Changes at a specific position of an amino acid or (less commonly, DNA) sequence that preserve the physico-chemical properties of the original residue.

• Homology - Similarity attributed to descent from a common ancestor. NOTE: it is binary, sequences have homology or they do not. Something cannot be “highly homologous”

• Source: http://www.ncbi.nlm.nih.gov/Education/BLASTinfo/glossary2.html

Page 25: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10

BLAST Basic Local Alignment Search Tool

• E-Values: Expectation value. The number of different alignments with scores equivalent to or better than S that are expected to occur in a database search by chance. The lower the E value, the more significant the score.

• In the limit of sufficiently large sequence lengths m and n, the statistics of HSP scores are characterized by two parameters, K and lambda. Most simply, the expected number of HSPs with score at least S is given by the formula The parameters K and lambda can be thought of simply as natural scales for the search space size and the scoring system respectively.

• We call this the E-value for the score S. This formula makes eminently intuitive sense. Doubling the length of either sequence should double the number of HSPs attaining a given score. Also, for an HSP to attain the score 2x it must attain the score x twice in a row, so one expects E to decrease exponentially with score.

• Raw Score: The score of an alignment, S, calculated as the sum of substitution and gap scores. Substitution scores are given by a look-up table (see PAM, BLOSUM). Gap scores are typically calculated as the sum of G, the gap opening penalty and L, the gap extension penalty. For a gap of length n, the gap cost would be G+Ln. The choice of gap costs, G and L is empirical, but it is customary to choose a high value for G (10-15)and a low value for L (1-2).

• HSP: High-scoring segment pair. Local alignments with no gaps that achieve one of the top alignment scores in a given search.

Sources:• http://www.ncbi.nlm.nih.gov/Education/BLASTinfo/similarity.html• http://www.ncbi.nlm.nih.gov/Education/BLASTinfo/information3.html

Page 26: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10

Program  Description

blastpCompares an amino acid query sequence against a protein sequence database.

blastnCompares a nucleotide query sequence against a nucleotide sequence database.

blastx

Compares a nucleotide query sequence translated in all reading frames against a protein sequence database. You could use this option to find potential translation products of an unknown nucleotide sequence.

tblastnCompares a protein query sequence against a nucleotide sequence database dynamically translated in all reading frames.

tblastx

Compares the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database. Please note that the tblastx program cannot be used with the nr database on the BLAST Web page because it is computationally intensive.

Page 27: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10

There are many metalloenzymes often doing crucial cellular biochemical (and biogeochemical) processes

Enzymes containing metals:• Superoxide dismutase• Urease• Aconitase• Zinc finger proteins• Carbonic anhydrase• Alkaline phosphatase• DNA polymerase• Nitrate Reductase• Multi-copper oxidase• uvrA (ultraviolet resistence gene)• Ferredoxin• Nitrogenase• Many more…

There are also many proteins and enzymes that are involved in metal processes (uptake, storage, insertion, transformations etc).

Page 28: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10

Integrated Microbial GenomicsJoint Genome Institute, Department of Energy

The U.S. Department of Energy (DOE) Office of Science supports innovative, high-impact, peer-reviewed biological science to seek solutions to difficult DOE mission challenges. These challenges include finding alternative sources of energy, understanding biological carbon cycling as it relates to global climate change, and cleaning up environmental wastes.

•Cleanup of toxic-waste sites worldwide. •Production of novel therapeutic and preventive agents and pathways. •Energy generation and development of renewable energy sources (e.g., methane and hydrogen). •Production of chemical catalysts, reagents, and enzymes to improve efficiency of industrial processes. •Management of environmental carbon dioxide, which is related to climate change. •Detection of disease-causing organisms and monitoring of the safety of food and water supplies. •Use of genetically altered bacteria as living sensors (biosensors) to detect harmful chemicals in soil, air, or water. •Understanding of specialized systems used by microbial cells to live in natural environments with other cells. http://microbialgenomics.energy.gov/index.shtml

Page 29: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10

The Integrated Microbial Genomes (IMG) system serves as a community resource for comparative analysis and annotation of all publicly available genomes from three

domains of life, in a uniquely integrated context.

Go To: http://img.jgi.doe.gov/

Page 30: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10

Compile list of Organisms

Page 31: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10

IMG Carts

• Carts are needed since IMG resets your session’s cache when you leave the site.

• Carts are an easy way to save a list of:– Organisms (eg. all cyanobacteria)– Genes (i.e you have a list of genes that code for

superoxide dismutase in 16 different organisms)– Functions (you have a list of the most popular

metalloenzymes in the form of COG, Pfam, TigerFam, or EC#)

• Saved as tab delimited text files

Page 32: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10

Organism Cart (cyanobac)taxon_oid Genome Name Sequencing StatusDomain Genes GC Perc Bases641228474 Acaryochloris marina MBIC11017 Finished Bacteria 8488 0.47 8361599637000006 Anabaena variabilis ATCC 29413 Finished Bacteria 5764 0.41 7068601638341074 Crocosphaera watsonii WH 8501 Draft Bacteria 6004 0.37 6238156640612201 Cyanothece sp. CCY 0110 Draft Bacteria 6520 0.37 5880532637000121 Gloeobacter violaceus PCC 7421 Finished Bacteria 4488 0.62 4659019640963043 Leptolyngbya valderiana BDU 20041 Draft Bacteria 12 0.53 89264639857035 Lyngbya sp. PCC 8106 Draft Bacteria 6185 0.41 7037511639857037 Nodularia spumigena CCY9414 Draft Bacteria 4904 0.41 5316258638341137 Nostoc punctiforme PCC 73102 Draft Bacteria 7818 0.41 9020037637000199 Nostoc sp. PCC 7120 Finished Bacteria 6217 0.41 7211789640069321 Prochlorococcus marinus AS9601 Finished Bacteria 1982 0.31 1669886640753041 Prochlorococcus marinus MIT 9215 Finished Bacteria 2056 0.31 1738790640069322 Prochlorococcus marinus MIT 9301 Finished Bacteria 1963 0.31 1641879640069323 Prochlorococcus marinus MIT 9303 Finished Bacteria 3127 0.5 2682675637000210 Prochlorococcus marinus MIT 9312 Finished Bacteria 1856 0.31 1709204637000211 Prochlorococcus marinus MIT 9313 Finished Bacteria 2345 0.51 2410873640069324 Prochlorococcus marinus MIT 9515 Finished Bacteria 1964 0.31 1704176640069325 Prochlorococcus marinus NATL1A Finished Bacteria 2247 0.35 1864731637000212 Prochlorococcus marinus NATL2A Finished Bacteria 1942 0.35 1842899637000213 Prochlorococcus marinus marinus CCMP1375 Finished Bacteria 1932 0.36 1751080637000214 Prochlorococcus marinus pastoris CCMP1986 Finished Bacteria 1765 0.31 1657990641228501 Prochlorococcus marinus str. MIT 9211 Finished Bacteria 1901 0.38 1688963637000307 Synechococcus elongatus PCC 6301 Finished Bacteria 2584 0.55 2696255637000308 Synechococcus elongatus PCC 7942 Finished Bacteria 2715 0.55 2742269639857006 Synechococcus sp. BL107 Draft Bacteria 2553 0.54 2283377637000309 Synechococcus sp. CC9311 Finished Bacteria 2945 0.52 2606748637000310 Synechococcus sp. CC9605 Finished Bacteria 2756 0.59 2510659637000311 Synechococcus sp. CC9902 Finished Bacteria 2358 0.54 2234828637000312 Synechococcus sp. JA-2-3Ba(2-13) Finished Bacteria 2938 0.58 3046680637000313 Synechococcus sp. JA-3-3Ab Finished Bacteria 2891 0.6 2932766640427148 Synechococcus sp. RCC307 Finished Bacteria 2583 0.61 2224914639857007 Synechococcus sp. RS9916 Draft Bacteria 3009 0.6 2664465638341213 Synechococcus sp. RS9917 Draft Bacteria 2820 0.64 2579542638341214 Synechococcus sp. WH 5701 Draft Bacteria 3401 0.65 3043834640427149 Synechococcus sp. WH 7803 Finished Bacteria 2586 0.6 2366980638341215 Synechococcus sp. WH 7805 Draft Bacteria 2938 0.58 2620367637000314 Synechococcus sp. WH 8102 Finished Bacteria 2586 0.59 2434428637000315 Synechocystis sp. PCC 6803 Finished Bacteria 3626 0.47 3947019637000320 Thermosynechococcus elongatus BP-1 Finished Bacteria 2554 0.54 2593857637000329 Trichodesmium erythraeum IMS101 Finished Bacteria 5124 0.34 7750108

Page 33: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10

Gene cart (Cu/Zn superoxide dismutase)

gene_oid Locus Tag Gene Symbol Product Name AA Seq Length Genome641254312 AM1_5239 sodCC copper/zinc superoxide dismutase 196 Acaryochloris marina MBIC11017637459373 glr1981 similar to superoxide dismutase 233 Gloeobacter violaceus PCC 7421637459565 glr2170 similar to superoxide dismutase 191 Gloeobacter violaceus PCC 7421640015250 L8106_24545 superoxide dismutase 201 Lyngbya sp. PCC 8106639885006 BL107_14050 putative superoxide dismutase 198 Synechococcus sp. BL107638115359 sync_1771 sodC Copper/zinc superoxide dismutase 175 Synechococcus sp. CC9311637776096 Syncc9605_1507 superoxide dismutase precursor (Cu-Zn) 178 Synechococcus sp. CC9605637771156 Syncc9902_0982 putative superoxide dismutase 175 Synechococcus sp. CC9902640545246 SynRCC307_0325 sodC Superoxide dismutase [Cu-Zn]( EC:1.15.1.1 ) 175 Synechococcus sp. RCC307639889548 RS9916_26849 superoxide dismutase precursor (Cu-Zn) 177 Synechococcus sp. RS9916640543304 SynWH7803_0951 sodC Superoxide dismutase [Cu-Zn]( EC:1.15.1.1 ) 174 Synechococcus sp. WH 7803639020551 WH7805_01302 putative superoxide dismutase 174 Synechococcus sp. WH 7805

Page 34: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10

Function Cart (metalloenzymes)

func_id func_nameCOG0619 ABC-type cobalt transport system, permease component CbiQ and related transportersCOG1122 ABC-type cobalt transport system, ATPase componentCOG1930 ABC-type cobalt transport system, periplasmic componentCOG2032 Cu/Zn superoxide dismutaseCOG2140 Thermophilic glucose-6-phosphate isomerase and related metalloenzymesCOG3227 Zinc metalloprotease (elastase)COG4097 Predicted ferric reductaseCOG4300 Predicted permease, cadmium resistance proteinpfam01676Metalloenzymepfam01794Ferric_reductpfam02022Integrase_Znpfam02361CbiQpfam02553CbiNpfam02742Fe_dep_repr_Cpfam03596Cad

Page 35: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10
Page 36: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10
Page 37: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10
Page 38: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10
Page 39: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10
Page 40: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10
Page 41: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10

• Load genomes– Go to “FIND GENOMES”– Click “VIEW PHYTOGENETICALLY”– Click “CLEAR ALL” to unselect all genomes– Click “ALL” after Cyanobacteria listings to select all Cyanobacterial genomes– Click “SAVE SELECTIONS” to choose only these selected Cyanobacterial genomes. Note

at top now it should say 40 genomes selected.

• Gene Search for Superoxide Dismutase, using “FIND GENES” function– By “GENE SEARCH”: type in superoxide dismutase and hit search. Note that this will only

return genes that have been “annotated” as a superoxide dismutase by a previous computer or human annotator. Go ahead and grab a sequence for Synechococcus strain WH8102’s nickel superoxide dismutase, by clicking on the 474bp to the clipboard (highlight the area and hit control-C). Note that this is the DNA sequence.

– Click the “FIND GENES” tab and then the “BLAST” tab: Paste in the nickel superoxide dismutase into open box.

– Choose BLASTn for nucleotide (DNA) search– Set the cutoff value to 1e-2, (less stringent).– Note that the best hit is where you got the sequence from. – Repeat, but now with the amino acid sequence instead of the DNA sequence

In Class Exercise on IMG:http://img.jgi.doe.gov

Page 42: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10

Sequences producing significant alignments: (bits) E-Value

637000314.NC_005070 Synechococcus sp. WH 8102, complete genome. 940 0.0 637000310.NC_007516 Synechococcus sp. CC9605, complete genome. 389 e-105 639857006.NZ_AATZ01000003 Synechococcus sp. BL107, unfinished se... 311 3e-82 637000311.NC_007513 Synechococcus sp. CC9902, complete genome. 287 4e-75 637000309.NC_008319 Synechococcus sp. CC9311, complete genome. 208 3e-51 640069323.NC_008820 Prochlorococcus marinus str. MIT 9303, compl... 168 3e-39 637000211.NC_005071 Prochlorococcus marinus str. MIT 9313, compl... 153 2e-34 640963030.NZ_ABCS01000039 Plesiocystis pacifica SIR-1, unfinishe... 68 7e-09 637000213.NC_005042 Prochlorococcus marinus subsp. marinus str. ... 54 1e-04 641228501.NC_009976 Prochlorococcus marinus str. MIT 9211, compl... 48 0.007

Blast results

Page 43: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10

The COG database: new developments in phylogenetic classification of proteins from complete genomes

Roman L. Tatusov, Darren A. Natale, Igor V. Garkavtsev, Tatiana A. Tatusova, Uma T. Shankavaram, Bachoti S. Rao, Boris Kiryutin, Michael Y. Galperin, Natalie D. Fedorova, and Eugene

V. KooninaNational Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA

The database of Clusters of Orthologous Groups of proteins (COGs), which represents an attempt on a phylogenetic classification of the proteins encoded in complete genomes, currently consists of 2791 COGs including 45 350 proteins from 30 genomes of bacteria, archaea and the yeast Saccharomyces cerevisiae (http://www.ncbi.nlm.nih.gov/COG). In addition, a supplement to the COGs is available, in which proteins encoded in the genomes of two multicellular eukaryotes, the nematode Caenorhabditis elegans and the fruit fly Drosophila melanogaster, and shared with bacteria and/or archaea were included. The new features added to the COG database include information pages with structural and functional details on each COG and literature references, improvements of the COGNITOR program that is used to fit new proteins into the COGs, and classification of genomes and COGs constructed by using principal component analysis.

Growth dynamics of the COG set with the increase of number of included genomes. The circles show the sequence of genome inclusion according to the actual order of sequencing, and the smooth line shows the mean of 106 random permutations of the genome order. The colored area indicates the range between the maximal and minimal value for each point (number of genomes) in 106 random permutations.Nucleic Acids Res. 2001 January 1; 29(1): 22–28.

Page 44: An introduction to metalloenzymes and biotechnological approaches to studying them  12.755  L10

End for today