Sequence-based approaches to uncultivated microbes Susannah Green Tringe DOE Joint Genome Institute Metagenome Program Lead Department of Energy: Mission areas Bioenergy Carbon Cycle DOE Joint Genome Institute Walnut Creek, CA since 1999 Mission: To advance genomics in support of the DOE mission Biogeochemistry Uncultivated microbial communities Wetland carbon cycling Root-associated communities Uncultivated microbial communities Wetland carbon cycling Root-associated communities JGI Programs & Infrastructure Bioenergy Metagenomes DNA Sequencing Carbon Cycling Plants Fungi Biogeochemistry Microbes Genomic Computational Technologies Analysis Synthesis Synthesis What is metagenomics? Isolate (pure culture) Microbial community Genomics Metagenomics tame wild ≠ Most bacteria don’t grow on plates: “the great plate count anomaly” 16S ribosomal RNA as a phylogenetic marker 21 proteins 16S rRNA 30S 70S Ribosome subunits 50S 34 proteins 5S rRNA 23S rRNA Escherichia coli 16S rRNA Primary and Secondary Structure Falk Warnecke 16S-based phylogeny Carl Woese 1928-2012 Woese Microbiol Rev 1987 Norm Pace 16S rRNA in environmental microbiology Falk Warnecke Bacterial phylogenetic tree expansion cultured uncultured Modified from Baker et al 2013 (Microbe) The situation is similar for archaea Baker et al 2013 (Microbe) Why metagenomics? Industrial enzymes Most microbes are uncultured! Antibiotics Greenhouse gas cycling Suizenbacher et al 1997; www.chm.bris.ac.uk; Functional metagenomics Gillespie 2002 Turbomycin synthesis genes isolated from a soil metagenome library Schloss & Handelsman 2003 Discovery of proteorhodopsin de la Torre, J. R. et al. (2003) Proc. Natl. Acad. Sci. USA 100, 12830-12835 Rhodopsin-like gene – never before seen in bacteria! Copyright ©2003 by the National Academy of Sciences Bacterial rRNA operon Shotgun metagenomics Shotgun sequencing: Could it be done? Schloss & Handelsman 2003 Metagenome assembly - like putting together several jigsaw puzzles Falk Warnecke . . . with some pieces missing Falk Warnecke Can we still reconstruct? Falk Warnecke Can we still reconstruct? Falk Warnecke One approach: a simple community anaerobic aerobic effluent influent return sludge EUB sedimentation tank waste sludge PAO Enhanced Biological Phosphate Removal (EBPR) reactors are dominated by Candidatus Accumulibacter phosphatis (CAP) but it cannot be grown in pure culture Crocetti et al., 2001 Proposed biological model for polyPaccumulating organisms (PAOs) Anaerobic zone Cell PHA NAD Aerobic zone Cell PHA NADH Glycogen Acetyl-CoA Glycogen ADP ADP PolyP PolyP ATP ATP PO4 TCA cycle acetate PO4 CO2 CAP metabolic reconstruction Anaerobic Phase Aerobic Phase Garcia Martin et al. 2006 What about more complex communities? EBPR sludge 1 Sargasso Sea 10 100 Soil 1000 Species complexity ? 10000 A Adaptive gene for habitat A Adaptive gene for habitat B Essential gene B Environmental Gene Tags (EGTs) Comparative metagenomics COG5524: Bacteriorhodopsin COG1292: Cholineglycine betaine transporter COG3459: Cellobiose phosphorylase Tringe et al 2005 Metagenome sequence output 100 Tb Sequence output (Tb) 100 90 JGI Sequenced Bases 80 Metagenome bases 70 60 50 40 30 Tb 30 20 40 Gb 10 0 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 Fiscal Year meta-GENOMEs Hess 2012: 15 rumen genomes Wrighton 2012: 3 uncultivated phyla 49 genomes Iverson 2012: Marine Group A Annotation of metagenomes Shotgun data Singlet fasta ORF prediction Contig fasta Assembly RNA prediction Contig stats Jill Banfield, UCB Chongle Pan, ORNL Contig Depth G+C Contig1 10.1 .49 Contig2 5.6 .35 Contig3 19.8 .38 Genome binning, extraction, improvement ORF prediction RNA prediction The single-cell approach: how it works isolation lysis + polymerase + random hexamers dNTPs + template DNA MDA Tanja Woyke Single cell genomics: key challenges CHALLENGE isolation lysis Sample contamination (‘hitchhiker’ DNA) No universal lysis for all taxa Chimerism MDA Reagent contamination MDA bias Tanja Woyke JGI single-cell sequencing pipeline Whole genome amplification Single cell isolation Sample Community profiling 16S rRNA gene identification Genome sequencing Tanja Woyke Draft genome Assembly Data QC Annotation Data curation Current limitations: 100 cells ≠ 100 SAGs Whole genome amplification Single cell isolation Sample Community profiling ??? 16S rRNA gene identification o Not every cell can be isolated Draft genome o Not every cell can be lysed and WGA’d o Not every cell can be 16S ID’d Genome sequencing Tanja Woyke Assembly Data QC Annotation Data curation Recovered diversity: 16S tags vs SAGs Tanja Woyke Modified from Clingenpeel et al 2014 (Frontiers in Microbiol) marker genes culture draft/ complete genomes Tanja Woyke single cell partial draft genomes, complete genomes (rarely) target cell enrichment metagenomics draft genomes, complete genomes unassembled data, genome bins, complete genomes culture Tanja Woyke single cell target cell enrichment metagenomics metagenomic approach single-cell approach Tanja Woyke marker genes Uncultivated microbial communities Wetland carbon cycling Root-associated communities Wetland restoration Global carbon Carbon stored by wetlands (35% of all terrestrial carbon) Wetlands (9%) Global land area CH4 CO2 Wetland restoration What determines if a wetland serves as a greenhouse gas source or a greenhouse gas sink? CH4 CO2 San Francisco Bay wetlands PAST Wetland PRESENT Farmland Salt pond www.wrmp.org Subsidence and carbon loss 1850s Elevation loss Levee failure Mount and Twiss 2005 Subsidence and carbon loss 1850s Elevation loss Levee failure Mount and Twiss 2005 Twitchell Island restored wetland - Est. 1997 Peat accretion: ~4 cm/year Net GHG budget: -494 g CO2 eq/m2 Shaomei He Mark Waldrop Lisamarie Windham-Myers Twitchell wetland Sampling site gradients A Site B C/L Methane flux Water inlet Do the microbial communities reflect these changes in geochemistry? Peat accretion Oxygen, Nitrate, Sulfate Water outflow What controls methane? CH CH 4 4 Abundance Activity Species composition Relative Gene Family Abundances Samples with more methanogenesis genes have fewer genes in denitrification, dissimilatory sulfate reduction, and metal reduction. Methane oxidation genes were more abundant in rhizomes. Methanogen abundance DNA abundance Methanogen marker genes from metagenome RNA abundance (÷2) CH4 CH4 - Methane emissions correlated to methanogen ABUNDANCE and ACTIVITY Methanogen diversity Hydrogenotrophic Methanogenesis: CO2 + 4 H2 → CH4 + 2 H2O Acetoclastic Methanogenesis: CH3COOH → CH4 + CO2 Hydrogenotrophic Methanomicrobiales (order) [H] Aug_C Methanosaetaceae;Methanosaeta [A] Methanobacteriaceae;Methanobacterium [H] Feb_L Methanobacteriales (order) [H] Methanosarcinaceae;Methanosarcina [A] Aug_B Methanocellaceae;Methanocella [H] Methanomicrobiaceae;Methanofollis [H] MSBL1;SAGMEG-1 [H] Feb_B Methanospirillaceae [H] Methanococcales (order) [H] Aug_A Methanosarcinaceae;Methanomethylovorans [A] Methanobacteriaceae;Methanobrevibacter [H] Feb_A Methanobacteriaceae;Methanosphaera [H] pMC1;pMC1FA [H] 0% 50% Acetoclastic 100% Bay / Delta Salinity Gradient Historic and restored wetlands sampled along a salinity gradient Methane flux varies with salinity and wetland age Salinity (ppt) Average seawater Susie Theroux Uncultivated microbial communities Wetland carbon cycling Root-associated communities Rhizosphere Grand Challenge 1) Rhizosphere / endophyte microbes -provide nutrients -protect from pathogens and stress -influence growth -sequester carbon 2) Challenges -soil microbial communities are notoriously complex -plant genomes are complex -crosses multiple disciplines and programs -statistical rigor requires high sample numbers 52 Arabidopsis rhizosphere project Rhizosphere Endosphere Soil Are there unique communities in each compartment? Does the plant control access? 53 Rhizosphere Grand Challenge Identifying the major determinants of microbial community assembly Host factors Root-associated microbial communities Full factorial design 1117 samples 16S pyrotag profiles Variables investigating: Soil type – Mason Farm vs. Clayton Sample fraction – Bulk soil vs. rhizosphere vs. endophyte Plant age – bolting (young) vs. senescent (old) Genotype – 8 ecotypes Individual – Aim for 10 individuals per condition 54 Jeff Dangl The Arabidopsis microbiome The endophyte community is unique and reproducible and similar across soil types OTUs Lundberg Nature 2012 Sample type Rhizosphere/ Soil 1 Rhizosphere/ Soil 2 Endophyte The Arabidopsis Microbiome Cultured isolates Single cells “Plate scrape” metagenomes Flow-sorted “mini-metagenomes” Sphingobacteriales OTU 2324 Streptomyces OTU 14834 Pseudonocardiaciae OTU 13797 Dangl lab, UNC Woyke lab, JGI An endophyte genome catalog Actinobacteria 99 isolates and 130 SAGs fully sequenced >50% of target OTUs Firmicutes Alphaproteobacteria Cyanobacteria Single Cells Isolates Bacteroidetes Chloroflexi Betaproteobacteria S. Clingenpeel Recolonization and RNA-seq Full phosphate No bacteria full P + Bacteria full P Low phosphate No bacteria low P Sterile + Bacteria low P Colonized Inoculation with root-associated microbes reverses low P phenotype Conclusions • Most microorganisms cannot be readily grown in the lab • Nucleic acid sequencing provides a means to study uncultivated organisms via their genomes • Next-gen sequencing is opening up new opportunities in metagenomics and single cell genomics • These techniques are enabling advances in diverse areas of science Acknowledgments • EBPR sludge project – Trina McMahon, UWM – Phil Hugenholtz • Wetlands project – Shaomei He – Lisamarie Windham-Myers, USGS – Mark Waldrop, USGS – Susie Theroux – Wyatt Hartman – Dennis Baldocchi, UCB • Rhizosphere project – Jeff Dangl, UNC – Scott Clingenpeel – Tanja Woyke JGI: Next Generation Genome Science User Facility • Community Science Program • JGI-EMSL Collaborative Science Program • Emerging Technologies Opportunity Program (ETOP) • Visiting Scientist Program • Distinguished Postdoctoral Fellow in Genomics Questions? Contact: sgtringe@lbl.gov Genomic signatures of plant association Taxon # PA genomes # NPA genomes # total genes Bacillus 28 141 852,695 Burkholderiales 60 144 1,072,509 Microbacteriaceae 18 27 140,326 Micrococcaceae 13 22 117,775 Nocardiaceae 13 37 315,053 Paenibacillus 11 29 226,631 Pseudomonas 152 77 1,242,490 Rhizobiales 170 152 1,717,437 Sphingomonas 7 12 73,946 Streptomyces 9 65 513,229 Xanthomonadaceae 93 19 433,488 Comparing genomes of phylogenetically related plantassociated and non-plant-associated bacteria to identify gene families overrepresented in plant-associated (PA) organisms 1300 genomes (6.5M genes) included in analysis Recurrent plant-associated functions include chemotaxis, certain transposases, carbohydrate transport and metabolism, type III secretion systems, and nodulation Asaf Levy 64
© Copyright 2025