Health News

Novel method for comparing whole protein-coding genes between metagenomic data detects an environmental gradient for the microbiota

In a recent study published in the journal PLOS ONE, researchers from Japan compared phylogenetic trees based on proteomic data of microbiota with dendrograms of environmental factors to determine the role of environmental gradients in establishing microbial communities.

Study: Phylogeny analysis of whole protein-coding genes in metagenomic data detected an environmental gradient for the microbiota. Image Credit: ART-ur/Shutterstock


Lab Diagnostics and Automation eBook

Compilation of the top interviews, articles, and news in the last year.
Download a free copy

The role of environmental factors such as temperature, moisture, carbon and nitrogen content, and pH in microbial growth has been well studied. Determining correlations between changing microbial communities and environmental gradients can further our understanding of how principal environmental factors influence microbial communities.

With the advancement of sequencing technologies in the last few decades and the development of faster and more efficient bioinformatics tools, methods such as whole metagenomic shotgun sequencing can be used to identify new species, functional genes, and metabolic pathways have become feasible.

With access to larger amounts of genomic and proteomic data and advanced phylogenetic tools, similarities and relationships between microbial species and communities and correlations with environmental factors can be examined in detail.

About the study

The present study developed a method called Metagenomic Phylogeny by Average Sequence Similarity (MPASS) based on average sequence similarity to compare the proteomic data obtained from whole genome shotgun sequencing.

To test the accuracy of MPASS in detecting sequence similarities, two simulated datasets of metagenomic data of five bacterial species — Acidobacterium capsulatum, Bacteroides fragilis, Nitrosospira multiformis, Proteus mirabilis, and Sulfolobus islandicus — were analyzed using MPASS, and used to construct phylogenetic trees. The branching order relationships and bacterial species compositions were observed.

The method was then applied to an existing soil metagenomic dataset comprising 16 samples from distinct ecological biomes such as tundra, cold deserts, hot deserts, forests, and prairies. MPASS was also used to analyze an aquatic metagenomic dataset comprising 35 samples from a deep-sea hydrothermal vent, oceans, lakes, and hot springs.

Additionally, the branches of the Kirishima hot springs from the aquatic metagenomic tree were used in a program called TREEDIST to quantitatively compare the metagenome-based phylogenetic tree with a dendrogram of environmental parameters. The reported parameters of the hot springs were turbidity, pH, and concentrations of organic carbon, total nitrogen, and copper, zinc, and sulfate ions.

Furthermore, the number of genes and species distributed across various metagenomes of the same lineage and the number of shared genes between similar metagenomes were also analyzed.


The results reported that the MPASS method could accurately construct metagenomic trees from simulated and actual soil and water metagenomic samples, with correct clustering of the samples in the dataset.

Furthermore, when the metagenomic tree was used to infer correlations between microbiome transitions and environmental factors, the branching orders of samples from the hydrothermal vents and hot springs clusters correlated with the distance from the heat sources as well as with increasing temperatures at sampling sites.

The metagenomic tree built using the MPASS method was able to determine the microbiome transitions that occurred with changing environmental gradients. The tree topology also reflected the functional and taxonomic levels of the microbial dynamics. For the soil microbe metagenomic dataset, the phylogenetic tree separated the samples into three clusters — hot desert, cold desert, and green biomes consisting of prairie, forest, and tundra. Consistent with previous studies, the clusters reflected similarities in soil pH.

The tree clustered the samples according to freshwater and seawater biomes for the aquatic metagenomic dataset. Epsilonproteobacteria, such as Campylobacteria that reduce sulfur, were found in abundance in and around hydrothermal vents, with the abundance decreasing with increasing distance from the vents. Within the seawater cluster, there were three subclusters based on geographic location and further differences in branching based on the depth of the sample. The freshwater samples were subclustered based on the lake and hot spring biomes.

The metagenomic tree was similar to the dendrograms of various environmental factors, including oxidation-reduction potential, vanadium ions, sulfate concentration, and total, totally organic, dissolved organic, and particulate organic nitrogen. Aerobic, sulfur-metabolizing Crenarchaeota and anaerobic Aquificae were found in highly turbid or transparent regions of the hot springs, respectively.


Overall, the results indicated that the MPASS method developed in this study accurately classified whole proteomic data derived using metagenomic shotgun sequencing based on sequence similarity. The metagenomic tree constructed using MPASS was useful in determining correlations with environmental gradients.

Journal reference:
  • Satoh, S. et al. (2023) "Phylogeny analysis of whole protein-coding genes in metagenomic data detected an environmental gradient for the microbiota", PLOS ONE, 18(2), p. e0281288. doi: 10.1371/journal.pone.0281288.

Posted in: Life Sciences News

Tags: Bioinformatics, Cold, Copper, Genes, Genome, Genomic, heat, Microbiome, pH, Phylogeny, Protein, Shotgun Sequencing, Sulfur, Zinc

Comments (0)

Written by

Dr. Chinta Sidharthan

Chinta Sidharthan is a writer based in Bangalore, India. Her academic background is in evolutionary biology and genetics, and she has extensive experience in scientific research, teaching, science writing, and herpetology. Chinta holds a Ph.D. in evolutionary biology from the Indian Institute of Science and is passionate about science education, writing, animals, wildlife, and conservation. For her doctoral research, she explored the origins and diversification of blindsnakes in India, as a part of which she did extensive fieldwork in the jungles of southern India. She has received the Canadian Governor General’s bronze medal and Bangalore University gold medal for academic excellence and published her research in high-impact journals.

Source: Read Full Article