Every one of us harbors 100 trillion microorganisms in our digestive tract.
This microbial community, or microbiota, is extremely diverse and includes
bacteria, archaea, viruses, bacteriophages1, etc. One possible
method of exploration is to globally sequence the DNA from this small world.
Then, all that needs to be done is to assign the sequences found in the
metagenome, by comparing them to indexed reference genes. There is just one
problem: most intestinal microorganisms, which are non-culturable in the
laboratory, are not listed in any catalogs. Moreover, the complexity of this
environment limits the use of conventional bioinformatics methods.
An international consortium led by INRA, and involving teams from the CEA-IG
(Genoscope)2, has logically reduced the scale of this problem. On the
one hand, all genes from the same biological entity (the chromosome of a
bacterial species, for example) are present at the same abundance in a given
metagenome. This abundance simply reflects the size of the entity in the medium
analyzed. On the other hand, the composition of the intestinal population is
specific for each person, as the proportion of different organisms varies
greatly from one individual to the next. If, by comparing the content of
metagenomes from different subjects, sets of genes are identified whose
abundance varies in a similar manner from one microbiota to another, then these
genes probably belong to the same entity.
By analyzing the feces of nearly four hundred individuals, scientists have
divided around 3.9 million discovered genes into more than 7,000 “co-abundance
groups”. From this latter group, the scientists have identified more than 700
species of bacteria, including 630 unknown species, and have recreated the
reference genome from 238 new species. Hundreds of phages (more than 800 unknown
species) were also identified. Furthermore, by revealing hundreds of
co-dependent relationships, the researchers have projected a new light on the
survival mechanisms of each microorganism, as well as the overall functioning of
the populations.
[1] a virus that infects bacteria
[2] mixed teams from the CEA/CNRS/Université d’Evry