You are here : Home > News > Next-generation sequencing data modeled like a chess game

Scientific result | Bioinformatic | DNA | Molecular mechanisms | Simulation & modelling

Next-generation sequencing data modeled like a chess game


​A team at I2BC is proposing a thought experiment, called the NGS (Next Generation Sequencing) chess problem, which treats sequencing data as a superimposed image of several independent chess games. A more realistic approach to the dynamics of cellular processes within each cell, which may not follow the same temporal trajectory, even for a homogeneous cell population. This point of view reveals the limitations of NGS approaches of temporal analysis.

Published on 17 February 2025

​HOW CAN WE EXTRACT THE SPATIO-TEMPORAL BEHAVIOR OF UNDERLYING CELLULAR PROCESSES FROM MASSIVE SEQUENCING DATA?

The development of high-throughput sequencing technologies or Next-Generation Sequencing (NGS) has paved the way for the study of the spatio-temporal coordination of cellular processes along the genome (DNA repair, nucleosome arrangement, DNA methylation). However, data sets are generally limited to a few time points, and missing information has to be interpolated. Most models assume that the dynamics under study are similar between individual cells, so that a homogeneous cell culture can be represented by a population-wide average.
For several years, Julie Soutourina's team has been developing computational tools to assess molecular interactions between proteins and DNA using NGS sequencing data, and unlike studies that consider sequencing signals as average behavior, the team's researchers take them as a superposition of stochastic interactions in independent cells. The central problem raised by the researchers is that any sequencing method freezes the cellular configuration at a given moment, and that previous and subsequent temporal observations are inaccessible.

BY PLAYING CHESS

In this study, the researchers proposed a minimal formal algorithm linking possible temporal trajectories of the cellular processes under consideration to population-scale NGS data, so as to infer missing information for studying cellular processes over time. To do this, they used numerical simulations based on a thought experiment, which they called the NGS chess problem, in which they compared the analysis of temporal sequencing data with the observation of a superimposed image of several independent chess games. Analysis of the obtained spatio-temporal kinetics argues in favor of a new methodology that would take into account the temporal trajectory of cellular processes considered in each cell independently, even for a homogeneous cell population.

By representing the dynamics of biological processes arbitrarily as a chess game, this work underlines the importance of developing new computational approaches, and shows how it is possible to obtain information about the internal dynamics of living cells from data on cell population-scale behavior.

ContactJulie Soutourina (julie.soutourina@i2bc.paris-saclay.fr ; julie.soutourina@cea.fr)

NGS: High-throughput sequencing or Next-Generation Sequencing (NGS) is a molecular methodology that enables the rapid sequencing of thousands to millions of DNA or RNA molecules simultaneously, by determining the unique and specific order of nucleic acid bases. It is a breakthrough technology that has considerably increased our understanding of diseases such as cancer.



Top page