HOW CAN WE EXTRACT THE SPATIO-TEMPORAL BEHAVIOR OF UNDERLYING CELLULAR PROCESSES FROM MASSIVE SEQUENCING DATA?
The development of high-throughput sequencing technologies or Next-Generation Sequencing (NGS) has paved the way for the study of the spatio-temporal coordination of cellular processes along the genome (DNA repair, nucleosome arrangement, DNA methylation). However, data sets are generally limited to a few time points, and missing information has to be interpolated. Most models assume that the dynamics under study are similar between individual cells, so that a homogeneous cell culture can be represented by a population-wide average.
For several years, Julie Soutourina's team has been developing computational tools to assess molecular interactions between proteins and DNA using NGS sequencing data, and unlike studies that consider sequencing signals as average behavior, the team's researchers take them as a superposition of stochastic interactions in independent cells. The central problem raised by the researchers is that any sequencing method freezes the cellular configuration at a given moment, and that previous and subsequent temporal observations are inaccessible.
BY PLAYING CHESS
In this study, the researchers proposed a minimal formal algorithm linking possible temporal trajectories of the cellular processes under consideration to population-scale NGS data, so as to infer missing information for studying cellular processes over time. To do this, they used numerical simulations based on a thought experiment, which they called the NGS chess problem, in which they compared the analysis of temporal sequencing data with the observation of a superimposed image of several independent chess games. Analysis of the obtained spatio-temporal kinetics argues in favor of a new methodology that would take into account the temporal trajectory of cellular processes considered in each cell independently, even for a homogeneous cell population.
By representing the dynamics of biological processes arbitrarily as a chess game, this work underlines the importance of developing new computational approaches, and shows how it is possible to obtain information about the internal dynamics of living cells from data on cell population-scale behavior.
Contact: Julie Soutourina (julie.soutourina@i2bc.paris-saclay.fr ; julie.soutourina@cea.fr)
NGS: High-throughput sequencing or Next-Generation Sequencing (NGS) is a molecular methodology that enables the rapid sequencing of thousands to millions of DNA or RNA molecules simultaneously, by determining the unique and specific order of nucleic acid bases. It is a breakthrough technology that has considerably increased our understanding of diseases such as cancer.