Given the great mass of biological data produced, we intervene very early in the projects by assuming the raw data directly exiting the high-throughput systems such as the new-generation sequencers (NGS).
Working with the data, we define pertinent analysis strategies, define and ensure the quality control necessary and then conduct the bioinformatics analyses of the data. For the implementation of those tasks, we use existing software or software that we develop which we deploy in our analysis workflows ('bioinformatics pipelines'). We also develop new methodologies or approaches for exploitation of the data generated by the latest generations of technology.
We also evaluate the IT dimensions of the problems raised and deploy our calculations on the high-performance computing (HPC) computer deployed at the TGCC (Very Large Computing Center, Bruyères le Châtel, 3000 cores, 5 PB). The parallelization of our codes enables us to respond rapidly to complex computational questions and thus ensure human genome analyses in a few hours.