Machine learning requires statistical models built on data collected from a variety of sources, something that is becoming more and more difficult amid growing data privacy and security concerns. Federated learning constitutes a powerful—and increasingly popular—solution to this problem. Here's how it works: Multiple participants collaborate to train a model, and learning is supervised by a central server. However, only the characteristics of the locally-learned models—and not the actual data—are sent to the server. The server then uses these characteristics to create the final model.
CEA-List's new platform offers specially developed algorithms to handle heterogeneous data in cases where the data is not evenly distributed across all contributors, incremental-learning-inspired tools to limit the risk of catastrophic forgetting, and transfer learning tools to customize the federated model for individual contributors. The overriding objective is to ensure that the federated learning models developed are at least as good as models trained directly on data.
CEA-List's platform also includes several aggregation algorithms (median, etc.) to shield the models developed from poisoning attacks, where malicious participants introduce false training data to throw the model off. These algorithms can ensure that the models are reliable as long as the number of bad actors does not exceed 50%.
The platform is available to companies in the health, mobility, and IoT industries seeking to develop federated learning solutions in scenarios where the training data sets are either too sensitive or too big to use conventional methods. Academic researchers can also use the platform to evaluate their own algorithms.