Closed-form, interpretable mathematical models have been instrumental for advancing our understanding of the world. Think, for example, of Newton’s law of gravitation and how its mathematical analysis has enabled us to predict astronomical phenomena with great accuracy and, perhaps more importantly, to understand central forces in general and, ultimately, the relationship between symmetry and conservation laws. With the data revolution, we may now be in a position to uncover new mathematical models for many systems from physics to the social sciences. However, to deal with increasing amounts of data, we need “machine scientists” that are able to extract these closed-form mathematical models automatically from data.

In a series of papers, ICREA professor Roger Guimerà and colleagues at Universitat Rovira i Virgili have developed a Bayesian machine scientist. The Bayesian machine scientist assigns model plausibilities rigorously, and establishes its prior expectations about the models by learning from a large empirical corpus of mathematical expressions. It also explores the space of all possible closedform mathematical models in ways that provide guarantees of eventually finding the correct one, if it exists.

For systems for which models have been proposed before, the Bayesian machine scientist is able to uncover new models that are more plausible and more predictive than the old ones, without being more complex. The machine scientist is also able to uncover accurate, closed-form mathematical models for systems for which no closed-form model was known before. In particular, Guimerà and coworkers have applied the Bayesian machine scientist to a 90 year-old problem in turbulence. For this problem they provide closed-form solutions and, despite the fact that many partial solutions have been proposed, they find that the original approach proposed in the 1930s is the most plausible one so far, outperforming even the models proposed in the last few years.