An environmental scientist might, for example, examine whether exposure to air pollution is associated with lower birth weights in a specific county. Given the complexity of environmental and health data, a researcher could reasonably turn to machine learning to estimate the strength of this relationship. These methods are desirable for their ability to capture complex, nonlinear patterns that are difficult to represent with traditional statistical models. In practice, a model could be trained on existing pollution and health records to estimate how variations in air quality correspond to changes in birth weight outcomes.
While machine-learning techniques are highly effective for prediction, they are far less reliable when the goal is to determine whether two variables are truly associated, and how certain one can be about that conclusion. Some methods address this by providing confidence intervals, which quantify uncertainty. However, researchers at Massachusetts Institute of Technology found that, in spatial settings, these confidence intervals can be deeply misleading. When data vary across geographic locations, standard approaches may report high confidence even when the estimated association is seriously inaccurate.
This problem arises because many variables of interest, such as air pollution, rainfall, or temperature, are spatially structured rather than randomly distributed. Common confidence-interval methods rely on assumptions that break down in such contexts. They often presume that observations are independent and identically distributed, that the statistical model is perfectly specified, and that the data used to train the model closely resemble the data at the location where the association is being estimated. In spatial analyses, none of these assumptions holds reliably, leading to intervals that appear precise but are fundamentally wrong.
The consequences of this failure are significant. A model might claim to be 95 per cent confident that it has captured the true relationship between two variables, such as pollution and health outcomes, when in fact it has missed that relationship entirely. This false sense of certainty can mislead scientists, policymakers, and practitioners into trusting results that should instead be treated with caution. Recognising this risk, the MIT researchers set out to develop a method that could produce confidence intervals that remain valid when data vary across space.
Their solution replaces unrealistic assumptions with one that better reflects how many real-world variables behave: spatial smoothness. Instead of assuming that data from different locations are effectively interchangeable, the new method assumes that values change gradually over space. For example, air pollution levels are unlikely to shift dramatically from one city block to the next, instead tapering off as distance from pollution sources increases. According to Tamara Broderick, this assumption aligns far more closely with the structure of environmental and spatial data.
When tested in simulations and on real datasets, the new approach consistently produced accurate confidence intervals, even in the presence of noise or measurement error. Other commonly used techniques failed to do so. The findings, presented at the Conference on Neural Information Processing Systems, suggest that researchers in fields such as environmental science, economics, and epidemiology could benefit substantially from adopting methods tailored to spatial data. By providing more trustworthy measures of uncertainty, this work clarifies when results can be genuinely relied upon, thereby improving both scientific understanding and decision-making.
More information: David R. Burt et al, Smooth Sailing: Lipschitz-Driven Uncertainty Quantification for Spatial Association, arXiv. DOI: 10.48550/arXiv.2502.06067
Journal information: arXiv Provided by Massachusetts Institute of Technology