Gaussian mixture models (GMMs) are widely used probabilistic clustering methods that represent data as mixtures of Gaussian distributions. In predictive mapping, they can be used to assign discrete classes to spatial locations from continuous input variables, with applications in geology, ecology, and remote sensing. However, when class distributions overlap in feature space, standard GMMs may produce fragmented or spatially inconsistent predictions. In such settings, spatial context can help improve the coherence of the resulting maps. Existing approaches for incorporating spatial information into GMM-based predictive mapping, however, may rely on more complex spatial formulations or offer limited flexibility in controlling the influence of spatial information.
In this paper, we propose a kernel-based spatial adjustment of GMM posterior probabilities for predictive mapping. The method first estimates class posterior probabilities using a standard GMM, then computes spatially regularized probabilities by locally smoothing these posteriors with a spatial kernel, and finally combines the original and spatially adjusted probabilities through a trade-off parameter controlling the influence of spatial context. This provides a simple and flexible way to improve spatial coherence while retaining the feature-based probabilistic structure of the initial GMM. We also investigate posterior-based measures of assignment uncertainty to characterize ambiguous areas in the resulting predictive maps.
The proposed approach is evaluated on synthetic datasets with simple and more complex spatial patterns, and on a real-world case study of surficial geological mapping using Digital Elevation Models (DEMs) and derived topographic attributes.
The results show that the proposed method improves clustering accuracy and spatial coherence relative to the standard GMM and to the spatial GMM variants considered in this study. The assignment uncertainty measures further help identify transition zones and areas prone to misclassification in both synthetic and real-world datasets.