Uncertainty is endemic in geospatial data due to the imperfect means of recording, processing, and representing spatial information. Propagating geospatial model inputs inherent uncertainty to uncertainty in model predictions is a critical requirement in each model's impact assessment and risk-conscious policy decision-making. It is still extremely difficult, however, to perform in practice uncertainty analysis of model outputs, particularly in complex spatially distributed environmental models, partially due to computational constraints.
In the field of groundwater hydrology, the "stochastic revolution" has produced an enormous number of theoretical publications and greatly influenced our perspective on uncertainty and heterogeneity; it has had relatively little impact, however, on practical modeling. Monte Carlo simulation using simple random (SR) sampling from a multivariate distribution is one of the most widely used family of methods for uncertainty propagation in hydrogeological flow and transport model predictions, the other being analytical propagation.
Real-life hydrogeological problems however, consist of complex and non-linear three dimensional groundwater models with millions of nodes and irregular boundary conditions. The number of Monte-Carlo runs required in these cases, depends on the number of uncertain parameters and on the relative accuracy required for the distribution of model predictions. In the context of sensitivity studies, inverse modelling or Monte-Carlo analyses, the ensuing computational burden is usually overwhelming and computationally impractical. These tough computational constrains have to be relaxed and removed before meaningful stochastic groundwater modeling applications are possible.
A computationally efficient alternative to classical Monte Carlo simulation based on SR sampling is Latin hypercube (LH) sampling, a form of stratified random sampling. The latter yields a more representative distribution of model outputs (in terms of smaller sampling variability of their statistics) for the same number of input simulated realizations. The ability to generate unbiased LH realizations becomes critical in a spatial context, where random variables are geo-referenced and exhibit spatial correlation, to ensure unbiased outputs of complex models. On this regard, this dissertation offers a detailed analysis of LH sampling and compares it with SR sampling in a hydrogeological context. Additionally, two alternative stratified sampling methods, here named stratified likelihood (SL) sampling and minimum energy (ME) sampling, are examined (proposed in a spatial context) and their efficiency is further compared to SR and LH in a hydrogeological context; also accounting for the uncertainty related to the particular model at hand via a two step sampling method. All three stratified sampling methods (accounting for model sensitivity in the second case study) were found in this work to be more efficient than simple random sampling.
Additionally, this thesis proposes a novel method for the expansion of the application domain of LH sampling to very large regular grids which is the common case in environmental (hydrogeological or not) models. More specifically, a novel combination of Stein's Latin Hypercube sampling with a Monte Carlo simulation method applicable over high discretization domains is proposed, and its performance is further validated in 2D and 3D hydrogeological problems of flow and transport in a mid-heterogeneous porous media, both consisting of about $1$ million nodes. Last, an additional novel extension of the proposed LH sampling on large grids is adopted for conditional high discretized problems. In this case too, the performance of the proposed approach is evaluated in a 3D hydrogeological model of flow and transport. Results indicate that both extensions (conditional and not) of LH sampling on large grids facilitate efficient uncertainty propagation with fewer model runs due to more representative model inputs.
Overall, it could be argued that all the proposed methodological approaches could reduce the time and computer resources required to perform uncertainty analysis in hydrogeological flow and transport problems. Additionally, since it is the first time that stratified sampling is performed over high discretization domains, it could be argued that the proposed extensions of LH sampling on large grids could be considered a milestone for future uncertainty analysis efforts. Moreover, all the proposed stratified methods could contribute to a wider application of uncertainty analysis endeavors in a Monte Carlo framework for any spatially distributed impact assessment study.