Introduction

The Stochastic Partial Derivative Equations (SPDE) model is a new methodology which enables estimation and simulation in a new formulation.

Most of the Geostatistical standard procedures suffer some severe limitations in presence of large data sets and large number of target sites (referred to as the big n problem). Moreover the traditional methodology for running simulations (i.e. Turning Bands Model) may lead to a prohibitive computing time.

SPDE is an efficient formalism which allows performing Estimation and Simulations. Additionally it allows dealing with large set of inequalities through an efficient Gibbs sampling procedure. It gets rid of concepts such as the Neighborhood (for selecting samples which will serve in the conditioning step) or the number of Turning Bands for the simulation process.

Conversely, SPDE methodology is constrained to Matérn models (usually called K-Bessel):

\[C(h)=\sigma^2\frac{2^{1-\nu}}{\Gamma(\nu)}\left(\sqrt{2\nu}\frac{h}{a}\right)^\nu \mathcal{K}_\nu\left(\sqrt{2\nu}\frac{h}{a}\right)\]

where:

\(\mathcal{K}_\nu\) is the modified Bessel function of order \(2\).
\(\nu\) controls the regularity.

Note that:

\(\nu=1/2\) corresponds to the exponential covariance
\(\nu\rightarrow +\infty\) corresponds to the Gaussian covariance

If \(Z(s)\) is a stationary random function in the space (of dimension \(d\)) with Matérn covariance and regularity \(\nu\), scale parameter \(a\) and sill \(\sigma^2\), them \(Z(s)\) is the solution of the SPDE equation: \[(\kappa^2 - \Delta)^{\alpha/2} Z(s) = \tau\mathcal{W}(s)\]

with:

\(\Delta = \sum_{i=1}^d \frac{\partial^2}{\partial s_i^2}\) is the Laplacian
\(\mathcal{W}(s)\) is a standardized white noise process
\(\kappa = 1/a\)
\(\alpha = \nu + d/2\)
\(\tau = \frac{\sigma\Gamma(\nu+d/2)^{1/2}(4\pi)^{d/4}\kappa^{\nu}}{\Gamma(\nu)^{1/2}}\)

(Lindgren, Rue and Lindström (2011) An explicit link between Gaussian fields and Gaussian random fields: the stochastic partial differential equation approach. Journal of Royal Statistic society)

How to utilize the previous result?

According to Whittle (1954): \[(\kappa^2-\Delta)^{\alpha/2} Z(s) = \tau\mathcal{W}(s)\]

On a regular 2-D grid, we can use Finite Differences (\(\alpha=2\)):

\[\kappa^2 Z_{i,j} -\frac{1}{dx}\left(\frac{Z_{i+1,j}-Z_{i,j}}{dx}-\frac{Z_{i,j}-Z_{i-1,j}}{dx}\right)-\frac{1}{dy}\left(\frac{Z_{i,j+1}-Z_{i,j}}{dy}-\frac{Z_{i,j}-Z_{i,j-1}}{dy}\right)=\tau W_{i,j}\]

with \(W_{i,j}\) is a standardized white noise.

This corresponds to the following sparse linear system \[A\textbf{Z}=\mathbf{W}\]

Turning into covariance and denoting \(\Sigma=\textrm{Cov}(\textbf{Z})\) we have:

\[A\Sigma A^t= \tau^2 I\]

The precision matrix \(Q\) of \(\textbf{Z}\) is then:

\[Q=\Sigma^{-1}=\frac{1}{\tau^2}A^tA\]

As can be seen in the example based on a regular grid, the square matrix Q is symmetric, with as many rows (and columns) as number of grid nodes. Moreover it is sparse: the number of non-zero terms on each line (resp. column) is 13.

How to perform Simple Kriging?

Let \[Z=\left (\begin{array}{c} X\\ Y \end{array} \right)\] where \(X\) is the vector of data and \(Y\) the vector of targets.

The global covariance matrix is: \[Cov(Z)=\Sigma=\left( \begin{array}{cc} \Sigma_{XX} & \Sigma_{XY} \\ \Sigma_{YX} & \Sigma_{YY} \end{array} \right) \]

Then Kriging if given by:

\[Y^\star =\Sigma_{YX} \Sigma_{XX}^{-1} X\]

The covariance matrix of the errors:

\[Cov(Y^\star-Y) = \Sigma_{YY} - \Sigma_{YX} \Sigma_{XX}^{-1} \Sigma_{XY}\]

If we now consider the Precision Matrix:

\[Q=\Sigma^{-1}={\left( \begin{array}{cc} Q_{XX} & Q_{XY} \\ Q_{YX} & Q_{YY} \end{array} \right)} \]

Then Kriging estimate can be written:

\[Y^\star = -Q_{YY}^{-1} Q_{YX} X\]

and the Covariance matrix of the errors

\[Cov(Y^\star-Y) = Q_{YY}^{-1}\]

How to perform simulations?

Non-conditional simulations are obtained by Cholesky factorization of \(Q\) \[Q=LL^t\] and resolution of a triangular system \[L^t\textbf{Z}=\textbf{U}\] with \(\textbf{U}\) a standardized Gaussian random vector with independent components

Finally:

Kriging is obtained by computing \(-Q_{YY}^{-1} Q_{YX}X\)
Conditional simulations are obtained by adding to the kriging a residual with precision matrix \(Q_{YY}\)

An interesting remark must now be addressed, regarding the complexity of linear systems involved in the Kriging procedure:

In the traditional geostatistics, we solve system involving \(\Sigma_{XX}\) which has the size of the data set.

In the SPDE framework, we solve system involving \(Q_{YY}\) which has the size of the target.

Nevertheless, \(\Sigma_{XX}\) is dense whereas \(Q_{YY}\) is sparse. In the latter case, we can use adapted algorithms for sparse matrices with complexity \(n^{3/2}\) (proportional to the non zeros terms) whereas the general Cholesky algorithm has complexity \(n^3\).

Irregular Data

In the case of irregular data, the Finite Differences may not be used any more. However the numerical methods involved in Partial Derivative Equations can be stated on the basis of Finite Elements. They need an intermediate step of triangulating (in 2-D or tetrahedralization for 3-D) where both the data and the target sites must be located on vertices.

Then the random function \(Z\) is approximated by \[Z(s)=\sum_{k=1}^K\lambda_k\psi_k(s)\] where \(\psi_k\) is equal to \(1\) at vertex \(k\) and decreases linearly to \(0\) at the neighboring vertices.

the \(\lambda_k\)’s are random
known at data locations
the precision matrix of \(\Lambda\) can be computed
it is sparse
its sparsity is linked to the triangulation

Simulation under Inequality constraints (Gibbs sampler)

The aim is to simulate a Gaussian Random Function at all the vertices of a triangulation knowing that some values are restricted to an interval.

The workflow is as follows:

Initialize the values at all the vertices
Until convergence, simulate each constrained value \(Z(s_i)\) by Gibbs sampler

\[Z(s_i)=Z^\star(s_i)+\sigma_KU\] where \(U\) is a standardized Gaussian innovation, such as \(Z(s_i)\) respects the inequality constraints at location \(s_i\)

The kriging weights \(\beta_j^{i}\) and the conditional variance \(\sigma_K\) are directly read in the precision matrix : \[\beta_j^{(i)}=-\frac{Q_{ij}}{Q_{ii}} \textrm{ and }\sigma_K^2=\frac{1}{Q_{ii}}\]
Perform conditional simulations at unconstrained vertices as previously

How can we turn into non-stationary covariance

We recall the isotropic initial formula:

\[(\kappa^2-\Delta)^{\frac{\alpha}{2}}Z(s)=\tau\mathcal{W}(s)\]

The global anisotropy corresponds to:

\[(\kappa^2-\nabla.H\nabla)^{\frac{\alpha}{2}}Z(s)=\tau\mathcal{W}(s)\]

Finally the local anisotropy (e.g : varying anisotropies) is given by:

\[(\kappa^2(s)-\nabla.H(s)\nabla)^{\frac{\alpha}{2}}Z(s)=\tau(s)\mathcal{W}(s)\]

Application using RGeostats

All the demonstrations will be performed using the package RGeostats (that must be loaded first).

library(RGeostats)

## Loading required package: Rcpp

## Loading RGeostats - Version:11.2.9

constant.define("ASP",1)

This paper describes the main applications applied in 2-D examples (even if all the techniques can be transposed to 3-D). All the procedures will be performed using SPDE and compared to similar techniques performed using the traditional estimation or simulation models.

The next paragraph defines the main parameters of the case study. The grid is created with 400 by 300 nodes (the mesh has a unit size).

ndim = 2
dx = c(1,1)
nx = c(400,300)
grid = db.create(nx=nx,dx=dx)

An isotropic Matérn Model is created with a range defined as a percentage (20%) of the extension of the grid (along X). Its sill is set to 1.

percentage = 20
extends = grid$extends
range   = extends[1] * percentage / 100
model   = model.create(vartype="K-Bessel",range=range, sill=1)

The Model can now be visualized:

plot(model,xlim=c(0,200),title="The Matern Model")

The traditional methods require the definition of a Neighborhood. For a fair comparison, the neighborhood is set to a Unique: all the available data will be used for the estimation (or the conditioning step for the simulation) of each target site.

neigh = neigh.create(ndim=ndim,type=0)

The non-conditional simulations

We first compare SPDE with the traditional techniques in the scope of non-conditional simulations performed on the grid. We fist perform the simulation using the Turning Band method where the key parameter is the number of bands. The first test is performed using 1000 bands.

seed   = 432431
nbtuba = 1000
result.tb = simtub(,grid,model,seed=seed,nbtuba=nbtuba)
title = paste0("Non Conditional Simulation (",nbtuba," bands)")
plot(result.tb,pos.legend=1,title=title)

The reader can see some artifacts on the plot: this is linked to the fact that the number of bands is probably not sufficient. This test is performed again, raising the number of bands by a factor 10 (i.e. 10000 bands). As a counterpart, the computation time is increased almost by the same factor. Note thatm even if the code if provided, the next step is not performed in this paper in order to save computing time.

seed   = 432431
nbtuba = 10000
result.tb = simtub(,grid,model,seed=seed,nbtuba=nbtuba)
title = paste0("Non Conditional Simulation (",nbtuba," bands)")
plot(result.tb,pos.legend=1,title=title)

It is now time to perform the simulations using the SPDE Model. Due to its efficiency, it is worthwhile performing several simulations simultaneously (here 10). Moreover, in order to eliminate the edge effects linked to this methodology, we expand the simulation grid by adding a margin whose dimension is set to twice the range.

seed   = 31415
nbsimu = 10
gext   = c(2*range,2*range)
result.nc = spde(,grid,model,seed=seed,gext=gext,nbsimu=nbsimu)
title = paste0("SPDE: one (out of ",nbsimu,") simulations")
plot(result.nc,pos.legend=1,title=title)

Introducing SPDE

In this paragraph, we consider the usual geostatistical procedures (i.e. estimation and conditional simulations) through the Stochastic Partial Derivative Equation methodology. Once more, we will compare the traditional approach to the one based on the solution of SPDE.

Let us first generate the conditioning information constituted of a set of 300 samples. Their location is uniform within the grid extension while their value is simulated (using the Matérn model).

ndata  = 300
data   = db.create(x1=extends[1]*runif(ndata),
                   x2=extends[2]*runif(ndata))
nbtuba = 1000
data   = simtub(,data,model,nbtuba=nbtuba)
data   = db.rename(data,"Simu.V1.S1","data")
title  = paste0("Data file(",ndata," samples)")
plot(data,name.post=1,pch=19,title=title)

Estimation

First we perform the estimation using the traditional kriging procedure, and considering all the information in the neighborhood of each node of the target grid (Unique neighborhood). In the same time, the standard deviation (square root of the variance) of the estimation error is derived. Even the Dual approach is used, the computing time is quite long when estimating the variable over the 120000 grid nodes.

result.krige = kriging(data,grid,model,neigh,uc="")

The resulting estimation is represented over the grid, overlaying the conditioning data (with their dimension proportional to their value.

title = paste0("Kriging Estimate") 
plot(result.krige,name="Kriging.data.estim",title=title)
plot(data,add=T,bg="white",pch=21)

The corresponding standard deviation of estimation is represented together with the conditioning data location.

title = paste0("Kriging Standard Deviation")
plot(result.krige,name="Kriging.data.stdev",title=title)
plot(data,add=T,name.post=1,bg="white")

In the SPDE approach, the set of data and target must belong to the network on which the Finite Element procedure is run. Then, although the target nodes belong to a grid, a grid approach cannot be used as soon as data are not located exactly on some grid nodes. Then we must regroup data and target in a 2-D triangulated network. Note that, in the 3-D case, we must replace triangles by tetrahedrons.

We must now learn some principles on the triangulation procedure. Let us perform the triangulation based on the 300 data points and visualize it. The triangulation procedure allows a parametrization provided through a switch: the switch nqQ asks for a quality mesh generation (the angle of each triangle must not be too small).

title = paste0("Triangulation with switch 'nqQ'")
a = meshing(data,triswitch="nqQ",title=title,flag.point=FALSE,
            flag.plot=TRUE)

When using the other switch nQa50, we ask for the surface of each triangle to be smaller than 50, hence the generation of a lot of small triangles introducing a large set of additional non-conditioned triangle vertices.

title = paste0("Triangulation with switch 'nqQa50'") 
a = meshing(data,triswitch="nqQa50",title=title,flag.point=FALSE,
            flag.plot=TRUE)

When defining the network of triangles for performing an estimation (or conditional simulations) on the nodes of a regular grid, we embark the set of data points together with all the grid nodes in the triangulation (letting the ability to introduce free additional triangle vertices). Moreover, it is recommended to enlarge the area covered by the network: the radius of this extension is set to twice the range. For efficiency reason, the display has been switched OFF.

title = paste0("Triangulation with extended area") 
a = meshing(data,triswitch="nqQ",dbaux=grid,gext=gext,
    title = title, flag.point=FALSE, flag.plot=FALSE)

Based on the triangulation procedure, we can now run the SPDE estimation. Note that in this first attempt, no extension has been required so that the triangulation ends on the convex hull enclosing the data and the grid. The user can notice a slight reduction in the computing time.

result.kspde1 = spde(data,grid,model,flag.est=TRUE,flag.std=TRUE)

The result is represented in the same manner as the one performed using traditional kriging, overlaying the data information. The result is quite similar to the previous one.

title = paste0("Kriging Estimate (SPDE)")
plot(result.kspde1,name="SPDE.data.estim",title=title)
plot(data,add=TRUE,bg="white",pch=21)

Similarly, we visualize the standard deviation of the estimation error and compare it to the one obtained previously.

title = paste0("Kriging Standard Deviation (SPDE)")
plot(result.kspde1,name="SPDE.data.stdev",title=title)
plot(data,add=T,name.post=1,bg="white")

To better appreciate the resemblance between estimation performed using the traditional kriging of the kriging using SPDE technology, we compare the results node-by-node in a scatter plot. The cloud is essentially located around the first diagonal, although some dispersion can be observed.

title = paste0("Comparing Estimations")
dummy = correlation(result.krige,"*estim","*estim",result.kspde1,title=title,
                    flag.iso=T,flag.same=T,flag.diag=T,
                    xlab="Traditional estimation",ylab="SPDE estimation")

The same procedure is used to compare the standard deviation of estimation when using traditional kriging or SPDE estimation. The comparison is now located close to the diagonal only for low values of the standard deviation of the estimation error; the dispersion increases with the estimation error.

title = paste0("Comparing Standard Deviations")
dummy = correlation(result.krige,"*stdev","*stdev",result.kspde1,title=title,
                    flag.iso=T,flag.same=T,flag.diag=T,
                    xlab="Traditional estimation",ylab="SPDE estimation")

To better understand the origin of the deviation, we can compute and visualize the absolute value of the difference between the two estimations. We easily check the main differences (in absolute value) are located at the periphery of the grid.

result.kspde1 = db.add(result.kspde1,
diff.K = abs(result.kspde1[,"SPDE.data.estim"] -
             result.krige[,"Kriging.data.estim"]))
title = paste0("Difference of Estimations")
plot(result.kspde1,pos.legend=1,zlim=c(0,0.67),title=title)

The same comparison method is used for the standard deviation of the estimation error. Once again, the deviation is clearly located on the edge of the grid.

result.kspde1 = db.add(result.kspde1,
diff.STD = abs(result.kspde1[,"SPDE.data.stdev"] -
               result.krige[,"Kriging.data.stdev"]))
title = paste0("Difference of Standard Deviations")
plot(result.kspde1,pos.legend=1,zlim=c(0,0.16),title=title)

As a consequence of the previous remark, we perform the estimation again enlarging the estimation area (the extension radius is fixed to twice the range).

result.kspde2 = spde(data,grid,model,flag.est=TRUE,
                     flag.std=TRUE,gext=gext)

The comparison is performed again on the basis of the estimation via the scatter plot procedure. The dispersion is now much smaller.

title = paste0("Comparing Estimations")
dummy = correlation(result.krige,"*estim","*estim",result.kspde2,title=title,
                    flag.iso=T,flag.same=T,flag.diag=T,
                    xlab="Traditional estimation",ylab="SPDE estimation")

The comparison is also performed on the standard deviation of the estimation. The dispersion is again very limited, even if the deviation starts again for very large values of the error.

title = paste0("Comparing Standard Deviations")
dummy = correlation(result.krige,"*stdev","*stdev",result.kspde2,title=title,
                    flag.iso=T,flag.same=T,flag.diag=T,
                    xlab="Traditional estimation",ylab="SPDE estimation")

Simulations

As for the Estimation procedure, the simulations are performed using the traditional Turning Bands model and the SPDE methodology.

A conditional simulation using the traditional Turning Bands Model is performed using 1000 bands. It will serve as a reference in this paragraph. Due to time consideration, a single simulation is performed: this time would be proportional to the number of simulations.

seed   = 543243
nbtuba = 1000
nbsimu = 1
result.tbc = simtub(data,grid,model,neigh,uc="",seed=seed,
                    nbsimu=nbsimu,nbtuba=nbtuba)

The result of the simulation is represented.

title = paste0("Conditional Simulation (",nbtuba," bands)",sep="")
plot(result.tbc,pos.legend=1,title=title)

Using the SPDE technology, a set of 100 conditional simulations is performed. One can appreciate the gain in computing time, compared to the traditional Turning Bands Model (even using only a limited number of bands). When performed, each simulation is treated so as to derive on the fly the mean and variance of the simulations.

nbsimu = 100
result.spc = spde(data,grid,model,seed=seed,gext=gext,
                  flag.modif=TRUE, nbsimu=nbsimu)

The mean of the 100 simulation outcomes is represented next: it must be compared to the estimation performed earlier.

title = paste0("Average of ",nbsimu," simulations")
plot(result.spc,pos.legend=1,name="SPDE.mean",title=title)

Similarly the dispersion of the outcomes should be compared to the estimation error.

title = paste0("Standard Deviation of ",nbsimu," simulations")
plot(result.spc,pos.legend=1,name="SPDE.stdev",title=title)

Another comparison is given by the scatter plot between the Estimation and the mean of 100 simulations, both using the SPDE methodology.

title = paste0("Comparing Kriging and Mean of Simulations")
dummy = correlation(result.krige,"*estim","*mean",result.spc,title=title,
                    flag.iso=T,flag.same=T,flag.diag=T,
                    xlab="Direct Estimation",ylab="From Simulations")

The same comparison is performed between the Standard deviation of the Estimation error and the standard deviation of the dispersion of the 100 simulation outcomes.

title = paste0("Comparing Standard Deviation and Dispersion of Simulations")
dummy = correlation(result.krige,"*stdev","*stdev",result.spc,title=title,
                    flag.iso=T,flag.same=T,flag.diag=T,
                    xlab="Direct Estimation",ylab="From Simulations")

Estimation with large data set

The SPDE technology is very efficient to deal with large data sets. It makes a real difference with the traditional technique where kriging is usually considered as limited to neighborhoods of few hundreds of data each. We recall that SPDE has no neighborhood definition: it implicitly corresponds to a Unique neighborhood.

To demonstrate the specificity of SPDE estimation, we consider a case with 10000 data.

ndata.big = 10000
data.big  = db.create(x1=extends[1]*runif(ndata.big),
                      x2=extends[2]*runif(ndata.big))
nbtuba    = 1000
data.big  = simtub(,data.big,model,nbtuba=nbtuba)
data.big  = db.rename(data.big,"Simu.V1.S1","data")

To illustrate such a large data set, we must reduce the size of the symbol used for the sample representation.

title = paste0("Large data set (",ndata.big," samples)")
plot(data.big,name.post=1,cex=0.1,title=title)

The estimation is performed using the SPDE method.

result.kspde.big = spde(data.big,grid,model,flag.est=TRUE,flag.std=TRUE)

The estimated grid is presented in the next figure.

title = paste0("Estimation (SPDE) with large data set") 
plot(result.kspde.big,name="SPDE.data.estim",title=title)
plot(data.big,add=T,name.post=1,cex=0.1)

The standard deviation of the estimation error is presented next:

title = paste0("Standard Deviation (SPDE) with large data set")
plot(result.kspde.big,name="SPDE.data.stdev",title=title)
plot(data.big,add=T,name.post=1,cex=0.1)

Similarly, we now perform 10 conditional simulations and represent one of the outcomes.

nbsimu = 10
result.spc.big = spde(data.big,grid,model,seed=seed,gext=gext,
                      nbsimu=nbsimu)
title = paste0("Conditional simulation (one out of ",nbsimu,") with large data set")
plot(result.spc.big,pos.legend=1,title=title)

SPDE First steps

D. Renard, N. Desassis

4 May 2017