Kriging with Categorical Covariate

Any question regarding the Interpolation method using Kriging

Kriging with Categorical Covariate

Postby niafall » Tue Mar 01, 2022 12:48 pm

Is there a kriging method where a categorical variable (e.g. sediment type - "sand" or "mud") can be used as a covariate?

Many thanks :)
niafall
 
Posts: 21
Joined: Tue May 17, 2016 5:46 pm

Re: Kriging with Categorical Covariate

Postby Didier Renard » Wed Mar 02, 2022 8:59 am

Hi there,
Again this is a good generic question... with many different possible answers.

The main problem is to "convert" the categorical information into a numerical one. Then it suffices to consider this additional variable (or set of variables) to your initial primary (numerical) variable in order to establish a model and then perform any subsequent geostat operation (kriging which becomes cokiring ... simulations).
So next come several thoughts about the way to perform this conversion.

When dealing with categorical variables, a usual solution is to consider them as outcomes of a continuous variable through a truncation procedure. Having said so, the issue is then to elaborate a relevant scheme for transforming a real value into a categorical one. This transform function can be simple if the categorical classes are immediately derived from the continuous one (e.g. define categories from a grade as low grade, middle grade and high grade). The transform function is perfectly known (even better if you know the truncation values). Sometimes, this naive solution does not make sense: e.g. when dealing with lithofacies (sand, shale, conglomerate, ...). Although one could thing of a proxy (granulometry for example), this is obviously too naive as a transformation function.
However, let us conclude this part by saying that if such a function exists, it suffices to invert it... which is not simple through. Just a track as a possible solution: say we have defined that facies A corresponds to an underlying continuous variable lying within [LA, UA]; similarly facies B corresponds to the same underlying variable lying within [LB,UB], ... Then the deal for the back transformation is simply to draw a random variable in the correct interval, given the value of the facies.
Obviously, this very naive solution does not take any spatial behavior into account (other than two samples close one to each other probably belong to the same facies and therefore will result in close values in the real scale).

Another common way is to translate this categorical information into numerical value using indicators. We do not gain much except that the result is numeric this time. A simple way is to convert each category into its own indicator. But then we end up with multiple indicators (as many as categories). Then we have to turn into a multi-variate fitting of the whole set of indicators. And we have not addressed several properties:
- the closure: say if there are 3 categories in total, each sample must belong to one category. This means that, for each point, the sum of the indicators must be equal to 1
- the disjunctive characteristic: if a sample belongs to one category, it does not belong to another one (this is not always wished by the way... in the case of nested categories).... otherwise it suffices to check that the sum of the indicators is never bigger than 1
- the transition: when leaving a category (say A), samples have tendency to jump into another category (say B) rather than C...
However, note that if you only have two disjunctive categories, this solution is turned into a very easy solution and requires a single indicator function.

Another solution that has been successfully checked and used is to resort to PluriGaussian model. There the transformation is provided through the Truncation Rule (which can be quite elaborated as it may involve more than one underlying variable ... for PGS for example. Note that we usually do not consider more than 2 underlying GRFs). The principle of this method is to fit the characteristics of these underlying random functions (GRFs) through the behavior of the simple and cross variograms of the indicators. At this stage, the truncation rule together with the proportions of the different categories is assumed to be known.
This "blind" fitting was initially tedious as we had to propose a model for the underlying GRFS, apply the transformation between variogram of the GRF into the variogram of indicators and check the fit against the experimental variogram of indicators. This trial-and-error procedure has been replaced by an automatic fitting procedure (variopgs() in RGeostats). When this step is achieved, you have a model describing the spatial characteristics of the underlying GRFS. Then the last task is simply to run a Gibbs Sampler (function gibbs()) which will "translate" the category into the numerical value of the GRFs.
Note that this method ensures several advantages simultaneously. The model:
- preserves the categories by construction: the value of the GRFs converted back to category (using the Truncation Rule) will give back the correct category for each sample
- preserves the transitions: if close samples belong to categories A and B more frequently than any other assemblage, then the underlying GRF values will also be closer
- preserve the proportions of each category (by construction)
- preserve the topology: if category A and C cannot be adjacent (say that you must have an intermediate B category in between), this is reflected by the Truncation Scheme and will be fulfilled by the transformation as described above.

These are 3 main ways to convert your categorical information into numerical one.

However, most of them (specifically the last one) relies on simulations! And you wanted to address an estimation problem (you mentioned Kriging). Then I have no better solution to suggest than building conditional simulations.
Let me be more accurate in terms of PGS solution (the third one).
You would define a single model of GRF using variopgs. Afterwards, you will run Gibbs (which is a simulator engine) several times, proving series of numerical outcomes at each sample point (it may be double series if you run PGS with 2 underlying GRFs). Then each serie of GRF values must be considered as a possible set of measures at the data point. They must be combined to the information of the primary information at the same data points in order to run an conditional simulation.

Up to this detail (where part of the data change value for each simulation), the rest of the story is usual. You run several conditional simulations and average them in order to obtain a result close enough to an estimation.

Hope this will help.
Didier Renard
 
Posts: 337
Joined: Thu Sep 20, 2012 4:22 pm

Re: Kriging with Categorical Covariate

Postby niafall » Mon Mar 07, 2022 1:04 pm

Many thanks, Didier, for another detailed response.

Let me be more accurate in terms of PGS solution (the third one).
You would define a single model of GRF using variopgs. Afterwards, you will run Gibbs (which is a simulator engine) several times, proving series of numerical outcomes at each sample point (it may be double series if you run PGS with 2 underlying GRFs). Then each serie of GRF values must be considered as a possible set of measures at the data point. They must be combined to the information of the primary information at the same data points in order to run an conditional simulation.


Are there any example scripts detailing how some/all of these steps might be implemented in RGeostats? Is it possible to approximate some uncertainty quantity from these simulations? (I assume for this second question I would refer to your answer to my question posted elsewhere)

Many thanks :)
niafall
 
Posts: 21
Joined: Tue May 17, 2016 5:46 pm

Re: Kriging with Categorical Covariate

Postby Fabien Ors » Mon Jun 13, 2022 9:10 am

Sorry for our late answer.
Here is a example script on how to use vario.pgs function : http://rgeostats.free.fr/doc/Courses/08 ... ns-PGS.Rmd
Fabien Ors
Administrateur du site
 
Posts: 226
Joined: Thu Sep 20, 2012 1:07 pm


Return to Kriging

Who is online

Users browsing this forum: No registered users and 1 guest