by Didier Renard » Thu Aug 01, 2013 9:46 pm
Hello Matt
Sorry for the delay in answering your interesting question concerning the selections.
Each Db can contain at most ONE active selection: this corresponds to a variable with a locator set to "sel".
A sample is considered as masked off if the value of the selection variable is equal to 0; otherwise the sample is active.
When a selection variable is defined within a Db, all the procedures using that Db (usually) take this selection into account:
- if the Db is used as input, only the active samples can be used
- if the Db uis used as output, only the active samples are processed.
For demonstration sake, I imagined the following small case study.
# Create the input data set (100 random points in a [0,1] 2-D square, called "data" #
data <- db.create(x1=runif(100),x2=runif(100),z1=runif(100))
# create an output grid covering the area, called "grid" #
grid <- db.create(flag.grid=T,nx=c(100,100),dx=c(0.01,0.01))
# Define a model, called "model" #
### This can be done by using the interactive procedure model.input() ###
### I have created a model composed of a cubic isotropic structure with range 0.3 and sill 1 ###
# Define a Neighborhood, called "neigh" #
### This can be done by using the interactive procedure neigh.input() ###
### I have created a Unique neighborhood ###
# Create a simple polygon (triangle) interactively, called "polygon" #
polygon = polygon.create(x=c(0.1,0.9,0.5),y=c(0.1,0.1,0.9))
# Apply the selection on the input and the output Db #
data = db.polygon(data,polygon)
grid = db.polygon(grid,polygon)
# As you can check, a selection has been created in each Db #
# We can now check the usage of the selection in the output file, using the conditional simulations #
plot(simtub(data,grid,model,neigh))
### We can see that only the nodes of the output Db (Grid) located inside the polygon have been processed ###
# We now check the usage of the selection in the input Db, using the kriging test performed for the grid node #5000 #a <-
a <- krigtest(data,grid,model,neigh,iech0=5000)
### The result is a list (stored in the object "a") which contains several fields ###
### The field xyz corresponds to the locations of the data information used for this estimation ###
plot(a$xyz)
plot(polygon,add=T)
### By displaying the information used for kriging the grid node #5000 (using Unique Neighborhood) ###
### we can check that it correspond to the subset of the data which belong to the polygon ###
Obviously, the selection applied to the input and output Db do not have to be the same...
Now that you have understood the usage of the selection, I can introduce the concept of the DOMAINING.
Suppose that you have assigned a DOMAIN number to each sample of the input Db and to each grid node of the output Db,
(this variable must be defined as the locator "dom"in each Db),
you can run the simulation or estimation procedure several times, by specifying the value of the target DOMAIN.
This number is provided throught the argument "domain" whih is available in most of the procedures.
Say you run the estimation with domain=21, then a temporary selection is created on the fly, where the only samples
whose domain value is equal to 21 are active. At the same time, a selection is created on the fly regrouping the only
nodes of the output Db for which the domain value is equal to 21. And the procedure takes place.
If the locator "dom" is not created in a Db, the argument "domain" has no impact.
Hoping all this information will be of some help..