--- title: "K-Fold" author: "D. Renard" date: "9 juillet 2020" output: pdf_document: default html_document: default --- # Introduction This paper is to demonstrate the K-Fold option for the Cross-Validation ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) library(RGeostats) rm(list=ls()) constant.define("asp",1) ``` We create the data set (by non conditional simulation) ```{r} db = db.point.init(nech=100,coormax=c(100,100)) model = model.create(vartype="Cubic",range=50) db = simtub(,db,model) db = db.rename(db,"Simu*","Data") plot(db,pch=19,title="Données") neigh = neigh.create(radius=30,nmaxi=10) ``` # Cross Validation ## Traiditional method The traditional method consists in suppressing in turn each datum and in re-estimating it from its neighboring samples. This option is called **Leave One Point Out**. This cross-validation is performed next. The normalized error is displayed at each datum. ```{r} res = xvalid(db,model,neigh) plot(res,name="*stderr*",pch=19,title="Normalized Error") ``` ## K-Fold method Each datum is now attributed a code. For simplicity sake, the code values are assigned randomly, using values ranging from 1 to 5 (*ncode*). ```{r} ncode = 5 db = db.add(db,code=ceiling(ncode * runif(db$nech))) db = db.locate(db,"Data","z") db = db.locate(db,"code","code") plot(db,name.color="code",title="Code") ``` We perform the cross-validation with K-Fold option ```{r} res = xvalid(db,model,neigh,flag.code=TRUE) plot(res,name="*stderr*",pch=19,title="Normalized Error") ``` Let us check the results by selecting a target sample specifically. The first trial is performed using the Leave-One-Point-Out algorithm whereas the second trial uses the K-Fold option. ```{r} debug.reference(17) res = xvalid(db,model,neigh) res = xvalid(db,model,neigh,flag.code=TRUE) debug.reference(0) ```