The FBMS
package provides functions for Flexible
Bayesian Model Selection and Model Averaging.
Below are provided examples on how to run the algorithm and what the results tell us, we begin by loading the package and a supplied dataset
library(FBMS)
#> Loading required package: fastglm
#> Loading required package: bigmemory
#> Loading required package: GenSA
#> Loading required package: parallel
#> Loading required package: r2r
library(GenSA)
library(fastglm)
data("breastcancer")
bc <- breastcancer[,c(ncol(breastcancer),2:(ncol(breastcancer)-1))]
We need some nonlinear transformations for the algorithm to use, the package offers a selection of these, but you are also able to define your own. Here we create a list of the ones to use, all but one are supplied by the package.
By calling two functions in the package, a list of probabilities for various parts of the algorithm, as well as a list of parameters are created. The list of probabilities needs the list of transformations to be able to create the vector of probabilities for the different transformations
We can use one of the supplied likelihood functions, but here we demonstrate how to create our own, it takes four arguments, the dependent \(y\) variable, the matrix \(X\) containing all independent variables, the model as a logical vector specifying the columns of \(X\), and a list of complexity measures for the features involved in the model
loglik.example <- function (y, x, model, complex, params) {
r <- 20/223
suppressWarnings({mod <- fastglm(as.matrix(x[,model]), y, family=binomial())})
ret <- (-(mod$deviance -2*log(r)*sum(complex$width)))/2
return(list(crit=ret, coefs=mod$coefficients))
}
To be able to calculate the alphas when using for example strategy 3
as per Hubin et al., we need a function for the log likelihood, in this
example we will use the function supplied by the package called
logistic.loglik.alpha
. With that function as a starting
point, you can also create your own function for this. We also adjust
our parameter list to use the third strategy.
We are now ready to run the algorithm, in this vignette we will only
run very few iterations for demonstration purposes, but the only thing
that needs to be changed are the number or populations to visit
T
, the number of iterations per population N
and the number of iterations for the final population
N.final
set.seed(1234)
result <- gmjmcmc(bc, loglik.example, logistic.loglik.alpha, transforms, P=3, N=30, N.final=60, probs, params)
#> Data coerced to matrix type.
#> Intercept added to data.
#> Population 1 begin.New best crit in cur pop: -33.7601529708415
#> New best crit in cur pop: -31.3487134729348
#> New best crit in cur pop: -28.9372739750294
#>
#> Population 1 done.
#> Current best crit: -28.9372739750294
#> Feature importance:
#> ! ##| radius_mean
#> ############!#################| texture_mean
#> ! | perimeter_mean
#> ############!#################| area_mean
#> ############!#################| smoothness_mean
#> ############!#################| compactness_mean
#> ! | concavity_mean
#> ! | concave.points_mean
#> ############!#################| symmetry_mean
#> ############!#################| fractal_dimension_mean
#> ! | radius_se
#> ############!#################| texture_se
#> ! | perimeter_se
#> ############!#################| area_se
#> ! | smoothness_se
#> ############!#################| compactness_se
#> ! | concavity_se
#> ! | concave.points_se
#> ! | symmetry_se
#> ! | fractal_dimension_se
#> ! | radius_worst
#> ! | texture_worst
#> ! | perimeter_worst
#> ! | area_worst
#> ! | smoothness_worst
#> ############!#################| compactness_worst
#> ! #####| concavity_worst
#> ############!#################| concave.points_worst
#> ############!#################| symmetry_worst
#> ! | fractal_dimension_worst
#> Replaced feature radius_mean with (radius_se*compactness_worst)
#> Replaced feature perimeter_mean with exp_dbl(texture_se)
#> Replaced feature concavity_mean with troot(area_se)
#> Replaced feature concave.points_mean with exp_dbl(compactness_se)
#> Replaced feature radius_se with (symmetry_mean*texture_se)
#> Replaced feature perimeter_se with troot(compactness_mean)
#> Replaced feature smoothness_se with (area_mean*texture_se)
#> Replaced feature concavity_se with to3(symmetry_mean)
#> Replaced feature concave.points_se with (symmetry_mean*smoothness_se)
#> Replaced feature symmetry_se with troot(symmetry_worst)
#> Replaced feature fractal_dimension_se with (compactness_mean*area_se)
#> Replaced feature radius_worst with sigmoid(perimeter_se)
#> Replaced feature texture_worst with (smoothness_mean*concavity_mean)
#> Replaced feature perimeter_worst with sin_deg(concave.points_mean)
#> Replaced feature area_worst with sigmoid(concave.points_worst)
#> Replaced feature smoothness_worst with radius_se
#> Replaced feature concavity_worst with (concave.points_se*concave.points_worst)
#> Replaced feature fractal_dimension_worst with (concavity_worst*area_se)
#> Added feature troot(-0.71+0.32*symmetry_worst+0.84*area_se+2.54*smoothness_se+0.4*area_mean+0.04*compactness_worst)
#> Added feature exp_dbl((symmetry_mean*texture_se))
#> Added feature sin_deg(0.99+-0.01*concave.points_worst+0.32*area_se+-1.01*area_mean+0.47*perimeter_se)
#> Added feature (concavity_mean*concavity_se)
#> Added feature exp_dbl(perimeter_se)
#> Added feature p0(texture_mean)
#> Added feature (texture_mean*fractal_dimension_mean)
#> Added feature (texture_se*compactness_mean)
#> Added feature to3(fractal_dimension_worst)
#> Added feature (symmetry_worst*texture_se)
#> Added feature (fractal_dimension_mean*concave.points_worst)
#> Added feature (perimeter_se*smoothness_mean)
#> Added feature (area_worst*smoothness_se)
#> Population 2 begin.New best crit in cur pop: -163.9778858578
#> New best crit in cur pop: -161.566446359914
#> New best crit in cur pop: -156.743567364114
#>
#> Population 2 done.
#> Current best crit: -156.743567364114
#> Feature importance:
#> ! | (radius_se*compactness_worst)
#> ! | texture_mean
#> ! | exp_dbl(texture_se)
#> ############!#################| area_mean
#> ############!#################| smoothness_mean
#> ! | compactness_mean
#> ############!#################| troot(area_se)
#> ! | exp_dbl(compactness_se)
#> ############!#################| symmetry_mean
#> ! | fractal_dimension_mean
#> ############!#################| (symmetry_mean*texture_se)
#> ############!#################| texture_se
#> ############!#################| troot(compactness_mean)
#> ! | area_se
#> ############!#################| (area_mean*texture_se)
#> ############!#################| compactness_se
#> ############!#################| to3(symmetry_mean)
#> ! | (symmetry_mean*smoothness_se)
#> ! | troot(symmetry_worst)
#> ! | (compactness_mean*area_se)
#> ! | sigmoid(perimeter_se)
#> ############!#################| (smoothness_mean*concavity_mean)
#> ! | sin_deg(concave.points_mean)
#> ############!#################| sigmoid(concave.points_worst)
#> ############!#################| radius_se
#> ! | compactness_worst
#> ! | (concave.points_se*concave.points_worst)
#> ! | concave.points_worst
#> ############!#################| symmetry_worst
#> ############!#################| (concavity_worst*area_se)
#> ############!#################| troot(-0.71+0.32*symmetry_worst+0.84*area_se+2.54*smoothness_se+0.4*area_mean+0.04*compactness_worst)
#> ! | exp_dbl((symmetry_mean*texture_se))
#> ! | sin_deg(0.99+-0.01*concave.points_worst+0.32*area_se+-1.01*area_mean+0.47*perimeter_se)
#> ############!#################| (concavity_mean*concavity_se)
#> ! | exp_dbl(perimeter_se)
#> ! | p0(texture_mean)
#> ############!#################| (texture_mean*fractal_dimension_mean)
#> ############!#################| (texture_se*compactness_mean)
#> ! | to3(fractal_dimension_worst)
#> ! | (symmetry_worst*texture_se)
#> ############!#################| (fractal_dimension_mean*concave.points_worst)
#> ############!#################| (perimeter_se*smoothness_mean)
#> ############!#################| (area_worst*smoothness_se)
#> Replaced feature (radius_se*compactness_worst) with ((smoothness_mean*concavity_mean)*(symmetry_mean*texture_se))
#> Replaced feature texture_mean with ((concavity_worst*area_se)*(symmetry_mean*texture_se))
#> Replaced feature exp_dbl(texture_se) with (to3(symmetry_mean)*(fractal_dimension_mean*concave.points_worst))
#> Replaced feature compactness_mean with sigmoid(troot(1.3+-0.65*symmetry_worst+-0.02*area_se+-0.57*smoothness_se+-1.13*area_mean+-1.85*compactness_worst))
#> Replaced feature exp_dbl(compactness_se) with to3(texture_se)
#> Replaced feature fractal_dimension_mean with to3((texture_se*compactness_mean))
#> Replaced feature area_se with ((perimeter_se*smoothness_mean)*(to3(symmetry_mean)*(fractal_dimension_mean*concave.points_worst)))
#> Replaced feature (symmetry_mean*smoothness_se) with (smoothness_worst*area_se)
#> Replaced feature troot(symmetry_worst) with (symmetry_worst*troot(compactness_mean))
#> Replaced feature (compactness_mean*area_se) with ((concavity_worst*area_se)*troot(compactness_mean))
#> Replaced feature sigmoid(perimeter_se) with troot(1+-1.28*symmetry_mean)
#> Replaced feature sin_deg(concave.points_mean) with exp_dbl((texture_mean*fractal_dimension_mean))
#> Replaced feature compactness_worst with sigmoid(troot(area_se))
#> Replaced feature (concave.points_se*concave.points_worst) with ((area_worst*smoothness_se)*radius_mean)
#> Replaced feature concave.points_worst with sin_deg((texture_mean*fractal_dimension_mean))
#> Replaced feature exp_dbl((symmetry_mean*texture_se)) with ((area_worst*smoothness_se)*(texture_mean*fractal_dimension_mean))
#> Replaced feature sin_deg(0.99+-0.01*concave.points_worst+0.32*area_se+-1.01*area_mean+0.47*perimeter_se) with (radius_se*compactness_se)
#> Replaced feature exp_dbl(perimeter_se) with (texture_mean*troot(-2.25+0.95*symmetry_worst+-0.06*area_se+-0.08*smoothness_se+-2.39*area_mean+-0.73*compactness_worst))
#> Replaced feature p0(texture_mean) with compactness_mean
#> Replaced feature to3(fractal_dimension_worst) with (to3(symmetry_mean)*area_mean)
#> Replaced feature (symmetry_worst*texture_se) with to3(symmetry_worst)
#> Population 3 begin. |= |New best crit in cur pop: -279.7269817573
#> |== | |=== | |==== |New best crit in cur pop: -270.081223765728
#> |===== | |====== | |======= |New best crit in cur pop: -245.966828786667
#> |======== | |========= |New best crit in cur pop: -236.321070795041
#> |========== | |=========== | |============ | |============= |New best crit in cur pop: -224.263873305516
#> |============== | |=============== | |================ | |================= | |================== | |=================== | |==================== | |===================== | |====================== | |======================= | |======================== | |========================= | |========================== | |=========================== | |============================ | |============================= | |============================== | |=============================== |New best crit in cur pop: -221.852433807637
#> |================================ | |================================= | |================================== | |=================================== | |==================================== | |===================================== | |====================================== | |======================================= | |========================================| |========================================| |========================================| |========================================| |========================================| |========================================| |========================================| |========================================| |========================================| |========================================| |========================================| |========================================|New best crit in cur pop: -212.206675815853
#> |========================================| |========================================| |========================================| |========================================| |========================================| |========================================| |========================================| |========================================| |========================================|
#> Population 3 done.
#> Current best crit: -212.206675815853
#> Feature importance:
#> ! | ((smoothness_mean*concavity_mean)*(symmetry_mean*texture_se))
#> ############!#################| ((concavity_worst*area_se)*(symmetry_mean*texture_se))
#> ! | (to3(symmetry_mean)*(fractal_dimension_mean*concave.points_worst))
#> ! | area_mean
#> ############!#################| smoothness_mean
#> ############!#################| sigmoid(troot(1.3+-0.65*symmetry_worst+-0.02*area_se+-0.57*smoothness_se+-1.13*area_mean+-1.85*compactness_worst))
#> ############!#################| troot(area_se)
#> ! | to3(texture_se)
#> ############!#################| symmetry_mean
#> ! | to3((texture_se*compactness_mean))
#> ! | (symmetry_mean*texture_se)
#> ############!#################| texture_se
#> ! | troot(compactness_mean)
#> ! | ((perimeter_se*smoothness_mean)*(to3(symmetry_mean)*(fractal_dimension_mean*concave.points_worst)))
#> ! | (area_mean*texture_se)
#> ! | compactness_se
#> ! | to3(symmetry_mean)
#> ############!#################| (smoothness_worst*area_se)
#> ! | (symmetry_worst*troot(compactness_mean))
#> ############!#################| ((concavity_worst*area_se)*troot(compactness_mean))
#> ! | troot(1+-1.28*symmetry_mean)
#> ! | (smoothness_mean*concavity_mean)
#> ############!#################| exp_dbl((texture_mean*fractal_dimension_mean))
#> ############!#################| sigmoid(concave.points_worst)
#> ############!#################| radius_se
#> ############!#################| sigmoid(troot(area_se))
#> ############!#################| ((area_worst*smoothness_se)*radius_mean)
#> ! | sin_deg((texture_mean*fractal_dimension_mean))
#> ! | symmetry_worst
#> ! | (concavity_worst*area_se)
#> ! | troot(-0.71+0.32*symmetry_worst+0.84*area_se+2.54*smoothness_se+0.4*area_mean+0.04*compactness_worst)
#> ! | ((area_worst*smoothness_se)*(texture_mean*fractal_dimension_mean))
#> ############!#################| (radius_se*compactness_se)
#> ! | (concavity_mean*concavity_se)
#> ############!#################| (texture_mean*troot(-2.25+0.95*symmetry_worst+-0.06*area_se+-0.08*smoothness_se+-2.39*area_mean+-0.73*compactness_worst))
#> ############!#################| compactness_mean
#> ! | (texture_mean*fractal_dimension_mean)
#> ! | (texture_se*compactness_mean)
#> ! | (to3(symmetry_mean)*area_mean)
#> ############!#################| to3(symmetry_worst)
#> ############!#################| (fractal_dimension_mean*concave.points_worst)
#> ############!#################| (perimeter_se*smoothness_mean)
#> ############!#################| (area_worst*smoothness_se)
We can then summarize the results using the supplied function and get a plot of the importance of the parameters in the last population of features