Title: | Accumulated Local Effects (ALE) Plots and Partial Dependence (PD) Plots |
---|---|
Description: | Visualizes the main effects of individual predictor variables and their second-order interaction effects in black-box supervised learning models. The package creates either Accumulated Local Effects (ALE) plots and/or Partial Dependence (PD) plots, given a fitted supervised learning model. |
Authors: | Dan Apley |
Maintainer: | Dan Apley <[email protected]> |
License: | GPL-2 |
Version: | 1.1 |
Built: | 2025-02-28 04:01:05 UTC |
Source: | https://github.com/cran/ALEPlot |
Visualizes the main effects of individual predictor variables and their second-order interaction effects in black-box supervised learning models. The package creates either Accumulated Local Effects (ALE) plots and/or Partial Dependence (PD) plots, given a fitted supervised learning model.
See the two individual functions ALEPlot
and PDPlot
that are included in this package.
Dan Apley
Maintainer: Dan Apley <[email protected]>
Apley, D. W. (2016), "Visualizing the Effects of Predictor Variables in Black Box Supervised Learning Models," submitted for publication.
Computes and plots accumulated local effects (ALE) plots for a fitted supervised learning model. The effects can be either a main effect for an individual predictor (length(J) = 1
) or a second-order interaction effect for a pair of predictors (length(J) = 2
).
ALEPlot(X, X.model, pred.fun, J, K = 40, NA.plot = TRUE)
ALEPlot(X, X.model, pred.fun, J, K = 40, NA.plot = TRUE)
X |
The data frame of predictor variables to which the supervised learning model was fit. The names of the predictor variables must be the same as when the model was fit. The response variable should not be included in |
X.model |
The fitted supervised learning model object (e.g., a tree, random forest, neural network, etc.), typically an object to which a built-in |
pred.fun |
A user-supplied function that will be used to predict the response for |
J |
A numeric scalar or two-length vector of indices of the predictors for which the ALE plot will be calculated. |
K |
A numeric scalar that specifies the number of intervals into which the predictor range is divided when calculating the ALE plot effects. If |
NA.plot |
A logical value that is only used if |
See the Apley (2016) reference paper listed below for details. For J = j
(i.e., if the index for a single predictor is specified), the function calculates and returns the ALE main effect of
, which is denoted by
in Apley (2016). It also plots
. For
J = c(j1,j2)
(i.e., if the indices for a pair of predictors are specified), the function calculates and returns the ALE second-order interaction effect of
, which is denoted by
in Apley (2016). It also plots
.
K |
The same as the input argument |
f.values |
If |
x.values |
For numeric predictors, if |
Dan Apley
Apley, D. W. (2016), "Visualizing the Effects of Predictor Variables in Black Box Supervised Learning Models," submitted for publication.
See PDPlot
for partial dependence plots.
######################################################################## ## A transparent example in which the supervised learning model is a linear regression \code{lm}, ## but we will pretend it is black-box ######################################################################## ## Generate some data and fit a \code{lm} supervised learning model N=500 x1 <- runif(N, min=0, max=1) x2 <- runif(N, min=0, max=1) x3 <- runif(N, min=0, max=1) y = x1 + 2*x2^2 + rnorm(N, 0, 0.1) DAT = data.frame(y, x1, x2, x3) lm.DAT = lm(y ~ .^2 + I(x1^2) + I(x2^2) + I(x3^2), DAT) ## Define the predictive function (easy in this case, since \code{lm} has a built-in ## predict function that suffices) yhat <- function(X.model, newdata) as.numeric(predict(X.model, newdata)) ## Calculate and plot the ALE main and second-order interaction effects of x1, x2, x3 par(mfrow = c(2,3)) ALE.1=ALEPlot(DAT[,2:4], lm.DAT, pred.fun=yhat, J=1, K=50, NA.plot = TRUE) ALE.2=ALEPlot(DAT[,2:4], lm.DAT, pred.fun=yhat, J=2, K=50, NA.plot = TRUE) ALE.3=ALEPlot(DAT[,2:4], lm.DAT, pred.fun=yhat, J=3, K=50, NA.plot = TRUE) ALE.12=ALEPlot(DAT[,2:4], lm.DAT, pred.fun=yhat, J=c(1,2), K=20, NA.plot = TRUE) ALE.13=ALEPlot(DAT[,2:4], lm.DAT, pred.fun=yhat, J=c(1,3), K=20, NA.plot = TRUE) ALE.23=ALEPlot(DAT[,2:4], lm.DAT, pred.fun=yhat, J=c(2,3), K=20, NA.plot = TRUE) ## The following manually recreates the same plots produced by the above ALEPlot function calls par(mfrow = c(2,3)) plot(ALE.1$x.values, ALE.1$f.values, type="l", xlab="x1", ylab="ALE main effect for x1") plot(ALE.2$x.values, ALE.2$f.values, type="l", xlab="x2", ylab="ALE main effect for x2") plot(ALE.3$x.values, ALE.3$f.values, type="l", xlab="x3", ylab="ALE main effect for x3") image(ALE.12$x.values[[1]], ALE.12$x.values[[2]], ALE.12$f.values, xlab = "x1", ylab = "x2") contour(ALE.12$x.values[[1]], ALE.12$x.values[[2]], ALE.12$f.values, add=TRUE, drawlabels=TRUE) image(ALE.13$x.values[[1]], ALE.13$x.values[[2]], ALE.13$f.values, xlab = "x1", ylab = "x3") contour(ALE.13$x.values[[1]], ALE.13$x.values[[2]], ALE.13$f.values, add=TRUE, drawlabels=TRUE) image(ALE.23$x.values[[1]], ALE.23$x.values[[2]], ALE.23$f.values, xlab = "x2", ylab = "x3") contour(ALE.23$x.values[[1]], ALE.23$x.values[[2]], ALE.23$f.values, add=TRUE, drawlabels=TRUE) ######################################################################## ## A larger example in which the supervised learning model is a neural network (\code{nnet}) ######################################################################## ## Generate some data and fit a \code{nnet} supervised learning model library(nnet) N=5000 x1 <- runif(N, min=0, max=1) x2 <- runif(N, min=0, max=1) x3 <- runif(N, min=0, max=1) y = x1 + 2*x2^2 +(x1-0.5)*(x3-0.5) + rnorm(N, 0, 0.1) DAT = data.frame(y, x1, x2, x3) nnet.DAT<-nnet(y~., data=DAT, linout=TRUE, skip=FALSE, size=10, decay=0.01, maxit=1000, trace=FALSE) ## Define the predictive function yhat <- function(X.model, newdata) as.numeric(predict(X.model, newdata, type="raw")) ## Calculate and plot the ALE main and second-order interaction effects of x1, x2, x3 par(mfrow = c(2,3)) ALE.1=ALEPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=1, K=50, NA.plot = TRUE) ALE.2=ALEPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=2, K=50, NA.plot = TRUE) ALE.3=ALEPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=3, K=50, NA.plot = TRUE) ALE.12=ALEPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=c(1,2), K=20, NA.plot = TRUE) ALE.13=ALEPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=c(1,3), K=20, NA.plot = TRUE) ALE.23=ALEPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=c(2,3), K=20, NA.plot = TRUE) ######################################################################## ## A binary classification example in which the supervised learning model is ## a neural network (\code{nnet}), and the log-odds of the predicted class probability ## is the function to be plotted ######################################################################## ## Generate some data and fit a \code{nnet} supervised learning model library(nnet) N=5000 x1 <- runif(N, min=0, max=1) x2 <- runif(N, min=0, max=1) x3 <- runif(N, min=0, max=1) z = -3.21 + 2.81*x1 + 5.62*x2^2 + 2.81*(x1-0.5)*(x3-0.5) #true log-odds p = exp(z)/(1+exp(z)) u = runif(N) y = u < p DAT = data.frame(y, x1, x2, x3) nnet.DAT<-nnet(y~., data=DAT, linout=FALSE, skip=FALSE, size=10, decay=0.05, maxit=1000, trace=FALSE) ## Define the ALE function to be the log-odds of the predicted probability that y = TRUE yhat <- function(X.model, newdata) { p.hat = as.numeric(predict(X.model, newdata, type="raw")) log(p.hat/(1-p.hat)) } ## Calculate and plot the ALE main and second-order interaction effects of x1, x2, x3 par(mfrow = c(2,3)) ALE.1=ALEPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=1, K=50, NA.plot = TRUE) ALE.2=ALEPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=2, K=50, NA.plot = TRUE) ALE.3=ALEPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=3, K=50, NA.plot = TRUE) ALE.12=ALEPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=c(1,2), K=20, NA.plot = TRUE) ALE.13=ALEPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=c(1,3), K=20, NA.plot = TRUE) ALE.23=ALEPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=c(2,3), K=20, NA.plot = TRUE)
######################################################################## ## A transparent example in which the supervised learning model is a linear regression \code{lm}, ## but we will pretend it is black-box ######################################################################## ## Generate some data and fit a \code{lm} supervised learning model N=500 x1 <- runif(N, min=0, max=1) x2 <- runif(N, min=0, max=1) x3 <- runif(N, min=0, max=1) y = x1 + 2*x2^2 + rnorm(N, 0, 0.1) DAT = data.frame(y, x1, x2, x3) lm.DAT = lm(y ~ .^2 + I(x1^2) + I(x2^2) + I(x3^2), DAT) ## Define the predictive function (easy in this case, since \code{lm} has a built-in ## predict function that suffices) yhat <- function(X.model, newdata) as.numeric(predict(X.model, newdata)) ## Calculate and plot the ALE main and second-order interaction effects of x1, x2, x3 par(mfrow = c(2,3)) ALE.1=ALEPlot(DAT[,2:4], lm.DAT, pred.fun=yhat, J=1, K=50, NA.plot = TRUE) ALE.2=ALEPlot(DAT[,2:4], lm.DAT, pred.fun=yhat, J=2, K=50, NA.plot = TRUE) ALE.3=ALEPlot(DAT[,2:4], lm.DAT, pred.fun=yhat, J=3, K=50, NA.plot = TRUE) ALE.12=ALEPlot(DAT[,2:4], lm.DAT, pred.fun=yhat, J=c(1,2), K=20, NA.plot = TRUE) ALE.13=ALEPlot(DAT[,2:4], lm.DAT, pred.fun=yhat, J=c(1,3), K=20, NA.plot = TRUE) ALE.23=ALEPlot(DAT[,2:4], lm.DAT, pred.fun=yhat, J=c(2,3), K=20, NA.plot = TRUE) ## The following manually recreates the same plots produced by the above ALEPlot function calls par(mfrow = c(2,3)) plot(ALE.1$x.values, ALE.1$f.values, type="l", xlab="x1", ylab="ALE main effect for x1") plot(ALE.2$x.values, ALE.2$f.values, type="l", xlab="x2", ylab="ALE main effect for x2") plot(ALE.3$x.values, ALE.3$f.values, type="l", xlab="x3", ylab="ALE main effect for x3") image(ALE.12$x.values[[1]], ALE.12$x.values[[2]], ALE.12$f.values, xlab = "x1", ylab = "x2") contour(ALE.12$x.values[[1]], ALE.12$x.values[[2]], ALE.12$f.values, add=TRUE, drawlabels=TRUE) image(ALE.13$x.values[[1]], ALE.13$x.values[[2]], ALE.13$f.values, xlab = "x1", ylab = "x3") contour(ALE.13$x.values[[1]], ALE.13$x.values[[2]], ALE.13$f.values, add=TRUE, drawlabels=TRUE) image(ALE.23$x.values[[1]], ALE.23$x.values[[2]], ALE.23$f.values, xlab = "x2", ylab = "x3") contour(ALE.23$x.values[[1]], ALE.23$x.values[[2]], ALE.23$f.values, add=TRUE, drawlabels=TRUE) ######################################################################## ## A larger example in which the supervised learning model is a neural network (\code{nnet}) ######################################################################## ## Generate some data and fit a \code{nnet} supervised learning model library(nnet) N=5000 x1 <- runif(N, min=0, max=1) x2 <- runif(N, min=0, max=1) x3 <- runif(N, min=0, max=1) y = x1 + 2*x2^2 +(x1-0.5)*(x3-0.5) + rnorm(N, 0, 0.1) DAT = data.frame(y, x1, x2, x3) nnet.DAT<-nnet(y~., data=DAT, linout=TRUE, skip=FALSE, size=10, decay=0.01, maxit=1000, trace=FALSE) ## Define the predictive function yhat <- function(X.model, newdata) as.numeric(predict(X.model, newdata, type="raw")) ## Calculate and plot the ALE main and second-order interaction effects of x1, x2, x3 par(mfrow = c(2,3)) ALE.1=ALEPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=1, K=50, NA.plot = TRUE) ALE.2=ALEPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=2, K=50, NA.plot = TRUE) ALE.3=ALEPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=3, K=50, NA.plot = TRUE) ALE.12=ALEPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=c(1,2), K=20, NA.plot = TRUE) ALE.13=ALEPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=c(1,3), K=20, NA.plot = TRUE) ALE.23=ALEPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=c(2,3), K=20, NA.plot = TRUE) ######################################################################## ## A binary classification example in which the supervised learning model is ## a neural network (\code{nnet}), and the log-odds of the predicted class probability ## is the function to be plotted ######################################################################## ## Generate some data and fit a \code{nnet} supervised learning model library(nnet) N=5000 x1 <- runif(N, min=0, max=1) x2 <- runif(N, min=0, max=1) x3 <- runif(N, min=0, max=1) z = -3.21 + 2.81*x1 + 5.62*x2^2 + 2.81*(x1-0.5)*(x3-0.5) #true log-odds p = exp(z)/(1+exp(z)) u = runif(N) y = u < p DAT = data.frame(y, x1, x2, x3) nnet.DAT<-nnet(y~., data=DAT, linout=FALSE, skip=FALSE, size=10, decay=0.05, maxit=1000, trace=FALSE) ## Define the ALE function to be the log-odds of the predicted probability that y = TRUE yhat <- function(X.model, newdata) { p.hat = as.numeric(predict(X.model, newdata, type="raw")) log(p.hat/(1-p.hat)) } ## Calculate and plot the ALE main and second-order interaction effects of x1, x2, x3 par(mfrow = c(2,3)) ALE.1=ALEPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=1, K=50, NA.plot = TRUE) ALE.2=ALEPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=2, K=50, NA.plot = TRUE) ALE.3=ALEPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=3, K=50, NA.plot = TRUE) ALE.12=ALEPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=c(1,2), K=20, NA.plot = TRUE) ALE.13=ALEPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=c(1,3), K=20, NA.plot = TRUE) ALE.23=ALEPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=c(2,3), K=20, NA.plot = TRUE)
Computes and plots partial dependence (PD) plots for a fitted supervised learning model. The effects can be either a main effect for an individual predictor (length(J) = 1
) or a second-order interaction effect for a pair of predictors (length(J) = 2
).
PDPlot(X, X.model, pred.fun, J, K)
PDPlot(X, X.model, pred.fun, J, K)
X |
The data frame of predictor variables to which the supervised learning model was fit. The names of the predictor variables must be the same as when the model was fit. The response variable should not be included in |
X.model |
The fitted supervised learning model object (e.g., a tree, random forest, neural network, etc.), typically an object to which a built-in |
pred.fun |
A user-supplied function that will be used to predict the response for |
J |
A numeric scalar or two-length vector of indices of the predictors for which the PD plot will be calculated. |
K |
A numeric scalar that represents the number of discrete points at which the PD plot will be calculated. If |
This function calculates and plots the partial dependence (PD) plots first introduced in Friedman (2001). See the Apley (2016) reference paper listed below for details. For J = j
(i.e., if the index for a single predictor is specified), the function calculates and returns the PD main effect of
, which is denoted by
in Apley (2016). It also plots
. For
J = c(j1,j2)
(i.e., if the indices for a pair of predictors are specified), the function calculates and returns the PD second-order interaction effect of
, which is denoted by
in Apley (2016). It also plots
.
f.values |
If |
x.values |
For numeric predictors, if |
Dan Apley
Friedman, J. H., (2001), "Greedy function approximation: A gradient boosting machine," Annals of Statistics, 29(5), pp. 1189-1232.
Apley, D. W. (2016), "Visualizing the Effects of Predictor Variables in Black Box Supervised Learning Models," submitted for publication.
See ALEPlot
for partial dependence plots.
######################################################################## ## A transparent example in which the supervised learning model is a linear regression \code{lm}, ## but we will pretend it is black-box ######################################################################## ## Generate some data and fit a \code{lm} supervised learning model N=500 x1 <- runif(N, min=0, max=1) x2 <- runif(N, min=0, max=1) x3 <- runif(N, min=0, max=1) y = x1 + 2*x2^2 + rnorm(N, 0, 0.1) DAT = data.frame(y, x1, x2, x3) lm.DAT = lm(y ~ .^2 + I(x1^2) + I(x2^2) + I(x3^2), DAT) ## Define the predictive function (easy in this case, since \code{lm} has ## a built-in predict function that suffices) yhat <- function(X.model, newdata) as.numeric(predict(X.model, newdata)) ## Calculate and plot the PD main effects and second-order interaction effects of x1, x2, x3 par(mfrow = c(2,3)) PD.1=PDPlot(DAT[,2:4], lm.DAT, pred.fun=yhat, J=1, K=50) PD.2=PDPlot(DAT[,2:4], lm.DAT, pred.fun=yhat, J=2, K=50) PD.3=PDPlot(DAT[,2:4], lm.DAT, pred.fun=yhat, J=3, K=50) PD.12=PDPlot(DAT[,2:4], lm.DAT, pred.fun=yhat, J=c(1,2), K=30) PD.13=PDPlot(DAT[,2:4], lm.DAT, pred.fun=yhat, J=c(1,3), K=30) PD.23=PDPlot(DAT[,2:4], lm.DAT, pred.fun=yhat, J=c(2,3), K=30) ## The following manually recreates the same plots produced by the above PDPlot function calls par(mfrow = c(2,3)) plot(PD.1$x.values, PD.1$f.values, type="l", xlab="x1", ylab="PD main effect for x1") plot(PD.2$x.values, PD.2$f.values, type="l", xlab="x2", ylab="PD main effect for x2") plot(PD.3$x.values, PD.3$f.values, type="l", xlab="x3", ylab="PD main effect for x3") image(PD.12$x.values[[1]], PD.12$x.values[[2]], PD.12$f.values, xlab = "x1", ylab = "x2") contour(PD.12$x.values[[1]], PD.12$x.values[[2]], PD.12$f.values, add=TRUE, drawlabels=TRUE) image(PD.13$x.values[[1]], PD.13$x.values[[2]], PD.13$f.values, xlab = "x1", ylab = "x3") contour(PD.13$x.values[[1]], PD.13$x.values[[2]], PD.13$f.values, add=TRUE, drawlabels=TRUE) image(PD.23$x.values[[1]], PD.23$x.values[[2]], PD.23$f.values, xlab = "x2", ylab = "x3") contour(PD.23$x.values[[1]], PD.23$x.values[[2]], PD.23$f.values, add=TRUE, drawlabels=TRUE) ######################################################################## ## A larger example in which the supervised learning model is a neural network (\code{nnet}) ######################################################################## ## Generate some data and fit a \code{nnet} supervised learning model library(nnet) N=5000 x1 <- runif(N, min=0, max=1) x2 <- runif(N, min=0, max=1) x3 <- runif(N, min=0, max=1) y = x1 + 2*x2^2 +(x1-0.5)*(x3-0.5) + rnorm(N, 0, 0.1) DAT = data.frame(y, x1, x2, x3) nnet.DAT<-nnet(y~., data=DAT, linout=TRUE, skip=FALSE, size=10, decay=0.01, maxit=1000, trace=FALSE) ## Define the predictive function yhat <- function(X.model, newdata) as.numeric(predict(X.model, newdata, type="raw")) ## Calculate and plot the PD main and second-order interaction effects of x1, x2, x3 par(mfrow = c(2,3)) PD.1=PDPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=1, K=50) PD.2=PDPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=2, K=50) PD.3=PDPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=3, K=50) PD.12=PDPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=c(1,2), K=20) PD.13=PDPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=c(1,3), K=20) PD.23=PDPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=c(2,3), K=20) ######################################################################## ## A binary classification example in which the supervised learning model is ## a neural network (\code{nnet}), and the log-odds of the predicted class ## probability is the function to be plotted ######################################################################## ## Generate some data and fit a \code{nnet} supervised learning model library(nnet) N=5000 x1 <- runif(N, min=0, max=1) x2 <- runif(N, min=0, max=1) x3 <- runif(N, min=0, max=1) z = -3.21 + 2.81*x1 + 5.62*x2^2 + 2.81*(x1-0.5)*(x3-0.5) #true log-odds p = exp(z)/(1+exp(z)) u = runif(N) y = u < p DAT = data.frame(y, x1, x2, x3) nnet.DAT<-nnet(y~., data=DAT, linout=FALSE, skip=FALSE, size=10, decay=0.05, maxit=1000, trace=FALSE) ## Define the ALE function to be the log-odds of the predicted probability that y = TRUE yhat <- function(X.model, newdata) { p.hat = as.numeric(predict(X.model, newdata, type="raw")) log(p.hat/(1-p.hat)) } ## Calculate and plot the PD main and second-order interaction effects of x1, x2, x3 par(mfrow = c(2,3)) PD.1=PDPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=1, K=50) PD.2=PDPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=2, K=50) PD.3=PDPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=3, K=50) PD.12=PDPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=c(1,2), K=20) PD.13=PDPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=c(1,3), K=20) PD.23=PDPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=c(2,3), K=20)
######################################################################## ## A transparent example in which the supervised learning model is a linear regression \code{lm}, ## but we will pretend it is black-box ######################################################################## ## Generate some data and fit a \code{lm} supervised learning model N=500 x1 <- runif(N, min=0, max=1) x2 <- runif(N, min=0, max=1) x3 <- runif(N, min=0, max=1) y = x1 + 2*x2^2 + rnorm(N, 0, 0.1) DAT = data.frame(y, x1, x2, x3) lm.DAT = lm(y ~ .^2 + I(x1^2) + I(x2^2) + I(x3^2), DAT) ## Define the predictive function (easy in this case, since \code{lm} has ## a built-in predict function that suffices) yhat <- function(X.model, newdata) as.numeric(predict(X.model, newdata)) ## Calculate and plot the PD main effects and second-order interaction effects of x1, x2, x3 par(mfrow = c(2,3)) PD.1=PDPlot(DAT[,2:4], lm.DAT, pred.fun=yhat, J=1, K=50) PD.2=PDPlot(DAT[,2:4], lm.DAT, pred.fun=yhat, J=2, K=50) PD.3=PDPlot(DAT[,2:4], lm.DAT, pred.fun=yhat, J=3, K=50) PD.12=PDPlot(DAT[,2:4], lm.DAT, pred.fun=yhat, J=c(1,2), K=30) PD.13=PDPlot(DAT[,2:4], lm.DAT, pred.fun=yhat, J=c(1,3), K=30) PD.23=PDPlot(DAT[,2:4], lm.DAT, pred.fun=yhat, J=c(2,3), K=30) ## The following manually recreates the same plots produced by the above PDPlot function calls par(mfrow = c(2,3)) plot(PD.1$x.values, PD.1$f.values, type="l", xlab="x1", ylab="PD main effect for x1") plot(PD.2$x.values, PD.2$f.values, type="l", xlab="x2", ylab="PD main effect for x2") plot(PD.3$x.values, PD.3$f.values, type="l", xlab="x3", ylab="PD main effect for x3") image(PD.12$x.values[[1]], PD.12$x.values[[2]], PD.12$f.values, xlab = "x1", ylab = "x2") contour(PD.12$x.values[[1]], PD.12$x.values[[2]], PD.12$f.values, add=TRUE, drawlabels=TRUE) image(PD.13$x.values[[1]], PD.13$x.values[[2]], PD.13$f.values, xlab = "x1", ylab = "x3") contour(PD.13$x.values[[1]], PD.13$x.values[[2]], PD.13$f.values, add=TRUE, drawlabels=TRUE) image(PD.23$x.values[[1]], PD.23$x.values[[2]], PD.23$f.values, xlab = "x2", ylab = "x3") contour(PD.23$x.values[[1]], PD.23$x.values[[2]], PD.23$f.values, add=TRUE, drawlabels=TRUE) ######################################################################## ## A larger example in which the supervised learning model is a neural network (\code{nnet}) ######################################################################## ## Generate some data and fit a \code{nnet} supervised learning model library(nnet) N=5000 x1 <- runif(N, min=0, max=1) x2 <- runif(N, min=0, max=1) x3 <- runif(N, min=0, max=1) y = x1 + 2*x2^2 +(x1-0.5)*(x3-0.5) + rnorm(N, 0, 0.1) DAT = data.frame(y, x1, x2, x3) nnet.DAT<-nnet(y~., data=DAT, linout=TRUE, skip=FALSE, size=10, decay=0.01, maxit=1000, trace=FALSE) ## Define the predictive function yhat <- function(X.model, newdata) as.numeric(predict(X.model, newdata, type="raw")) ## Calculate and plot the PD main and second-order interaction effects of x1, x2, x3 par(mfrow = c(2,3)) PD.1=PDPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=1, K=50) PD.2=PDPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=2, K=50) PD.3=PDPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=3, K=50) PD.12=PDPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=c(1,2), K=20) PD.13=PDPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=c(1,3), K=20) PD.23=PDPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=c(2,3), K=20) ######################################################################## ## A binary classification example in which the supervised learning model is ## a neural network (\code{nnet}), and the log-odds of the predicted class ## probability is the function to be plotted ######################################################################## ## Generate some data and fit a \code{nnet} supervised learning model library(nnet) N=5000 x1 <- runif(N, min=0, max=1) x2 <- runif(N, min=0, max=1) x3 <- runif(N, min=0, max=1) z = -3.21 + 2.81*x1 + 5.62*x2^2 + 2.81*(x1-0.5)*(x3-0.5) #true log-odds p = exp(z)/(1+exp(z)) u = runif(N) y = u < p DAT = data.frame(y, x1, x2, x3) nnet.DAT<-nnet(y~., data=DAT, linout=FALSE, skip=FALSE, size=10, decay=0.05, maxit=1000, trace=FALSE) ## Define the ALE function to be the log-odds of the predicted probability that y = TRUE yhat <- function(X.model, newdata) { p.hat = as.numeric(predict(X.model, newdata, type="raw")) log(p.hat/(1-p.hat)) } ## Calculate and plot the PD main and second-order interaction effects of x1, x2, x3 par(mfrow = c(2,3)) PD.1=PDPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=1, K=50) PD.2=PDPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=2, K=50) PD.3=PDPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=3, K=50) PD.12=PDPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=c(1,2), K=20) PD.13=PDPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=c(1,3), K=20) PD.23=PDPlot(DAT[,2:4], nnet.DAT, pred.fun=yhat, J=c(2,3), K=20)