Title: | Mixed Poisson Regression for Overdispersed Count Data |
---|---|
Description: | Fits mixed Poisson regression models (Poisson-Inverse Gaussian or Negative-Binomial) on data sets with response variables being count data. The models can have varying precision parameter, where a linear regression structure (through a link function) is assumed to hold on the precision parameter. The Expectation-Maximization algorithm for both these models (Poisson Inverse Gaussian and Negative Binomial) is an important contribution of this package. Another important feature of this package is the set of functions to perform global and local influence analysis. See Barreto-Souza and Simas (2016) <doi:10.1007/s11222-015-9601-6> for further details. |
Authors: | Alexandre B. Simas [aut, cre] |
Maintainer: | Alexandre B. Simas <[email protected]> |
License: | GPL-2 | file LICENSE |
Version: | 1.0.0 |
Built: | 2025-03-11 04:50:50 UTC |
Source: | https://github.com/vpnsctl/mixpoissonreg |
School administrators study the attendance behavior of high school juniors at two schools.
data("Attendance")
data("Attendance")
Data frame containing 314 observations on 4 variables.
number of days absent.
gender of the student.
three-level factor indicating the type of instructional program in which the student is enrolled.
standardized math score.
School administrators study the attendance behavior of high school juniors at two schools. Predictors of the number of days of absence include the type of program in which
the student is enrolled and a standardized test in math. Attendance data on 314 high school juniors from two urban high schools.
The response variable of interest is days absent, daysabs
. The variable math
is the standardized math score for each student.
The variable prog
is a three-level factor indicating the type of instructional program in which the student is enrolled.
Data can be obtained from Introduction to Statistical Modeling Github Repository. See also Barreto-Souza and Simas (2020) for further details.
Hughes, M. and Fisher, T. (2020) Introduction to Statistical Modeling.
data("Attendance", package = "mixpoissonreg") daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance, model = "PIG") summary(daysabs_fit) daysabs_fit_red <- mixpoissonreg(daysabs ~ gender + math + prog | prog, data = Attendance, model = "PIG") summary(daysabs_fit_red) # Plot of the fit with all precision covariates plot(daysabs_fit) local_influence_plot(daysabs_fit) # Plot, using ggplot2, of the reduced fit autoplot(daysabs_fit_red) local_influence_autoplot(daysabs_fit_red)
data("Attendance", package = "mixpoissonreg") daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance, model = "PIG") summary(daysabs_fit) daysabs_fit_red <- mixpoissonreg(daysabs ~ gender + math + prog | prog, data = Attendance, model = "PIG") summary(daysabs_fit_red) # Plot of the fit with all precision covariates plot(daysabs_fit) local_influence_plot(daysabs_fit) # Plot, using ggplot2, of the reduced fit autoplot(daysabs_fit_red) local_influence_autoplot(daysabs_fit_red)
mixpoissonreg
objectAugment accepts a model object and a dataset and adds information about each observation in the dataset. It includes
predicted values in the .fitted
column, residuals in the .resid
column, and standard errors for the fitted values in a .se.fit
column, if
the type of prediction is 'link'. New columns always begin with a . prefix to avoid overwriting columns in the original dataset.
## S3 method for class 'mixpoissonreg' augment( x, data = stats::model.frame(x), newdata = NULL, type.predict = c("response", "link", "precision", "variance"), type.residuals = c("pearson", "score"), se_fit = FALSE, conf_int = TRUE, pred_int = FALSE, ... )
## S3 method for class 'mixpoissonreg' augment( x, data = stats::model.frame(x), newdata = NULL, type.predict = c("response", "link", "precision", "variance"), type.residuals = c("pearson", "score"), se_fit = FALSE, conf_int = TRUE, pred_int = FALSE, ... )
x |
A |
data |
A |
newdata |
A |
type.predict |
Type of prediction. The options are 'response', 'link', 'precision' and 'variance'. The default is "response". |
type.residuals |
Type of residuals. The options are 'pearson' and 'score'. The default is 'pearson'. |
se_fit |
Logical indicating whether or not a .se.fit column should be added to the augmented output. If TRUE, it only returns a non-NA value if type of prediction is 'link'. |
conf_int |
Logical indicating whether or not confidence intervals for the fitted variable with type chosen from type.predict should be built. The available type options are 'response' and 'link'. |
pred_int |
Logical indicating whether or not prediction intervals for future observations should be built. It only works with type.predict = 'response'. The arguments
|
... |
Additional arguments. Possible additional arguments are |
A tibble::tibble()
with columns:
.cooksd
Cook's distance.
.fitted
Fitted or predicted value.
.fittedlwrconf
Lower bound of the confidence interval, if conf_int = TRUE
.fitteduprconf
Upper bound of the confidence interval, if conf_int = TRUE
.fittedlwrpred
Lower bound of the prediction interval, if pred_int = TRUE
.fitteduprpred
Upper bound of the prediction interval, if pred_int = TRUE
.hat
Diagonal of the hat matrix.
.resid
The chosen residual.
.resfit
The chosen residual of the fitted object.
.se.fit
Standard errors of fitted values, if se_fit = TRUE.
.gencooksd
Generalized Cook's distance.
.lwrenv
Lower bound of the simulated envelope, if the fitted mixpoissonreg
object, was fitted with envelopes > 0.
.mdnenv
Median of the simulated envelope, if the fitted mixpoissonreg
object, was fitted with envelopes > 0.
.uprenv
Upper bound of the simulated envelope, if the fitted mixpoissonreg
object, was fitted with envelopes > 0.
glance.mixpoissonreg
, tidy.mixpoissonreg
, tidy_local_influence.mixpoissonreg
,
autoplot.mixpoissonreg
, local_influence_autoplot.mixpoissonreg
mixpoissonreg
ObjectsThis function provides ggplot2-based counterparts to the plots produced by plot.mixpoissonreg
.
Currently there are six plots available. They contain residual analysis and global influence diagnostics. The plots are selectable by
the which
argument. The plots are: Residuals vs. obs. numbers; Normal Q-Q plots, which may contain simulated envelopes, if the fitted object
has simulated envelopes; Cook's distances vs. obs. numbers; Generalized Cook's distances vs. obs. numbers; Cook's distances vs. Generalized Cook's distances;
Response variables vs. fitted means. By default, the first two plots and the last two plots are provided.
If both ncol and nrow are NULL
, the plots will be placed one at a time. To place multiple plots, set the values for nrow
or ncol
.
## S3 method for class 'mixpoissonreg' autoplot( object, which = c(1, 2, 5, 6), title = list("Residuals vs Obs. number", "Normal Q-Q", "Cook's distance", "Generalized Cook's distance", "Cook's dist vs Generalized Cook's dist", "Response vs Fitted means"), title.bold = FALSE, title.size = NULL, title.colour = NULL, label.repel = TRUE, x.axis.col = NULL, y.axis.col = NULL, x.axis.size = NULL, y.axis.size = NULL, cook.plot.type = "linerange", cook.plot.pointshape = NULL, nrow = NULL, ncol = NULL, qqline = TRUE, ask = prod(graphics::par("mfcol")) < length(which) && grDevices::dev.interactive(), include.modeltype = TRUE, include.residualtype = FALSE, sub.caption = NULL, sub.caption.col = NULL, sub.caption.size = NULL, sub.caption.face = NULL, sub.caption.hjust = 0.5, env_alpha = 0.5, env_fill = "grey70", gpar_sub.caption = list(fontface = "bold"), colour = "#444444", size = NULL, linetype = NULL, alpha = NULL, fill = NULL, shape = NULL, label = TRUE, label.label = NULL, label.colour = "#000000", label.alpha = NULL, label.size = NULL, label.angle = NULL, label.family = NULL, label.fontface = NULL, label.lineheight = NULL, label.hjust = NULL, label.vjust = NULL, label.n = 3, ad.colour = "#888888", ad.linetype = "dashed", ad.size = 0.2, ... )
## S3 method for class 'mixpoissonreg' autoplot( object, which = c(1, 2, 5, 6), title = list("Residuals vs Obs. number", "Normal Q-Q", "Cook's distance", "Generalized Cook's distance", "Cook's dist vs Generalized Cook's dist", "Response vs Fitted means"), title.bold = FALSE, title.size = NULL, title.colour = NULL, label.repel = TRUE, x.axis.col = NULL, y.axis.col = NULL, x.axis.size = NULL, y.axis.size = NULL, cook.plot.type = "linerange", cook.plot.pointshape = NULL, nrow = NULL, ncol = NULL, qqline = TRUE, ask = prod(graphics::par("mfcol")) < length(which) && grDevices::dev.interactive(), include.modeltype = TRUE, include.residualtype = FALSE, sub.caption = NULL, sub.caption.col = NULL, sub.caption.size = NULL, sub.caption.face = NULL, sub.caption.hjust = 0.5, env_alpha = 0.5, env_fill = "grey70", gpar_sub.caption = list(fontface = "bold"), colour = "#444444", size = NULL, linetype = NULL, alpha = NULL, fill = NULL, shape = NULL, label = TRUE, label.label = NULL, label.colour = "#000000", label.alpha = NULL, label.size = NULL, label.angle = NULL, label.family = NULL, label.fontface = NULL, label.lineheight = NULL, label.hjust = NULL, label.vjust = NULL, label.n = 3, ad.colour = "#888888", ad.linetype = "dashed", ad.size = 0.2, ... )
object |
A |
which |
a list or vector indicating which plots should be displayed. If a subset of the plots is required, specify a subset of the numbers 1:6,
see title below for the different kinds. In
plot number 2, 'Normal Q-Q', if the |
title |
titles to appear above the plots; character vector or list of valid graphics annotations. Can be set to "" to suppress all titles. |
title.bold |
logical indicating whether the titles should be bold. The default is FALSE. |
title.size |
numerical indicating the size of the titles. |
title.colour |
title colour. |
label.repel |
Logical flag indicating whether to use ggrepel to place the labels. |
x.axis.col |
colour of the x axis title. |
y.axis.col |
colour of the y axis title. |
x.axis.size |
size of the x axis title. |
y.axis.size |
size of the y axis title. |
cook.plot.type |
character indicating the type of plot for Cook's distance and generalized Cook's distance. Default is "linerange". The options are "linerange" and "points". |
cook.plot.pointshape |
the shape of points if "cook.plot.type" is set to "points". |
nrow |
Number of facet/subplot rows. If both |
ncol |
Number of facet/subplot columns. If both |
qqline |
logical; if |
ask |
logical; if |
include.modeltype |
logical. Indicates whether the model type ('NB' or 'PIG') should be displayed on the titles. |
include.residualtype |
local. Indicates whether the name of the residual ('Pearson' or 'Score') should be displayed on the title of plot 1 (Residuals vs. Index). |
sub.caption |
common title-above the figures if there are more than one. If NULL, as by default, a possible abbreviated version of |
sub.caption.col |
color of subcaption (when one figure at a time). |
sub.caption.size |
size of subcaption (when one figure at a time). |
sub.caption.face |
font face for subcaption, options are: "plain", "bold", "italic" and "bold.italic". |
sub.caption.hjust |
indicates the position of the subcaption (when one figure at a time). The default is 0.5, which indicates that the subcaption is centered, a value 0 places the subcaption at the left side of the plot whereas a value of 1 places the subcaption at the right side of the plot. |
env_alpha |
alpha of the envelope region (when the fitted model has envelopes) |
env_fill |
the colour of the filling in the envelopes. |
gpar_sub.caption |
list of gpar parameters to be used as common title in the case of multiple plots. The title will be given in sub.caption argument. See
the help of |
colour |
line colour. |
size |
point size. |
linetype |
line type. |
alpha |
alpha of the plot. |
fill |
fill colour. |
shape |
point shape. |
label |
Logical value whether to display labels. |
label.label |
vector of labels. If |
label.colour |
Colour for text labels. |
label.alpha |
Alpha for text labels. |
label.size |
Size for text labels. |
label.angle |
Angle for text labels. |
label.family |
Font family for text labels. |
label.fontface |
Fontface for text labels. |
label.lineheight |
Lineheight for text labels. |
label.hjust |
Horizontal adjustment for text labels. |
label.vjust |
Vertical adjustment for text labels. |
label.n |
Number of points to be laeled in each plot, starting with the most extreme. |
ad.colour |
Line colour for additional lines. |
ad.linetype |
Line type for additional lines. |
ad.size |
Fill colour for additional lines. |
... |
other arguments passed to methods. |
Based on autoplot.lm
from the excellent ggfortify package, ggfortify.
sub.caption - by default the function call - is shown as a subtitle (under the x-axis title) on each plot when plots are on separate pages, or as a subtitle in the outer margin when there are multiple plots per page.
Called for its side effects.
data("Attendance", package = "mixpoissonreg") daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) autoplot(daysabs_fit, which = 1:6) autoplot(daysabs_fit, nrow = 2) daysabs_fit_ml <- mixpoissonregML(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance, envelope = 100) autoplot(daysabs_fit_ml, which = 2) daysabs_prog <- mixpoissonregML(daysabs ~ prog, data = Attendance) autoplot(daysabs_prog)
data("Attendance", package = "mixpoissonreg") daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) autoplot(daysabs_fit, which = 1:6) autoplot(daysabs_fit, nrow = 2) daysabs_fit_ml <- mixpoissonregML(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance, envelope = 100) autoplot(daysabs_fit_ml, which = 2) daysabs_prog <- mixpoissonregML(daysabs ~ prog, data = Attendance) autoplot(daysabs_prog)
mixpoissonreg
Objects.Extract model coefficients of fitted mixed Poisson regression models. The parameters arguments allows one to chose if all coefficients should be extracted,
with parameters = 'all'
; if the coefficients of the mean-related parameters should be extracted, with parameters = 'mean'
; if the coefficients of the
precision-related parameters should be extracted, with parameters = 'precision'
.
## S3 method for class 'mixpoissonreg' coef(object, parameters = c("all", "mean", "precision"), ...)
## S3 method for class 'mixpoissonreg' coef(object, parameters = c("all", "mean", "precision"), ...)
object |
object of class "mixpoissonreg" containing results from the fitted model. |
parameters |
a string to determine which coefficients should be extracted: 'all' extracts all coefficients, 'mean' extracts the coefficients of the mean parameters and 'precision' extracts coefficients of the precision parameters. |
... |
further arguments passed to or from other methods. |
A vector containing the coefficients of a mixpoissonreg object.
data("Attendance", package = "mixpoissonreg") daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) coef(daysabs_fit) coef(daysabs_fit, parameters = "precision") daysabs_prog <- mixpoissonreg(daysabs ~ prog, data = Attendance) coef(daysabs_prog)
data("Attendance", package = "mixpoissonreg") daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) coef(daysabs_fit) coef(daysabs_fit, parameters = "precision") daysabs_prog <- mixpoissonreg(daysabs ~ prog, data = Attendance) coef(daysabs_prog)
mixpoissonreg
ObjectsFunction providing the fitted means, linear predictors, precisions or variances for mixed Poisson regression models.
## S3 method for class 'mixpoissonreg' fitted(object, type = c("response", "link", "precision", "variance"), ...)
## S3 method for class 'mixpoissonreg' fitted(object, type = c("response", "link", "precision", "variance"), ...)
object |
object of class "mixpoissonreg" containing results from the fitted model. |
type |
the type of variable to get the fitted values. The default is the "response" type, which provided the estimated values for the means. The type "link" provides the estimates for the linear predictor of the mean. The type "precision" provides estimates for the precision parameters whereas the type "variance" provides estimates for the variances. |
... |
Currently not used. |
A vector containing the fitted values of a mixpoissonreg object.
predict.mixpoissonreg
, summary.mixpoissonreg
,
coef.mixpoissonreg
, vcov.mixpoissonreg
,
plot.mixpoissonreg
data("Attendance", package = "mixpoissonreg") daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) fitted(daysabs_fit) fitted(daysabs_fit, type = "precision") daysabs_prog <- mixpoissonreg(daysabs ~ prog, data = Attendance) fitted(daysabs_prog)
data("Attendance", package = "mixpoissonreg") daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) fitted(daysabs_fit) fitted(daysabs_fit, type = "precision") daysabs_prog <- mixpoissonreg(daysabs ~ prog, data = Attendance) fitted(daysabs_prog)
mixpoissonreg
objectGlance accepts a mixpoissonreg
object and returns a
tibble::tibble()
with exactly one row of model summaries.
The summaries are Efron's pseudo-, degrees of freedom, AIC, BIC, log-likelihood,
the type of model used in the fit ('NB' or 'PIG'), the total number of observations and the estimation method.
## S3 method for class 'mixpoissonreg' glance(x, ...)
## S3 method for class 'mixpoissonreg' glance(x, ...)
x |
A |
... |
Additional arguments. Currently not used. |
A tibble::tibble()
with exactly one row and columns:
efron.pseudo.r2
Efron's pseudo-, that is, the squared correlation between the fitted values and the response values.
df.null
Degrees of freedom used by the null model.
logLik
The log-likelihood of the model.
AIC
Akaike's Information Criterion for the model.
BIC
Bayesian Information Criterion for the model.
df.residual
Residual degrees of freedom.
nobs
Number of observations used.
model.type
Type of model fitted, "NB" or "PIG".
est.method
The estimation method of the fitted model, "EM" or "ML".
augment.mixpoissonreg
, tidy.mixpoissonreg
, tidy_local_influence.mixpoissonreg
,
autoplot.mixpoissonreg
, local_influence_autoplot.mixpoissonreg
These functions provides global influence diagnostic quantities such as Cook's distance, hat values, generalized Cook's distance (through argument
on cooks.distance.mixpoissonreg
function), likelihood displacement (through argument
on cooks.distance.mixpoissonreg
function) and Q-displacement (through argument
on cooks.distance.mixpoissonreg
function).
## S3 method for class 'mixpoissonreg' hatvalues(model, parameters = c("mean", "precision"), ...) ## S3 method for class 'mixpoissonreg' cooks.distance( model, type = c("CD", "GCD", "GCDmean", "GCDprecision", "LD", "QD"), hat = c("mean", "precision"), ... ) ## S3 method for class 'mixpoissonreg' influence(model, do.coef = TRUE, ...)
## S3 method for class 'mixpoissonreg' hatvalues(model, parameters = c("mean", "precision"), ...) ## S3 method for class 'mixpoissonreg' cooks.distance( model, type = c("CD", "GCD", "GCDmean", "GCDprecision", "LD", "QD"), hat = c("mean", "precision"), ... ) ## S3 method for class 'mixpoissonreg' influence(model, do.coef = TRUE, ...)
model |
a |
parameters |
the parameter to which the hat values will be computed. The options are 'mean' and 'precision'. The default is 'mean'. For hatvalues with respect to the mean the model must be fitted with 'x' set to TRUE, and for hatvalues with respect to the precision the model must be fitted with 'w' set to TRUE. |
... |
Currently not used. |
type |
the type of Cook's distance to be used. The options are 'CD', the standard Cook's distance; 'GCD', the generalized Cook's distance with respect to all parameters, Zhu et al. (2001); 'GCDmean', the generalized Cook's distance with respect to the mean parameters; 'GCDprecision', the generalized Cook's distance with respect to the precision parameters; 'LD', the likelihood displacement (also known as likelihood distance), see Cook and Weisberg (1982); 'QD', the Q-displacement, see Zhu et al. (2001). See 'details' for more informations. For 'GCD', 'GCDmean', 'LD' and 'QD', the model must be fitted with 'x' set to TRUE, and for 'GCD', 'GCDprecision', 'LD' and 'QD', the model must be fitted with 'w' set to TRUE. For 'CD', if 'hat' is set to 'mean', the model must be fitted with 'x' set to TRUE, whereas if 'hat' is set to 'precision', the model must be fitted with 'w' set to TRUE. |
hat |
hat values |
do.coef |
logical indicating if the the approximation to the change of coefficients values after case removal are desired. The model must be fitted with x = TRUE. See details for further explanations. |
For hat values of mixed Poisson regression models, we follow Zhu et al. (2001) to consider the negative of the hessian of the Q-function as weight matrix, and follow Pregibon (1981) to define the 'hat' matrix with respect to this weight matrix. We can consider the hessian of the Q-function with respect to mean-related parameters, which is the default. We can also consider the hessian of the Q-function with respect to the precision-related parameters to give rise to hat values related to the precision parameters.
The Generalized Cook's distance and Q-displacement for EM-based models were defined in Zhu et al. (2001) and computed for mixed Poisson regression models in Barreto-Souza and Simas (2016). We implemented first-order approximation to these quantities to make it computationally feasible. These first-order approximations are available in Barreto-Souza and Simas (2016). We also provide versions of generalized Cook's distance for mean-related or precision-related parameters, whose details can be found in Barreto-Souza and Simas (2016).
In the influence method we provide a 'do.coef' argument that computes first-order approximations to the impact of removal of each case to each parameter, in the same spirit as the 'do.coeff' argument in 'influence.lm'.
The method influence.mixpoissonreg
returns a list containing
hat.mean, a vector containing the hat values with respect to the mean, hat.precision,
a vector containing the hat values with respect to the precision, coefficients.mean (if do.coef=TRUE
), a matrix
containing the first
order approximations for the mean-related coefficients after the removal
of each observation, coefficients.precision (if do.coef=TRUE
), a matrix containing the first
order approximations for the precision-related coefficients after the removal
of each observation, pear.res, a vector containing the Pearson residuals,
score.res, a vector containing the score residuals.
The cooks.distance.mixpoissonreg returns a vector containing the Cook's distances.
The hatvalues.mixpoissonreg returns a vector containing the hat values.
DOI:10.1007/s11222-015-9601-6 doi:10.1007/s11222-015-9601-6(Barreto-Souza and Simas, 2016)
Cook, D.R. and Weisberg, S. (1982) Residuals and Influence in Regression. (New York: Chapman and Hall, 1982)
Pregibon, D. (1981) Logistic Regression Diagnostics. Ann. Stat. 9, 705-724.
Zhu, H.T., Lee, S.Y., Wei, B.C., Zhu, J. (2001) Case-deletion measures formodels with incomplete data. Biometrika, 88, 727–737.
Local Influence Diagnostics
local_influence(model, ...) local_influence_plot(model, ...)
local_influence(model, ...) local_influence_plot(model, ...)
model |
an object for which the local influence is desired |
... |
further arguments passed to or from other methods. |
local_influence
is a generic function to return local influence diagnostics under different perturbation schemes and different directions.
local_influence_plot
is a generic function to provide friendly plots of such diagnostics.
Local influence diagnostics were first introduced by Cook (1986), where several perturbation schemes were introduced and normal curvatures were obtained. Poon and Poon (1999)
introduced the conformal normal curvature, which has nice properties and takes values on the unit interval . Zhu and Lee (2001) following Cook (1986) and Poon and Poon (1999)
introduced normal and conformal normal curvatures for EM-based models.
The local_influence method returns a list containing the resulting perturbation schemes as elements. The local_influence_plot is called for its side effects.
Cook, R. D. (1986) Assessment of Local Influence. Journal of the Royal Statistical Society. Series B (Methodological), Vol. 48, pp.133-169. https://rss.onlinelibrary.wiley.com/doi/10.1111/j.2517-6161.1986.tb01398.x
Poon, W.-Y. and Poon, Y.S. (1999) Conformal normal curvature and assessment of local influence. Journal of the Royal Statistical Society. Series B (Methodological), Vol. 61, pp.51-61. https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/1467-9868.00162
Zhu, H.-T. and Lee, S.-Y. (2001) Local influence for incomplete data models. Journal of the Royal Statistical Society. Series B (Methodological), Vol. 63, pp.111-126. https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/1467-9868.00279
local_influence.mixpoissonreg
, local_influence_plot.mixpoissonreg
,
local_influence_autoplot.mixpoissonreg
mixpoissonreg
ObjectsFunction to provide customizable ggplot2-based plots of local influence diagnostics.
## S3 method for class 'mixpoissonreg' local_influence_autoplot( model, which = c(1, 2, 3, 4), title = list("Case Weights Perturbation", "Hidden Variable Perturbation", "Mean Explanatory Perturbation", "Precision Explanatory Perturbation", "Simultaneous Explanatory Perturbation"), title.size = NULL, title.bold = FALSE, title.colour = NULL, x.axis.col = NULL, y.axis.col = NULL, x.axis.size = NULL, y.axis.size = NULL, type.plot = "linerange", curvature = c("conformal", "normal"), direction = c("canonical", "max.eigen"), parameters = c("all", "mean", "precision"), mean.covariates = NULL, precision.covariates = NULL, label.repel = TRUE, nrow = NULL, ncol = NULL, ask = prod(graphics::par("mfcol")) < length(which) && grDevices::dev.interactive(), include.modeltype = TRUE, sub.caption = NULL, sub.caption.col = NULL, sub.caption.size = NULL, sub.caption.face = NULL, sub.caption.hjust = 0.5, gpar_sub.caption = list(fontface = "bold"), detect.influential = TRUE, n.influential = 5, draw.benchmark = FALSE, colour = "#444444", size = NULL, linetype = NULL, alpha = NULL, fill = NULL, shape = NULL, label = TRUE, label.label = NULL, label.colour = "#000000", label.alpha = NULL, label.size = NULL, label.angle = NULL, label.family = NULL, label.fontface = NULL, label.lineheight = NULL, label.hjust = NULL, label.vjust = NULL, ad.colour = "#888888", ad.linetype = "dashed", ad.size = 0.2, ... )
## S3 method for class 'mixpoissonreg' local_influence_autoplot( model, which = c(1, 2, 3, 4), title = list("Case Weights Perturbation", "Hidden Variable Perturbation", "Mean Explanatory Perturbation", "Precision Explanatory Perturbation", "Simultaneous Explanatory Perturbation"), title.size = NULL, title.bold = FALSE, title.colour = NULL, x.axis.col = NULL, y.axis.col = NULL, x.axis.size = NULL, y.axis.size = NULL, type.plot = "linerange", curvature = c("conformal", "normal"), direction = c("canonical", "max.eigen"), parameters = c("all", "mean", "precision"), mean.covariates = NULL, precision.covariates = NULL, label.repel = TRUE, nrow = NULL, ncol = NULL, ask = prod(graphics::par("mfcol")) < length(which) && grDevices::dev.interactive(), include.modeltype = TRUE, sub.caption = NULL, sub.caption.col = NULL, sub.caption.size = NULL, sub.caption.face = NULL, sub.caption.hjust = 0.5, gpar_sub.caption = list(fontface = "bold"), detect.influential = TRUE, n.influential = 5, draw.benchmark = FALSE, colour = "#444444", size = NULL, linetype = NULL, alpha = NULL, fill = NULL, shape = NULL, label = TRUE, label.label = NULL, label.colour = "#000000", label.alpha = NULL, label.size = NULL, label.angle = NULL, label.family = NULL, label.fontface = NULL, label.lineheight = NULL, label.hjust = NULL, label.vjust = NULL, ad.colour = "#888888", ad.linetype = "dashed", ad.size = 0.2, ... )
model |
A |
which |
a list or vector indicating which plots should be displayed. If a subset of the plots is required, specify a subset of the numbers 1:5, see caption below (and the 'Details') for the different kinds. |
title |
titles to appear above the plots; character vector or list of valid graphics annotations. Can be set to "" to suppress all titles. |
title.size |
numerical indicating the size of the titles. |
title.bold |
logical indicating whether the titles should be bold. The default is FALSE. |
title.colour |
title colour. |
x.axis.col |
colour of the x axis title. |
y.axis.col |
colour of the y axis title. |
x.axis.size |
size of the x axis title. |
y.axis.size |
size of the y axis title. |
type.plot |
a character indicating the type of the plots. The default is "linerange". The options are "linerange" and "points". |
curvature |
the curvature to be returned, 'conformal' for the conformal normal curvature (see Zhu and Lee, 2001 and Poon and Poon, 1999) or 'normal' (see Zhu and Lee, 2001 and Cook, 1986). |
direction |
the 'max.eigen' returns the eigenvector associated to the largest eigenvalue of the perturbation matrix. The 'canonical' considers the curvatures under the canonical directions, which is known as "total local curvature" (see Lesaffre and Verbeke, 1998). For conformal normal curvatures both of them coincide. The default is 'canonical'. |
parameters |
the parameter to which the local influence will be computed. The options are 'all', 'mean' and 'precision'. This argument affects the 'case_weights' and 'hidden_variable' perturbation schemes. The default is 'all'. |
mean.covariates |
a list or vector of characters containing the mean-explanatory variables to be used in the 'mean-explanatory' and 'simultaneous-explanatory' perturbation schemes. If NULL, the 'mean-explanatory' and 'simultaneous-explanatory' perturbation schemes will be computed by perturbing all mean-related covariates. The default is NULL. |
precision.covariates |
a list or vector of characters containing the precision-explanatory variables to be used in the 'precision-explanatory' and 'simultaneous-explanatory' perturbation schemes. If NULL, the 'precision-explanatory' and 'simultaneous-explanatory' perturbation schemes will be computed by perturbing all precision-related covariates. The default is NULL. |
label.repel |
Logical flag indicating whether to use ggrepel to place the labels. |
nrow |
Number of facet/subplot rows. If both |
ncol |
Number of facet/subplot columns. If both |
ask |
logical; if |
include.modeltype |
logical. Indicates whether the model type ('NB' or 'PIG') should be displayed on the captions. |
sub.caption |
common title-above the figures if there are more than one. If NULL, as by default, a possible abbreviated version of deparse(x$call) is used. |
sub.caption.col |
color of subcaption (when one figure at a time). |
sub.caption.size |
size of subcaption (when one figure at a time). |
sub.caption.face |
font face for subcaption, options are: "plain", "bold", "italic" and "bold.italic". |
sub.caption.hjust |
indicates the position of the subcaption (when one figure at a time). The default is 0.5, which indicates that the subcaption is centered, a value 0 places the subcaption at the left side of the plot whereas a value of 1 places the subcaption at the right side of the plot. |
gpar_sub.caption |
list of gpar parameters to be used as common title in the case of multiple plots. The title will be given in sub.caption argument. See
the help of |
detect.influential |
logical. Indicates whether the benchmark should be used to detect influential observations and identify them on the plot. If there is no benchmark available, the top 'n.influential' observations will be identified in the plot by their indexes. |
n.influential |
interger. The maximum number of influential observations to be identified on the plot. |
draw.benchmark |
logical. Indicates whether a horizontal line identifying the benchmark should be drawn. |
colour |
line colour. |
size |
point size. |
linetype |
line type. |
alpha |
alpha of the plot. |
fill |
fill colour. |
shape |
point shape. |
label |
Logical value whether to display labels. |
label.label |
vector of labels. If |
label.colour |
Colour for text labels. |
label.alpha |
Alpha for text labels. |
label.size |
Size for text labels. |
label.angle |
Angle for text labels. |
label.family |
Font family for text labels. |
label.fontface |
Fontface for text labels. |
label.lineheight |
Lineheight for text labels. |
label.hjust |
Horizontal adjustment for text labels. |
label.vjust |
Vertical adjustment for text labels. |
ad.colour |
Line colour for additional lines. |
ad.linetype |
Line type for additional lines. |
ad.size |
Fill colour for additional lines. |
... |
Currently not used. |
Called for its side effects.
DOI:10.1007/s11222-015-9601-6 doi:10.1007/s11222-015-9601-6(Barreto-Souza and Simas; 2016)
Cook, R. D. (1986) Assessment of Local Influence. Journal of the Royal Statistical Society. Series B (Methodological), Vol. 48, pp.133-169. https://rss.onlinelibrary.wiley.com/doi/10.1111/j.2517-6161.1986.tb01398.x
Lesaffre, E. and Verbeke, G. (1998) Local Influence in Linear Mixed Models. Biometrics, 54, pp. 570-582.
Poon, W.-Y. and Poon, Y.S. (2002) Conformal normal curvature and assessment of local influence. Journal of the Royal Statistical Society. Series B (Methodological), Vol. 61, pp.51-61. https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/1467-9868.00162
Zhu, H.-T. and Lee, S.-Y. (2001) Local influence for incomplete data models. Journal of the Royal Statistical Society. Series B (Methodological), Vol. 63, pp.111-126. https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/1467-9868.00279
glance.mixpoissonreg
, augment.mixpoissonreg
, tidy.mixpoissonreg
, autoplot.mixpoissonreg
data("Attendance", package = "mixpoissonreg") daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) local_influence_autoplot(daysabs_fit) local_influence_autoplot(daysabs_fit, nrow = 2) daysabs_fit_ml <- mixpoissonregML(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance, envelope = 100) local_influence_autoplot(daysabs_fit_ml, which = 2) daysabs_prog <- mixpoissonreg(daysabs ~ prog | prog, data = Attendance) local_influence_autoplot(daysabs_prog)
data("Attendance", package = "mixpoissonreg") daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) local_influence_autoplot(daysabs_fit) local_influence_autoplot(daysabs_fit, nrow = 2) daysabs_fit_ml <- mixpoissonregML(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance, envelope = 100) local_influence_autoplot(daysabs_fit_ml, which = 2) daysabs_prog <- mixpoissonreg(daysabs ~ prog | prog, data = Attendance) local_influence_autoplot(daysabs_prog)
Local influence plots for mixed Poisson regression models. Currently the conformal normal and normal curvatures are available
under several perturbation schemes. The default is the conformal normal curvature since
it takes values on and other nice properties (see Zhu and Lee, 2001 and Poon and Poon, 1999 for further details).
## S3 method for class 'mixpoissonreg' local_influence_plot( model, which = c(1, 2, 3, 4), caption = list("Case Weights Perturbation", "Hidden Variable Perturbation", "Mean Explanatory Perturbation", "Precision Explanatory Perturbation", "Simultaneous Explanatory Perturbation"), sub.caption = NULL, detect.influential = TRUE, n.influential = 5, draw.benchmark = FALSE, lty.benchmark = 2, type_plot = "h", curvature = c("conformal", "normal"), direction = c("canonical", "max.eigen"), parameters = c("all", "mean", "precision"), mean.covariates = NULL, precision.covariates = NULL, main = "", ask = prod(graphics::par("mfcol")) < length(which) && grDevices::dev.interactive(), labels.id = names(stats::residuals(model)), cex.id = 0.75, cex.oma.main = 1.25, cex.caption = 1, include.modeltype = TRUE, ... )
## S3 method for class 'mixpoissonreg' local_influence_plot( model, which = c(1, 2, 3, 4), caption = list("Case Weights Perturbation", "Hidden Variable Perturbation", "Mean Explanatory Perturbation", "Precision Explanatory Perturbation", "Simultaneous Explanatory Perturbation"), sub.caption = NULL, detect.influential = TRUE, n.influential = 5, draw.benchmark = FALSE, lty.benchmark = 2, type_plot = "h", curvature = c("conformal", "normal"), direction = c("canonical", "max.eigen"), parameters = c("all", "mean", "precision"), mean.covariates = NULL, precision.covariates = NULL, main = "", ask = prod(graphics::par("mfcol")) < length(which) && grDevices::dev.interactive(), labels.id = names(stats::residuals(model)), cex.id = 0.75, cex.oma.main = 1.25, cex.caption = 1, include.modeltype = TRUE, ... )
model |
a |
which |
a list or vector indicating which plots should be displayed. If a subset of the plots is required, specify a subset of the numbers 1:5, see caption below (and the 'Details') for the different kinds. |
caption |
captions to appear above the plots; character vector or list of valid graphics annotations. Can be set to "" or NA to suppress all captions. |
sub.caption |
common title-above the figures if there are more than one. If NULL, as by default, a possible abbreviated version of deparse(x$call) is used. |
detect.influential |
logical. Indicates whether the benchmark should be used to detect influential observations and identify them on the plot. If there is no benchmark available, the top 'n.influential' observations will be identified in the plot by their indexes. |
n.influential |
interger. The maximum number of influential observations to be identified on the plot. |
draw.benchmark |
logical. Indicates whether a horizontal line identifying the benchmark should be drawn. |
lty.benchmark |
the line type of the benchmark if drawn. |
type_plot |
what type of plot should be drawn. The default is 'h'. |
curvature |
the curvature to be returned, 'conformal' for the conformal normal curvature (see Zhu and Lee, 2001 and Poon and Poon, 1999) or 'normal' (see Zhu and Lee, 2001 and Cook, 1986). |
direction |
the 'max.eigen' returns the eigenvector associated to the largest eigenvalue of the perturbation matrix. The 'canonical' considers the curvatures under the canonical directions, which is known as "total local curvature" (see Lesaffre and Verbeke, 1998). For conformal normal curvatures both of them coincide. The default is 'canonical'. |
parameters |
the parameter to which the local influence will be computed. The options are 'all', 'mean' and 'precision'. This argument affects the 'case_weights' and 'hidden_variable' perturbation schemes. The default is 'all'. |
mean.covariates |
a list or vector of characters containing the mean-explanatory variables to be used in the 'mean-explanatory' and 'simultaneous-explanatory' perturbation schemes. If NULL, the 'mean-explanatory' and 'simultaneous-explanatory' perturbation schemes will be computed by perturbing all mean-related covariates. The default is NULL. |
precision.covariates |
a list or vector of characters containing the precision-explanatory variables to be used in the 'precision-explanatory' and 'simultaneous-explanatory' perturbation schemes. If NULL, the 'precision-explanatory' and 'simultaneous-explanatory' perturbation schemes will be computed by perturbing all precision-related covariates. The default is NULL. |
main |
character; title to be placed at each plot additionally (and above) all captions. |
ask |
logical; if |
labels.id |
vector of labels, from which the labels for extreme points will be chosen. The default uses the observation numbers. |
cex.id |
magnification of point labels. |
cex.oma.main |
controls the size of the sub.caption only if that is above the figures when there is more than one. |
cex.caption |
controls the size of caption. |
include.modeltype |
logical. Indicates whether the model type ('NB' or 'PIG') should be displayed on the captions. |
... |
other graphical arguments to be passed. |
local_influence.mixpoissonreg
provides local influence diagnostics for mixed Poisson regression models for all perturbation schemes considered in
Barreto-Souza and Simas (2016), for normal and conformal normal curvatures. Further, it is also provides results for the canonical directions, which is called
the total local influence (see Lesaffre and Verbeke, 1998), as well as for the direction of largest curvature, which is the direction of the eigenvector of the
perturbation matrix associated to the largest eigenvalue.
local_influence_plot.mixpoissonreg
provides a plot of the local influence diagnostics. Each plot corresponds to a perturbation scheme. The first plot considers
the 'case-weights' perturbation; the second plot considers the 'hidden-variable' perturbation (which was introduced in Barreto-Souza and Simas, 2016); the third plot
considers the mean-explanatory perturbation; the fourth plot considers the precision-explanatory perturbation; the fifth plot considers the simultanous-explanatory perturbation.
For both local_influence.mixpoissonreg
and local_influence_plot.mixpoissonreg
, one can select which covariates will be perturbed in the 'mean-explanatory',
'precision-explanatory' and 'simultaneous-explanatory' perturbation schemes. These are chosen in the 'mean.covariates' and 'precision.covariates' arguments.
If one considers the total local influence, then Zhu and Lee (2001) provides benchmark for influential observations for all perturbation schemes. These are returned as
attributes in the returned list from local_influence.mixpoissonreg
. When using the local_influence_plot.mixpoissonreg
, only points above the benchmark
will be displayed. One can also set the option 'draw_benchmark' to TRUE to plot the benchmark line.
Called for its side effects.
DOI:10.1007/s11222-015-9601-6 doi:10.1007/s11222-015-9601-6(Barreto-Souza and Simas; 2016)
Cook, R. D. (1986) Assessment of Local Influence. Journal of the Royal Statistical Society. Series B (Methodological), Vol. 48, pp.133-169. https://rss.onlinelibrary.wiley.com/doi/10.1111/j.2517-6161.1986.tb01398.x
Lesaffre, E. and Verbeke, G. (1998) Local Influence in Linear Mixed Models. Biometrics, 54, pp. 570-582.
Poon, W.-Y. and Poon, Y.S. (1999) Conformal normal curvature and assessment of local influence. Journal of the Royal Statistical Society. Series B (Methodological), Vol. 61, pp.51-61. https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/1467-9868.00162
Zhu, H.-T. and Lee, S.-Y. (2001) Local influence for incomplete data models. Journal of the Royal Statistical Society. Series B (Methodological), Vol. 63, pp.111-126. https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/1467-9868.00279
data("Attendance", package = "mixpoissonreg") daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) local_influence_plot(daysabs_fit) daysabs_fit_ml <- mixpoissonregML(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance, envelope = 100) local_influence_plot(daysabs_fit_ml, which = 2) daysabs_progML <- mixpoissonregML(daysabs ~ prog | prog, data = Attendance) local_influence_plot(daysabs_progML)
data("Attendance", package = "mixpoissonreg") daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) local_influence_plot(daysabs_fit) daysabs_fit_ml <- mixpoissonregML(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance, envelope = 100) local_influence_plot(daysabs_fit_ml, which = 2) daysabs_progML <- mixpoissonregML(daysabs ~ prog | prog, data = Attendance) local_influence_plot(daysabs_progML)
This function provides local influence diagnostic quantities. Currently the conformal normal and normal curvatures are available
under several perturbation schemes. The default is the conformal normal curvature since
it takes values on and other nice properties (see Zhu and Lee, 2001 and Poon and Poon, 1999 for further details).
## S3 method for class 'mixpoissonreg' local_influence( model, perturbation = c("case_weights", "hidden_variable", "mean_explanatory", "precision_explanatory", "simultaneous_explanatory"), curvature = c("conformal", "normal"), direction = c("canonical", "max.eigen"), parameters = c("all", "mean", "precision"), mean.covariates = NULL, precision.covariates = NULL, ... )
## S3 method for class 'mixpoissonreg' local_influence( model, perturbation = c("case_weights", "hidden_variable", "mean_explanatory", "precision_explanatory", "simultaneous_explanatory"), curvature = c("conformal", "normal"), direction = c("canonical", "max.eigen"), parameters = c("all", "mean", "precision"), mean.covariates = NULL, precision.covariates = NULL, ... )
model |
a |
perturbation |
a list or vector of perturbation schemes to be returned. The currently available schemes are "case_weights", "hidden_variable", "mean_explanatory", "precision_explanatory", "simultaneous_explanatory". See Barreto-Souza and Simas (2016) for further details. |
curvature |
the curvature to be returned, 'conformal' for the conformal normal curvature (see Zhu and Lee, 2001 and Poon and Poon, 1999) or 'normal' (see Zhu and Lee, 2001 and Cook, 1986). |
direction |
the 'max.eigen' returns the eigenvector associated to the largest eigenvalue of the perturbation matrix. The 'canonical' considers the curvatures under the canonical directions, which is known as "total local curvature" (see Lesaffre and Verbeke, 1998). For conformal normal curvatures both of them coincide. The default is 'canonical'. |
parameters |
the parameter to which the local influence will be computed. The options are 'all', 'mean' and 'precision'. This argument affects the 'case_weights' and 'hidden_variable' perturbation schemes. The default is 'all'. |
mean.covariates |
a list or vector of characters containing the mean-explanatory variables to be used in the 'mean-explanatory' and 'simultaneous-explanatory' perturbation schemes. If NULL, the 'mean-explanatory' and 'simultaneous-explanatory' perturbation schemes will be computed by perturbing all mean-related covariates. The default is NULL. |
precision.covariates |
a list or vector of characters containing the precision-explanatory variables to be used in the 'precision-explanatory' and 'simultaneous-explanatory' perturbation schemes. If NULL, the 'precision-explanatory' and 'simultaneous-explanatory' perturbation schemes will be computed by perturbing all precision-related covariates. The default is NULL. |
... |
other graphical arguments to be passed. |
local_influence.mixpoissonreg
provides local influence diagnostics for mixed Poisson regression models for all perturbation schemes considered in
Barreto-Souza and Simas (2016), for normal and conformal normal curvatures. Further, it is also provides results for the canonical directions, which is called
the total local influence (see Lesaffre and Verbeke, 1998), as well as for the direction of largest curvature, which is the direction of the eigenvector of the
perturbation matrix associated to the largest eigenvalue.
local_influence_plot.mixpoissonreg
provides a plot of the local influence diagnostics. Each plot corresponds to a perturbation scheme. The first plot considers
the 'case-weights' perturbation; the second plot considers the 'hidden-variable' perturbation (which was introduced in Barreto-Souza and Simas, 2016); the third plot
considers the mean-explanatory perturbation; the fourth plot considers the precision-explanatory perturbation; the fifth plot considers the simultanous-explanatory perturbation.
For both local_influence.mixpoissonreg
and local_influence_plot.mixpoissonreg
, one can select which covariates will be perturbed in the 'mean-explanatory',
'precision-explanatory' and 'simultaneous-explanatory' perturbation schemes. These are chosen in the 'mean.covariates' and 'precision.covariates' arguments.
If one considers the total local influence, then Zhu and Lee (2001) provides benchmark for influential observations for all perturbation schemes. These are returned as
attributes in the returned list from local_influence.mixpoissonreg
. When using the local_influence_plot.mixpoissonreg
, only points above the benchmark
will be displayed. One can also set the option 'draw_benchmark' to TRUE to plot the benchmark line.
a list containing the resulting perturbation schemes as elements. Each returned element has an attribute 'benchmark', which for the conformal normal curvature, it is computed following Zhu and Lee (2001), and for normal curvature it is computed following Verbeke and ... If the 'direction' is 'max.eigen' the 'benchmark' attribute is NA.
The 'mean_explanatory', 'precision_explanatory' and 'simultaneous_explanatory' elements of the list contain an attribute 'covariates' indicating which covariates were used in the perturbation schemes.
DOI:10.1007/s11222-015-9601-6 doi:10.1007/s11222-015-9601-6(Barreto-Souza and Simas; 2016)
Cook, R. D. (1986) Assessment of Local Influence. Journal of the Royal Statistical Society. Series B (Methodological), Vol. 48, pp.133-169. https://rss.onlinelibrary.wiley.com/doi/10.1111/j.2517-6161.1986.tb01398.x
Lesaffre, E. and Verbeke, G. (1998) Local Influence in Linear Mixed Models. Biometrics, 54, pp. 570-582.
Poon, W.-Y. and Poon, Y.S. (1999) Conformal normal curvature and assessment of local influence. Journal of the Royal Statistical Society. Series B (Methodological), Vol. 61, pp.51-61. https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/1467-9868.00162
Zhu, H.-T. and Lee, S.-Y. (2001) Local influence for incomplete data models. Journal of the Royal Statistical Society. Series B (Methodological), Vol. 63, pp.111-126. https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/1467-9868.00279
data("Attendance", package = "mixpoissonreg") daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) local_influence(daysabs_fit) daysabs_fit_ml <- mixpoissonregML(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance, envelope = 100) local_influence_plot(daysabs_fit_ml, which = 1) daysabs_progML <- mixpoissonregML(daysabs ~ prog | prog, data = Attendance) local_influence(daysabs_progML)
data("Attendance", package = "mixpoissonreg") daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) local_influence(daysabs_fit) daysabs_fit_ml <- mixpoissonregML(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance, envelope = 100) local_influence_plot(daysabs_fit_ml, which = 1) daysabs_progML <- mixpoissonregML(daysabs ~ prog | prog, data = Attendance) local_influence(daysabs_progML)
mixpoissonreg
ObjectsFunction to compute the log-likelihood at the estimated parameters for mixed Poisson regression models.
## S3 method for class 'mixpoissonreg' logLik(object, ...)
## S3 method for class 'mixpoissonreg' logLik(object, ...)
object |
an object of class "mixpoissonreg" containing results from the fitted model. |
... |
further arguments passed to or from other methods. |
Returns an object of class LogLik
containing the log-likelihood
of the fitted mixpoissonreg object.
data("Attendance", package = "mixpoissonreg") daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) logLik(daysabs_fit) daysabs_prog <- mixpoissonreg(daysabs ~ prog, data = Attendance) logLik(daysabs_prog)
data("Attendance", package = "mixpoissonreg") daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) logLik(daysabs_fit) daysabs_prog <- mixpoissonreg(daysabs ~ prog, data = Attendance) logLik(daysabs_prog)
Fits mixed Poisson regression models (Poisson-Inverse Gaussian or Negative-Binomial) on data sets with response variables being count data. The models can have varying precision parameter, where a linear regression structure (through a link function) is assumed to hold on the precision parameter. The Expectation-Maximization algorithm for both these models (Poisson Inverse Gaussian and Negative Binomial) is an important contribution of this package. Another important feature of this package is the set of functions to perform global and local influence analysis.
mixpoissonreg( formula, data, link.mean = c("log", "sqrt"), link.precision = c("identity", "log", "inverse.sqrt"), model = c("NB", "PIG"), method = c("EM", "ML"), residual = c("pearson", "score"), y = TRUE, x = TRUE, w = TRUE, envelope = 0, prob = 0.95, model.frame = TRUE, em_controls = list(maxit = 5000, em_tol = 10^(-5), em_tolgrad = 10^(-2)), optim_method = "L-BFGS-B", optim_controls = list() ) mixpoissonreg.fit( x, y, w = NULL, link.mean = c("log", "sqrt"), link.precision = c("identity", "log", "inverse.sqrt"), model = c("NB", "PIG"), method = c("EM", "ML"), residual = c("pearson", "score"), envelope = 0, prob = 0.95, em_controls = list(maxit = 5000, em_tol = 10^(-5), em_tolgrad = 10^(-2)), optim_method = "L-BFGS-B", optim_controls = list() )
mixpoissonreg( formula, data, link.mean = c("log", "sqrt"), link.precision = c("identity", "log", "inverse.sqrt"), model = c("NB", "PIG"), method = c("EM", "ML"), residual = c("pearson", "score"), y = TRUE, x = TRUE, w = TRUE, envelope = 0, prob = 0.95, model.frame = TRUE, em_controls = list(maxit = 5000, em_tol = 10^(-5), em_tolgrad = 10^(-2)), optim_method = "L-BFGS-B", optim_controls = list() ) mixpoissonreg.fit( x, y, w = NULL, link.mean = c("log", "sqrt"), link.precision = c("identity", "log", "inverse.sqrt"), model = c("NB", "PIG"), method = c("EM", "ML"), residual = c("pearson", "score"), envelope = 0, prob = 0.95, em_controls = list(maxit = 5000, em_tol = 10^(-5), em_tolgrad = 10^(-2)), optim_method = "L-BFGS-B", optim_controls = list() )
formula |
symbolic description of the model (examples: |
data |
elements expressed in formula. This is usually a data frame composed by:
(i) the observations formed by count data |
link.mean |
optionally, a string containing the link function for the mean. If omitted, the 'log' link function will be used. The possible link functions for the mean are "log" and "sqrt". |
link.precision |
optionally, a string containing the link function the precision parameter. If omitted and the only precision
covariate is the intercept, the 'identity' link function will be used, if omitted and there is a precision covariate other than the
intercept, the 'log' link function will be used. The possible link functions for the precision parameter are "identity" and "inverse.sqrt" (which is |
model |
character ("NB" or "PIG") indicating the type of model to be fitted, with "NB" standing for Negative-Binomial and "PIG" standing for Poisson Inverse Gaussian. The default is "NB". |
method |
estimation method to be chosen between "EM" (Expectation-Maximization) and "ML" (Maximum-Likelihood). The default method is "EM". |
residual |
character indicating the type of residual to be evaluated ("pearson" or "score"). The default is "pearson". Notice that they coincide for Negative-Binomial models. |
y |
For For |
x |
For For |
w |
For For |
envelope |
number of simulations (synthetic data sets) to build envelopes for residuals (with |
prob |
probability indicating the confidence level for the envelopes (default: |
model.frame |
logical indicating whether the model frame should be returned as component of the returned value. |
em_controls |
only used with the 'EM' method. A list containing two elements: |
optim_method |
main optimization algorithm to be used. The available methods are the same as those of |
optim_controls |
a list of control arguments to be passed to the |
Among the regression models with discrete response variables, Poisson regression is the most popular
for modeling count data. See, for instance Sellers and Shmueli (2010).
It is well-known that this model is equidispersed (that is, the mean is equal to the variance),
which in practice may be an unrealistic
assumption. Several models have been introduced in the literature to overcome this problem such as
negative binomial (NB) and Poisson inverse gaussian (PIG) distributions (see Lawless, 1987).
The most common way to do this is to consider a mixed Poisson distribution, which is defined as follows.
Let be a positive random variable (generally being continuous) with distribution
function
,
where
denotes the parameter vector associated to the
distribution. Let
Poisson
, for
some constant
. Therefore
follows a mixed Poisson (MP) distribution with probability
function given by
for . With this,
has an overdispersed distribution and hence it is a natural alternative to the Poisson distribution.
The most common choices for
are gamma and inverse-gaussian distributions,
which yields
following, respectively, NB and PIG distributions.
General properties of the MP distributions can be found in Karlis and Xekalaki (2005) and in the references therein.
In mixpoissonreg
two regression models are implemented, namely, the NB and PIG regression models.
We follow the definitions and notations given in Barreto-Souza and Simas (2016). The mixed Poisson regression model
is defined by assuming is a random sample where
or
for
.
Under this parameterization we have
and
, where
and
for the NB case, and
and
for
the PIG case, with
being the second derivative of the function
.
The following linear relations are assumed
and
where and
are real valued vectors.
The terms
and
represent, respectively, the i-th row of the matrices "x" (
)
and "w" (
) containing covariates in their columns
(
and
may be 1 to handle intercepts).
Therefore, the mixpoissonreg
package handles up to two regression structures
at the same time: one for the mean parameter, one for the precision parameter. The regression structure for
the mean is determined through a formula y ~ x1 + ... + xn
, whereas the regression structure for
the precision parameter is determined through the right-hand side of the formula using the separator "|
". So,
for example, a regression with x1,...,xn
as covariates for the mean and z1,...,zm
as covariates for the precision
parameter corresponds to the formula y ~ x1 + ... + xn | z1 + ... + zm
. If only there is only formula for
the regression structure for the mean, the regression structure for the precision parameter will only have the intercept,
that is, y ~ x1 + ... + xn
is the same as y ~ x1 + ... + xn | 1
.
In general, in this package, the EM-algorithm estimation method obtains estimates closer to the maximum likelihood estimate than the maximum likelihood estimation method, in the sense that the likelihood function evaluated at the EM-algorithm estimate is greater or equal (usually strictly greater) than the likelihood function evaluated at the maximum likelihood estimate. So, unless the processing time is an issue, we strongly recommend the EM-algorithm as the estimation method.
In Barreto-Souza and Simas (2016) two residuals were studied: the pearson residuals
and the score residuals. Both these residuals are implemented in the mixpoissonreg
package. They coincide for NB regression models. They can be accessed via
the residuals method.
It is also noteworthy that all the global and local influence analysis tools developed
in Barreto-Souza and Simas (2016) are implemented in this package. See influence.mixpoissonreg
,
local_influence.mixpoissonreg
, local_influence_plot.mixpoissonreg
and local_influence_autoplot.mixpoissonreg
.
mixpoissonreg
returns an object of class "mixpoissonreg" whereas mixpoissonreg.fit
returns an object of class "mixpoissonreg_fit". Both objects are given by lists containing the outputs from the model fit (Negative-Binomial or Poisson Inverse Gaussian regression).
An object of the class "mixpoissonreg" is a list containing the following elements:
coefficients
- a list with elements "mean" and "precision" containing the estimated coefficients of the model;
call
- the formula used by the model. If using mixpoissonreg.fit
, this returns NULL
.
modelname
- the fitted model, NB or PIG;
modeltype
- the abbreviated model name
residualname
- the name of the chosen residual in the call, 'pearson' or 'score';
niter
- number of iterations of the EM algorithm if method = "EM" and number of iterations
of the optim
function, if method = "ML";
start
- the initial guesses of the parameters
intercept
- vector indicating if the intercept is present in the mean and/or in the precision regressions;
link.mean
- link function of the mean;
link.precision
- link function of the precision parameter;
fitted.values
- a vector of fitted values in the response scale;
fitted.precisions
- a vector of fitted precisions;
efron.pseudo.r2
- Efron's pseudo R^2: the squared correlation between the response variables and the predicted values;
vcov
- covariance matrix of the parameters of the fitted model;
logLik
- log-likelihood at the estimated parameters;
Qfunction
- Q-function at the estimated parameters;
x
- the covariates related to the mean (if x = TRUE);
w
- the covariates related to the precision parameter (if w = TRUE);
y
- the response variables (if y = TRUE);
model
- if requested (the default), the model frame;
formula
- the formula supplied;
nobs
- number of observations
df.null
- the residual degrees of freedom for the model with constant mean and constant precision;
df.residual
- the residual degrees of freedom of the fitted model;
estimation_method
- the estimation method, "EM" or "ML"
residuals
- vector of raw residuals, that is, the response variable minus the fitted means;
std_errors
- the standard errors of the estimated parameters;
envelope
- the numerical envelopes used to build the Q-Q plot with simulated envelopes;
terms
- (only for mixpoissonreg
)the terms
object used;
levels
- (where relevant, only for mixpoissonreg
) the levels of the factors used;
contrasts
- (where relevant, only for mixpoissonreg
) the contrasts used.
DOI:10.1007/s11222-015-9601-6 doi:10.1007/s11222-015-9601-6(Barreto-Souza and Simas; 2016)
URL:https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1751-5823.2005.tb00250.x (Karlis and Xekalaki; 2005)
DOI:10.2307/3314912 doi:10.2307/3314912(Lawless; 1987)
Sellers, K.F. and Shmueli, G. (2010) A flexible regression model for count data. Ann. Appl. Stat., 4, 943-961
summary.mixpoissonreg
, plot.mixpoissonreg
, autoplot.mixpoissonreg
,
residuals.mixpoissonreg
, predict.mixpoissonreg
,influence.mixpoissonreg
,
cooks.distance.mixpoissonreg
,
local_influence.mixpoissonreg
, local_influence_plot.mixpoissonreg
, local_influence_autoplot.mixpoissonreg
# Examples using the Attendance dataset: daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) summary(daysabs_fit) # Base R plot of the fit plot(daysabs_fit) # ggplot2 plot of the fit autoplot(daysabs_fit) # plot of local influence measures local_influence_plot(daysabs_fit) # ggplot2 plot of local influence measures local_influence_autoplot(daysabs_fit) # Fitting a reduced model of the sabe type as the previous one daysabs_fit_red <- mixpoissonreg(daysabs ~ gender + math + prog | prog, data = Attendance, model = daysabs_fit$modeltype) # Likelihood ratio test: lmtest::lrtest(daysabs_fit, daysabs_fit_red) # Wald test: lmtest::waldtest(daysabs_fit, daysabs_fit_red) daysabs_prog <- mixpoissonreg(daysabs ~ prog, data = Attendance) summary(daysabs_prog)
# Examples using the Attendance dataset: daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) summary(daysabs_fit) # Base R plot of the fit plot(daysabs_fit) # ggplot2 plot of the fit autoplot(daysabs_fit) # plot of local influence measures local_influence_plot(daysabs_fit) # ggplot2 plot of local influence measures local_influence_autoplot(daysabs_fit) # Fitting a reduced model of the sabe type as the previous one daysabs_fit_red <- mixpoissonreg(daysabs ~ gender + math + prog | prog, data = Attendance, model = daysabs_fit$modeltype) # Likelihood ratio test: lmtest::lrtest(daysabs_fit, daysabs_fit_red) # Wald test: lmtest::waldtest(daysabs_fit, daysabs_fit_red) daysabs_prog <- mixpoissonreg(daysabs ~ prog, data = Attendance) summary(daysabs_prog)
Uses maximum likelihood estimators to fit mixed Poisson regression models (Poisson-Inverse Gaussian or Negative-Binomial) on data sets with response variables being count data. The models can have varying precision parameter, where a linear regression structure (through a link function) is assumed to hold on the precision parameter.
mixpoissonregML( formula, data, link.mean = c("log", "sqrt"), link.precision = c("identity", "log", "inverse.sqrt"), model = c("NB", "PIG"), residual = c("pearson", "score"), y = TRUE, x = TRUE, w = TRUE, envelope = 0, prob = 0.95, model.frame = TRUE, em_controls = list(maxit = 5000, em_tol = 10^(-5), em_tolgrad = 10^(-2)), optim_method = "L-BFGS-B", optim_controls = list() ) mixpoissonregML.fit( x, y, w = NULL, link.mean = c("log", "sqrt"), link.precision = c("identity", "log", "inverse.sqrt"), model = c("NB", "PIG"), residual = c("pearson", "score"), envelope = 0, prob = 0.95, em_controls = list(maxit = 5000, em_tol = 10^(-5), em_tolgrad = 10^(-2)), optim_method = "L-BFGS-B", optim_controls = list() )
mixpoissonregML( formula, data, link.mean = c("log", "sqrt"), link.precision = c("identity", "log", "inverse.sqrt"), model = c("NB", "PIG"), residual = c("pearson", "score"), y = TRUE, x = TRUE, w = TRUE, envelope = 0, prob = 0.95, model.frame = TRUE, em_controls = list(maxit = 5000, em_tol = 10^(-5), em_tolgrad = 10^(-2)), optim_method = "L-BFGS-B", optim_controls = list() ) mixpoissonregML.fit( x, y, w = NULL, link.mean = c("log", "sqrt"), link.precision = c("identity", "log", "inverse.sqrt"), model = c("NB", "PIG"), residual = c("pearson", "score"), envelope = 0, prob = 0.95, em_controls = list(maxit = 5000, em_tol = 10^(-5), em_tolgrad = 10^(-2)), optim_method = "L-BFGS-B", optim_controls = list() )
formula |
symbolic description of the model (examples: |
data |
elements expressed in formula. This is usually a data frame composed by:
(i) the observations formed by count data |
link.mean |
optionally, a string containing the link function for the mean. If omitted, the 'log' link function will be used. The possible link functions for the mean are "log" and "sqrt". |
link.precision |
optionally, a string containing the link function the precision parameter. If omitted and the only precision
covariate is the intercept, the 'identity' link function will be used, if omitted and there is a precision covariate other than the
intercept, the 'log' link function will be used. The possible link functions for the precision parameter are "identity" and "inverse.sqrt" (which is |
model |
character ("NB" or "PIG") indicating the type of model to be fitted, with "NB" standing for Negative-Binomial and "PIG" standing for Poisson Inverse Gaussian. The default is "NB". |
residual |
character indicating the type of residual to be evaluated ("pearson" or "score"). The default is "pearson". Notice that they coincide for Negative-Binomial models. |
y |
For For |
x |
For For |
w |
For For |
envelope |
number of simulations (synthetic data sets) to build envelopes for residuals (with |
prob |
probability indicating the confidence level for the envelopes (default: |
model.frame |
logical indicating whether the model frame should be returned as component of the returned value. |
em_controls |
only used with the 'EM' method. A list containing two elements: |
optim_method |
main optimization algorithm to be used. The available methods are the same as those of |
optim_controls |
a list of control arguments to be passed to the |
Among the regression models with discrete response variables, Poisson regression is the most popular
for modeling count data. See, for instance Sellers and Shmueli (2010).
It is well-known that this model is equidispersed (that is, the mean is equal to the variance),
which in practice may be an unrealistic
assumption. Several models have been introduced in the literature to overcome this problem such as
negative binomial (NB) and Poisson inverse gaussian (PIG) distributions (see Lawless, 1987).
The most common way to do this is to consider a mixed Poisson distribution, which is defined as follows.
Let be a positive random variable (generally being continuous) with distribution
function
,
where
denotes the parameter vector associated to the
distribution. Let
Poisson
, for
some constant
. Therefore
follows a mixed Poisson (MP) distribution with probability
function given by
for . With this,
has an overdispersed distribution and hence it is a natural alternative to the Poisson distribution.
The most common choices for
are gamma and inverse-gaussian distributions,
which yields
following, respectively, NB and PIG distributions.
General properties of the MP distributions can be found in Karlis and Xekalaki (2005) and in the references therein.
In mixpoissonreg
two regression models are implemented, namely, the NB and PIG regression models.
We follow the definitions and notations given in Barreto-Souza and Simas (2016). The mixed Poisson regression model
is defined by assuming is a random sample where
or
for
.
Under this parameterization we have
and
, where
and
for the NB case, and
and
for
the PIG case, with
being the second derivative of the function
.
The following linear relations are assumed
and
where and
are real valued vectors.
The terms
and
represent, respectively, the i-th row of the matrices "x" (
)
and "w" (
) containing covariates in their columns
(
and
may be 1 to handle intercepts).
Therefore, the mixpoissonreg
package handles up to two regression structures
at the same time: one for the mean parameter, one for the precision parameter. The regression structure for
the mean is determined through a formula y ~ x1 + ... + xn
, whereas the regression structure for
the precision parameter is determined through the right-hand side of the formula using the separator "|
". So,
for example, a regression with x1,...,xn
as covariates for the mean and z1,...,zm
as covariates for the precision
parameter corresponds to the formula y ~ x1 + ... + xn | z1 + ... + zm
. If only there is only formula for
the regression structure for the mean, the regression structure for the precision parameter will only have the intercept,
that is, y ~ x1 + ... + xn
is the same as y ~ x1 + ... + xn | 1
.
In general, in this package, the EM-algorithm estimation method obtains estimates closer to the maximum likelihood estimate than the maximum likelihood estimation method, in the sense that the likelihood function evaluated at the EM-algorithm estimate is greater or equal (usually strictly greater) than the likelihood function evaluated at the maximum likelihood estimate. So, unless the processing time is an issue, we strongly recommend the EM-algorithm as the estimation method.
In Barreto-Souza and Simas (2016) two residuals were studied: the pearson residuals
and the score residuals. Both these residuals are implemented in the mixpoissonreg
package. They coincide for NB regression models. They can be accessed via
the residuals method.
It is also noteworthy that all the global and local influence analysis tools developed
in Barreto-Souza and Simas (2016) are implemented in this package. See influence.mixpoissonreg
,
local_influence.mixpoissonreg
, local_influence_plot.mixpoissonreg
and local_influence_autoplot.mixpoissonreg
.
mixpoissonregML
returns an object of class "mixpoissonreg" whereas mixpoissonregML.fit
returns an object of class "mixpoissonreg_fit". Both objects are given by lists containing the outputs from the model fit (Negative-Binomial or Poisson Inverse Gaussian regression).
An object of the class "mixpoissonreg" is a list containing the following elements:
coefficients
- a list with elements "mean" and "precision" containing the estimated coefficients of the model;
call
- the formula used by the model. If using mixpoissonreg.fit
, this returns NULL
.
modelname
- the fitted model, NB or PIG;
modeltype
- the abbreviated model name
residualname
- the name of the chosen residual in the call, 'pearson' or 'score';
niter
- number of iterations of the EM algorithm if method = "EM" and number of iterations
of the optim
function, if method = "ML";
start
- the initial guesses of the parameters
intercept
- vector indicating if the intercept is present in the mean and/or in the precision regressions;
link.mean
- link function of the mean;
link.precision
- link function of the precision parameter;
fitted.values
- a vector of fitted values in the response scale;
fitted.precisions
- a vector of fitted precisions;
efron.pseudo.r2
- Efron's pseudo R^2: the squared correlation between the response variables and the predicted values;
vcov
- covariance matrix of the parameters of the fitted model;
logLik
- log-likelihood at the estimated parameters;
Qfunction
- Q-function at the estimated parameters;
x
- the covariates related to the mean (if x = TRUE);
w
- the covariates related to the precision parameter (if w = TRUE);
y
- the response variables (if y = TRUE);
model
- if requested (the default), the model frame;
formula
- the formula supplied;
nobs
- number of observations
df.null
- the residual degrees of freedom for the model with constant mean and constant precision;
df.residual
- the residual degrees of freedom of the fitted model;
estimation_method
- the estimation method, "EM" or "ML"
residuals
- vector of raw residuals, that is, the response variable minus the fitted means;
std_errors
- the standard errors of the estimated parameters;
envelope
- the numerical envelopes used to build the Q-Q plot with simulated envelopes;
terms
- (only for mixpoissonreg
)the terms
object used;
levels
- (where relevant, only for mixpoissonreg
) the levels of the factors used;
contrasts
- (where relevant, only for mixpoissonreg
) the contrasts used.
DOI:10.1007/s11222-015-9601-6 doi:10.1007/s11222-015-9601-6(Barreto-Souza and Simas; 2016)
URL:https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1751-5823.2005.tb00250.x (Karlis and Xekalaki; 2005)
DOI:10.2307/3314912 doi:10.2307/3314912(Lawless; 1987)
Sellers, K.F. and Shmueli, G. (2010) A flexible regression model for count data. Ann. Appl. Stat., 4, 943-961
summary.mixpoissonreg
, plot.mixpoissonreg
, autoplot.mixpoissonreg
,
residuals.mixpoissonreg
, predict.mixpoissonreg
,influence.mixpoissonreg
,
cooks.distance.mixpoissonreg
,
local_influence.mixpoissonreg
, local_influence_plot.mixpoissonreg
, local_influence_autoplot.mixpoissonreg
# Examples using the Attendance dataset: daysabs_fit_ml <- mixpoissonregML(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) summary(daysabs_fit_ml) # Base R plot of the fit plot(daysabs_fit_ml) # ggplot2 plot of the fit autoplot(daysabs_fit_ml) # plot of local influence measures local_influence_plot(daysabs_fit_ml) # ggplot2 plot of local influence measures local_influence_autoplot(daysabs_fit_ml) # Fitting a reduced model of the sabe type as the previous one daysabs_fit_ml_red <- mixpoissonregML(daysabs ~ gender + math + prog | prog, data = Attendance, model = daysabs_fit_ml$modeltype) # Likelihood ratio test: lmtest::lrtest(daysabs_fit_ml, daysabs_fit_ml_red) # Wald test: lmtest::waldtest(daysabs_fit_ml, daysabs_fit_ml_red) daysabs_progML <- mixpoissonregML(daysabs ~ prog, data = Attendance) summary(daysabs_progML)
# Examples using the Attendance dataset: daysabs_fit_ml <- mixpoissonregML(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) summary(daysabs_fit_ml) # Base R plot of the fit plot(daysabs_fit_ml) # ggplot2 plot of the fit autoplot(daysabs_fit_ml) # plot of local influence measures local_influence_plot(daysabs_fit_ml) # ggplot2 plot of local influence measures local_influence_autoplot(daysabs_fit_ml) # Fitting a reduced model of the sabe type as the previous one daysabs_fit_ml_red <- mixpoissonregML(daysabs ~ gender + math + prog | prog, data = Attendance, model = daysabs_fit_ml$modeltype) # Likelihood ratio test: lmtest::lrtest(daysabs_fit_ml, daysabs_fit_ml_red) # Wald test: lmtest::waldtest(daysabs_fit_ml, daysabs_fit_ml_red) daysabs_progML <- mixpoissonregML(daysabs ~ prog, data = Attendance) summary(daysabs_progML)
mixpoissonreg
ObjectsCurrently there are six plots available. They contain residual analysis and global influence diagnostics. The plots are selectable by
the which
argument. The plots are: Residuals vs. obs. numbers; Normal Q-Q plots, which may contain simulated envelopes, if the fitted object
has simulated envelopes; Cook's distances vs. obs. numbers; Generalized Cook's distances vs. obs. numbers; Cook's distances vs. Generalized Cook's distances;
Response variables vs. fitted means. By default, the first two plots and the last two plots are provided.
## S3 method for class 'mixpoissonreg' plot( x, which = c(1, 2, 5, 6), caption = list("Residuals vs Obs. number", "Normal Q-Q", "Cook's distance", "Generalized Cook's distance", "Cook's dist vs Generalized Cook's dist", "Response vs Fitted means"), sub.caption = NULL, qqline = TRUE, col.qqline = "black", main = "", line_col_env = "gray", line_col_median = "black", fill_col_env = "gray", fill_alpha_env = 0.7, ask = prod(graphics::par("mfcol")) < length(which) && grDevices::dev.interactive(), labels.id = names(stats::residuals(x)), label.pos = c(4, 2), type.cookplot = "h", id.n = 3, col.id = NULL, cex.id = 0.75, cex.oma.main = 1.25, cex.caption = 1, col.caption = "black", cex.points = 1, col.points = "black", include.modeltype = TRUE, include.residualtype = FALSE, ... )
## S3 method for class 'mixpoissonreg' plot( x, which = c(1, 2, 5, 6), caption = list("Residuals vs Obs. number", "Normal Q-Q", "Cook's distance", "Generalized Cook's distance", "Cook's dist vs Generalized Cook's dist", "Response vs Fitted means"), sub.caption = NULL, qqline = TRUE, col.qqline = "black", main = "", line_col_env = "gray", line_col_median = "black", fill_col_env = "gray", fill_alpha_env = 0.7, ask = prod(graphics::par("mfcol")) < length(which) && grDevices::dev.interactive(), labels.id = names(stats::residuals(x)), label.pos = c(4, 2), type.cookplot = "h", id.n = 3, col.id = NULL, cex.id = 0.75, cex.oma.main = 1.25, cex.caption = 1, col.caption = "black", cex.points = 1, col.points = "black", include.modeltype = TRUE, include.residualtype = FALSE, ... )
x |
object of class "mixpoissonreg" containing results from the fitted model. If the model was fitted with envelope = 0, the Q-Q plot will be produced without envelopes. |
which |
a list or vector indicating which plots should be displayed. If a subset of the plots is required, specify a subset of the numbers 1:6,
see caption below for the different kinds. In
plot number 2, 'Normal Q-Q', if the |
caption |
captions to appear above the plots; character vector or list of valid graphics annotations. Can be set to "" or NA to suppress all captions. |
sub.caption |
common title-above the figures if there are more than one. If NULL, as by default, a possible abbreviated version of |
qqline |
logical; if |
col.qqline |
color of the qqline. |
main |
character; title to be placed at each plot additionally (and above) all captions. |
line_col_env |
line color for the upper and lower quantile curves of the simulated envelopes if the |
line_col_median |
line color for the median curve of the simulated envelopes if the |
fill_col_env |
fill color for the simulated envelopes if the |
fill_alpha_env |
alpha of the envelope region, when the |
ask |
logical; if |
labels.id |
vector of labels, from which the labels for extreme points will be chosen. The default uses the observation numbers. |
label.pos |
positioning of labels, for the left half and right half of the graph respectively, for plots 2 and 6. |
type.cookplot |
character; what type of plot should be drawn for Cook's and Generalized Cook's distances (plots 3 and 4). The default is 'h'. |
id.n |
number of points to be labelled in each plot, starting with the most extreme. |
col.id |
color of point labels. |
cex.id |
magnification of point labels. |
cex.oma.main |
controls the size of the sub.caption only if that is above the figures when there is more than one. |
cex.caption |
controls the size of caption. |
col.caption |
controls the caption color. |
cex.points |
controls the size of the points. |
col.points |
controls the colors of the points. |
include.modeltype |
logical. Indicates whether the model type ('NB' or 'PIG') should be displayed on the captions. |
include.residualtype |
local. Indicates whether the name of the residual ('Pearson' or 'Score') should be displayed on the caption of plot 1 (Residuals vs. Index). |
... |
graphical parameters to be passed. |
The plot
method is implemented following the same structure as the plot.lm, so it will be easy to be used by practitioners that
are familiar with glm
objects.
These plots allows one to perform residuals analsysis and influence diagnostics. There are other global influence functions, see influence.mixpoissonreg
.
See Barreto-Souza and Simas (2016), Cook and Weisberg (1982) and Zhu et al. (2001).
It is called for its side effects.
DOI:10.1007/s11222-015-9601-6 doi:10.1007/s11222-015-9601-6(Barreto-Souza and Simas; 2016)
Cook, D.R. and Weisberg, S. (1982) Residuals and Influence in Regression. (New York: Chapman and Hall, 1982)
Zhu, H.T., Lee, S.Y., Wei, B.C., Zhu, J. (2001) Case-deletion measures formodels with incomplete data. Biometrika, 88, 727–737.
autoplot.mixpoissonreg
, local_influence_plot.mixpoissonreg
, local_influence_autoplot.mixpoissonreg
,
summary.mixpoissonreg
, predict.mixpoissonreg
, influence.mixpoissonreg
data("Attendance", package = "mixpoissonreg") daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | prog, data = Attendance) plot(daysabs_fit, which = 1:6) plot(daysabs_fit) daysabs_fit_ml <- mixpoissonregML(daysabs ~ gender + math + prog | prog, data = Attendance, envelope = 100) plot(daysabs_fit_ml, which = 2) daysabs_prog <- mixpoissonreg(daysabs ~ prog, data = Attendance) plot(daysabs_prog)
data("Attendance", package = "mixpoissonreg") daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | prog, data = Attendance) plot(daysabs_fit, which = 1:6) plot(daysabs_fit) daysabs_fit_ml <- mixpoissonregML(daysabs ~ gender + math + prog | prog, data = Attendance, envelope = 100) plot(daysabs_fit_ml, which = 2) daysabs_prog <- mixpoissonreg(daysabs ~ prog, data = Attendance) plot(daysabs_prog)
mixpoissonreg
ObjectsFunction to obtain various predictions based on the fitted mixed Poisson regression models.
## S3 method for class 'mixpoissonreg' predict( object, newdata = NULL, type = c("response", "link", "precision", "variance"), se.fit = FALSE, interval = c("none", "confidence", "prediction"), level = 0.95, nsim_pred = 100, nsim_pred_y = 100, ... )
## S3 method for class 'mixpoissonreg' predict( object, newdata = NULL, type = c("response", "link", "precision", "variance"), se.fit = FALSE, interval = c("none", "confidence", "prediction"), level = 0.95, nsim_pred = 100, nsim_pred_y = 100, ... )
object |
object of class "mixpoissonreg" containing results from the fitted model. |
newdata |
optionally, a data frame in which to look for variables with which to predict. If omitted, the fitted response values will be provided. |
type |
the type of prediction. The default is the "response" type, which provided the estimated values for the means. The type "link" provides the estimates for the linear predictor. The type "precision" provides estimates for the precision parameters whereas the type "variance" provides estimates for the variances. |
se.fit |
logical switch indicating if standard errors on the scale of linear predictors should be returned. If |
interval |
Type of interval calculation for the response variables, 'none', 'confidence' or 'prediction'. If 'confidence', the confidence intervals for the means are returned.
If 'prediction', prediction intervals for future response variables are reported. For confidence intervals, the type of the prediction must be 'response' or 'link'.
For prediction intervals the type of prediction must be 'response'. For 'confidence' intervals, when using |
level |
Tolerance/confidence level. The default is set to 0.95. |
nsim_pred |
number of means and predictions to be generated in each step of the simulation. The default is set to 100. |
nsim_pred_y |
number of response variables generated for each pair of mean and precision to compute the prediction intervals. The default is set to 100. |
... |
further arguments passed to or from other methods. |
The se.fit
argument only returns a non-NA vector for type = 'link', that is, on the scale of the linear predictor for the mean parameter. For the response scale,
one can obtain confidence or prediction intervals. It is important to notice that confidence intervals must not be used for future observations as they will underestimate
the uncertainty. In this case prediction intervals should be used. Currently, we do not have closed-form expressions for the prediction interval and, therefore, they
are obtained by simulation and can be computationally-intensive.
A vector containing the predicted values if se.fit=FALSE
, a list with
elements fit and se.fit if se.fit=TRUE
, and a matrix if interval
is set to confidence or prediction.
fitted.mixpoissonreg
, summary.mixpoissonreg
, plot.mixpoissonreg
, autoplot.mixpoissonreg
,
coef.mixpoissonreg
, vcov.mixpoissonreg
,
plot.mixpoissonreg
data("Attendance", package = "mixpoissonreg") daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) predict(daysabs_fit, interval = "confidence") predict(daysabs_fit, type = "link", se.fit = TRUE) daysabs_prog <- mixpoissonreg(daysabs ~ prog, data = Attendance) predict(daysabs_prog)
data("Attendance", package = "mixpoissonreg") daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) predict(daysabs_fit, interval = "confidence") predict(daysabs_fit, type = "link", se.fit = TRUE) daysabs_prog <- mixpoissonreg(daysabs ~ prog, data = Attendance) predict(daysabs_prog)
mixpoissonreg
ObjectsFunction to return 'pearson' or 'score' residuals for mixed Poisson regression models.
## S3 method for class 'mixpoissonreg' residuals(object, type = c("pearson", "score"), ...)
## S3 method for class 'mixpoissonreg' residuals(object, type = c("pearson", "score"), ...)
object |
an object of class "mixpoissonreg" containing results from the fitted model. |
type |
the type of residual to be returned. Currently, the options are 'pearson' or 'score'. The default is set to 'pearson'. Notice that these residuals coincide for Negative-Binomial models. |
... |
Currently not used. |
A vector containing the residuals of a mixpoissonreg object.
plot.mixpoissonreg
, predict.mixpoissonreg
,
autoplot.mixpoissonreg
, summary.mixpoissonreg
data("Attendance", package = "mixpoissonreg") daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) residuals(daysabs_fit) #Score residuals: residuals(daysabs_fit, type = "score") daysabs_prog <- mixpoissonreg(daysabs ~ prog, data = Attendance) residuals(daysabs_prog)
data("Attendance", package = "mixpoissonreg") daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) residuals(daysabs_fit) #Score residuals: residuals(daysabs_fit, type = "score") daysabs_prog <- mixpoissonreg(daysabs ~ prog, data = Attendance) residuals(daysabs_prog)
mixpoissonreg
Objects.Function providing a summary of results related to mixed Poisson regression models.
## S3 method for class 'mixpoissonreg' summary(object, ...)
## S3 method for class 'mixpoissonreg' summary(object, ...)
object |
an object of class "mixpoissonreg" containing results from the fitted model. |
... |
further arguments passed to or from other methods. |
An object of class summary_mixpoissonreg
containing several
informations of a mixpoissonreg object.
plot.mixpoissonreg
, autoplot.mixpoissonreg
,
local_influence_plot.mixpoissonreg
, local_influence_autoplot.mixpoissonreg
data("Attendance", package = "mixpoissonreg") daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) summary(daysabs_fit) daysabs_fit_ml <- mixpoissonregML(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) summary(daysabs_fit_ml) daysabs_prog <- mixpoissonreg(daysabs ~ prog, data = Attendance) summary(daysabs_prog)
data("Attendance", package = "mixpoissonreg") daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) summary(daysabs_fit) daysabs_fit_ml <- mixpoissonregML(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) summary(daysabs_fit_ml) daysabs_prog <- mixpoissonreg(daysabs ~ prog, data = Attendance) summary(daysabs_prog)
Functions to provide tidy outputs or ggplot2-based plots of local influence diagnostics.
local_influence_autoplot(model, ...) tidy_local_influence(model, ...) local_influence_benchmarks(model, ...)
local_influence_autoplot(model, ...) tidy_local_influence(model, ...) local_influence_benchmarks(model, ...)
model |
A model object for which local influence diagnostics are desired. |
... |
additional arguments to be passed. |
Local influence diagnostics were first introduced by Cook (1986), where several perturbation schemes were introduced and normal curvatures were obtained. Poon and Poon (1999)
introduced the conformal normal curvature, which has nice properties and takes values on the unit interval . Zhu and Lee (2001) following Cook (1986) and Poon and Poon (1999)
introduced normal and conformal normal curvatures for EM-based models.
The tidy_local_influence method returns a tibble containing the resulting perturbation schemes as columns. The local_influence_benchmarks method returns a tibble with the benchmarks as columns. The local_influence_autoplot method is called for its side effects.
Cook, R. D. (1986) Assessment of Local Influence. Journal of the Royal Statistical Society. Series B (Methodological), Vol. 48, pp.133-169. https://rss.onlinelibrary.wiley.com/doi/10.1111/j.2517-6161.1986.tb01398.x
Poon, W.-Y. and Poon, Y.S. (1999) Conformal normal curvature and assessment of local influence. Journal of the Royal Statistical Society. Series B (Methodological), Vol. 61, pp.51-61. https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/1467-9868.00162
Zhu, H.-T. and Lee, S.-Y. (2001) Local influence for incomplete data models. Journal of the Royal Statistical Society. Series B (Methodological), Vol. 63, pp.111-126. https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/1467-9868.00279
mixpoissonreg
ObjectsFunctions to provide tidy outputs of local influence diagnostics. tidy_local_influence.mixpoissonreg
provides a tibble::tibble()
containing the local influence diagnostics under the chosen perturbation schemes.
local_influence_benchmarks.mixpoissonreg
provides a tibble::tibble()
with a single row and one column
for each selected perturbation scheme containing influential benchmarks for each perturbation scheme.
## S3 method for class 'mixpoissonreg' tidy_local_influence( model, perturbation = c("case_weights", "hidden_variable", "mean_explanatory", "precision_explanatory", "simultaneous_explanatory"), curvature = c("conformal", "normal"), direction = c("canonical", "max.eigen"), parameters = c("all", "mean", "precision"), mean.covariates = NULL, precision.covariates = NULL, ... ) ## S3 method for class 'mixpoissonreg' local_influence_benchmarks( model, perturbation = c("case_weights", "hidden_variable", "mean_explanatory", "precision_explanatory", "simultaneous_explanatory"), curvature = c("conformal", "normal"), direction = c("canonical", "max.eigen"), parameters = c("all", "mean", "precision"), mean.covariates = NULL, precision.covariates = NULL, ... )
## S3 method for class 'mixpoissonreg' tidy_local_influence( model, perturbation = c("case_weights", "hidden_variable", "mean_explanatory", "precision_explanatory", "simultaneous_explanatory"), curvature = c("conformal", "normal"), direction = c("canonical", "max.eigen"), parameters = c("all", "mean", "precision"), mean.covariates = NULL, precision.covariates = NULL, ... ) ## S3 method for class 'mixpoissonreg' local_influence_benchmarks( model, perturbation = c("case_weights", "hidden_variable", "mean_explanatory", "precision_explanatory", "simultaneous_explanatory"), curvature = c("conformal", "normal"), direction = c("canonical", "max.eigen"), parameters = c("all", "mean", "precision"), mean.covariates = NULL, precision.covariates = NULL, ... )
model |
A |
perturbation |
a list or vector of perturbation schemes to be returned. The currently available schemes are "case_weights", "hidden_variable", "mean_explanatory", "precision_explanatory", "simultaneous_explanatory". See Barreto-Souza and Simas (2016) for further details. |
curvature |
the curvature to be returned, 'conformal' for the conformal normal curvature (see Zhu and Lee, 2001 and Poon and Poon, 1999) or 'normal' (see Zhu and Lee, 2001 and Cook, 1986). |
direction |
the 'max.eigen' returns the eigenvector associated to the largest eigenvalue of the perturbation matrix. The 'canonical' considers the curvatures under the canonical directions, which is known as "total local curvature" (see Lesaffre and Verbeke, 1998). For conformal normal curvatures both of them coincide. The default is 'canonical'. |
parameters |
the parameter to which the local influence will be computed. The options are 'all', 'mean' and 'precision'. This argument affects the 'case_weights' and 'hidden_variable' perturbation schemes. The default is 'all'. |
mean.covariates |
a list or vector of characters containing the mean-explanatory variables to be used in the 'mean-explanatory' and 'simultaneous-explanatory' perturbation schemes. If NULL, the 'mean-explanatory' and 'simultaneous-explanatory' perturbation schemes will be computed by perturbing all mean-related covariates. The default is NULL. |
precision.covariates |
a list or vector of characters containing the precision-explanatory variables to be used in the 'precision-explanatory' and 'simultaneous-explanatory' perturbation schemes. If NULL, the 'precision-explanatory' and 'simultaneous-explanatory' perturbation schemes will be computed by perturbing all precision-related covariates. The default is NULL. |
... |
Currently not used. |
A tibble containing the perturbations schemes as columns.
DOI:10.1007/s11222-015-9601-6 doi:10.1007/s11222-015-9601-6(Barreto-Souza and Simas; 2016)
Cook, R. D. (1986) Assessment of Local Influence. Journal of the Royal Statistical Society. Series B (Methodological), Vol. 48, pp.133-169. https://rss.onlinelibrary.wiley.com/doi/10.1111/j.2517-6161.1986.tb01398.x
Lesaffre, E. and Verbeke, G. (1998) Local Influence in Linear Mixed Models. Biometrics, 54, pp. 570-582.
Poon, W.-Y. and Poon, Y.S. (1999) Conformal normal curvature and assessment of local influence. Journal of the Royal Statistical Society. Series B (Methodological), Vol. 61, pp.51-61. https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/1467-9868.00162
Zhu, H.-T. and Lee, S.-Y. (2001) Local influence for incomplete data models. Journal of the Royal Statistical Society. Series B (Methodological), Vol. 63, pp.111-126. https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/1467-9868.00279
glance.mixpoissonreg
, augment.mixpoissonreg
, tidy.mixpoissonreg
, autoplot.mixpoissonreg
data("Attendance", package = "mixpoissonreg") daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) tidy_local_influence(daysabs_fit) daysabs_fit_ml <- mixpoissonregML(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance, envelope = 100) tidy_local_influence(daysabs_fit_ml, perturbation = "case_weights") daysabs_prog <- mixpoissonreg(daysabs ~ prog | prog, data = Attendance) tidy_local_influence(daysabs_prog)
data("Attendance", package = "mixpoissonreg") daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) tidy_local_influence(daysabs_fit) daysabs_fit_ml <- mixpoissonregML(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance, envelope = 100) tidy_local_influence(daysabs_fit_ml, perturbation = "case_weights") daysabs_prog <- mixpoissonreg(daysabs ~ prog | prog, data = Attendance) tidy_local_influence(daysabs_prog)
mixpoissonreg
objectTidy returns a tibble::tibble()
containing
informations on the coefficients of the model, such as the estimated parameters,
standard errors, z-statistics and p-values. Additionally, it may return confidence
intervals for the model parameters.
## S3 method for class 'mixpoissonreg' tidy(x, conf.int = FALSE, conf.level = 0.95, ...)
## S3 method for class 'mixpoissonreg' tidy(x, conf.int = FALSE, conf.level = 0.95, ...)
x |
A |
conf.int |
Logical indicating whether or not to include a confidence interval in the tidied output. Defaults to FALSE. |
conf.level |
The confidence level to use for the confidence interval if conf.int = TRUE. Must be strictly greater than 0 and less than 1. Defaults to 0.95, which corresponds to a 95 percent confidence interval. |
... |
Additional arguments. Currently not used. |
A tibble containing the coefficients of the fitted mixpoissonreg object along with its estimates, std. errors, z-statistics and p-values.
glance.mixpoissonreg
, augment.mixpoissonreg
, tidy_local_influence.mixpoissonreg
,
autoplot.mixpoissonreg
, local_influence_autoplot.mixpoissonreg
mixpoissonreg
ObjectsReturns the variance-covariance matrix of the parameters for fitted mixed Poisson regression models. The parameters
argument
indicates for which parameters the variance-covariance matrix should be computed, namely, 'mean' for mean-relatex parameters or 'precision' for precision-related parameters.
## S3 method for class 'mixpoissonreg' vcov(object, parameters = c("all", "mean", "precision"), ...)
## S3 method for class 'mixpoissonreg' vcov(object, parameters = c("all", "mean", "precision"), ...)
object |
an object of class "mixpoissonreg" containing results from the fitted model. |
parameters |
a string to determine which coefficients should be extracted: 'all' extracts all coefficients, 'mean' extracts the coefficients of the mean parameters and 'precision' extracts coefficients of the precision parameters. |
... |
further arguments passed to or from other methods. |
A matrix containing the covariance matrix of a mixpoissonreg object.
data("Attendance", package = "mixpoissonreg") daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) vcov(daysabs_fit) vcov(daysabs_fit, parameters = "mean") daysabs_prog <- mixpoissonreg(daysabs ~ prog, data = Attendance) vcov(daysabs_prog)
data("Attendance", package = "mixpoissonreg") daysabs_fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog, data = Attendance) vcov(daysabs_fit) vcov(daysabs_fit, parameters = "mean") daysabs_prog <- mixpoissonreg(daysabs ~ prog, data = Attendance) vcov(daysabs_prog)