Title: | Port of the S+ "Robust Library" |
---|---|
Description: | Methods for robust statistics, a state of the art in the early 2000s, notably for robust regression and robust multivariate analysis. |
Authors: | Jiahui Wang [aut], Ruben Zamar [aut], Alfio Marazzi [aut], Victor Yohai [aut], Matias Salibian-Barrera [aut], Ricardo Maronna [aut], Eric Zivot [aut], David Rocke [aut], Doug Martin [aut], Martin Maechler [aut], Kjell Konis [aut], Valentin Todorov [aut, cre] |
Maintainer: | Valentin Todorov <[email protected]> |
License: | GPL (>=3) |
Version: | 0.7-5 |
Built: | 2024-11-15 05:36:49 UTC |
Source: | https://github.com/valentint/robust |
Compute an analysis of variance table for one or more robust generalized linear model fits.
## S3 method for class 'glmRob' anova(object, ..., test = c("none", "Chisq", "F", "Cp")) ## S3 method for class 'glmRoblist' anova(object, ..., test = c("none", "Chisq", "F", "Cp"))
## S3 method for class 'glmRob' anova(object, ..., test = c("none", "Chisq", "F", "Cp")) ## S3 method for class 'glmRoblist' anova(object, ..., test = c("none", "Chisq", "F", "Cp"))
object |
a glmRob object. |
... |
additional glmRob objects. |
test |
a character string specifying the test statistic to be used. Can be one of "F", "Chisq", "Cp" or "none" for no test. |
an anova
object.
glmRob
,
anova
,
anova.glmRoblist
.
data(breslow.dat) bres.int <- glmRob(sumY ~ Age10 + Base4*Trt, family = poisson(), data = breslow.dat) anova(bres.int) bres.main <- glmRob(sumY ~ Age10 + Base4 + Trt, family = poisson(), data = breslow.dat) anova(bres.main, bres.int)
data(breslow.dat) bres.int <- glmRob(sumY ~ Age10 + Base4*Trt, family = poisson(), data = breslow.dat) anova(bres.int) bres.main <- glmRob(sumY ~ Age10 + Base4 + Trt, family = poisson(), data = breslow.dat) anova(bres.main, bres.int)
Compute an analysis of variance table for one or more robust linear model fits.
## S3 method for class 'lmRob' anova(object, ..., test = c("RF", "RWald")) ## S3 method for class 'lmRoblist' anova(object, const, ipsi, yc, test = c("RWald", "RF"), ...)
## S3 method for class 'lmRob' anova(object, ..., test = c("RF", "RWald")) ## S3 method for class 'lmRoblist' anova(object, const, ipsi, yc, test = c("RWald", "RF"), ...)
object |
an lmRob object. |
... |
additional arguments required by the generic anova function. If |
const |
a numeric value containing the tuning constant. |
ipsi |
an integer value specifying the psi-function. |
yc |
a numeric value containing the tuning constant. |
test |
a single character value specifying which test should be computed in the Anova table. The possible choices are "RWald" and "RF". |
The default test used by anova is the "RWald"
test, which is the Wald test based on robust estimates of the coefficients and covariance matrix. If test
is "RF"
, the robustified F-test is used instead.
an anova
object.
Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J., and Stahel, W. A. (1986). Robust statistics: the approach based on influence functions. John Wiley & Sons.
data(stack.dat) stack.small <- lmRob(Loss ~ Water.Temp + Acid.Conc., data = stack.dat) stack.full <- lmRob(Loss ~ ., data = stack.dat) anova(stack.full) anova(stack.full, stack.small)
data(stack.dat) stack.small <- lmRob(Loss ~ Water.Temp + Acid.Conc., data = stack.dat) stack.full <- lmRob(Loss ~ ., data = stack.dat) anova(stack.full) anova(stack.full, stack.small)
Patients suffering from simple or complex partial seizures were randomized to receive either the antiepileptic drug progabide or a placebo. At each of four successive postrandomization clinic visits, the number of seizures occuring over the previous two weeks was reported.
data(breslow.dat)
data(breslow.dat)
A data frame with 59 observations on the following 12 variables.
ID
an integer value specifying the patient identification number.
Y1
an integer value, the number of seizures during the first two week period.
Y2
an integer value, the number of seizures during the second two week period.
Y3
an integer value, the number of seizures during the third two week period.
Y4
an integer value, the number of seizures during the fourth two week period.
Base
an integer value giving the eight-week baseline seizure count.
Age
an integer value giving the age of the parient in years.
Trt
the treatment: a factor with levels placebo
and progabide
.
Ysum
an integer value, the sum of Y1
, Y2
, Y3
and Y4
.
sumY
an integer value, the sum of Y1
, Y2
, Y3
and Y4
.
Age10
a numeric value, Age
divided by 10.
Base4
a numeric value, Base
divided by 4.
Breslow, N. E., and Clayton, D. G. (1993), "Approximate Inference in Generalized Linear Mixed Models," Journal of the American Statistical Association, Vol. 88, No. 421, pp. 9-25.
Thrall, P. F., and Vail, S. C. (1990), "Some Covariance Models for Longitudinal Count Data With Overdispersion," Biometrics, Vol. 46, pp. 657-671.
data(breslow.dat)
data(breslow.dat)
Compute an estimate of the covariance/correlation matrix and location vector using classical methods.
Its main intention is to return an object compatible to that
produced by covRob
, but fit using classical methods.
covClassic(data, corr = FALSE, center = TRUE, distance = TRUE, na.action = na.fail, unbiased = TRUE, ...)
covClassic(data, corr = FALSE, center = TRUE, distance = TRUE, na.action = na.fail, unbiased = TRUE, ...)
data |
a numeric matrix or data frame containing the data. |
corr |
a logical flag. If |
center |
a logical flag or a numeric vector of length |
distance |
a logical flag. If |
na.action |
a function to filter missing data. The default |
unbiased |
logical indicating if an unbiased estimate of the covariance matrix is should becomputed. If false, the maximum likelihood estimate is computed. |
... |
additional . |
a list with class “covClassic” containing the following elements:
call |
an image of the call that produced the object with all the arguments named. |
cov |
a numeric matrix containing the estimate of the covariance/correlation matrix. |
center |
a numeric vector containing the estimate of the location vector. |
dist |
a numeric vector containing the squared Mahalanobis distances. Only
present if |
corr |
a logical flag. If |
Originally, and in S-PLUS, this function was called cov
; it has
been renamed, as that did mask the function in the standard package
stats.
data(stack.dat) covClassic(stack.dat)
data(stack.dat) covClassic(stack.dat)
Compute robust estimates of multivariate location and scatter.
covRob(data, corr = FALSE, distance = TRUE, na.action = na.fail, estim = "auto", control = covRob.control(estim, ...), ...)
covRob(data, corr = FALSE, distance = TRUE, na.action = na.fail, estim = "auto", control = covRob.control(estim, ...), ...)
data |
a numeric matrix or data frame containing the data. |
corr |
a logical flag. If |
distance |
a logical flag. If |
na.action |
a function to filter missing data. The default |
estim |
a character string specifying the robust estimator to be used. The choices are: "mcd" for the Fast MCD algorithm of Rousseeuw and Van Driessen, "weighted" for the Reweighted MCD, "donostah" for the Donoho-Stahel projection based estimator, "M" for the constrained M estimator provided by Rocke, "pairwiseQC" for the orthogonalized quadrant correlation pairwise estimator, and "pairwiseGK" for the Orthogonalized Gnanadesikan-Kettenring pairwise estimator. The default "auto" selects from "donostah", "mcd", and "pairwiseQC" with the goal of producing a good estimate in a reasonable amount of time. |
control |
a list of control parameters to be used in the numerical algorithms. See |
... |
control parameters may be passed directly when |
The covRob
function selects a robust covariance estimator that is likely to provide a good estimate in a reasonable amount of time. Presently this selection is based on the problem size. The Donoho-Stahel estimator is used if there are less than 1000 observations and less than 10 variables or less than 5000 observations and less than 5 variables. If there are less than 50000 observations and less than 20 variables then the MCD is used. For larger problems, the Orthogonalized Quadrant Correlation estimator is used.
The MCD and Reweighted-MCD estimates (estim = "mcd"
and estim = "weighted"
respectively) are computed using the covMcd
function in the robustbase package. By default, covMcd
returns the reweighted estimate; the actual MCD estimate is contained in the components of the output list prefixed with raw
.
The M estimate (estim = "M"
) is computed using the CovMest
function in the rrcov package. For historical reasons the Robust Library uses the MCD to compute the initial estimate.
The Donoho-Stahel (estim = "donostah"
) estimator is computed using the CovSde
function provided in the rrcov package.
The pairwise estimators (estim = "pairwisegk"
and estim = "pairwiseqc"
) are computed using the CovOgk
function in the rrcov package.
an object of class "covRob
" with components:
call |
an image of the call that produced the object with all the arguments named. |
cov |
a numeric matrix containing the final robust estimate of the covariance/correlation matrix. |
center |
a numeric vector containing the final robust estimate of the location vector. |
dist |
a numeric vector containing the squared Mahalanobis distances computed using robust estimates of covariance and location contained in |
raw.cov |
a numeric matrix containing the initial robust estimate of the covariance/correlation matrix. If there is no initial robust estimate then this element is set to |
raw.center |
a numeric vector containing the initial robust estimate of the location vector. If there is no initial robust estimate then this element is set to |
raw.dist |
a numeric vector containing the squared Mahalanobis distances computed using the initial robust estimates of covariance and location contained in |
corr |
a logical flag. If |
estim |
a character string containing the name of the robust estimator. |
control |
a list containing the control parameters used by the robust estimator. |
Version 0.3-8 of the Robust Library: all of the functions origianlly contributed by the S-Plus Robust Library have been replaced by dependencies on the robustbase and rrcov packages. Computed results may differ from earlier versions of the Robust Library. In particular, the MCD estimators are now adjusted by a small sample size correction factor. Additionally, a bug was fixed where the final MCD covariance estimate produced with estim = "mcd"
was not rescaled for consistency.
R. A. Maronna and V. J. Yohai (1995) The Behavior of the Stahel-Donoho Robust Multivariate Estimator. Journal of the American Statistical Association 90 (429), 330–341.
P. J. Rousseeuw and K. van Driessen (1999) A fast algorithm for the minimum covariance determinant estimator. Technometrics 41, 212–223.
D. L. Woodruff and D. M. Rocke (1994) Computable robust estimation of multivariate location and shape on high dimension using compound estimators. Journal of the American Statistical Association, 89, 888–896.
R. A. Maronna and R. H. Zamar (2002) Robust estimates of location and dispersion of high-dimensional datasets. Technometrics 44 (4), 307–317.
CovSde
,
covMcd
,
CovOgk
,
CovMest
,
covRob.control
,
covClassic
.
data(stackloss) covRob(stackloss)
data(stackloss) covRob(stackloss)
This function is used to create a list of control parameters for the underlying robust estimator used in the covRob
function.
covRob.control(estim, ...)
covRob.control(estim, ...)
estim |
a character vector of length one giving the name of the estimator to generate the control parameters for. |
... |
control parameters appropriate for the robust estimator specified in |
The control parameters are estimator specific. Information on the control parameters (and their default values) can be found in the help files of each of the robust covariance estimators.
a list of control parameters appropriate for the robust estimator given in estim
. The value of estim
occupies the first element of the list.
This function is a utility function for covRob
.<br>
The underlying robust estimators are: CovSde
, covMcd
and CovOgk
. Power-users should consider calling these functions directly.
mcd.control <- covRob.control("mcd", quan = 0.75, ntrial = 1000) ds.control <- covRob.control("donostah", prob = 0.95) qc.control <- covRob.control("pairwiseqc")
mcd.control <- covRob.control("mcd", quan = 0.75, ntrial = 1000) ds.control <- covRob.control("donostah", prob = 0.95) qc.control <- covRob.control("pairwiseqc")
For a covfm
object containing 2 models, this function plots the
Mahalanobis distance from the first model on the y-axis and the
Mahalanobis distance from the second model on the x-axis.
ddPlot.covfm(x, level = 0.95, strip = "", id.n = 3, ...)
ddPlot.covfm(x, level = 0.95, strip = "", id.n = 3, ...)
x |
a |
level |
a single numeric value between 0 and 1 giving the chi-squared percent point used to compute the outlyingness threshold. |
strip |
a character string printed in the “strip” at the top of the plot. |
id.n |
a single nonnegative integer specifying the number of extreme points to label in the plot. |
... |
additional arguments are passed to |
if the models can be compared then the plotted trellis
object is
invisibly returned. Otherwise x
is returned invisibly.
data(woodmod.dat) woodm.fm <- fit.models(list(Robust = "covRob", Classical = "covClassic"), data = woodmod.dat) ddPlot.covfm(woodm.fm, main = "Plot Title", xlab = "x-axis label", ylab = "y-axis label", pch = 4, col = "purple")
data(woodmod.dat) woodm.fm <- fit.models(list(Robust = "covRob", Classical = "covClassic"), data = woodmod.dat) ddPlot.covfm(woodm.fm, main = "Plot Title", xlab = "x-axis label", ylab = "y-axis label", pch = 4, col = "purple")
Produces side-by-side plots of Mahalanobis distance computed using the location and covariance matrix estimates contained in each element of a covfm
object.
distancePlot.covfm(x, level = 0.95, id.n = 3, ...)
distancePlot.covfm(x, level = 0.95, id.n = 3, ...)
x |
a |
level |
a single numeric value between 0 and 1 giving the chi-squared percent point used to compute the outlyingness threshold. |
id.n |
a single nonnegative integer specifying the number of extreme points to label in the plot. |
... |
additional arguments are passed to |
the trellis
object is invisibly returned.
data(woodmod.dat) woodm.fm <- fit.models(list(Robust = "covRob", Classical = "covClassic"), data = woodmod.dat) distancePlot.covfm(woodm.fm, main = "Plot Title", xlab = "x-axis label", ylab = "y-axis label", pch = 4, col = "purple")
data(woodmod.dat) woodm.fm <- fit.models(list(Robust = "covRob", Classical = "covClassic"), data = woodmod.dat) distancePlot.covfm(woodm.fm, main = "Plot Title", xlab = "x-axis label", ylab = "y-axis label", pch = 4, col = "purple")
drop1.lmRob
is used to investigate a robust Linear Model object by
recomputing it, successively omitting each of a number of specified terms.
## S3 method for class 'lmRob' drop1(object, scope, scale, keep, fast = FALSE, ...)
## S3 method for class 'lmRob' drop1(object, scope, scale, keep, fast = FALSE, ...)
object |
an lmRob object. |
scope |
an optional |
scale |
a single numeric value containing a residual scale estimate. If missing, the scale estimate in |
keep |
a character vector of names of components that should be saved for each subset model. Only names from the set |
fast |
a logical value. If |
... |
additional arguments required by the generic drop1 function. |
This function is a method for the generic function drop1
for class "lmRob"
.
An anova
object is constructed, consisting of the term labels, the degrees of freedom, and Robust Final Prediction Errors (RFPE) for each subset model. If keep
is missing, the anova
object is returned. If keep
is present, a list with components "anova"
and "keep"
is returned. In this case, the "keep"
component is a matrix of mode "list"
, with a column for each subset model, and a row for each component kept.
data(stack.dat) stack.rob <- lmRob(Loss ~ ., data = stack.dat) drop1(stack.rob)
data(stack.dat) stack.rob <- lmRob(Loss ~ ., data = stack.dat) drop1(stack.rob)
When there are 3 or more variables in the data, this function produces a matrix with ellipses drawn in the upper triangle. The ellipse in cell of the plot is drawn to be a contour of a standard bivariate normal with correlation
. One ellipse is drawn in each cell for each model in the
covfm
object. When there are 2 variables in the data, this function produces a scatter plot of the data with an overlaid 95% confidence ellipse for each model in the covfm
object.
ellipsesPlot.covfm(x, ...)
ellipsesPlot.covfm(x, ...)
x |
a |
... |
additional arguments are ignored. |
x is invisibly returned.
data(woodmod.dat) woodm.fm <- fit.models(list(Robust = "covRob", Classical = "covClassic"), data = woodmod.dat) ellipsesPlot.covfm(woodm.fm)
data(woodmod.dat) woodm.fm <- fit.models(list(Robust = "covRob", Classical = "covClassic"), data = woodmod.dat) ellipsesPlot.covfm(woodm.fm)
Maximum-likelihood fitting of univariate distributions.
fitdstn(x, densfun, ...)
fitdstn(x, densfun, ...)
x |
a numeric vector containing the sample. |
densfun |
a character string naming the distribution. Distributions ‘gamma’, ‘lognormal’, and ‘weibull’ are supported. |
... |
additional arguments are ignored. |
This function relies on the fitdistr
function for
the computations. The returned object is modified to support plotting
and comparison.
a list with class “fitdstn” containing the following elements:
estimate |
a named numeric vector containing the parameter estimates. |
sd |
a named numeric vector containing the standard deviations of the parameter estimates. |
vcov |
a numeric matrix containing the variance-covariance matrix of the estimated parameter vector. |
n |
a single numeric value indicating the number of sample points in |
loglik |
a single numeric value giving the maxized the log-likelihood. |
call |
the matched call. |
densfun |
the character string |
x |
the data provided in |
The print
method displays the estimated parameters and their
standard errors (in parentheses).
An important goal here is the comparison with robust fits to
the same distributions, see fitdstnRob
.
fitdistr
which provides many more choices for
densfun
.
Robust Fitting of Univariate Distributions.
fitdstnRob(x, densfun, ...)
fitdstnRob(x, densfun, ...)
x |
A numeric vector containing the sample. |
densfun |
a character string naming the distribution. Distributions ‘gamma’, ‘lognormal’, and ‘weibull’ are recognized. |
... |
additional arguments are passed to the fitting functions. |
a list with class “fitdstn” containing the following elements:
estimate |
a named numeric vector containing the parameter estimates. |
sd |
a named numeric vector containing the standard deviations of the parameter estimates. |
vcov |
a numeric matrix containing the variance-covariance matrix of the estimated parameter vector. |
mu |
a single numeric value containing an estimate of the mean. |
V.mu |
a single numeric value containing the variance of the estimated mean. |
control |
a list containing the control parameters used by the estimator. |
call |
the matched call. |
densfun |
the character string |
x |
the data provided in |
The print
method displays the estimated parameters and their standard errors (in parentheses).
gammaRob
, lognormRob
,
weibullRob
.
The classical counterparts, see fitdstn
.
Robust estimation of gamma distribution parameters
gammaRob(x, estim = c("M", "tdmean"), control = gammaRob.control(estim, ...), ...)
gammaRob(x, estim = c("M", "tdmean"), control = gammaRob.control(estim, ...), ...)
x |
a numeric vector containing the sample. |
estim |
a character string specifying which estimator to use. |
control |
a list of control parameters appropriate for the estimator in |
... |
control parameters may also be given here. |
a list with class “fitdstn” containing the following elements:
estimate |
a named numeric vector containing the parameter estimates. |
sd |
a named numeric vector containing the standard deviations of the parameter estimates. |
vcov |
a numeric matrix containing the variance-covariance matrix of the estimated parameter vector. |
mu |
a single numeric value containing an estimate of the mean. |
V.mu |
a single numeric value containing the variance of the estimated mean. |
control |
a list containing the control parameters used by the estimator. |
The print
method displays the estimated parameters and their standard errors (in parentheses).
Create a list of control parameters for the gammaRob
function.
gammaRob.control(estim, ...)
gammaRob.control(estim, ...)
estim |
a character string specifying the estimator. |
... |
control parameters appropriate for the estimator given in |
a list of control parameters appropriate for the specified estimator.
Generates a random dataset with some amount of contaimination.
gen.data(coeff, n = 100, eps = 0.1, sig = 3, snr = 1/20, seed = 837)
gen.data(coeff, n = 100, eps = 0.1, sig = 3, snr = 1/20, seed = 837)
coeff |
a numeric vector of length 3 containing the true coefficients. |
n |
a positive integer giving the number of observations in the data set. |
eps |
a numeric value between 0 and 0.5 specifying the fraction of contamination. |
sig |
a positive numeric value giving the standard deviation of the uncontaminated data. |
snr |
a positive numeic value giving the signal to noise ratio, well not really. |
seed |
an integer value giving the seed for the random number generator. |
a data frame with n
rows and 4 columns. The regressors are generated as: rnorm(n,1)
, rnorm(n,1)^3
, exp(rnorm(n,1))
. It also generates an unused vector x4
.
Produces an object of class glmRob
which is a Robust Generalized Linear Model fit.
glmRob(formula, family = binomial(), data, weights, subset, na.action, method = "cubif", model = TRUE, x = FALSE, y = TRUE, control = glmRob.control, contrasts = NULL, ...)
glmRob(formula, family = binomial(), data, weights, subset, na.action, method = "cubif", model = TRUE, x = FALSE, y = TRUE, control = glmRob.control, contrasts = NULL, ...)
formula |
a formula expression as for other regression models, of the form response ~ predictors. See the documentation of |
family |
a family object - only |
data |
an optional data frame in which to interpret the variables occuring in the formula. |
weights |
an optional vector of weights to be used in the fitting process. Should be |
subset |
an expression specifying the subset of the data to which the model is fit. This can be a logical vector (which is replicated to have length equal to the number of observations), a numeric vector indicating which observations are included, or a character vector of the row names to be included. By default all observations are used. |
na.action |
a function to filter missing data. This is applied to the |
method |
a character vector indicating the fitting method. The choices are |
model |
a logical flag. If |
x |
a logical flag. If |
y |
a logical flag. If |
contrasts |
a list of contrasts to be used for some or all of the factors appearing as variables in the model formula. The names of the list should be the names of the corresponding variables, and the elements should either be contrast-type matrices (matrices with as many rows as levels of the factor and with columns linearly independent of each other and of a column of one's), or else they should be functions that compute such contrast matrices. |
control |
a list of iteration and algorithmic constants to control the conditionally unbiased bounded influence robust fit. See |
... |
control arguments maybe specified directly. |
a list with class glmRob
containing the robust generalized linear model fit. See glmRob.object
for details.
Copas, J. B. (1988). Binary Regression Models for Contaminated Data. JRSS 50, 225-265.
Kunsch, L., Stefanski L. and Carroll, R. (1989). Conditionally Unbiased Bounded-Influence Estimation in General Regression Models, with Applications to Generalized Linear Models. JASA 50, 460-466.
Carroll, R. J. and Pederson, S. (1993). On Robustness in the Logistic Regression Model. JRSS 55, 693-706.
Marazzi, A. (1993). Algorithms, routines and S functions for robust statistics. Wadsworth & Brooks/Cole, Pacific Grove, CA.
glmRob.control
,
glmRob.object
,
glmRob.cubif.control
,
glmRob.mallows.control
,
glmRob.misclass.control
,
glm
.
data(breslow.dat) glmRob(sumY ~ Age10 + Base4*Trt, family = poisson(), data = breslow.dat, method = "cubif")
data(breslow.dat) glmRob(sumY ~ Age10 + Base4*Trt, family = poisson(), data = breslow.dat, method = "cubif")
Generates a list of control parameters for glmRob. The main purpose of this function is to implement the default behaviour for glmRob. Use the functions listed in the See Also section to generate control lists for the different robust estimators.
glmRob.control(method, ...)
glmRob.control(method, ...)
method |
a character vector specifying which extimator the control parameters should be generated for. The choices are |
... |
additional arguments are included in the control (if appropriate for the estimator specified by |
a list of control parameters appropriate for the fitting method specified by the method
argument.
glmRob.cubif.control
,
glmRob.mallows.control
,
glmRob.misclass.control
.
Robustly fit a generalized linear model using a
conditionally unbiased bounded
influence (“cubif”) estimator. This function is
called by the high-level function glmRob
when
method = "cubif"
(the default) is specified.
glmRob.cubif(x, y, intercept = FALSE, offset = 0, family = binomial(), null.dev = TRUE, control)
glmRob.cubif(x, y, intercept = FALSE, offset = 0, family = binomial(), null.dev = TRUE, control)
x |
a numeric model matrix. |
y |
either a numeric vector containing the response or, in the case of the binomial family, a two-column numeric matrix containing the number of successes and failures. |
intercept |
a logical value. If |
offset |
a numeric vector containing the offset. |
family |
a family object. |
null.dev |
a logical value. If |
control |
a list of control parameters. See |
See glmRob.object
.
Kunsch, L., Stefanski L. and Carroll, R. (1989). Conditionally Unbiased Bounded-Influence Estimation in General Regression Models, with Applications to Generalized Linear Models. JASA 84, 460–466.
Marazzi, A. (1993). Algorithms, routines and S functions for robust statistics. Wadsworth & Brooks/Cole, Pacific Grove, CA.
Allows users to set parameters for glmRob
.
glmRob.cubif.control(epsilon = 0.001, maxit = 50, bpar = 2, cpar = 1.5, trc = FALSE, ...)
glmRob.cubif.control(epsilon = 0.001, maxit = 50, bpar = 2, cpar = 1.5, trc = FALSE, ...)
epsilon |
a positive numeric values specifying the convergence threshold for the parameters. |
maxit |
a positive integer giving the maximum number of iterations. |
bpar |
bpar |
cpar |
a single positive numeric value specifying the tuning constant for the initial estimate. This is the truncation value for the likelihood equation for the initial estimate. It determines the starting point of the iterative algorithm to calculate the final estimate. |
trc |
a logical value. If |
... |
additional arguments are ignored. |
a list is returned containing the values specified in the Arguments section.
Computes the Mallows Type Estimator provided by glmRob.
glmRob.mallows(x, y, control, offset, null.dev, family, Terms)
glmRob.mallows(x, y, control, offset, null.dev, family, Terms)
x |
model matrix |
y |
a numeric vector of Bernoulli responses. |
control |
control parameters. |
offset |
offset |
null.dev |
a logical value. If |
family |
a binomial family object. |
Terms |
the |
a list similar to glmRob.object
.
link{glmRob}
data(mallows.dat) glmRob(y ~ a + b + c, data = mallows.dat, family = binomial(), method = 'mallows')
data(mallows.dat) glmRob(y ~ a + b + c, data = mallows.dat, family = binomial(), method = 'mallows')
Allows users to set parameters for glmRob
.
glmRob.mallows.control(wt.fn = wt.carroll, wt.tuning = 8, ...)
glmRob.mallows.control(wt.fn = wt.carroll, wt.tuning = 8, ...)
wt.fn |
a weight function that might depend on a tuning constant. This function will be evaluated at the square root of the robust Mahalanobis distances of the covariates divided by their dimension. |
wt.tuning |
a tuning constant for |
... |
additional arguments are ignored. |
a list is returned, consisting of these parameters packaged to be used by glmRob()
. The values for glmRob.mallows.control()
can be supplied directly in a call to glmRob()
. These values are filtered through glmRob.mallows.control()
inside glmRob()
.
Computes the consistent misclassification estimate provided in glmRob
.
glmRob.misclass(x, y, control, offset, null.dev, family, Terms)
glmRob.misclass(x, y, control, offset, null.dev, family, Terms)
x |
model matrix. |
y |
response. |
control |
control parameters. |
offset |
offset. |
null.dev |
a logical value. |
family |
a binomial family object. |
Terms |
the Terms object computed in glmRob. |
a list similar to glmRob.object
.
data(leuk.dat) glmRob(y ~ ag + wbc, data = leuk.dat, family = binomial(), method = 'misclass')
data(leuk.dat) glmRob(y ~ ag + wbc, data = leuk.dat, family = binomial(), method = 'misclass')
Allows users to set parameters for glmRob
.
glmRob.misclass.control(mc.gamma = 0.01, mc.maxit = 30, mc.trc = FALSE, mc.tol = 0.001, mc.initial = NULL, ...)
glmRob.misclass.control(mc.gamma = 0.01, mc.maxit = 30, mc.trc = FALSE, mc.tol = 0.001, mc.initial = NULL, ...)
mc.gamma |
a real number between 0 and 1 that represents the probability of misclassification of a response variable. |
mc.maxit |
maximum number of iterations. |
mc.trc |
a logical value indicating whether a trace of the current parameter values is printed to the screen while the algorithm iterates. |
mc.tol |
convergence threshold. |
mc.initial |
a vector of initial values to start the iterations. If ommited, the coeficients resulting from a non-robust glm fit are used. |
... |
additional arguments are ignored. |
a list containing the parameters packaged to be used by glmRob
. The values for glmRob.misclass.control
can be supplied directly in a call to glmRob
. These values are filtered through glmRob.misclass.control
inside glmRob
.
These are objects of class glmRob
which represent the robust fit of a generalized linear regression model, as estimated by glmRob()
.
coefficients |
the coefficients of the |
linear.predictors |
the linear fit, given by the product of the model matrix and the coefficients. |
fitted.values |
the fitted mean values, obtained by transforming
|
residuals |
the residuals from the final fit; also known as working residuals, they are typically not interpretable. |
deviance |
up to a constant, minus twice the log-likelihood evaluated at the final
|
null.deviance |
the deviance corresponding to the model with no predictors. |
family |
a 3 element character vector giving the name of the family, the link and the variance function. |
rank |
the number of linearly independent columns in the model matrix. |
df.residuals |
the number of degrees of freedom of the residuals. |
call |
a copy of the call that produced the object. |
assign |
the same as the |
contrasts |
the same as the |
terms |
the same as the |
ni |
vector of the number of repetitions on the dependent variable. If the model
is poisson then |
weights |
weights from the final fit. |
iter |
number of iterations used to compute the estimates. |
y |
the dependent variable. |
contrasts |
the same as the |
anova
,
coefficients
,
deviance
,
fitted.values
,
family
, formula
,
plot
, print
,
residuals
,
summary
.
The following components must be included in a legitimate
"glmRob"
object. Residuals, fitted values, and
coefficients should be extracted by the generic functions of the same name,
rather than by the "\$"
operator. The
family
function returns the entire family
object used in the fitting, and deviance
can
be used to extract the deviance of the fit.
An exmaple data set for the misclassification fitter in glmRob.
data(leuk.dat)
data(leuk.dat)
A data frame with 33 observations on the following 3 variables.
wbc
a numeric vector.
ag
a numeric vector.
y
a numeric vector.
Don't know - if you know please email the package maintainer.
data(leuk.dat)
data(leuk.dat)
Performs a robust linear regression with high breakdown point and high efficiency regression.
lmRob(formula, data, weights, subset, na.action, model = TRUE, x = FALSE, y = FALSE, contrasts = NULL, nrep = NULL, control = lmRob.control(...), ...)
lmRob(formula, data, weights, subset, na.action, model = TRUE, x = FALSE, y = FALSE, contrasts = NULL, nrep = NULL, control = lmRob.control(...), ...)
formula |
a |
data |
a |
weights |
vector of observation weights; if supplied, the algorithm fits to minimize the sum of a function of the square root of the weights multiplied into the residuals. The length of |
subset |
expression saying which subset of the rows of the data should be used in the fit. This can be a logical vector (which is replicated to have length equal to the number of observations), or a numeric vector indicating which observation numbers are to be included, or a character vector of the row names to be included. All observations are included by default. |
na.action |
a function to filter missing data. This is applied to the |
model |
a logical flag: if |
x |
a logical flag: if |
y |
a logical flag: if |
contrasts |
a list giving contrasts for some or all of the factors appearing in the model formula. The elements of the list should have the same name as the variable and should be either a contrast matrix (specifically, any full-rank matrix with as many rows as there are levels in the factor), or else a function to compute such a matrix given the number of levels. |
nrep |
the number of random subsamples to be drawn. If |
control |
a list of control parameters to be used in the numerical algorithms. See |
... |
additional arguments are passed to the ccontrol functions. |
By default, the lmRob
function automatically chooses an appropriate algorithm to compute a final robust estimate with high breakdown point and high efficiency. The final robust estimate is computed based on an initial estimate with high breakdown point. For the initial estimation, the alternate M-S estimate is used if there are any factor variables in the predictor matrix, and an S-estimate is used otherwise. To compute the S-estimate, a random resampling or a fast procedure is used unless the data set is small, in which case exhaustive resampling is employed. See lmRob.control
for how to choose between the different algorithms.
a list describing the regression. Note that the solution returned here is an approximation to the true solution based upon a random algorithm (except when "Exhaustive"
resampling is chosen). Hence you will get (slightly) different answers each time if you make the same call with a different seed. See lmRob.control
for how to set the seed, and see lmRob.object
for a complete description of the object returned.
Gervini, D., and Yohai, V. J. (1999). A class of robust and fully efficient regression estimates; mimeo, Universidad de Buenos Aires.
Marazzi, A. (1993). Algorithms, routines, and S functions for robust statistics. Wadsworth & Brooks/Cole, Pacific Grove, CA.
Maronna, R. A., and Yohai, V. J. (2000). Robust regression with both continuous and categorical predictors. Journal of Statistical Planning and Inference 89, 197–214.
Pena, D., and Yohai, V. (1999). A Fast Procedure for Outlier Diagnostics in Large Regression Problems. Journal of the American Statistical Association 94, 434–445.
Yohai, V. (1988). High breakdown-point and high efficiency estimates for regression. Annals of Statistics 15, 642–665.
Yohai, V., Stahel, W. A., and Zamar, R. H. (1991). A procedure for robust estimation and inference in linear regression; in Stahel, W. A. and Weisberg, S. W., Eds., Directions in robust statistics and diagnostics, Part II. Springer-Verlag.
data(stack.dat) stack.rob <- lmRob(Loss ~ ., data = stack.dat)
data(stack.dat) stack.rob <- lmRob(Loss ~ ., data = stack.dat)
Allows the users to set values affecting the estimation procedure for
robust regression in lmRob
.
lmRob.control(tlo = 1e-4, tua = 1.5e-06, mxr = 50, mxf = 50, mxs = 50, tl = 1e-06, estim = "Final", initial.alg = "Auto", final.alg = "MM", seed = 1313, level = 0.1, efficiency = 0.9, weight = c("Optimal", "Optimal"), trace = TRUE)
lmRob.control(tlo = 1e-4, tua = 1.5e-06, mxr = 50, mxf = 50, mxs = 50, tl = 1e-06, estim = "Final", initial.alg = "Auto", final.alg = "MM", seed = 1313, level = 0.1, efficiency = 0.9, weight = c("Optimal", "Optimal"), trace = TRUE)
tlo |
the relative tolerance in the iterative algorithms. |
tua |
the tolerance used for the determination of pseudo-rank. |
mxr |
the maximum number of iterations in the refinement step. |
mxf |
the maximum number of iterations for computing final coefficient estimates. |
mxs |
the maximum number of iterations for computing scale estimate. |
tl |
the tolerance for scale denominators. If a scale estimate becomes less than |
estim |
parameter that determines the type of estimator to be computed. If |
initial.alg |
parameter that determines the algorithm for initial estimates. Valid choices are |
final.alg |
parameter that determines the type of the final estimates. Valid choices are |
seed |
seed parameter used in the random sampling and genetic algorithm for the computation of initial estimates. |
weight |
a character vector that determines the type of loss functions to be used. The first determines the loss function used for the initial estimates, and the second determines the loss function used for the final M-estimates. Valid choices are |
level |
the level of significance of the test for bias of the final MM-estimates, if desired later on. |
efficiency |
the asymptotic efficiency of the final estimate. |
trace |
a logical flag: if |
a list containing the values used for each of the control parameters.
data(stack.dat) my.control <- lmRob.control(weight=c("Bisquare","Optimal")) stack.bo <- lmRob(Loss ~ ., data = stack.dat, control = my.control)
data(stack.dat) my.control <- lmRob.control(weight=c("Bisquare","Optimal")) stack.bo <- lmRob(Loss ~ ., data = stack.dat, control = my.control)
These are the basic computing engines called by lmRob
used to robustly fit linear models. These functions are not intended to be used directly.
lmRob.fit(x, y, x1.idx = NULL, nrep = NULL, robust.control = NULL, ...) lmRob.wfit(x, y, w, x1.idx = NULL, nrep = NULL, robust.control = NULL, ...)
lmRob.fit(x, y, x1.idx = NULL, nrep = NULL, robust.control = NULL, ...) lmRob.wfit(x, y, w, x1.idx = NULL, nrep = NULL, robust.control = NULL, ...)
x |
a numeric matrix containing the design matrix. |
y |
a numeric vector containing the linear model response. |
w |
a numeric vector containing the weights. |
x1.idx |
a numeric vector containing the indices of columns of the design matrix arising from the coding of factor variables. |
nrep |
the number of random subsamples to be drawn. If |
robust.control |
a list of control parameters to be used in the numerical algorithms. See |
... |
additional arguments. |
Fits a robust linear model with high breakdown point and high efficiency estimates. This is used by lmRob
, but not supposed to be called by the users directly.
lmRob.fit.compute(x, y, x1.idx = NULL, nrep = NULL, robust.control = NULL, ...)
lmRob.fit.compute(x, y, x1.idx = NULL, nrep = NULL, robust.control = NULL, ...)
x |
a numeric matrix containing the design matrix. |
y |
a numeric vector containing the linear model response. |
x1.idx |
a numeric vector containing the indices of columns of the design matrix arising from the coding of factor variables. |
nrep |
the number of random subsamples to be drawn. If |
robust.control |
a list of control parameters to be used in the numerical algorithms. See |
... |
additional arguments. |
an object of class "lmRob"
. See lmRob.object
for a complete description of the object returned.
Gervini, D., and Yohai, V. J. (1999). A class of robust and fully efficient regression estimates, mimeo, Universidad de Buenos Aires.
Marazzi, A. (1993). Algorithms, routines, and S functions for robust statistics. Wadsworth & Brooks/Cole, Pacific Grove, CA.
Maronna, R. A., and Yohai, V. J. (1999). Robust regression with both continuous and categorical predictors, mimeo, Universidad de Buenos Aires.
Yohai, V. (1988). High breakdown-point and high efficiency estimates for regression, Annals of Statistics, 15, 642-665.
Yohai, V., Stahel, W. A., and Zamar, R. H. (1991). A procedure for robust estimation and inference in linear regression, in Stahel, W. A. and Weisberg, S. W., Eds., Directions in robust statistics and diagnostics, Part II. Springer-Verlag.
These are objects of class lmRob
which represent the robust fit of a linear regression model, as estimated by lmRob
function.
coefficients |
vector of coefficients for the robust regression. If |
T.coefficients |
the vector of coefficients for the initial estimate, if |
scale |
the scale estimate computed using the initial estimates. |
residuals |
the residual vector corresponding to the estimates returned in |
T.residuals |
the residual vector corresponding to the estimates returned in |
fitted.values |
the fitted values corresponding to the estimates returned in |
T.fitted.values |
the fitted values corresponding to the estimates returned in |
cov |
the estimated covariance matrix of the estimates in |
T.cov |
the estimated covariance matrix of the estimates in |
rank |
the rank of the design matrix |
iter.refinement |
the number of iterations required to refine the initial estimates. |
df.residuals |
the degrees of freedom in the residuals (the number of rows in |
est |
a character string that specifies the type of estimates returned. If |
control |
a list of control parameters, passed to the function |
genetic.control |
a list of control parameters, passed to the function |
dev |
the robust deviance if final MM-estimates are returned. |
T.dev |
the robust deviance corresponding to initial S-estimates if applies. |
r.squared |
the fraction of variation in |
T.r.squared |
the fraction of variation in |
M.weights |
the robust estimate weights corresponding to the final MM-estimates in |
T.M.weights |
the robust estimate weights corresponding to the initial S-estimates in |
iter.final.coef |
the number of iterations required to compute the final MM-estimates of the coefficients, if applies. |
call |
an image of the call that produced the object, but with the arguments all named and with the actual formula included as the |
assign |
the same as the |
contrasts |
the same as the |
terms |
the same as the |
This class of objects is returned from the lmRob
function.
add1
, anova
, coef
, deviance
, drop1
, fitted
, formula
, labels
, plot
, print
, residuals
, summary
, update
.
The following components must be included in a legitimate "lmRob"
object:
Computes the robust Final Prediction Errors (FPE) for a robust regression fit using M-estimates.
lmRob.RFPE(object, scale = NULL)
lmRob.RFPE(object, scale = NULL)
object |
an lmRob object. |
scale |
a numeric value specifying the scale estimate used to compute the robust FPE. Usually this should
be the scale estimate from an encompassing model. If |
a single numeric value giving the robust final prediction error.
lmRob
,
step.lmRob
,
drop1.lmRob
.
data(stack.dat) stack.rob <- lmRob(Loss ~ ., data = stack.dat) lmRob.RFPE(stack.rob)
data(stack.dat) stack.rob <- lmRob(Loss ~ ., data = stack.dat) lmRob.RFPE(stack.rob)
Robust estimation of lognormal distribution parameters.
lognormRob(x, estim = c("tdmean"), control = lognormRob.control(estim, ...), ...)
lognormRob(x, estim = c("tdmean"), control = lognormRob.control(estim, ...), ...)
x |
a numeric vector containing the sample. |
estim |
a character string specifying which estimator to use. |
control |
a list of control parameters appropriate for the estimator in |
... |
control parameters may also be given here. |
a list with class “fitdstn” containing the following elements:
estimate |
a named numeric vector containing the parameter estimates. |
sd |
a named numeric vector containing the standard deviations of the parameter estimates. Missing in current implementation. |
vcov |
a numeric matrix containing the variance-covariance matrix of the estimated parameter vector. Missing in current implementation. |
mu |
a single numeric value containing an estimate of the mean. |
V.mu |
a single numeric value containing the variance of the estimated mean. |
control |
a list containing the control parameters used by the estimator. |
The print
method displays the estimated parameters and their
standard errors (in parentheses).
lognormRob.control
, fitdstnRob
.
Create a list of control parameters for the lognormRob
function.
lognormRob.control(estim, ...)
lognormRob.control(estim, ...)
estim |
a character string specifying the estimator. |
... |
control parameters appropriate for the estimator given in |
a list of control parameters appropriate for the specified estimator.
Test for bias between least-squares and robust MM linear regression estimates.
lsRobTest(object, test = c("T2", "T1"), ...)
lsRobTest(object, test = c("T2", "T1"), ...)
object |
an |
test |
either |
... |
additional arguments are ignored. |
rob.fit <- lmRob(stack.loss ~ ., data = stackloss) lsRobTest(rob.fit) lsRobTest(rob.fit, test = "T1")
rob.fit <- lmRob(stack.loss ~ ., data = stackloss) lsRobTest(rob.fit) lsRobTest(rob.fit, test = "T1")
An exmaple data set for the mallows fitter in glmRob.
data(mallows.dat)
data(mallows.dat)
A data frame with 70 observations on the following 4 variables.
y
a numeric vector.
a
a numeric vector.
b
a numeric vector.
c
a numeric vector.
Don't know - if you know please email the package maintainer.
data(mallows.dat)
data(mallows.dat)
Plot the estimated densities over a histogram of the data.
overlaidDenPlot.fdfm(x, trunc = 1.0 - 1e-3, ...)
overlaidDenPlot.fdfm(x, trunc = 1.0 - 1e-3, ...)
x |
an |
trunc |
if non NULL, the maximum x-value of the plot is the
largest |
... |
additional arguments are passed to the plotting functions. |
x
is invisibly returned.
data(los, package="robustbase") ## Not run: los.fm <- fit.models(c(Robust = "fitdstnRob", MLE = "fitdstn"), x = los, densfun = "gamma") los.fm <- fit.models(c(Robust = "fitdstnRob", MLE = "fitdstn"), x = los, densfun = "weibull") overlaidDenPlot.fdfm(los.fm, xlab = "x-axis label", ylab = "y-axis label", main = "Plot Title") ## End(Not run)
data(los, package="robustbase") ## Not run: los.fm <- fit.models(c(Robust = "fitdstnRob", MLE = "fitdstn"), x = los, densfun = "gamma") los.fm <- fit.models(c(Robust = "fitdstnRob", MLE = "fitdstn"), x = los, densfun = "weibull") overlaidDenPlot.fdfm(los.fm, xlab = "x-axis label", ylab = "y-axis label", main = "Plot Title") ## End(Not run)
Generic plot method for objects with classes “covfm”, “covRob”, and “covClassic”.
## S3 method for class 'covfm' plot(x, which.plots = c(4, 3, 5), ...) ## S3 method for class 'covRob' plot(x, which.plots = c(4, 3, 5), ...) ## S3 method for class 'covClassic' plot(x, which.plots = c(4, 3, 5), ...)
## S3 method for class 'covfm' plot(x, which.plots = c(4, 3, 5), ...) ## S3 method for class 'covRob' plot(x, which.plots = c(4, 3, 5), ...) ## S3 method for class 'covClassic' plot(x, which.plots = c(4, 3, 5), ...)
x |
an oject of class "covClassic", "covRob", or "covfm". |
which.plots |
either "ask", "all", or an integer vector specifying which plots to draw. If which.plots is an integer vector, use the plot numbers given here (or in the "ask" menu). The plot options are (2) Eigenvalues of Covariance Estimate, (3) Sqrt of Mahalanobis Distances, (4) Ellipses Matrix, and (5) Distance - Distance Plot. |
... |
additional arguments are passed to the plot subfunctions. |
The actual plot functions are only implemented for "fit.models" objects. When this method is dispatched on an object of class "cov" or "covRob" the object is cast as a "fit.models" object containing a single element and plotted with plot.covfm
. The actual plotting is done by the subfunctions listed in the See Also section.
x
is invisibly returned.
The requested plots are drawn on a graphics device.
plot
,
covClassic
,
covRob
,
fit.models
,
ddPlot.covfm
,
ellipsesPlot.covfm
,
screePlot.covfm
,
distancePlot.covfm
.
data(woodmod.dat) woodm.cov <- covClassic(woodmod.dat) woodm.covRob <- covRob(woodmod.dat) plot(woodm.cov) plot(woodm.covRob) woodm.fm <- fit.models(list(Robust = "covRob", Classical = "covClassic"), data = woodmod.dat) plot(woodm.fm)
data(woodmod.dat) woodm.cov <- covClassic(woodmod.dat) woodm.covRob <- covRob(woodmod.dat) plot(woodm.cov) plot(woodm.covRob) woodm.fm <- fit.models(list(Robust = "covRob", Classical = "covClassic"), data = woodmod.dat) plot(woodm.fm)
Comparison plots for fitted univariate distributions.
## S3 method for class 'fdfm' plot(x, which.plots = 2:3, ...)
## S3 method for class 'fdfm' plot(x, which.plots = 2:3, ...)
x |
an |
which.plots |
either "ask", "all", or an integer vector specifying which plots to draw. In the latter case, use the plot numbers given in the "ask" menu. |
... |
additional arguments are passed to the plotting functions. |
x
is invisibly returned.
data(los, package = "robustbase") los.fm <- fit.models(c(Robust = "fitdstnRob", MLE = "fitdstn"), x = los, densfun = "gamma") plot(los.fm)
data(los, package = "robustbase") los.fm <- fit.models(c(Robust = "fitdstnRob", MLE = "fitdstn"), x = los, densfun = "gamma") plot(los.fm)
Creates a set plots useful for assessing a robustly fitted generalized linear model. The plot options are (2) Deviance Residuals vs. Predicted Values, (3) Response vs. Predicted Values, (4) Normal QQ Plot of Pearson Residuals, (5) QQ Plot of Deviance Residuals, (6) Standardized Deviance Residuals vs. Robust Distances, (7) Standardized Deviance Residuals vs. Index (Time), and (8) Sqrt of abs(Deviance Residuals) vs. Fitted Values.
## S3 method for class 'glmRob' plot(x, which.plots = c(2, 5, 7, 6), ...)
## S3 method for class 'glmRob' plot(x, which.plots = c(2, 5, 7, 6), ...)
x |
a glmRob object. |
which.plots |
either "ask", "all", or an integer vector specifying which plots to draw. If |
... |
additional arguments are pass to the ploting subfunctions which are listed in the see also section. |
This function casts the glmRob object as an glmfm object containing a single model.
The actual ploting is then done by the function
plot.glmfm
.
x
is invisibly returned.
The selected plots are drawn on a graphics device.
Atkinson, A. C. (1985). Plots, Transformations and Regression. New York: Oxford University Press.
plot
,
glmRob
,
plot.glmfm
.
Creates a set plots useful for assessing a robustly fitted linear model. The plot options are (2) Normal QQ-Plot of Residuals, (3) Estimated Kernel Density of Residuals, (4) Robust Residuals vs Robust Distances, (5) Residuals vs Fitted Values, (6) Sqrt of abs(Residuals) vs Fitted Values, (7) Response vs Fitted Values, (8) Standardized Residuals vs Index (Time), (9) Overlaid Normal QQ-Plot of Residuals, and (10) Overlaid Estimated Density of Residuals. For simple linear regression models there is also the option to have a side-by-side plots of the the fit over a scatter plot of the data.
## S3 method for class 'lmRob' plot(x, which.plots = c(5, 2, 6, 4), ...)
## S3 method for class 'lmRob' plot(x, which.plots = c(5, 2, 6, 4), ...)
x |
an lmRob object. |
which.plots |
either "ask", "all", or an integer vector specifying which plots to draw. If |
... |
additional arguments are pass to the ploting subfunctions which are listed in the see also section. |
This function casts the lmRob object as an lmfm object containing a single model.
The actual ploting is then done by the function plot.lmfm
.
x
is invisibly returned.
The selected plots are drawn on a graphics device.
Atkinson, A. C. (1985). Plots, Transformations and Regression. New York: Oxford University Press.
data(stack.dat) stack.rob <- lmRob(Loss ~ ., data = stack.dat) plot(stack.rob, which.plots = 6)
data(stack.dat) stack.rob <- lmRob(Loss ~ ., data = stack.dat) plot(stack.rob, which.plots = 6)
Obtains predictions and optionally estimates standard errors of those predictions from a fitted robust generalized linear model object.
## S3 method for class 'glmRob' predict(object, newdata, type = c("link", "response", "terms"), se.fit = FALSE, terms = labels(object), dispersion = NULL, ...)
## S3 method for class 'glmRob' predict(object, newdata, type = c("link", "response", "terms"), se.fit = FALSE, terms = labels(object), dispersion = NULL, ...)
object |
a glmRob object. |
newdata |
optionally, a data frame in which to look for variables with which to predict. If omitted, the fitted linear predictors are used. |
type |
a character string specifying the type of prediction. The choices are "link" for predictions on the scale of the linear predictor, "response" for predctions on the scale of the response, and "terms" which returns a matrix giving the fitted values for each term in the model formula on the scale of the linear predictor. |
se.fit |
a logical value. If |
terms |
when |
dispersion |
the dispersion of the generalized linear model fit to be assumed in computing the standard errors. If omitted, that returned by 'summary' applied to the object is used. |
... |
additional arguments required by the generic predict method. |
If se.fit = FALSE
, a vector or matrix of predictions. Otherwise a list with components:
fit |
Predictions |
se.fit |
Estimated standard errors |
data(breslow.dat) bres.rob <- glmRob(sumY ~ Age10 + Base4 * Trt, family = poisson(), data = breslow.dat) predict(bres.rob)
data(breslow.dat) bres.rob <- glmRob(sumY ~ Age10 + Base4 * Trt, family = poisson(), data = breslow.dat) predict(bres.rob)
Extracts the fitted values from an lmRob
object and returns a matrix of predictions.
## S3 method for class 'lmRob' predict(object, newdata, type = "response", se.fit = FALSE, terms = labels(object), ...)
## S3 method for class 'lmRob' predict(object, newdata, type = "response", se.fit = FALSE, terms = labels(object), ...)
object |
an lmRob object. |
newdata |
a data frame containing the values at which predictions are required. This argument can be missing, in which case predictions are made at the same values used to compute the object. Only those predictors referred to in the right side of the formula in object need be present by name in |
type |
a single character value specifying the type of prediction. The only choice is "response". If "response" is selected, the predictions are on the scale of the response. |
se.fit |
a logical value. If |
terms |
this argument is presently unused. |
... |
additional arguments required by the generic |
a vector of predictions, or a list consisting of the predictions and their standard errors if se.fit = TRUE
.
predict
can produce incorrect predictions when the newdata
argument is used if the formula in object
involves data-dependent transformations, such as poly(Age, 3)
or sqrt(Age - min(Age))
.
data(stack.dat) stack.rob <- lmRob(Loss ~ ., data = stack.dat) predict(stack.rob) predict(stack.rob, newdata = stack.dat[c(1,2,4,21), ], se.fit = TRUE)
data(stack.dat) stack.rob <- lmRob(Loss ~ ., data = stack.dat) predict(stack.rob) predict(stack.rob, newdata = stack.dat[c(1,2,4,21), ], se.fit = TRUE)
Side-by-side quantile-quantile plots of the sample versus estimated quantiles.
qqPlot.fdfm(x, qqline = TRUE, ...)
qqPlot.fdfm(x, qqline = TRUE, ...)
x |
an |
qqline |
a logical value. If |
... |
additional arguments are passed to |
data(los, package = "robustbase") los.fm <- fit.models(c(Robust = "fitdstnRob", MLE = "fitdstn"), x = los, densfun = "gamma") qqPlot.fdfm(los.fm, xlab = "x-axis label", ylab = "y-axis label", main = "Plot Title", pch = 4, col = "purple")
data(los, package = "robustbase") los.fm <- fit.models(c(Robust = "fitdstnRob", MLE = "fitdstn"), x = los, densfun = "gamma") qqPlot.fdfm(los.fm, xlab = "x-axis label", ylab = "y-axis label", main = "Plot Title", pch = 4, col = "purple")
Computes a robust bootstrap estimate of the standard error for each coefficient estimate in a robustly fitted linear model. This function is called by summary.lmRob
and is not intended to be called directly by users.
rb.lmRob(lmRob.object, M = 1000, seed = 99, fixed = TRUE)
rb.lmRob(lmRob.object, M = 1000, seed = 99, fixed = TRUE)
lmRob.object |
an lmRob object. |
M |
a positive integer giving the number of bootstrap subsamples. |
seed |
a positive integer specifying the seed for the random number generator. |
fixed |
a logical value. This should be set to |
a numeric vector of robust bootstrap standard error estimates.
Residuals methods for glmRob
objects.
## S3 method for class 'glmRob' residuals(object, type = c("deviance", "pearson", "working", "response"), ...)
## S3 method for class 'glmRob' residuals(object, type = c("deviance", "pearson", "working", "response"), ...)
object |
a |
type |
the type of residuals to be returned. |
... |
additional arguments are ignored. |
a numeric vector containing the residuals.
Draws overlaid screeplots for the models in a covfm
object.
screePlot.covfm(x, npcs, strip = "", ...)
screePlot.covfm(x, npcs, strip = "", ...)
x |
a |
npcs |
a postive integer value specifying the number of components to be plotted. |
strip |
a character string printed in the “strip” at the top of the plot. |
... |
additional arguments are passed to |
the trellis
object is invisibly returned.
data(woodmod.dat) woodm.fm <- fit.models(list(Robust = "covRob", Classical = "covClassic"), data = woodmod.dat) screePlot.covfm(woodm.fm, main = "Plot Title", xlab = "x-axis label", ylab = "y-axis label", pch = 4:5)
data(woodmod.dat) woodm.fm <- fit.models(list(Robust = "covRob", Classical = "covClassic"), data = woodmod.dat) screePlot.covfm(woodm.fm, main = "Plot Title", xlab = "x-axis label", ylab = "y-axis label", pch = 4:5)
These data are from the operation of a plant for the oxidation of ammonia to nitric acid, measured on 21 consecutive days.
data(stack.dat)
data(stack.dat)
This data frame contains the following variables:
the percentage of ammonia lost (times 10).
air flow into the plant
cooling water inlet temperature.
acid concentration as a percentage (coded by subtracting 50 and then multiplying by 10).
Brownlee, K.A. (1965). Statistical Theory and Methodology in Science and Engineering. New York: John Wiley & Sons, Inc.
data(stack.dat) stack.dat
data(stack.dat) stack.dat
Performs stepwise model selection on a robustly fitted linear model. Presently only the backward stepwise procedure is implemented.
step.lmRob(object, scope, scale, direction = c("both", "backward", "forward"), trace = TRUE, keep = NULL, steps = 1000, fast = FALSE, ...)
step.lmRob(object, scope, scale, direction = c("both", "backward", "forward"), trace = TRUE, keep = NULL, steps = 1000, fast = FALSE, ...)
object |
an |
scope |
either a formula or a list with elements |
scale |
a single numeric value containing a residual scale estimate. If missing, the scale estimate in |
direction |
a character value specifying the mode of stepwise search. The possibilities are "both", "backward", and "forward", with a default of "backward". Presently only "backward" stepwise searches are implemented. |
trace |
a logical value. If |
keep |
a filter function whose input is a fitted model object and the associated AIC statistic, and whose output is arbitrary. Typically keep will select a subset of the components of the object and return them. The default is not to keep anything. |
steps |
an integer value specifying the the maximum number of steps to be considered. The default is 1000 (essentially as many as required). It is typically used to stop the process early. |
fast |
a logical value. If |
... |
additional arguments required by the generic step function. |
Presently only backward stepwise selection is supported. During each step the Robust Final Prediction Error (as computed by the function lmRob.RFPE
) is calculated for the current model and for each sub-model achievable by deleting a single term. The function then either steps to the sub-model with the lowest Robust Final Prediction Error or, if the current model has the lowest Robust Final Prediction Error, terminates. The scale estimate from object
is used to compute the Robust Final Prediction Error throughout the procedure unless the scale
argument is provided in which case the user specified value is used.
the model with the lowest Robust Final Prediction Error encountered during the stepwise procedure is returned. Additionally, an anova
element corresponding to the steps taken in the search is appended to the returned object. If a keep
function was provided then the kept values can be found in the keep
element of the returned object.
lmRob
,
lmRob.RFPE
,
drop1.lmRob
.
data(stack.dat) stack.rob <- lmRob(Loss ~ ., data = stack.dat) ## The default behavior is to try dropping all terms ## step.lmRob(stack.rob) ## Keep Water.Temp in the model ## my.scope <- list(lower = . ~ Water.Temp, upper = . ~ .) step.lmRob(stack.rob, scope = my.scope)
data(stack.dat) stack.rob <- lmRob(Loss ~ ., data = stack.dat) ## The default behavior is to try dropping all terms ## step.lmRob(stack.rob) ## Keep Water.Temp in the model ## my.scope <- list(lower = . ~ Water.Temp, upper = . ~ .) step.lmRob(stack.rob, scope = my.scope)
The generic summary method for objects of class "covClassic", "covRob", and "covfm".
## S3 method for class 'covClassic' summary(object, ...) ## S3 method for class 'covRob' summary(object, ...) ## S3 method for class 'covfm' summary(object, ...)
## S3 method for class 'covClassic' summary(object, ...) ## S3 method for class 'covRob' summary(object, ...) ## S3 method for class 'covfm' summary(object, ...)
object |
an object of class "covClassic", "covRob", or "covfm". |
... |
additional arguments for the summary method. |
an object of class "summary.covClassic", "summary.covRob", or "summary.covfm" respectively. Objects of class "summary.cov" and "summary.covRob" have the following components. Objects of class "summary.covfm" are lists whose elements are "summary.cov" and "summary.covRob" objects.
call |
an image of the call that produced the object with all the arguments named. |
cov |
a numeric matrix containing the estimate of the covariance/correlation matrix. |
center |
a numeric vector containing the estimate of the location vector. |
evals |
a numeric vector containing the eigenvalues of the covariance/correlation matrix. |
dist |
a numeric vector containing the Mahalanobis distances. Only present if |
corr |
a logical flag. If |
summary
,
covClassic
,
covRob
,
fit.models
.
data(woodmod.dat) woodm.cov <- covClassic(woodmod.dat) ## IGNORE_RDIFF_BEGIN summary(woodm.cov) ## IGNORE_RDIFF_END woodm.covRob <- covRob(woodmod.dat) summary(woodm.covRob) woodm.fm <- fit.models(list(Robust = "covRob", Classical = "covClassic"), data = woodmod.dat) summary(woodm.fm)
data(woodmod.dat) woodm.cov <- covClassic(woodmod.dat) ## IGNORE_RDIFF_BEGIN summary(woodm.cov) ## IGNORE_RDIFF_END woodm.covRob <- covRob(woodmod.dat) summary(woodm.covRob) woodm.fm <- fit.models(list(Robust = "covRob", Classical = "covClassic"), data = woodmod.dat) summary(woodm.fm)
Compute a summary of the robustly fitted generalized linear model.
## S3 method for class 'glmRob' summary(object, correlation = TRUE, ...)
## S3 method for class 'glmRob' summary(object, correlation = TRUE, ...)
object |
a glmRob object. |
correlation |
a logical value. If |
... |
additional arguments required by the generic |
The summary is returned in a list of class summary.glmRob and contains the following components:
comp1 |
Description of 'comp1' |
comp2 |
Description of 'comp2' |
...
data(breslow.dat) bres.rob <- glmRob(sumY ~ Age10 + Base4*Trt, family = poisson(), data = breslow.dat) bres.sum <- summary(bres.rob) bres.sum
data(breslow.dat) bres.rob <- glmRob(sumY ~ Age10 + Base4*Trt, family = poisson(), data = breslow.dat) bres.sum <- summary(bres.rob) bres.sum
Compute a summary of the robustly fitted linear model.
## S3 method for class 'lmRob' summary(object, correlation = FALSE, bootstrap.se = FALSE, ...)
## S3 method for class 'lmRob' summary(object, correlation = FALSE, bootstrap.se = FALSE, ...)
object |
an lmRob object. |
correlation |
a logical value. If |
bootstrap.se |
a logical value. If |
... |
additional arguments required by the generic |
The summary is returned in a list of class summary.lmRob and contains the following components:
sigma |
a single numeric value containing the residual scale estimate. |
df |
a numeric vector of length 3 containing integer values: the rank of the model matrix, the residual degrees of freedom, and the number of coefficients in the model. |
cov.unscaled |
the unscaled covariance matrix; i.e, the matrix that, when multiplied by the estimate of the error variance, yields the estimated covariance matrix for the coefficients. |
correlation |
the correlation coefficient matrix for the coefficients in the model. |
... |
the remaining components are the same as the corresponding components in an |
data(stack.dat) stack.rob <- lmRob(Loss ~ ., data = stack.dat) stack.sum <- summary(stack.rob) stack.sum stack.bse <- summary(stack.rob, bootstrap.se = TRUE) stack.bse
data(stack.dat) stack.rob <- lmRob(Loss ~ ., data = stack.dat) stack.sum <- summary(stack.rob) stack.sum stack.bse <- summary(stack.rob, bootstrap.se = TRUE) stack.bse
Conducts test for bias of robust MM-estimates and Least Squares (LS) estimates against S-estimates, or permutation test of the slope estimate in a straight line fit.
test.lmRob(object, type = "bias", level = NULL, n.permute = 99)
test.lmRob(object, type = "bias", level = NULL, n.permute = 99)
object |
an object of class |
type |
character string. Valid choices are |
level |
the level of the test for bias of MM-estimate. By default, the |
n.permute |
a positive integer value specifying the number of permutations to use. |
the p-value of the permutation test, or an object of class "biasMM"
representing the bias test, in which case the following components ARE included:
mm |
a list describing the test of bias for final MM-estimates, with the following components: |
ls |
a list describing the test of bias for LS-estimates, with the following components: |
level |
the level of the test for bias of MM-estimate. |
Yohai, V., Stahel, W. A., and Zamar, R. H. (1991). A procedure for robust estimation and inference in linear regression, in Stahel, W. A. and Weisberg, S. W., Eds., Directions in robust statistics and diagnostics, Part II. Springer-Verlag.
A method for the generic update
function for objects
inheriting from class lmRob
. See update
for the
general behavior of this function and for the interpretation of the
arguments.
## S3 method for class 'lmRob' update(object, formula., ..., evaluate = TRUE)
## S3 method for class 'lmRob' update(object, formula., ..., evaluate = TRUE)
object |
an lmRob object. |
formula. |
a modeling formula, such as |
evaluate |
a logical value. If |
... |
additional arguments passed to the generic update function. |
If formula.
is missing, update.lmRob
alternates between
the initial estimates and final estimates. Otherwise (when formula.
is
present), update.lmRob
functions just like
update.default
.
either a new updated object, or else an unevaluated expression for creating such an object.
Robust estimation of Weibull distribution parameters.
weibullRob(x, estim = c("M", "tdmean"), control = weibullRob.control(estim, ...), ...)
weibullRob(x, estim = c("M", "tdmean"), control = weibullRob.control(estim, ...), ...)
x |
a numeric vector containing the sample. |
estim |
a character string specifying which estimator to use. |
control |
a list of control parameters appropriate for the estimator in |
... |
control parameters may also be given here. |
a list of class “fitdstn” containing the following elements:
estimate |
a named numeric vector containing the parameter estimates. |
sd |
a named numeric vector containing the standard deviations of the parameter estimates. |
vcov |
a numeric matrix containing the variance-covariance matrix of the estimated parameter vector. |
mu |
a single numeric value containing an estimate of the mean. |
V.mu |
a single numeric value containing the variance of the estimated mean. |
control |
a list containing the control parameters used by the estimator. |
The print
method displays the estimated parameters and their standard errors (in parentheses).
weibullRob.control
, fitdstnRob
.
Create a list of control parameters for the weibullRob
function.
weibullRob.control(estim, ...)
weibullRob.control(estim, ...)
estim |
a character string specifying the estimator. |
... |
control parameters appropriate for the estimator given in |
a list of control parameters appropriate for the specified estimator.
These functions compute the weights used by lmRob and its associated methods.
psi.weight(x, ips = 1, xk = 1.06) rho.weight(x, ips = 1, xk = 1.06) psp.weight(x, ips = 1, xk = 1.06) chi.weight(x, ips = 1, xk = 1.06)
psi.weight(x, ips = 1, xk = 1.06) rho.weight(x, ips = 1, xk = 1.06) psp.weight(x, ips = 1, xk = 1.06) chi.weight(x, ips = 1, xk = 1.06)
x |
a numeric vector. |
ips |
integer determining the weight function:
,
,
,
,
which is currently only available for |
xk |
a numeric value specifying the tuning constant. |
See the section “Theoretical Details”, p. 58-59, in chapter 2 of ‘Robust.pdf’.
a numeric vector, say r
of the same length as x
,
containing the function values .
x <- seq(-4,4, length=401) f.x <- cbind(psi = psi.weight(x), psp = psp.weight(x), chi = chi.weight(x), rho = rho.weight(x)) es <- expression(psi(x), {psi*minute}(x), chi(x), rho(x)) leg <- as.expression(lapply(seq_along(es), function(i) substitute(C == E, list(C=colnames(f.x)[i], E=es[[i]])))) matplot(x, f.x, type = "l", lwd = 1.5, main = "psi.weight(.) etc -- 'optimal'") abline(h = 0, v = 0, lwd = 2, col = "#D3D3D380") # opaque gray legend("bottom", leg, inset = .01, lty = 1:4, col = 1:4, lwd = 1.5, bg = "#FFFFFFC0")
x <- seq(-4,4, length=401) f.x <- cbind(psi = psi.weight(x), psp = psp.weight(x), chi = chi.weight(x), rho = rho.weight(x)) es <- expression(psi(x), {psi*minute}(x), chi(x), rho(x)) leg <- as.expression(lapply(seq_along(es), function(i) substitute(C == E, list(C=colnames(f.x)[i], E=es[[i]])))) matplot(x, f.x, type = "l", lwd = 1.5, main = "psi.weight(.) etc -- 'optimal'") abline(h = 0, v = 0, lwd = 2, col = "#D3D3D380") # opaque gray legend("bottom", leg, inset = .01, lty = 1:4, col = 1:4, lwd = 1.5, bg = "#FFFFFFC0")
The explanatory variables from the Modified Data on Wood Specific Gravity analyzed in Rousseeuw and Leroy (1987).
Note that data(wood, package="robustbase")
contains the same
data, and additionally the y-variable.
data(woodmod.dat)
data(woodmod.dat)
This data frame contains the following variables:
number of fibers per square milimeter in Springwood (coded by dividing by 1000).
number of fibers per square milimeter in Summerwood (coded by dividing by 10000).
fraction of Springwood.
fraction of light absorption by Springwood.
fraction of light absorption by Summerwood.
Rousseeuw, P. J., and Leroy, A. M. (1987). Robust Regression and Outlier Detection. New York: Wiley.
data(woodmod.dat) woodmod.dat data(wood, package = "robustbase") stopifnot(data.matrix(woodmod.dat) == data.matrix(wood [,1:5]))
data(woodmod.dat) woodmod.dat data(wood, package = "robustbase") stopifnot(data.matrix(woodmod.dat) == data.matrix(wood [,1:5]))