[R-sig-ME] standardized coefficients in glmer model
Ben Bolker
bbolker at gmail.com
Sun Dec 12 03:56:46 CET 2010
What I meant is that in the linear model case, what you get when you
calculate the standardized regression coefficients is expected change in
(y/(sd y)) per unit change in (x/(sd x)). With GLM you don't get this,
because of the link function. The natural analogue would (I think) be
expected change in (link(y)/sd(link(y))), or something like that
[because the assumption is that the relationship is linear on the linear
predictor scale], but naively calculating sd(link(y)) won't work,
because link(y) is often infinite.
A bit of googling suggests (as with many other times when one wants to
generalize from linear models to GLMs -- e.g. R^2 values) that the
answer is not obvious ...
<http://goliath.ecnext.com/coms2/gi_0199-762729/Six-approaches-to-calculating-standardized.html>
<http://www.nd.edu/~rwilliam/stats3/L06.pdf>
Again, for your purposes I think (?) the most important thing is that
you have the scales of the betas standardized correctly with respect to
each other (as opposed to standardizing the response variable), which
isn't a problem.
On 10-12-11 09:18 PM, Leeuwen, Casper van wrote:
> Dear list, David and Ben,
>
> Ben Bolker:
> Thanks so much for the awesome function: that was exacty what I was
> looking for, and it works perfectly on my dataset. Even including
> estimate for the interactions.
> However, I'm sorry but don't understand your remark on the scaling of
> sd(y) "may not generalize to the GLM case from the LM case?". I think it
> doesn't matter to scale the y-values for my x-estimates, but do you mean
> this would be different for a GLM model than for a LM model? Do you
> think the scaling of the y-values is incorrect if the regression is
> non-linear?
>
>
>
> David Duffy:
>
> Thanks a lot for the suggestion, my data is not human but birds body
> mass, essentially the same but no BMI. If I understand you correctly,
> you say it doesn't make sense to compare estimates between a binomial
> term (sex) and a (continuous) covariate (body mass)?
>
> Should I somehow construct a binomial variable from the body mass to be
> able to compare the estimates?
>
>
>
> Thanks,
>
> Casper
>
>
> ------------------------------------------------------------------------
> *From:* David Duffy [mailto:davidD at qimr.edu.au]
> *Sent:* Sat 12/11/2010 21:29
> *To:* Leeuwen, Casper van
> *Cc:* r-sig-mixed-models at r-project.org
> *Subject:* Re: [R-sig-ME] standardized coefficients in glmer model
>
> On Sat, 11 Dec 2010, Leeuwen, Casper van wrote:
>
>> model <- glmer (intact_binomial ~
>> species
>> + sex
>> + retention_time
>> + body_mass
>> + body_mass * retention_time
>> + (1 | individual)
>> , family = binomial (link = "logit")
>> )
>> summary(model)
>>
>> summary() returns effects sizes given as coefficients of the different
>> factors. However, I would like to indicate the importance of the
>> different terms in the model, to determine the relative importance of
>> for instance sex versus body_mass: which one is more important in
>> explaining my dependent variable?
>
> Given this is a logistic regression, there are various more or less
> unsatisfactory equivalents of an R2. You might be better off just
> comparing effect sizes eg odds ratio (exp(beta)) for sex versus that for
> the difference between the first and third quartiles of BMI or
> from say BMI=20 to BMI=25 and BMI=30, presuming this is human data.
>
> --
> | David Duffy (MBBS PhD) ,-_|\
> | email: davidD at qimr.edu.au ph: INT+61+7+3362-0217 fax: -0101 / *
> | Epidemiology Unit, Queensland Institute of Medical Research \_,-._/
> | 300 Herston Rd, Brisbane, Queensland 4029, Australia GPG 4D0B994A v
>
> ------------------------------------------------------------------------
>
> *From:* Ben Bolker [mailto:bbolker at gmail.com]
> *Sent:* Sat 12/11/2010 17:09
> *To:* Leeuwen, Casper van; r-sig-mixed-models at r-project.org
> *Subject:* Re: [R-sig-ME] standardized coefficients in glmer model
>
> On 10-12-11 02:21 AM, Leeuwen, Casper van wrote:
>> Dear R-list,
>
>> I'm running a mixed effect logistic regression with both factors and
>> covariates, an interaction and a random factor.
>
>> model <- glmer (intact_binomial ~ species + sex + retention_time +
>> body_mass + body_mass * retention_time + (1 | individual) , family =
>> binomial (link = "logit") ) summary(model)
>
>> summary() returns effects sizes given as coefficients of the
>> different factors. However, I would like to indicate the importance
>> of the different terms in the model, to determine the relative
>> importance of for instance sex versus body_mass: which one is more
>> important in explaining my dependent variable?
>
> If all your variables were numeric (which sex is not) then
>
> model_scaled <- glmer(...,data=scale(mydata))
>
> would work: looking at lm.beta and Make.Z in the QuantPsyc package (you
> didn't tell us where lm.beta() came from ...), Make.Z seems (as far as I
> can tell) to replicate the behavior of the built-in scale() function.
> But that approach won't work properly for factors with more than two
> levels ...
>
> Here's lm.beta:
>
> lm.beta
> function (MOD)
> {
> b <- summary(MOD)$coef[-1, 1]
> sx <- sd(MOD$model[-1])
> sy <- sd(MOD$model[1])
> beta <- b * sx/sy
> return(beta)
> }
>
> Here's a translation into lmer-land:
>
> lm.beta.lmer <- function(mod) {
> b <- fixef(mod)[-1] ## fixed-effect coefs, sans intercept
> sd.x <- apply(mod at X[,-1],2,sd) ## pull out model (design) matrix,
> ## drop intercept column, calculate
> ## sd of remaining columns
> sd.y <- sd(mod at y) ## sd of response
> b*sd.x/sd.y
> }
>
> Here's an example, using the Orthodont data from the nlme package:
>
> library(nlme)
> data(Orthodont)
> dat <- as.data.frame(Orthodont)
> detach(package:nlme)
>
> library(lme4)
> fm2 <- lmer(distance ~ age + Sex + (age|Subject), data = dat)
> lm.beta.lmer(fm2)
>
> For this example (which like yours has Sex, a two-level factor, as its
> only non-numeric predictor) we can show that we get the same answer (up
> to numeric fuzz) by scale()ing:
>
> pdat <- with(dat,cbind(distance,age,s=as.numeric(Sex)))
> pdat <- scale(pdat)
> dat2 <- data.frame(pdat,Subject=dat$Subject)
>
> fm3 <- lmer(distance ~ age + s + (age|Subject), data = dat2)
> fixef(fm3)
>
> The only remaining question I have is whether it makes sense to scale
> by sd(y) in this case -- may not generalize to the GLM case from the LM
> case? But you should have the correct *relative* magnitudes of
> parameters in any case.
>
> good luck,
> Ben Bolker
>
More information about the R-sig-mixed-models
mailing list