Skip to content

glmer warning : “Downdated VtV is not positive definite” - Is it because the response variable is too uniform? #483

@bjce

Description

@bjce

I hope that I'm not asking a silly question but I looked at similar issues (including opened issues #173 and #396) and couldn't find a solution to this problem. Thanks in advance for the time taken to look at the problem.

I have data that look like the simulation below. In the data there are 3 different groups (g1, g2 and g3) of subjects which needed either 1, 2 or 3 procedures (p1, p2, p3) respectively according to the gravity of the disease varying between subject. Hence, there is only p1 for the group g1; p1, p2 for the group g2; p1, p2, p3 for the group g3. Each measurement was performed at the time of each procedure. What I'm trying to do is the run to fit a mixed model logistic regression (sorry if it's not the correct nomenclature: I'm happy to learn if it's wrong) to look for difference in the proportion of patients at p1 who got a certain treatment (y1 a dichotomous variable) and to control for the id the patient by setting it as a random variable.

With this simulated data, the output of

mod_glmer = glmer(y1 ~ procedure + (1|id), data = data_g2, family = binomial)

gives me the following error:

Error in length(value <- as.numeric(value)) == 1L : Downdated VtV is not positive definite

My guess is that it has to do with the fact that at p1 there are only 1's when running the following:

table(data_g2$y1[data_g2$procedure == "p1"])

whereas there are both 0's and 1's at p2 when running the following:

table(data_g2$y1[data_g2$procedure == "p2"])

However I also ran the a similar "standard" logistic regression and don't get the error message:

mod_glm = glm(y1 ~ procedure, data = data_g2, family = binomial)

What should I do to demonstrate that the proportion of y1 is different in p1 compared to p2 using glmer?

P.S.: if you simulate the data, run several time as it involves generation of random number (I get the message around once every 2 runs if at random, it gives me an error each time with set.seed(1). Try another number n in set.seed(n) if you don't get the error message)

Simulation of the data

library(lubridate)
library(dplyr)
library(lme4)

set.seed(1)

# define the number of subjects in each groups
n_g3 = 20
n_g2 = n_g3 * 10
n_g1 = n_g3 * 10

# 3 different groups
id_p1 = paste0("ID",1:c(n_g1 + n_g2 + n_g3))
id_p2 = paste0("ID",c(n_g1+1):c(n_g1 + n_g2 + n_g3))
id_p3 = paste0("ID",c(n_g1 + n_g2+1):c(n_g1 + n_g2 + n_g3))
id = append(append(id_p1, id_p2), id_p3)

# 3 different groups
groups = c(rep("g1", n_g1), rep("g2", n_g2), rep("g3", n_g3), rep("g2", n_g2), rep("g3", n_g3), rep("g3", n_g3))

# 1st, 2nd or 3rd procedure
procedure = c(rep("p1", n_g1), rep("p1", n_g2), rep("p1", n_g3), rep("p2", n_g2), rep("p2", n_g3), rep("p3", n_g3))

# measurement n??2 dichotomous
y1 = c(rep(1, n_g1 + n_g2 + n_g3),
  (rbinom(c(n_g2 + n_g3),1,0.8)),
  (rbinom(n_g3,1,0.3))
)

data = data.frame(id, groups, procedure, y1)

### create subset g2 of data to make comparisons within the group with 2 procedures
data_g2 = data[which(data$groups == "g2"),]

table(data_g2$y1[data_g2$procedure == "p1"])
table(data_g2$y1[data_g2$procedure == "p2"])

mod_glm = glm(y1 ~ procedure, data = data_g2, family = binomial)
summary(mod)

mod_glmer = glmer(y1 ~ procedure + (1|id), data = data_g2, family = binomial)
summary(mod_glmer)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions