mardi 19 septembre 2017

Create a data.frame of variables depending on each other

I want to create a data frame out of variables depending on each other. I can accomplish this by using this dplyr-style code :

p.1 <- .1
p.2 <- .3
p.3 <- .6
l <- 1e2

df <- data.frame(
  var.1 =
    sample(0:1, l, prob = c(1 - p.1, p.1), replace = TRUE))

df <- df %>%
    mutate(
  var.2 = ifelse(
    var.1 == 1, 0,
    sample(0:1, l, prob = c(1 - p.2, p.2), replace = TRUE)),
  var.3 = ifelse(
    var.1 == 1 | var.2 == 1, 0,
    sample(0:1, l, prob = c(1 - p.3, p.3), replace = TRUE))
)

Even nicer it would be to create the df in one step, but this isn't working, because var.1 is not found:

df <- data.frame(
  var.1 =
    sample(0:1, l, prob = c(1 - p.1, p.1), replace = TRUE),
  var.2 = ifelse(
    var.1 == 1, 0,
    sample(0:1, l, prob = c(1 - p.2, p.2), replace = TRUE)
    ),
  var.3 = ifelse(
    var.1 == 1 | var.2 == 1, 0,
    sample(0:1, l, prob = c(1 - p.3, p.3), replace = TRUE)
    )
  )

Another attempt creating an empty df first isn't working either, this throws an error Error in mutate_impl(.data, dots) : Column var.1 must be length 0 (the number of rows) or one, not 100:

df <- data.frame()
df <- df %>%
  mutate(
    var.1 =
      sample(0:1, l, prob = c(1 - p.1, p.1), replace = TRUE),
    var.2 = ifelse(
      var.1 == 1, 0,
      sample(0:1, l, prob = c(1 - p.2, p.2), replace = TRUE)
      ),
    var.3 = ifelse(
      var.1 == 1 | var.2 == 1, 0,
      sample(0:1, l, prob = c(1 - p.3, p.3), replace = TRUE)
      )
  )

Actually I have a much larger number of variables and I want a more economic solution for this task.

Aucun commentaire:

Enregistrer un commentaire