mardi 22 octobre 2019

New column / mutate based on existing column

I want to add a new column to a dataframe df based on a condition from the existing columns e.g.,

df$TScore = as.factor(0)
df$TScore = 
  if_else(df$test_score >= '8.0', 'high',
      if_else(!is.na(df$test_score), 'low', 'NA'))

The problem I am facing is, for some cases TScore is what I would expect it to be i.e., 'high' when the score is 8 or greater but for some cases it is not correct. Is there an error in the above code?

I am also struggling with how to write it using dplyr(). I have so far written this:

df$TScore =   df %>%
                filter(test_score >= 8) %>%
                    mutate(TScore = 'high')

But as we would expect, the dimensions do not match. Following error is given:

Error in `$<-.data.frame`(`*tmp*`, appScore, value = list(cluster3 = c(1L,  : replacement has 126 rows, data has 236

Any advice would be greatly appreciated.

Aucun commentaire:

Enregistrer un commentaire