lundi 30 novembre 2020

R loop to copy content from two columns, according a third column, and output into a fourth column

I am aware of the solutions on similar problems and have read through all I found (hence the code provided below), but unfortunately couldn't make them work properly. Also, I am very unfamiliar with loops...

What I'm trying to do: I have a dataset data and am trying to input values in column Cov_length from either column length_L1 or column length_L2, depending on the values in yet a third column: Language: A) If Language states in a specific row L1, then I input the according row value from column length_L1 into Cov_length. B) And correspondingly if value in Language is L2, input the corresponding row value of length_L2 into Cov_length.

Here is some example data:

data111 <- data.frame(Language = c("L1","L1", "L2", "L1", "L2", "L2", "L2", "L1"),
                      Length_L1 = c(4, 7, 3, 12, 10, 5, 5, 7),
                      Length_L2 = c(5, 2, 9, 7, 3, 3, 4, 10),
                      Cov_length = c(0, 0, 0, 0, 0, 0, 0, 0))
> data111
  Language Length_L1 Length_L2 Cov_length
1       L1         4         5          0
2       L1         7         2          0
3       L2         3         9          0
4       L1        12         7          0
5       L2        10         3          0
6       L2         5         3          0
7       L2         5         4          0
8       L1         7        10          0

Here are two solutions I have tried. The first one just runs without error but doesn't do anything (values remain zeros in Cov_length).

for (i in 1:length(data$Language)) {
  if (i == "L1") {data$Cov_length [i] <- data$length_L1 [i] }
  else if (i == "L2") {data$Cov_length [i] <- data$length_L2 [i] }
  else {}
}

And this second solution just takes all the values from column length_L1 instead of actually selecting values between the two columns.

require(base)  
data %>% 
  mutate (Cov_length = ifelse(Language == "L1", paste(length_L1), paste(length_L2))) 

There are a number of cases in my data I'd have to do the above, and it is a 8000-observation piece, with random order of the L1/L2 values (I couldn't efficiently do it by hand). So any advice would be helpful. Thanks!

Aucun commentaire:

Enregistrer un commentaire