mardi 24 août 2021

R: fill in cells with values from different rows

I’m trying to fill NAs in a row with values from a different row. These rows are “linked” by a case number. I want to write an if loop that goes through the entire data frame and does this. But I think I don’t grasp the R language well enough. Can anybody help me?

The data frame:

CASE <- c(1, 2, 3, 4, 5, 6)
SERIAL <-c("AB",NA, NA, "CD", NA, NA)
REF <- c(NA, 1, 1, NA, 4, 4)
PA <- c(4, NA, NA, 2, NA, NA)
PE <- c(NA, 2, NA, NA, 1, NA)
PE2 <- c(NA, NA, 3, NA, NA, 3)

df <- data.frame (CASE, SERIAL, REF, PA, PE, PE2)

  CASE SERIAL REF PA PE PE2
    1     AB  NA  4  NA  NA
    2   <NA>   1 NA   2  NA
    3   <NA>   1 NA  NA   3
    4     CD  NA  2  NA  NA
    5   <NA>   4 NA   1  NA
    6   <NA>   4 NA  NA   3

In the row CASE = 1, I want to fill in the empty PE and PE2 with the values from the rows below, which reference the line (by REF = 1). In the line CASE = 4, I want to fill in the empty PE and PE2 with the values from the rows below, which reference the line (by REF = 4). The lines with no serial number only serve to fill the lines 1 and 4, so to speak. There is no way to collect the data directly into the corresponding lines. I tried this for loop, but don't know how to refrence the values correctly?

for (i in 1:dim(df)[1]{
  if (data$SERIAL[i]==NA){
    [data$CASE[data$REF[i]],PE] <- data$PE[i]
    [data$CASE[data$REF[i]],PE2] <- data$PE2[i]}
}
)

Aucun commentaire:

Enregistrer un commentaire