lundi 25 juillet 2016

subsetting on two conditions in loop

I have a data.frame in R which I want to subset depending on two conditions: Firstly the rows shouldn't be duplicates and secondly if they are duplicates, only the row with b==1 should be returned. Instead of the expected five rows, I get all seven rows of this sample df returned. What is the cause?

a <- c(rep("A", 2), "B", rep("C",2), "D", "E")
b <- c(1,2,1,1,2,1,2)
df <- data.frame(a,b)

result <- data.frame()
for (i in seq_along(df$a)) {
  if (duplicated2(df$a)[i] == FALSE) {
    result <- rbind(result, df[i,])
  } else if (duplicated2(df$a)[i] == TRUE && df$b == 1) {
    result <- rbind(result, df[i,])
  }
}

I'm new to programming and R and maybe there's some basic thing I get wrong. Can this also be done in an easier way?

Aucun commentaire:

Enregistrer un commentaire