jeudi 22 février 2018

Using nested for-if-else on a data frame in R

I am trying to use an if-else loop nested in a for loop to go through a data frame and output a new data frame based on conditions from the first df.

In this data frame I would like to compare each row N with row N+1,

if column elements match in column 1 and 2

and the difference between column values in columns 3 and 4 for row N and row N+1 are less than or equal to 1

then I would like to write a new row in the N+1 row spot

that has the same elements for col 1 and 2 as those in row N+1

and the minimum value of column 3 when comparing N and N+1 for col 3

and the maximum value of column 4 for comparing N and N+1 for col 4

Example:

aaa <- c(rep("cat",4), "dog", "dog")
bbb <- c("fit", rep("fat",2), rep("fat", 3))
ccc <- c(6,5,6,9,9,9)
ddd <- c(11,10,10,22,23,24)
df <- data.frame(aaa,bbb,ccc,ddd)

Go from this:

 aaa bbb ccc ddd
 cat fit   6  11
 cat fat   5  10
 cat fat   6  10
 cat fat   9  22
 dog fat   9  23
 dog fat   9  24

To the desired output:

 aaa bbb ccc ddd
 cat fit   6  11
 cat fat   5  10
 cat fat   9  22
 dog fat   9  24

My attempt is this:

result <- data.frame()
for (i in c(1:as.numeric(nrow(df))-1)){      
  if(df[i,1] == df[i+1,1]
     &
     df[i,2] == df[i+1,2]
     &
     abs(df[i,3]-df[i+1,3]) <=1
     &
     abs(df[i,4]-df[i+1,4]) <=1)       
   {
    result[i+1,] <- c(df[i,1],df[i,2],min(df[i,3],df[i+1,3]),max(df[i,4],df[i+1,4]))
    result[i,] <- c(NA,NA,NA,NA)
    } else {
    result[i,] <- df[i,]
    }
}
result

Aucun commentaire:

Enregistrer un commentaire