Being still rather new to R, I struggle with the following problem: I am looking at several particles moving along the x axis (in reality this is in 3D, but this simplifies matters for our purpose, here). I have a data frame with the each particle's ID and their corresponding position at a given time point. Here's an example:
x.position1 <- c(5, NA, 4, 7, 1, NA, 2, NA, NA, 3)
x.position2 <- c(6, NA, 8, 7, 2, 1, 2, NA, NA, 1)
x.position3 <- c(6, 2, 7, 7, 4, 3, 1, NA, NA, 6)
x.position4 <- c(7, 4, 9, 7, 5, 5, 0, 0, 5, 7)
x.position5 <- c(9, 5, NA, 7, 6, NA, 2, 3, 8, 11)
particule.ID <- c(1:10)
df <- data.frame(particule.ID, x.position1, x.position2, x.position3, x.position4, x.position5)
df
particule.ID x.position1 x.position2 x.position3 x.position4 x.position5
1 1 5 6 6 7 9
2 2 NA NA 2 4 5
3 3 4 8 7 9 NA
4 4 7 7 7 7 7
5 5 1 2 4 5 6
6 6 NA 1 3 5 NA
7 7 2 2 1 0 2
8 8 NA NA NA 0 3
9 9 NA NA NA 5 8
10 10 3 1 6 7 11
My aim is to calculate the displacement of each particle at each time point i. This displacement is therefore xi - x1. This newly calculated displacement is to be placed in a newly created column.
Here's the script I originally wrote for this:
for (i in 1:5){ # iterate for each time point i
df$Disp <- df[,2+i-1]-df[,2] # create a new column with the calculated displacement for time point i
nam.Disp <- paste("Disp", i, sep = "") #rename new column Disp+time point number
names(df)[names(df) == 'Disp'] <- nam.Disp
}
df
particule.ID x.position1 x.position2 x.position3 x.position4 x.position5 Disp1 Disp2 Disp3 Disp4 Disp5
1 1 5 6 6 7 9 0 1 1 2 4
2 2 NA NA 2 4 5 NA NA NA NA NA
3 3 4 8 7 9 NA 0 4 3 5 NA
4 4 7 7 7 7 7 0 0 0 0 0
5 5 1 2 4 5 6 0 1 3 4 5
6 6 NA 1 3 5 NA NA NA NA NA NA
7 7 2 2 1 0 2 0 0 -1 -2 0
8 8 NA NA NA 0 3 NA NA NA NA NA
9 9 NA NA NA 5 8 NA NA NA NA NA
10 10 3 1 6 7 11 0 -2 3 4 8
However, as you may notice in the data frame, sometimes a particle is not detected at i=1, or later. This means that I get a NA value. Therefore, including another loop with IF so that if that 1st time point is NA, R would go to the next time point until it found a non NA value to substract. I threfore came up with this, using ifelse instead of IF, since the latter can only deal with one value while my input is actually a column:
for (i in 1:5){ # iterate for each time point i
for (j in 1:5){ # if first time point has no value (NA) scan the row for next time point until an object is detected
ifelse(!is.na(df[,2+j-1]),
df$Disp <- (df[,2+i-1]-df[,2+j-1]), # create a new column with the calculated displacement for i time point
next) # if time point is NA go to next j (next fixed initial time point to test)
}
nam.Disp <- paste("Disp", i, sep = "") #rename new column Disp+time point number
names(df)[names(df) == 'Disp'] <- nam.Disp
}
df
particule.ID x.position1 x.position2 x.position3 x.position4 x.position5 Disp1 Disp2 Disp3 Disp4 Disp5
1 1 5 6 6 7 9 -4 -3 -3 -2 0
2 2 NA NA 2 4 5 NA NA -3 -1 0
3 3 4 8 7 9 NA NA NA NA NA NA
4 4 7 7 7 7 7 0 0 0 0 0
5 5 1 2 4 5 6 -5 -4 -2 -1 0
6 6 NA 1 3 5 NA NA NA NA NA NA
7 7 2 2 1 0 2 0 0 -1 -2 0
8 8 NA NA NA 0 3 NA NA NA -3 0
9 9 NA NA NA 5 8 NA NA NA -3 0
10 10 3 1 6 7 11 -8 -10 -5 -4 0
Somehow this return wrong values. It looks like the computation occured backwards: Disp1 = x5-x1, Disp2 = x5- x2, Disp3 = x5-x3 etc... while what I expected was: Disp1 = x1-x1, Disp2 = x2-x1, Disp3 = x3-x1 etc. How can this inclusion of the new for loop and the ifelse function have caused this? What am I doing wrong? There may be a way to do this manually, but since in reality I have at least 60 time points if not more, the task would be daunting.
Also, if you think there's a much smarter way to do this, please do share! And if I forgot to include important details that would help you understand better, just let me know.
Many thanks!
Flo
Aucun commentaire:
Enregistrer un commentaire