I’m trying to create a function which sums actual and inferred values from one column to create another. My data is of the form:
Nest <- c(a,b,c,d,e,a,c,a,d,c,b)
Age <- c(5,5,4,6,5,7,6,9,10,8,10)
Brood <- c(4,3,4,4,3,4,3,3,4,3,1)
df <- data.frame(Nest, Age, Brood)
What I am trying to do is sum brood across all days up until the current age, such that 1 day with 4 chicks is worth 4, and 2 days with 3 chicks each are worth 6 etc. This requires the function to impute the values for days with no data. If a chick(s) has died between visits (i.e. there is a reduction in Brood), the function needs to assume they died on the middle day between visits. We can assume that the brood size on the first visit is correct for all previous days. Brood size can only decrease, not increase.
The correct output for the above data would be:
df$Sum.Br <- c(20,15,16,24,15,28,23,35,40,29,24)
I have tried to achieve this with a series of ifelse commands wrapped within transform:
tmp <- transform(df, Sum.Br = ave(Brood, Nest, FUN = function(x)
c(df$Age*x[1],
ifelse(x[2] == x[1],
df$Age*x[2],
df$Age[x[1]]*x[1] + (df$Age[x[2]]-df$Age[x[1]])*((x[1]+x[2])/2)),
ifelse(x[3] == x[2],
ifelse(x[2]==x[1],
df$Age*x[3],
df$Age[x[1]]*x[1] + (df$Age[x[2]]-df$Age[x[1]])*((x[1]+x[2])/2) + (df$Age[[3]]-df$Age[x[2]])*x[3]),
ifelse(x[2]==x[1],
df$Age[x[2]]*x[2] + (df$Age[x[3]]-df$Age[x[2]])*((x[2]+x[3])/2),
df$Age[x[1]]*x[1] + (df$Age[x[2]]-df$Age[x[1]])*((x[1]+x[2])/2) + (df$Age[x[3]]-df$Age[x[2]])*((x[2]+x[3])/2))))
but after 3 repeats the coding is getting long and error-prone (and I’m not even sure this is all correct!).
Can anyone see a simpler way of doing this? Thanks!
Aucun commentaire:
Enregistrer un commentaire