I'm interested in reshaping a dataframe, but instead of using standard dcast functions like the mean, I'd like to use a custom function. Specifically, I'm interested in using an ifelse statement to assign binary values.
Here's a reproducible example:
# dataframe that includes extraneous information
df <- data.frame(sale_id=c(1,1,1,2,2,2,3,3,4,5),project_id=c(501,502,503,501,502,503,501,502,504,505),
sale_year=c(1990,1991,1993,1990,1992,1990,1991,1993,1990,1992),
var1=c(5,4,3,6,5,4,4,7,2,9),var2=c(7,3,4,8,5,8,2,3,5,7))
# list of the variables I actually need (I don't need 'sale_year')
varlist <- c("var1","var2")
# selecting out id variables and variables I'm interested in manipulating
dfvars <- df[,c("sale_id","project_id",varlist)]
# melt dataframe
library(reshape2)
mdata <- melt(dfvars, id=c('sale_id','project_id'))
# create custom ifelse function, assign '1' if mean is above a critical value, and '0' if not
funx <- function(u){ifelse(mean(u)>5,1,0)}
# cast data using this function
cdata <- dcast(mdata, sale_id~variable, funx)
It works if I just use a standard function, like mean (ex):
cdata <- dcast(mdata, sale_id~variable, mean)
But with my ifelse() function, I get an error about data types (logical vs. double), which doesn't make sense to me, since the result of "mean(u) > 5" should be returning a logical result (TRUE or FALSE), to then be used by the ifelse() part.
Aucun commentaire:
Enregistrer un commentaire