mardi 3 mars 2015

Function to assign cell value to subsequent NA-cells (same column)

Thanks for taking your time to look at my problem! Am new to the forum and relatively new to R, but I'll do my best to formulate the question clearly.


I have a big set of tree sample data with an irregular number of rows per individual. In the "class" variable column (here column 2) the first row of each individual has a value (1, 2, 3 or 4) and the subsequent ones are NA. I'd like to assign the first value of each individual to the respective subsequent NA-cells (belonging to the same individual).


Reproducible example dataframe (edited):



test <- cbind(c(1, 2, 3, NA, 4, NA, NA, NA, 5, NA, 6), c(3, 4, 3, NA, 1, NA, NA, NA, 2, NA, 1))
colnames(test) <- c("ID", "class")

ID class
[1,] 1 3
[2,] 2 4
[3,] 3 3
[4,] NA NA
[5,] 4 1
[6,] NA NA
[7,] NA NA
[8,] NA NA
[9,] 5 2
[10,] NA NA
[11,] 6 1


The result I am looking for is this:



ID class
[1,] 1 3
[2,] 2 4
[3,] 3 3
[4,] NA 3
[5,] 4 1
[6,] NA 1
[7,] NA 1
[8,] NA 1
[9,] 5 2
[10,] NA 2
[11,] 6 1


I copied the last solution from this topic How to substitute several NA with values within the DF using if-else in R? and tried to adapt it to my needs like this



test2 <- as.data.frame(t(apply(test["class"], 1, function(x)
if(is.na(x[1]) == FALSE && all(is.na(head(x[1], -1)[-1])))
replace(x, is.na(x), x[1]) else x)))


but it gives me the error "dim(x) must have positive length". I tried many other versions and it gives me all sorts of errors, I don't even know where to start. How can I improve it?


Aucun commentaire:

Enregistrer un commentaire