lundi 6 juillet 2015

R: Remove Counties with >3 NA's in Yield column , and use na.spline for counties with < 3 NA's

I have a data.frame "df" with 5 columns: "year", "state", "county", "fips" (state-county identifier), "yield".

A number of counties contain NA for yield. I have initially eliminated the counties with any NA value through the code:

Data <- df %>% group_by(fips) %>% filter(!any(is.na(Yield)))

I now need to only eliminate those counties contain more than 3 NA's. Hence, NA>3

For those counties with NA =< 3, I use the spline function:

v <- na.spline(df$Yield)
df$Yield <- v

So far I have the following for removing all counties with NA>3 and using spline to fill the NA's for the remaining shire:

if(length(df$Yield[is.na(df$Yield))<3){
na.spline(df$Yield)
}
}else{
df %>% group_by(fips) %>% filter(!any(is.na(Yield)))
}

This is clearly not working. Any insight would be greatly appreciated.

Aucun commentaire:

Enregistrer un commentaire