An if-statement returns an "missing Value"-error when there is a perfectly healthy value.
I wanted to write a simple script to delete rows in a dataset if one of their entries contains a certain tag. I assign an indicator variable in a new column (containsMR) and then iterate over the rows using a for-loop. If the indicator is TRUE, the row should be removed.
The indicators get assigned correctly, so far, so good. The interesting part: In the loop's if-statement seems to have trouble reading the values, because it returns "Error in if (data$containsMR[i]) { : missing value where TRUE/FALSE needed".
Given the correct (and complete) assignment of indicator variables, this surprises me. What is even more weird: Some, but not all the rows with a positive indicator are removed (checked with printouts and table(data$containsMR) ).
And now the really weird stuff: if I run the same loop one more time, it removes the rest of the columns (as it should), but returns the same error. So, theoretically, I could just run the loop twice, ignore the errors and walk away with the result I wanted. That's just really not the point of what I'm doing.
Bugfixes tried: - changed for- to while-loop - changed indicators (and if-statement) to integer (0,1) - ran the script in RStudio and R console - changed variable names, included/excluded definitions (e.g. adding the proxy variable row_number instead of calling it in one line.
# Script to delete all rows containing "MR" in column "EXAM_CODE"
# import file
data <- read.csv("C:\\ScriptingTest\\ablations 0114.csv")
# add indicator column
for (i in 1:nrow(data)){
data$containsMR[i] <- ifelse(grepl("MR", toString(data$EXAM_CODE[i])), TRUE, FALSE)
}
# remove rows with positive indicator
row_number <- nrow(data)
for (i in 1:row_number){
if (data$containsMR[i]){
data <- data[-c(i),]
}
}
# export csv
write.csv(data, "C:\\ScriptingTest\\export.csv")
Aucun commentaire:
Enregistrer un commentaire