In R, I need to compare the first 8 characters of one colA (Longitude.x) with the first 8 characters of a second colB (X.x). If the 8 characters are identical, then I want to write the value of colA (Longitude.x) to a new colC (XCoord). In other words, if colA contains a longitude value of -122.23538 and colB contains an X value of -122.235873, I want colC to take the value of colA -122.23538 because the first 8 characters (-122.235) match.
colA (Longitude.x) and colB (X.x) are both type double when first read in to R, so I have converted them to characters with the following code:
schools_merge$Longitude.x[] <- lapply(schools_merge$Longitude.x[], as.character)
schools_merge$X.x[] <- lapply(schools_merge$X.x[], as.character)
The class and type of both colA and B become "list."
I have tried the following code to write a new colC (XCoord):
schools_merge$XCoord <- if(substr(schools_merge$X.x,1,7) == substr(schools_merge$Longitude.x,1,8)) "yes" else "no"
While this code runs, it returns a warning--
Warning message:
In if (substr(schools_merge$X.x, 1, 8) == substr(schools_merge$Longitude.x,
: the condition has length > 1 and only the first element will be used
--and not the desired outcome (for example, the second element in each list should result in a "yes" for colC (XCoord) because characters 1-8 of the number -122.23538 are equal to characters 1-8 of -122.235873).
head(schools_merge$XCoord)
head(schools_merge$Longitude.x)
head(schools_merge$X.x)
> head(schools_merge$XCoord)
[1] "no" "no" "no" "no" "no" "no"
> head(schools_merge$Longitude.x)
[[1]]
[1] "-120.76288"
[[2]]
[1] "-122.23538"
[[3]]
[1] "-122.19604"
[[4]]
[1] "-122.09222"
[[5]]
[1] "-121.77057"
[[6]]
[1] "-122.21629"
> head(schools_merge$X.x)
[[1]]
[1] "-120.763628"
[[2]]
[1] "-122.235873"
[[3]]
[1] "-122.197942"
[[4]]
[1] "-122.092998"
[[5]]
[1] "-121.770702"
[[6]]
[1] "-122.216899"
The possibilities I can think of are: 1) What I am assuming counts as a character (i.e. '-' and '.' and all numbers) is incorrect, but I have tried several different iterations of the number of characters to compare and I still get the same--either head() all "yes" or all "no," or 2) I may need to change to a convert the columns to vector instead of character. Any help is much appreciated!
Thank you, Anna
Aucun commentaire:
Enregistrer un commentaire