jeudi 19 avril 2018

convert vector to state abbreviations using grep

My question is similar to this, but different:

State name to abbreviation in R

I have a vector of state names, like so:

name <- c("al's", "jim's", "joe's", "savoy", "wilhelm", "slim's", "looking glass", "eats", "bravo", "marx", "judah") 
state <- c("va", "fla", "calif", "tex", "mass", "ny", "ill", "in", "ri", "ariz", "ohio")
df <- data.frame(name, state)

As you can see, they are not standardized at all. What I would like to do is convert the state column in df to the two-letter state abbreviations, like so:

state_final <- c("va", "fl", "ca", "tx", "ma", "ny", "il", "in", "ri", "az", "oh")

The issue is that I tried using grep combined with the built-in state.name dataset in R, but I keep getting errors. Basically I want to look up an entry from df$state and compare it to state.name. If it matches, then I want to use the abbreviation for the matched state name. If it's already an abbreviation, like "va" then do nothing.

So, for example, if an entry in df$state was "Calif", I want to look in state.name, match with "California" and return the abbreviation "ca". The code I used is below:

state_final <- ifelse(grep(df$state, state.name, fixed = T, perl = T), state.abb[grep(df$state, state.name, fixed = T, perl = T)], "")   

Clearly I am missing something here. I was wondering if anyone had any suggestions. Any help would be much appreciated.

Aucun commentaire:

Enregistrer un commentaire