jeudi 15 août 2019

Passing string through multiple filters for matching

Working with MSA data and splitting string then putting them back together. Needing to filter through multiple columns to get strings to match properly. I need to filter the string of cities through states first... I could create a column for each city matched to MSA, but am looking for something more efficient.

> testdf <- data.frame(col1 =c('Dallas,Fort Worth,Arlington','Houston,The Woodlands,Sugar Land','Atlanta,Sandy Springs,Roswell'),
+                      col2 =c('TX','TX','GA'))
> df <- data.frame(col1 = c('Arlington','Houston','Arlington','Atlanta'),
+                  col2 = c('TX','TX','VA','GA'),
+                  stringsAsFactors = FALSE)
> testdf
                              col1 col2
1      Dallas,Fort Worth,Arlington   TX
2 Houston,The Woodlands,Sugar Land   TX
3    Atlanta,Sandy Springs,Roswell   GA
> df
       col1 col2
1 Arlington   TX
2   Houston   TX
3 Arlington   VA
4   Atlanta   GA

Looking for:

     col1 col2  MSA
1 Arlington   TX  Dallas,Fort Worth,Arlington
2   Houston   TX  Houston,The Woodlands,Sugar Land
3 Arlington   VA  NA
4   Atlanta   GA  Atlanta,Sandy Springs,Roswell

I'm pretty lost on how to even ask this question, so please let me know if I have a duplicate here. If it is a duplicate, please provide guidance on how to ask better.

Aucun commentaire:

Enregistrer un commentaire