I'm new to R and I could use some help.
I have two different datasets with people's names.
First database looks something like this:
NAME
JOSE SANTOS
MIRIAM RIOS
JULIANA SILVA
The second one looks like this:
NAME
TIAGO MELO
JOSE FRAGOSO SANTOS
JULIANA DOS SANTOS SILVA
MARIANA RIOS
Then I created a key using the first letter of the first name + the last name so I could merge the two datasets and see if the person from dataset1 is present in dataset2. It looks something like this:
KEY NAME_DB1 NAME_DB2
J SANTOS JOSE SANTOS JOSE FRAGOSO SANTOS
M RIOS MIRIAM RIOS MARIANA RIOS
J SILVA JULIANA SILVA JULIANA DOS SANTOS SILVA
T MELO NA TIAGO MELO
Then I created another key that gets the first 2 letters from the first name + last name because M RIOS could stand for either Miriam Rios or Mariana Rios.
KEY2 NAME_DB1 NAME_DB2
JO SANTOS JOSE SANTOS JOSE FRAGOSO SANTOS
MI RIOS MIRIAM RIOS NA
JU SILVA JULIANA SILVA JULIANA DOS SANTOS SILVA
TI MELO NA TIAGO MELO
MA RIOS NA MARIANA RIOS
What I need to do now is merge those 2 datasets by these 2 keys. The logic I need to follow is:
if LAST NAME == "RIOS", then it uses the second key (2 letters + last name). If LAST NAME is different from "RIOS", then i'll use the first key (1st letter + last name).
I don't know how to do that. I can't use the second key (two letters + last name) for everything because some names of my datasets are like this: R D RODRIGUES, for example.
Aucun commentaire:
Enregistrer un commentaire