mardi 16 avril 2019

Mutiple conditions in dataframe dplyr way

I am working on early 20th century french census. I work on households. Each house houseold has a household_chief (always on position 1). When a houseold is based on a couple, the wife is always on second position.

id_houseold<- c(1, 1, 1, 1, 2, 2, 3, 4,4,4, 5, 5)
 members <- c("household_chief", "wife", "child", "child","household_chief", "wife", "household_chief", "household_chief", "wife", "child", "household_chief","child")
 birthplace<- c("Paris", "Paris", "Paris", "Paris", "Paris", "Bordeaux",   "Nantes", "Paris", "Paris", "Nantes", "Nantes,", "Nantes")
data <- data.frame(id_houseold, members, birthplace)

I have made a sequence of position of members of each household :

library(dplyr)
data <- data %>%
group_by(id_houseold) %>% 
mutate(position_in_menage = 1:n())
data 

Here is my result :

id_houseold members         birthplace position_in_menage
     <dbl> <fct>           <fct>                   <int>
1           1 household_chief Paris                       1
2           1 wife            Paris                       2
3           1 child           Paris                       3
4           1 child           Paris                       4
5           2 household_chief Paris                       1
6           2 wife            Bordeaux                    2
7           3 household_chief Nantes                      1
8           4 household_chief Paris                       1
9           4 wife            Paris                       2
10          4 child           Nantes                      3
11          5 household_chief Nantes,                     1
12          5 child           Nantes                      2

What I want to kwow using dplyr package :

which households are made up of couples (with or without children) born in the same place ?

Aucun commentaire:

Enregistrer un commentaire