jeudi 25 février 2021

How to create a dummy variable based on other columns values in R?

I am cleaning a scraped dataset from duplicates. I want to create a dummy variable indicating whether I have two or more observations that are identical in all conditions or all conditions but one.

Here's an example of my dataset:

Postcode nrooms price sqm
76 1 259 30
75 5 380 120
75 5 400 120
75 2 450 80
76 1 259 30

Here's the dummy I want:

Postcode nrooms price sqm dummy
76 1 259 30 1
75 5 380 120 1
75 5 400 120 1
75 2 450 80 0
76 1 259 30 1

Where first and last rows have same values over all characteristics, the second and the third have same values in all characteristics but one (the price).

Could someone help me with this?

Thanks!

Aucun commentaire:

Enregistrer un commentaire