I want to index duplicates with respect to certain variables in R in a seperate, new variable. Let's assume that I have the following dataset:
a <- seq(from=0, to=1, by=.4)
b <- seq(from=0, to=1, by=.4)
c <- seq(from=0, to=1, by=.4)
d <- seq(from=0, to=1, by=.4)
df <- expand.grid(a=a, b=b, c=c, d=d)
> df[1:20,]
a b c d
1 0.0 0.0 0.0 0
2 0.4 0.0 0.0 0
3 0.8 0.0 0.0 0
4 0.0 0.4 0.0 0
5 0.4 0.4 0.0 0
6 0.8 0.4 0.0 0
7 0.0 0.8 0.0 0
8 0.4 0.8 0.0 0
9 0.8 0.8 0.0 0
10 0.0 0.0 0.4 0
11 0.4 0.0 0.4 0
12 0.8 0.0 0.4 0
13 0.0 0.4 0.4 0
14 0.4 0.4 0.4 0
15 0.8 0.4 0.4 0
16 0.0 0.8 0.4 0
17 0.4 0.8 0.4 0
18 0.8 0.8 0.4 0
19 0.0 0.0 0.8 0
20 0.4 0.0 0.8 0
In this case, the first entry and the tenth entry are identical with respect to a and b. How can I assign a value e.g. "0.00-0.00" to a new variable for all those columns that have this combination (also line 19) and the same for all other combinations (eg. line 2, 11 and 20 etc.).
Thanks a lot in advance!
Aucun commentaire:
Enregistrer un commentaire