mardi 4 avril 2017

Group rows with for loop based on if statement and add new column

I have got the following code and I need to adjust it as the output is not 100% correct:

df = read.table(text='colA   colB   colC
                10  11   7
                10  34   7
                10  89   7
                10  21   7
                9   8    0
                9   11   0
                9   21   0
                2   23   5
                2   21   5
                2   56   5
                1   45   0
                1   23   0
                22  14   3
                22  19   3
                22  90   3
                11  19   2
                11  45   2
                80  13   2
                80  17   2
                99  15   1
                55  20   1', header = TRUE)

df$ID <-NA

df[1,'ID']<-1

val=1

for (i in 2:nrow(df)){

  if (df[i,'colC']!=0 & (df[i,'colC']!=df[i-1,'colC'])) {val=val+1} 
  df[i,'ID']<-val

}

Output as per above code:

> df
   colA colB colC ID
1    10   11    7  1
2    10   34    7  1
3    10   89    7  1
4    10   21    7  1
5     9    8    0  1
6     9   11    0  1
7     9   21    0  1
8     2   23    5  2
9     2   21    5  2
10    2   56    5  2
11    1   45    0  2
12    1   23    0  2
13   22   14    3  3
14   22   19    3  3
15   22   90    3  3
16   11   19    2  4
17   11   45    2  4
18   80   13    2  4
19   80   17    2  4
20   99   15    1  5
21   55   20    1  5

What is wrong are the last 4 ID values, which should group the rows based on colC but because there are no 0s between each distinct group, colC values with same number but diverse colA are grouped together and this is not what I want.

Desired output (see last 4 ID values):

> df
   colA colB colC ID
1    10   11    7  1
2    10   34    7  1
3    10   89    7  1
4    10   21    7  1
5     9    8    0  1
6     9   11    0  1
7     9   21    0  1
8     2   23    5  2
9     2   21    5  2
10    2   56    5  2
11    1   45    0  2
12    1   23    0  2
13   22   14    3  3
14   22   19    3  3
15   22   90    3  3
16   11   19    2  4
17   11   45    2  4
18   80   13    2  5
19   80   17    2  5
20   99   15    1  6
21   55   20    1  7

How can I solve this? Thanks

Aucun commentaire:

Enregistrer un commentaire