mardi 2 mars 2021

Create one sum column based on two conditions

I have a dataset df

> df
         date group    x
    1  197302     A 0.53
    2  197303     A 0.60
    3  197304     A 0.57
    4  197302     B 0.68
    5  197303     B 0.71
    6  197304     B 0.65
    7  197302     C 0.16
    8  197303     C 0.25
    9  197304     C 0.22
    10 197302     D 0.31
    11 197303     D 0.39
    12 197304     D 0.36

I want to create a new column 'x.total' where some of the x-values are summed based on two conditions:

  1. I only want to sum group A and B with each other and on the other hand and also I only want to sum group C and D with each other.
  2. I only want to sum x for dates that are the same. This means that x shouldn't be summed if the date is 197302 for group A and 197303 for group B.

By following these conditions, the results should end up looking like this:

     date group    x x.total
1  197302     A 0.53    1.21
2  197303     A 0.60    1.31
3  197304     A 0.57    1.22
4  197302     B 0.68    1.21
5  197303     B 0.71    1.31
6  197304     B 0.65    1.22
7  197302     C 0.16    0.47
8  197303     C 0.25    0.64
9  197304     C 0.22    0.58
10 197302     D 0.31    0.47
11 197303     D 0.39    0.64
12 197304     D 0.36    0.58

Does anyone know how I can do that?

Aucun commentaire:

Enregistrer un commentaire