jeudi 26 novembre 2020

How do I sum certain partially-duplicated rows but not all duplicated rows (based on a conditional statement)?

My current dataset looks something like this:

INVOICE_ID  MONTH  AMOUNT
1            2     -500
1            2     -500
1            2     1000
1            3     30
1            3     10
1            3     -30
2            5     40
2            5     60
3            4     50
....

I want to convert this to look like:

INVOICE_ID  MONTH  AMOUNT
1            2      0
1            3     10
2            5      40
2            5      60
3            4      50

So for rows that have an INVOICE_ID and MONTH that match a row with the same INVOICE_ID and MONTH and a negative AMOUNT, I want to consolidate these rows into one row that is the sum of the AMOUNTs in all matching rows. However, for rows that have an INVOICE_ID and MONTH that match rows with the same INVOICE_ID and MONTH but only positive AMOUNTs, I want to leave this data untouched.

I'm not a great R user, so this feels above my level and I'm not sure how to approach it. Any help would be greatly appreciated!

Aucun commentaire:

Enregistrer un commentaire