I have this set
df=pd.DataFrame({'user':[1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,4],
'date':['1995-09-01','1995-09-02','1995-10-03','1995-10-04','1995-10-05','1995-11-07','1995-11-08','1995-11-09','1995-11-10','1995-11-15','1995-12-18','1995-12-19','1995-12-20','1995-12-23','1995-12-26','1995-12-27'],
'dc':['1995-09-02','1995-09-02','1995-10-02','1995-10-05','1995-10-05','1995-11-05','1995-11-05','1995-11-10','1995-11-10','1995-11-10','1995-12-10','1995-12-23','1995-12-23','1995-12-23','1995-12-23','1995-12-23'],
'tp':['s','c','f','s','c','c','f','s','c','s','f','s','s','c','s','f'],
'vt':['0','1','0','0','1','0','0','0','1','0','0','0','0','1','0','0'],
'c1':['1','5','0','2','3','9','3','2','0','5','5','6','4','0','6','0'],
'c2':['3','4','0','2','5','3','8','4','0','6','2','7','0','0','8','0'],
'c3':['5','5','2','5','6','4','2','4','4','6','3','4','3','8','2','7']})
df2
which gives:
user date dc tp vt c1 c2 c3
1 1995-09-01 1995-09-02 s 0 1 3 5
1 1995-09-02 1995-09-02 c 1 5 4 5
1 1995-10-03 1995-10-02 f 0 0 0 2
2 1995-10-04 1995-10-05 s 0 2 2 5
2 1995-10-05 1995-10-05 c 1 3 5 6
2 1995-11-07 1995-11-05 c 0 9 3 4
2 1995-11-08 1995-11-05 f 0 3 8 2
3 1995-11-09 1995-11-10 s 0 2 4 4
3 1995-11-10 1995-11-10 c 1 0 0 4
3 1995-11-15 1995-11-10 s 0 5 6 6
3 1995-12-18 1995-12-10 f 0 5 2 3
4 1995-12-19 1995-12-23 s 0 6 7 4
4 1995-12-20 1995-12-23 s 0 4 0 3
4 1995-12-23 1995-12-23 c 1 0 0 8
4 1995-12-26 1995-12-23 s 0 6 8 2
4 1995-12-27 1995-12-23 f 0 0 0 7
I want to create new column create new column df['dc2'], where groupby user, column df['dc2']= df['dc']. However if df['dc'] meets the conditions 'tp'='c' & 'vt'=1 & 'c1'=0 & 'c2'=0, then grab the date of the former entry (raw of the user)
#ie. for user 3, on column df['dc'] if we look at entry 'tp'='c' & 'vtb'=1, we can see it has 'c1'=0 and 'c2'=0, #thus the value of df['dc2'] will be (for user 3) '1995-11-09' instead of '1995-11-10'
#ie. for user 4, on column df['dc'] if we look at entry 'tp'= 'c' & 'vtb'=1, we can see it has 'c1'=0 and 'c2'=0, in this case df['dc2'] should be (for user 4) '1995-12-20' instead of '1995-12-23'
Here is the desired result:
user date dc dc2 tp vt c1 c2 c3
1 1995-09-01 1995-09-02 1995-09-02 s 0 1 3 5
1 1995-09-02 1995-09-02 1995-09-02 c 1 5 4 5
1 1995-10-03 1995-10-02 1995-10-02 f 0 0 0 2
2 1995-10-04 1995-10-05 1995-10-05 s 0 2 2 5
2 1995-10-05 1995-10-05 1995-10-05 c 1 3 5 6
2 1995-11-07 1995-11-05 1995-11-05 c 0 9 3 4
2 1995-11-08 1995-11-05 1995-11-05 f 0 3 8 2
3 1995-11-09 1995-11-10 1995-11-09 s 0 2 4 4
3 1995-11-10 1995-11-10 1995-11-09 c 1 0 0 4
3 1995-11-15 1995-11-10 1995-11-09 s 0 5 6 6
3 1995-12-18 1995-12-10 1995-12-09 f 0 5 2 3
4 1995-12-19 1995-12-23 1995-12-20 s 0 6 7 4
4 1995-12-20 1995-12-23 1995-12-20 s 0 4 0 3
4 1995-12-23 1995-12-23 1995-12-20 c 1 0 0 8
4 1995-12-26 1995-12-23 1995-12-20 s 0 6 8 2
4 1995-12-27 1995-12-23 1995-12-20 f 0 0 0 7
Aucun commentaire:
Enregistrer un commentaire