jeudi 14 février 2019

Conditional statement between columns and rows in dataframe

I'd like to create to column, that creates a value based on if-statement between values in the same row and, if needed, in the rows above. I have a constant A and df

A = 0.5
          FID_1          b          c        d            e
75907       nan 33021647.00   27014.12 27014.12        1.00
75858 159510.00 32888862.00   16532.64 28797.05        0.57
75859 159510.00 32888862.00   12264.41 28797.05        0.43
75795       nan 32869718.00   24218.16 24218.16        1.00
75518       nan 32574894.00   13304.45 13304.45        1.00

I'd like to create another column called f that will tell me if the value in e is greater than A or not for the given value in b. If that is correct, than the value is 1.

Example for the above df:

          FID_1          b          c        d            e    f
75907       nan 33021647.00   27014.12 27014.12        1.00    0
75858 159510.00 32888862.00   16532.64 28797.05        0.57    1
75859 159510.00 32888862.00   12264.41 28797.05        0.43    0
75795       nan 32869718.00   24218.16 24218.16        1.00    0
75518       nan 32574894.00   13304.45 13304.45        1.00    0

What is more tricky is, if I change the value of A to 0.6. In this case, I'd like to see for each number in b, if the first row of the value in b has a value in e greater than A and if not, i'd like to see for the second row of the same value sum of the values in e and check if it is greater than A. The df with A=0.6 look like this:

          FID_1          b          c        d            e    f
75907       nan 33021647.00   27014.12 27014.12        1.00    0
75858 159510.00 32888862.00   16532.64 28797.05        0.57    0
75859 159510.00 32888862.00   12264.41 28797.05        0.43    1
75795       nan 32869718.00   24218.16 24218.16        1.00    0
75518       nan 32574894.00   13304.45 13304.45        1.00    0

In this case, the code sums the 0.57 and 0.43 in the third row of df.

If that would not be the case, the code would go looking to third, fourth,... row of the same value in b - if it even exists.

This is the code for creating the e column

df['e'] = df.apply(lambda row: row.c / row.d, axis=1)

I tried similar for the f column, but I do not know how to input the if statement in the same code.

Aucun commentaire:

Enregistrer un commentaire