mardi 22 novembre 2016

Separating if conditions for which there can be some overlapping cases

Given a pandas dataframe wb, which looks like this (in Excel, before bringing it into pandas with read_csv():

enter image description here

Column ad_tag_name is in groups of 3. I want to append _level2 to every second of each group of 3, and _level3 to the value of this column in every third of each group of 3, so I end up with something like:

enter image description here

I have decided to use mod division, with the logic that "if it divides evently by both 2 and 3, then append _level3; if it divides evenly only by 2, then append _level2. Otherwise, leave it alone."

for index, elem in enumerate(wb['ad_requests']):
    if np.mod(index+1,2) == 0 and np.mod(index+1,3) == 0:
        wb.at[index,'\xef\xbb\xbf"ad_tag_name"'] = wb.at[index,'\xef\xbb\xbf"ad_tag_name"'] + "_level3"
    elif np.mod(index+1,3) == 0:
        wb.at[index,'\xef\xbb\xbf"ad_tag_name"'] = wb.at[index,'\xef\xbb\xbf"ad_tag_name"'] + "_level3"
    elif np.mod(index+1,2) == 0:
        wb.at[index,'\xef\xbb\xbf"ad_tag_name"'] = wb.at[index,'\xef\xbb\xbf"ad_tag_name"'] + "_level2"

Yet when I save the resulting CSV and examine it, I see:

enter image description here

The pattern is: no suffix, _level2, _level3, level2, no suffix, level3, no suffix, level2, level3 and then this repeats. So it's correct in 8 out of 9 cases, but really that is an accident. I don't like the fact that there may be some overlap between the ifs/elifs I have defined, and I am sure it is this flawed logic that it as the root of the problem.

How can we re-write the conditions so that they are properly achieving the logic I have in mind?

Python: 2.7.10 Pandas: 0.18.0

Aucun commentaire:

Enregistrer un commentaire