This one might be a bit different than the regular expressions already on StackOverflow. I hope this question is not a duplicate.
I have 1000 columns of which 400 contain the word "ball". Such as "a - ball", "b-ball", "ball - c". The word ball can appear anywhere in the column order. The code I use is
m = df.index.str.contains(K11\s|"ball")
df_masked = df.loc[:, ~m]
The above regular expression drops all the columns containing "ball" and anything with K11 and whatever comes after K11, such as k11 - tennis.
My question is in the second part of the expression, I want to drop all the columns containing ball except the columns that have "R15 - balls".
I am guessing maybe build an if statement with/without a regular expression, that checks if a column name has "ball", if so, does it have "R15"? if yes then keep the column, if not, then drop the column. And iterates on all the rest of the columns to see if it meets the condition.
How can I do that ?
Aucun commentaire:
Enregistrer un commentaire