jeudi 4 juin 2020

How can provide dynamically varying OR conditions for filtering dataframe

I have a dataframe and I use mutiple conditions to filter it.The code is kind of like below (recency.py).

def recency(config):
    i = config.exclude_list

    sales = pd.read_excel(config.input_path + "klm sales UPD.xlsx")


    df_exclude = sales[(sales["Product"].str.contains(i[0])==True)
                     | (sales["Product"].str.contains(i[1])==True)                     
                     | (sales["Customer"].str.contains(i[2])==True)
                     | (sales["Customer"].str.contains(i[3])==True)
                     | (sales["Customer"].str.contains(i[4])==True)
                     | (sales["Customer"].str.contains(i[5])==True)
                     | (sales["Customer"].str.contains(i[6])==True)
                     | (sales["Remarks"].str.contains(i[7])== True)                   
                     | (sales["Net Amount"]<10)]

Here is the config class (config.py):

class config:

    input_path = "/home/ram/Downloads/Data_Science/"
    exclude_list = ["SCRAP ","GFT","B-MART HOME APPLIANCES","4 ELECTRONICS & APPLIANCES","DISCOUNT CORNER","CASH CUSTOMER","CASH BILL","GIFT"]

And here is the main file(main.py):

from config import config
from Recency import recency
recency(config)

This config is a python file where I have defined a class(named config) with exclude_list and this is what I send to the function(recency) via a main program.The problem here is I have included 8 items in the excluded list and this might change n future. So I might need to make changes to recency function(receny.py), such that when create another config file which contains more (or less) than 8 items in excluded list,I will be able to use the same recency function to get df_exclude.What possible ways I can do this in python?

If anything is not clear in this question,please let me know in the comments.

Aucun commentaire:

Enregistrer un commentaire