mardi 22 juin 2021

TypeError: 'in

I want to categorize data based on certain keyword that exists in column. The pseudocode should be:

  • Program checks if the any of keywords in data dictionary exists in dataframe
  • If exists, create new column based on the data dictionary
  • If not exists, create new column with data "OTHERS"

Problem: So far the code is able to do it, but it only categorize data if the dataframe is exactly same as the keywords in data dictionary. For example:

  • if data is "scotland", categorized as "scotland.
  • if data is "I love scotland", it should categorize as "scotland" too, but current program categorized as "others"

code:

import pandas as pd
data = {'country': ['cheshire','scotland', 'scot', 'scot54','sctland is my country', 'here is Cambrgeshire','Cambridgeshire','County of Cambridgeshire Tourism Website','berkshire']}  
  
# Create DataFrame  
df = pd.DataFrame(data)  
print(df)

def func(a):
    scotland_dict = {k:"scotland" for k in ['scotland','scot','sctland']}
    cambridgeshire_dict = {k:"cambridgeshire" for k in ['Cambrgeshire','cambridgeshire','idgeshire']}

    city_dict = {**scotland_dict ,**cambridgeshire_dict }

    if  a.lower() in city_dict.keys():
        return city_dict[a.lower()]
    elif "cheshire" in a.lower():
        return "cheshire"
    else:
        return "others"

df["city"] = df["country"].apply(lambda x: func(x))
print(df["city"])

current output:

0          cheshire
1          scotland
2          scotland
3            others
4            others
5            others
6    cambridgeshire
7            others
8            others

Expected output:

0          cheshire
1          scotland
2          scotland
3          scotland
4          scotland
5    cambridgeshire
6    cambridgeshire
7    cambridgeshire
8            others

Updated: What I've tried:

if  city_dict.keys() in a.lower():
        return city_dict[a.lower()]
    elif "cheshire" in a.lower():
...

Error:

Exception has occurred: TypeError
'in <string>' requires string as left operand, not dict_keys
  File "/home/abyres/testa.py", line 22, in func
    if  city_dict.keys() in a.lower():
...

Aucun commentaire:

Enregistrer un commentaire