dimanche 31 mai 2020

Error: The truth value of a DataFrame is ambiguous when splitting strings into two columns if two conditions are met

I am trying to make split the string in column['first'] if the following two conditions are met.

  1. column['first'] contains words 'floor' or 'floors'
  2. column['second'] is empty

However, I received an error message.

The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Below is my code

#boolean series for condition 1: when values in column['second'] are empty

only_first_token = pd.isna(results_threshold_50_split_ownership['second']) 
print (len(only_first_token)) 
print (type(only_first_token))

#boolean series for condition 2: when values in column['first'] contain string floor or floors

first_token_contain_floor = results_threshold_50_split_ownership['first'].str.contains('floors|floor',case=False)
print (len(first_token_contain_floor))
print (type(only_first_token))

#if both conditions are met, the string in column['first'] will be split into column['first'] and['second']

if results_threshold_50_split_ownership[(only_first_token) & (first_token_contain_floor)]:
    results_threshold_50_split_ownership.first.str.split('Floors|Floor', expand=True)

print(results_threshold_50_split_ownership['first'])

I have read some answers here and have already changed the code a few times. I made sure the total number of boolean are the same at 1016. And I can successfully locate the rows that can fulfil the two conditions with the same code if I remove if. So I don't understand why it is ambiguous.

Any help would be much appreciated. Many thanks.

Aucun commentaire:

Enregistrer un commentaire