jeudi 25 juin 2020

Apply if/else logic to dataframe in function: ValueError: The truth value of a Series is ambiguous

I want to create a new column in a dataframe based on if/then logic. The rules for the actual problem are the output of a CART tree so fairly complex. The problem that I have is that when I try to apply the function to my dataframe, I get the error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

I am pretty sure that this is because the 'if' logic is trying to evaluate the input as a series as opposed to on a row by row basis. I just can't figure out the solution.

To replicate:

import pandas as pd
import numpy as np
np.random.seed(1)

#create sample dataframe
df_test = pd.DataFrame({"llflag": np.random.normal(0,1,100)})

#sample if/else logic
def tree1(df):
  if df['llflag'] <= 0.5:
      return 4
  else:  
      return 3
  return 

#attempt to apply function to df
df_test['testRR'] = df_test.apply(tree1(df_test ), axis = 1)

I got the same results with.

df_test['testRR'] = df_test.apply(lambda  x: tree1( df_test), axis = 1)'''

what am I missing? Thanks in advance.

Aucun commentaire:

Enregistrer un commentaire