I am new to Python and need some assistance. I am trying to write a simple function that takes a int value from a column in my Dataframe (seconds_left) and creates an additional column of string values created from binning methods. I am referencing a separate Numpy Array (bins2) that contains the cut points for the bins.
I created a 1D Numpy Array with the cut points for each bin I'd like to reference to define my boundaries. These cut points were created from the Sturges binning method on one variable/column labeled 'seconds_left'. It is important to note that my seconds_left column in the Dataframe are int values and my array has continuous values. Not sure if that is okay or not.
# Turn Series into flattened np array
sec_left_nparray = dfb.loc[:,'seconds_left'].values
# Bin cut point based on Sturges Estimator
(n2, bins2, patches2) = plt.hist(sec_left_nparray, bins='sturges', range=(0,300), density=True)#,log=True)
# Binning function
def bin_sec_left(row):
if ((row >= bins2[0]) & (row < bins2[1])):
return '0-18.75'
elif ((row >= bins2[1]) & (row < bins2[2])):
return '18.75 to 37.5'
else:
return 'NA'
#Add additional column to dab Dataframe
df['sec_left_bin'] = df['seconds_left'].apply[bin_sec_left]
I simply want to return a Dataframe with the added computed string column based on the values in my array. I am trying to reference the array index as my boundaries in the conditional statement. However, I keep getting an error "'method' object is not subscriptable". Any idea what I am doing wrong? Thanks in advance.
Aucun commentaire:
Enregistrer un commentaire