I am pretty new to python and used to use R. For this matter, I would use as.factor and categorize based on the number.
Earlier I was trying to use replace and .loc function in order to give a new category value in a new column according to the condition but it would run only to fail at what I wanted to do.
Eventually I created the following, very simple function:
g['Category'] = ""
for i in g['NumFloorsGroup']:
if i == '0-9' or i == '10-19':
g['Category'] = 'LowFl'
elif i == '50~':
g['Category'] = 'HighFl'
else:
g['Category'] = 'NormalFl'
When I run the function though, it only returns the 'LowFl' and doesn't correct the other parts. I feel like I am missing something.
the data info is as follows:
<class 'pandas.core.frame.DataFrame'>
Int64Index: 596 entries, 128 to 595
Data columns (total 4 columns):
YearBuilt 596 non-null int64
NumFloorsGroup 596 non-null category
Count 596 non-null int64
Category 596 non-null object
dtypes: category(1), int64(2), object(1)
Any comment will be helpful!
Aucun commentaire:
Enregistrer un commentaire