I have a pd.dataframe which contains activity count data from a Philips Actiwatch. When there is no activity count for a period of more than 60 minutes, the user was probably not wearing the device, and this range should be removed.
How do I detect periods of >60 min (each line is 1 minute) in my Dataframe and remove that complete period. Thus, if the activity count is 0 for 59 lines or less, nothing happens, but if the activity count is 0 for 60 lines or more (let's say 80 lines), this data should be NaN.
The csv file with the data can be found here: http://ift.tt/28Ye7QT
Pretty useless as it is, this is where I got stuck:
# remove all data where Activity = 0 for 60 or more consecutive minutes:
zero_count = 0
for n in range(len(data)):
if data['Activity'].loc[n] == NaN:
continue
elif data['Activity'].loc[n] > 0:
continue
elif data['Activity'].loc[n] = 0:
while data['Activity'].loc[n] = 0:
zero_count = zero_count + 1
if zero_count >60:
# NaN last zero_count number of lines.
zero_count = 0
break
else:
zero_count = 0
break
else:
print "Non-wear detection error"
break
What I was trying to do is check each line, if it is 0, it should add +1 to the "zero_count" and when a non-zero digit is read, it should check whether the zero_count is >60, if it is, it should NaN the whole range and reset the zero_count. If it is <60, the zero_count should just be reset without NaN-ing any data.
I hope anyone understand what I am trying to do and either: 1) make the code above work, or 2) have a better idea for doing what I am trying to do.
Thanks everyone who is even reading this post.
Best regards,
Rob
Aucun commentaire:
Enregistrer un commentaire