I have a dataset that looks like this (just a sample and also how the orange highlight comments should look like):
The two different colors against the rows are just to emphasize that they are different serial_num. The highlighted orange columns num_of_fails and run_number just mean they are new columns I made.
So I have to make sure everything is according to the same serial_num. In other words, in the sample dataset above, the serial_num is 846 and has 2 runs, as seen in run_number. If there happens to be another serial_num, lets say 847, then the run_number would start at 1.
Additionally, number of fails increases by 1 going up to a sum of 2. Then the counter for num_of_fails, restarts at 0 if num_of_fails is 2 or is a brand new run or is a different serial_num.
Here is my code:
df["num_of_fails"] = np.nan
df["run_number"] = np.nan
filter_list = ['846', '847']
sample_df = df[df.serial_num.isin(filter_list)]
sample_df_group = sample_df_group.groupby('serial_num')
num_of_fails = 0
for name, group in sample_df_group:
if group.iloc[2] == False
num_of_fails = 1
if (group.iloc[2] == False and group.iloc[10] == 1):
num_of_fails = 2
else:
num_of_fails 0
However, I get this error:
File "<ipython-input-72-bfd213949d82>", line 3
if group.iloc[2] == False
^
SyntaxError: invalid syntax
I dont know if I am starting this logic correct to represent how num_of_fails and run_number are populated based on what serial_num it is at and the pass_fail.
Any advice?

Aucun commentaire:
Enregistrer un commentaire