I have 2 data frames data2 and data3:
data0 = {
'state': ['CA', 'CA', 'OH'],
'year': [2012, 2014, 2010],
's': [2000, 4000, 5000]
}
data1=pd.DataFrame(data0)
data2 = {
'state': ['CA', 'CA', 'OH'],
'year': [2012, 2014, 2010],
's': [2000, 4000, None]
}
data3=pd.DataFrame(data2)
First I want to count s by state and year:
data11 = data1.groupby(['state', 'year'])['s'].agg({'result1': 'count'})
data33 = data3.groupby(['state', 'year'])['s'].agg({'result2': 'count'})
The question is how to write a statement that
i) if every row count (result1 column) in data11 is equal to every row count (result2) in data33 print "all rows matched" (here do not show the matching rows)
ii) else print "the following rows failed" (and shows the rows that failed from both data11 and data33)
Thanks!
Aucun commentaire:
Enregistrer un commentaire