I'm looking to condense a data frame based on various arguments from multiple variables and I'm not quite sure on how to achieve it in the easiest way possible. I'm thinking it's going to need some kind of personalised function but I don't have much experience in writing functions.
Basically, my data frame currently looks like this:
chainID teamID statID startType endType
1 Team A Effective Pass TO TO
1 Team A Effective Pass TO TO
1 Team A Effective Pass TO TO
1 Team A Effective Pass TO TO
1 Team A Ineffective Pass TO TO
2 Team B Effective Pass TO SH
2 Team B Entry TO SH
2 Team B Effective Pass TO SH
2 Team B Shot TO SH
3 Team A Effective Pass ST TO
3 Team A Entry ST TO
3 Team A Ineffective Pass ST TO
4 Team B Effective Pass TO ST
4 Team B Effective Pass TO ST
4 Team B Ineffective Pass TO ST
5 Team B Effective Pass TO SH
5 Team B Entry TO SH
5 Team B Goal TO SH
6 Team B Effective Pass CB TO
6 Team B Effective Pass CB TO
6 Team B Ineffective Pass CB TO
7 Team A Effective Pass TO ST
7 Team A Ineffective Pass TO ST
What I'm looking to do is whenever the word Entry
appears in the statID
column for any chainID
, I want to keep that row and the last row for that chainID
whilst removing all the other rows for that particular chainID
(see chainID 2 and 5). In addition, if Entry
appears in the statID
for any chainID
but the last row isn't a Goal
or Shot
I want to keep the rows following that data for the next chainID
(see chainID 3 and 4). E.g.
chainID teamID statID startType endType
2 Team B Entry TO SH
2 Team B Shot TO SH
3 Team A Entry ST TO
3 Team A Ineffective Pass ST TO
4 Team B Effective Pass TO ST
4 Team B Effective Pass TO ST
4 Team B Ineffective Pass TO ST
5 Team B Entry TO SH
5 Team B Goal TO SH
Aucun commentaire:
Enregistrer un commentaire