mercredi 26 juin 2019

Condensing a data frame using multiple arguments from certain variables in R

I'm looking to condense a data frame based on various arguments from multiple variables and I'm not quite sure on how to achieve it in the easiest way possible. I'm thinking it's going to need some kind of personalised function but I don't have much experience in writing functions.

Basically, my data frame currently looks like this:

chainID     teamID        statID        startType       endType        

1           Team A     Effective Pass      TO              TO
1           Team A     Effective Pass      TO              TO
1           Team A     Effective Pass      TO              TO
1           Team A     Effective Pass      TO              TO
1           Team A     Ineffective Pass    TO              TO
2           Team B     Effective Pass      TO              SH
2           Team B     Entry               TO              SH
2           Team B     Effective Pass      TO              SH
2           Team B     Shot                TO              SH
3           Team A     Effective Pass      ST              TO
3           Team A     Entry               ST              TO
3           Team A     Ineffective Pass    ST              TO
4           Team B     Effective Pass      TO              ST
4           Team B     Effective Pass      TO              ST
4           Team B     Ineffective Pass    TO              ST
5           Team B     Effective Pass      TO              SH
5           Team B     Entry               TO              SH
5           Team B     Goal                TO              SH
6           Team B     Effective Pass      CB              TO
6           Team B     Effective Pass      CB              TO
6           Team B     Ineffective Pass    CB              TO
7           Team A     Effective Pass      TO              ST
7           Team A     Ineffective Pass    TO              ST

What I'm looking to do is whenever the word Entry appears in the statID column for any chainID, I want to keep that row and the last row for that chainID whilst removing all the other rows for that particular chainID (see chainID 2 and 5). In addition, if Entry appears in the statID for any chainID but the last row isn't a Goal or Shot I want to keep the rows following that data for the next chainID (see chainID 3 and 4). E.g.

chainID     teamID        statID        startType       endType        

2           Team B     Entry               TO              SH
2           Team B     Shot                TO              SH
3           Team A     Entry               ST              TO
3           Team A     Ineffective Pass    ST              TO
4           Team B     Effective Pass      TO              ST
4           Team B     Effective Pass      TO              ST
4           Team B     Ineffective Pass    TO              ST
5           Team B     Entry               TO              SH
5           Team B     Goal                TO              SH


Aucun commentaire:

Enregistrer un commentaire