samedi 12 janvier 2019

Create list with sequential counting of a unique set of teams previous 'opponents using Pandas.cumcount

(Dataset posted in the end)

How do I create two columns that track the 'opponents 'opponents Winning Percentage?

(accumulated wins/accumulated games play) sequentially as the games are played.

Example: Team A plays Team B

Team A has previously played Team C and Team D
Team B has previously plaved Team E and Team F

So the dataset looks like this:

HomeTeam    AwayTeam    HomeWin   AwayWin   TotalGamesPlayedByHomeTeam TotalGamesPlayedByAwayTeam   TotalWinsHomeTeam    TotalWinsAwayTeam
  Team A      Team C       0         1                    0                 0                                0                      0
  Team D      Team A       0         1                    0                 1                                0                      0
  Team B      Team E       1         0                    0                 0                                0                      0
  Team F      Team B       0         1                    0                 1                                0                      1 
  Team A      Team B       1         0                    2                 2                                1                      2  

So two columns gets added to the dataset, HT_OOR(HomeTeam's Opponents Opponents Record) and AT_OOR(AwayTeam's Opponents Opponents Record). HT_OOR tracking the AWAY team's previous opponents and AT_OOR tracking the HOME team's previous opponents.

So in my example for the game between Team A and Team B, HT_OOR and AT_OOR would be following:

HT_OOR= (Team_E_TotalWins + Team_F_TotalWins) / (Team_E_TotalGamesPlayed+Team_F_TotalGamesPlayed)

AT_OOR= (Team_C_TotalWins + Team_D_TotalWins) / (Team_C_TotalGamesPlayed+Team_D_TotalGamesPlayed)

The important thing if using this method is always using the stats closest going backwards as they are sequentially counted up.

Other method would be to group Home_Win and Away_Win and group Home_Team and Away team and iterate through the whole list counting the appearances of each team separately and their respective losses.

The result for row 5 would be following:

HT_OOR    AT_OOR
  0        0.5

Since each row represents a unique game with two teams, the tracking columns (HT_OOR, AT_OOR) have to be in respect to the HomeTeam and AwayTeam in that row.

How do you specify these condition to make it work, is it possible? If there is any concern, or if I'm unclear about something please tell or ask.

Dataset

Screenshot of dataset, Variables:

"a_"=Away Team, "h_"=Home Team, 

HTWins= Total Previous wins HomeTeam

ATWins=Total Previous wins AwayTeam

a_games=Total games played by away team

h_games=Total games played by home team

h_won = Home Team Win

Dataset

    h_WinPerc   a_WinPerc   h_teamID    a_teamID    h_team            a_team                 HTWins ATWins  h_Won   a_Won   a_games h_games game_id
      0.0         0.0         52          10     Winnipeg Jets          Toronto Maple Leafs     0      0       0       1       1       1  2017020001
      0.0         0.0          5          19    Pittsburgh Penguins    St.Louis Blues           0      0       0       1       1       1    2017020002
      0.0         0.0         22          20    Edmonton Oilers Calgary  Flames                 0      0       1       0       1       1    2017020003
      0.0         0.0         28           4    San Jose Sharks         Philadelphia Flyers     0      0       0       1       1       1    2017020004
      0.0         0.0          6          18    Boston Bruins           Nashville Predators     0      0       1       0       1       1    2017020005
      0.0         0.0          7           8    Buffalo Sabres          Montréal Canadiens     0      0       1       0       1       1    2017020006
      0.0         0.0          3          21    New York Rangers        Colorado Avalanche      0      0       0       1       1       1    2017020007
      0.0         0.0          9          15    Ottawa Senators         Washington Capitals     0      0       1       0       1       1    2017020008
      0.0         0.0         17          30    Detroit Red Wings       Minnesota Wild          0      0       1       0       1       1    2017020009
      0.0         0.0         16           5    Chicago Blackhawks      Pittsburgh Penguins     0      0       1       0       1       2    2017020010
      0.0         0.0         24          53    Anaheim Ducks           Arizona Coyotes         0      0       1       0       1       1    2017020011
      0.0         1.0         26           4    Los Angeles Kings       Philadelphia Flyers     0      1       1       0       1       2    2017020012

Expected output is following.

HT_OOR(HomeTeam's Opponents Opponents Record): (accumulated wins/accumulated games play)

AT_OOR(AwayTeam's Opponents Opponents Record): (accumulated wins/accumulated games play)

Aucun commentaire:

Enregistrer un commentaire