vendredi 20 août 2021

Change column values based on the max value of another column

I've been trying to solve this issue for a long time now and I don't know what to do. Hopefully someone could help me out.

I have large dataframe with multiple groups such as the example below. I've had to edit column names so the data and what I'm trying might not make much sense. In the end I want to apply this to every "team" in the large sample. "Teams" either have one, none or multiple captains. What I'm trying to do is:

  • Calculate the distance from each player's country to the captain's country as well as the captain's distance to the opposing captain.
  • If there are multiple or no captains, the captain is determined based on the assistant dummy.
    • If there is no assistant dummy, captain is the player with a captain.dummy "Yes" and the closest distance to the opposing captain.
    • If there are multiple such cases, captain is the one from these with the highest salary proportion. In that case there may be multiple captains.

Note that the "dummy" is a string "Yes"/"No".

# A tibble: 12 x 11
# Groups:   team [1]
     team captain.dummy player.country player.country.x player.country.y assistant.dummy salary.proportion opposing.player.x opposing.player.y opposing.player.country opposing.captain   
    <int> <fct>         <fct>                     <dbl>            <dbl> <fct>                       <dbl>             <dbl>             <dbl> <fct>                   <fct>                  
 1 445841 Yes           Poland                    0.645           -0.408 No                          16.4              0.645            -0.408 Poland                  Player name
 2 445841 Yes           Poland                    0.645           -0.408 No                           4.1              0.645            -0.408 Poland                  Player name
 3 445841 No            Poland                    0.645           -0.408 No                           5.89             0.645            -0.408 Poland                  Player name
 4 445841 Yes           France                    1.91             0.542 No                           8.55             0.645            -0.408 Poland                  Player name
 5 445841 Yes           Poland                    0.645           -0.408 No                           5.11             0.645            -0.408 Poland                  Player name
 6 445841 Yes           Spain                     1.44             0.648 No                           6.15             0.645            -0.408 Poland                  Player name
 7 445841 Yes           Poland                    0.645           -0.408 No                           4.1              0.645            -0.408 Poland                  Player name
 8 445841 Yes           United Kingdom            2.35             0.488 No                          12.5              0.645            -0.408 Poland                  Player name
 9 445841 Yes           Poland                    0.645           -0.408 No                          16.4              0.645            -0.408 Poland                  Player name
10 445841 Yes           China                    -0.130            0.601 No                           6.51             0.645            -0.408 Poland                  Player name
11 445841 Yes           Poland                    0.645           -0.408 No                          10.2              0.645            -0.408 Poland                  Player name
12 445841 Yes           Poland                    0.645           -0.408 No                           4                0.645            -0.408 Poland                  Player name

I have tried to make a long sequence of mutate() -> ifelse() -> ifelse() (kind of like the player.distance variable) that would account for every scenario, but it all I can get is the following (correct) vector and an outcome that takes the first value and applies it to every captain. This results in every player having captain.dummy "Yes."

[1]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE

This is (one of the many alterations of) what I've tried so far. I have longer sequences but this message is already getting too long so I omitted them.

df <- subset(df, team == 445841) 
df <- df %>% select(team, captain.dummy, player.country, player.country.x, player.country.y, assistant.dummy, salary.proportion, opposing.player.x, opposing.player.y, opposing.player.country, opposing.player)

df <- df %>%
   group_by(team) %>%
   mutate(
     player.distance = ifelse(captain.dummy != "1",
                              0,
                              ifelse(player.country.x %% opposing.player.x == 0 &
                                     player.country.y %% opposing.player.y == 0,
                                           0,
                                           sqrt((opposing.player.x - player.country.x)^2 + 
                                                 (opposing.player.y - player.country.x)^2)
                              )
                       )
  ) %>%
mutate(
   captain.dummy = ifelse(captain.dummy == "Yes" & player.distance == min(player.distance[captain.dummy == "Yes"]),
                                          "Yes",
                                          "No")
  )  %>%
mutate(
   if(sum(captain.dummy == "Yes") > 1 & sum(assistant.dummy == "Yes") == 1) {
      captain.dummy = ifelse(captain.dummy == "Yes" & assistant.dummy == "Yes",
                                          "Yes",
                                          "No")
   }
) %>%
mutate(
   if(sum(captain.dummy == "Yes") > 1 & sum(assistant.dummy == "Yes") == 0) {
              captain.dummy = ifelse(captain.dummy == "Yes" & salary.proportion == salary.proportion[salary.proportion == max(salary.proportion),
                                          "Yes",
                                          "No")
              
   }
)

So what I'm trying to get in the case of this "team" is the following. Then eventually I want to calculate the distance between the captain and opposing captain.

     team captain.dummy player.country player.country.x player.country.y assistant.dummy salary.proportion opposing.player.x opposing.player.y opposing.player.country opposing.captain   
    <int> <fct>         <fct>                     <dbl>            <dbl> <fct>                       <dbl>             <dbl>             <dbl> <fct>                   <fct>                  
 1 445841 Yes           Poland                    0.645           -0.408 No                          16.4              0.645            -0.408 Poland                  Player name
 2 445841 No            Poland                    0.645           -0.408 No                           4.1              0.645            -0.408 Poland                  Player name
 3 445841 No            Poland                    0.645           -0.408 No                           5.89             0.645            -0.408 Poland                  Player name
 4 445841 No            France                    1.91             0.542 No                           8.55             0.645            -0.408 Poland                  Player name
 5 445841 No            Poland                    0.645           -0.408 No                           5.11             0.645            -0.408 Poland                  Player name
 6 445841 No            Spain                     1.44             0.648 No                           6.15             0.645            -0.408 Poland                  Player name
 7 445841 No            Poland                    0.645           -0.408 No                           4.1              0.645            -0.408 Poland                  Player name
 8 445841 No            United Kingdom            2.35             0.488 No                          12.5              0.645            -0.408 Poland                  Player name
 9 445841 Yes           Poland                    0.645           -0.408 No                          16.4              0.645            -0.408 Poland                  Player name
10 445841 No            China                    -0.130            0.601 No                           6.51             0.645            -0.408 Poland                  Player name
11 445841 No            Poland                    0.645           -0.408 No                          10.2              0.645            -0.408 Poland                  Player name
12 445841 No            Poland                    0.645           -0.408 No                           4                0.645            -0.408 Poland                  Player name

Aucun commentaire:

Enregistrer un commentaire