lundi 22 février 2021

How to Create New Columns Based on Row Categories Matching?

I'm trying to add new columns to my dataframe based on matching values in rows. A sample of my starting data is:

    ex <- structure(list(reg_desc = c("1-Northeast Region", "1-Northeast Region", 
"1-Northeast Region", "1-Northeast Region", "1-Northeast Region"
), state = c("04-Connecticut", "05-Maine", "04-Connecticut", 
"05-Maine", NA), trigger_city = c("14860-Bridgeport-Stamford-Norwalk", 
"12620-Bangor", NA, NA, NA), Category = c("M", "M", "S", "S", 
"R"), Cred_Fac = c(0, 0, 0.317804971641414, 0, 1), Mean = c(50323.3311111111, 
48944.4266666667, 44220.8220792079, 43724.1495, 50492.0654351396
)), row.names = c(1L, 7L, 118L, 119L, 136L), class = "data.frame")

original

But now I'd like to find a way where, if states and regions match, to append those values by creating new columns. I have a category column where M is for metropolitan, S is for state, and R is for region. I'd like to append the state info and region infos to any row that is one level below. My final output would look like:

hi1 <- data.frame(reg_desc = c("1-Northeast Region", "1-Northeast Region", 
                                  "1-Northeast Region", "1-Northeast Region", "1-Northeast Region"
), state = c("04-Connecticut", "05-Maine", "04-Connecticut", 
             "05-Maine", NA), trigger_city = c("14860-Bridgeport-Stamford-Norwalk", 
                                               "12620-Bangor", NA, NA, NA), Category = c("M", "M", "S", "S", 
                                                                                         "R"), Cred_Fac = c(0, 0, 0.317804971641414, 0, 1), Mean = c(50323.3311111111, 
                                                                                                                                                     48944.4266666667, 44220.8220792079, 43724.1495, 50492.0654351396),
State_Cred_Fac = c(0.317805,0.000000,NA,NA,NA),Mean_State = c(44220.82,43724.15,NA,NA,NA),
Reg_Cred_Fac = c(1.000000,1.000000,1.000000,1.000000,NA),
Mean_Region = c(50492.07,50492.07,50492.07,50492.07,NA))

new

Is there any way I could get a result like this without having to do it manually? Thanks in advance

Aucun commentaire:

Enregistrer un commentaire