I have a data frame called hispn and a vector of quintiles called qq_hispn. Hispn has two columns of interest named "FAMINC17" which is family income, and "Stimulants."
I'm trying to create a new column called "Stim_Income" that has a different value for the 5 income ranges and whether they are on a stimulant. So if they are between 0-20% of the income range and are on a stimulant, the value is 1. If not on a stimulant, the value is 6. The values should be 2 and 7 for 20-40%, 3 and 8 for 40-60%, etc. This will allow me to compute a prescription prevalence (1/6, 2/7, etc) for each quintile.
I came up with this very amateur method. Can anyone tell me why it is not working?
for (i in 1:5) {
for (j in nrow(hispn)) {
if ( (hispn[j,"FAMINC17"]>qq_hispn[i])&&(hispn[j,"FAMINC17"]<=qq_hispn[i+1])&&(hispn[j,"Stimulants"]==1) ) {
hispn[j,"Stim_Income"]<-i
} else if ( (hispn[j,"FAMINC17"]>qq_hispn[i])&&(hispn[j,"FAMINC17"]<=qq_hispn[i+1])&&(hispn[j,"Stimulants"]==0) ) {
hispn[j,"Stim_Income"]<-(i+5)
}
}
}
I tried to implement the code that Michelle linked in the comments, but it returned an error.
hispn %>%
mutate(Stim_Income = case_when (
FAMINC17>qq_hispn[1] & FAMINC17<=qq_hispn[2] & Stimulants==1 ~ 1
FAMINC17>qq_hispn[1] & FAMINC17<=qq_hispn[2] & Stimulants==0 ~ 6
FAMINC17>qq_hispn[2] & FAMINC17<=qq_hispn[3] & Stimulants==1 ~ 2
FAMINC17>qq_hispn[2] & FAMINC17<=qq_hispn[3] & Stimulants==0 ~ 7
FAMINC17>qq_hispn[3] & FAMINC17<=qq_hispn[4] & Stimulants==1 ~ 3
FAMINC17>qq_hispn[3] & FAMINC17<=qq_hispn[4] & Stimulants==0 ~ 8
FAMINC17>qq_hispn[4] & FAMINC17<=qq_hispn[5] & Stimulants==1 ~ 4
FAMINC17>qq_hispn[4] & FAMINC17<=qq_hispn[5] & Stimulants==0 ~ 9
FAMINC17>qq_hispn[5] & FAMINC17<=qq_hispn[6] & Stimulants==1 ~ 5
FAMINC17>qq_hispn[5] & FAMINC17<=qq_hispn[6] & Stimulants==0 ~ 10
)
)
Another user asked for reproducible data and an example output.
m1<- matrix(0,ncol=2,nrow=5)
m1[1,1]=1000
m1[2,1]=1000
m1[3,1]=1000
m1[4,1]=1000
m1[5,1]=10000
m1[3,2]=1
[,1] [,2]
[1,] 1000 0
[2,] 1000 0
[3,] 1000 1
[4,] 1000 0
[5,] 10000 0
And then, here is the new column with the information of interest, if the for loop would have worked. But instead, I got a column of NA.
[,1] [,2] [,3]
[1,] 1000 0 6
[2,] 1000 0 6
[3,] 1000 1 5
[4,] 1000 0 6
[5,] 0 0 7
Aucun commentaire:
Enregistrer un commentaire