I have a problem with a piece of code, to start I share a dataset:
df <- data.frame(PatientID = c("0002" ,"0002", "0005", "0005" ,"0009" ,"0009" ,"0018", "0018" ,"0039" ,"0039" , "0043" ,"0043", "0046", "0046" ,"0048" ,"0048"),
sex= c("F", "F", "M", "M", "F", "F", "M", "M","F", "F", "M", "M", "M", "M", "F", "F"),
A1 = c( 961.810 , 929.466 , 978.166, 1005.820 , 925.752 , 969.469 ,943.398 , 965.292 , 996.404 , 967.047 , NA , 893.428 , 921.606 , 976.192 , 929.590 , 950.493),
B1 = c(998.988 , NA , 998.680 , NA , 1020.560 , 955.540 , 911.606 , 964.039 , 988.087 , 902.367 , 959.338 ,1029.050 , 987.374 ,1066.400 ,957.512 , 917.597),
C1 = c( 987.140 , 961.810 , 929.466 , 978.166, 969.469 , 943.398 ,936.034, 965.292 , 996.404 , 920.610 , 967.047, 913.517 , 893.428 , 921.606 , 929.590 ,950.493),
D1 = c( 961.810 , 929.466 , 978.166, 1005.820 , 925.752 , 969.469 ,943.398 , 965.292 , 996.404 , 967.047 , NA , 893.428 , 921.606 , 976.192 , 929.590 , 950.493),
E1 = c(1006.330, 1028.070 , 954.274 ,1005.910 ,949.969 , 992.820 ,934.407 , 948.913 , 961.375 ,955.296 , 961.128 ,998.119 ,1009.110 , 994.891 ,1000.170 ,982.763),
G1= c(987.140 , 961.810 , 929.466 , 978.166, 969.469 , 943.398 ,936.034, 965.292 , 996.404 , 920.610 , 967.047, 913.517 , 893.428 , 921.606 , 929.590 ,950.493),
A2 = c(NA , 977.146 , NA , 964.315 ,NA , 952.311 , NA , NA , 947.465 , 902.852 , NA ,NA , 930.141 ,1007.790 , NA , 999.414),
B2 = c(998.988 , NA , 998.680 , NA , NA , 955.540 , NA , 964.039 , 988.087 , 902.367 , NA ,1029.050 , NA ,1066.400 ,NA , 917.597),
C2 = c( NA , NA , NA , NA, 969.469 , NA ,936.034, 965.292 , NA , 920.610 , 967.047, NA , 893.428 , 921.606 , 929.590 ,950.493),
D2 = c( 961.810 , NA , 978.166, NA , 925.752 , NA ,943.398 , 965.292 , NA , 967.047 , NA , 893.428 , 921.606 , 976.192 , NA , 950.493),
E2 = c(1006.330, 1028.070 , NA ,1005.910 ,949.969 , 992.820 ,934.407 , 948.913 , 961.375 ,955.296 , NA ,998.119 ,NA , 994.891 ,1000.170 ,982.763),
G2= c(NA , 958.990 , 924.680 , 955.927 , NA , NA ,973.348 , 984.392 , NA , NA , 995.368 , 994.997 , 979.454 , 952.605 ,NA , 956.507), stringsAsFactors = F)
To contextualize the problem: I have 2 sets of metrics defined at visit1
(A1,B1,C1,D1,E1,G1
) and the same metrics repeated at visit2
(A2,B2,C2,D2,E2,G2
) To diagnose someone at visit 1, I use the following code:
cols <- 3:8
df$sex= as.factor(df$sex)
df %>% mutate(Diagnosis=ifelse(sex == "F" & (rowSums(df[cols] > 1004, na.rm = TRUE) >=3) ,'Yes',
ifelse(sex == "M" & (rowSums(df[cols] > 986, na.rm = TRUE) >=3) ,'Yes','No')))-> df
This piece of code does what I want and it is perfectly fine! :) As you can see, I have one threshold for female (1004) and one threshold for male (986). Based on the equation, when the patient has 3 or more metrics above the threshold gets a 'Yes' in the diagnosis.
Now, the problem comes with visit 2. In this case, the diagnosis has 4 options, the patient can be diagnosed as "ongoing", "resolved", "new onset" or "never" disease.
In theory, the solution should be as easy as apply this piece of code:
cols <- 9:14 df$sex= as.factor(df$sex) df %>% mutate(Diagnosis=ifelse(sex == "F" & (rowSums(df[cols] > 1004, na.rm = TRUE) >=3) ,'Yes', ifelse(sex == "M" & (rowSums(df[cols] > 986, na.rm = TRUE) >=3) ,'Yes','No')))-> df
and then a really easy ifelse where were:
- if yes at visit 1 and yes at visit 2 = ongoing
- if yes at visit 1 and no at visit 2 = resolved
- if no at visit 1 and yes at visit 2 = new onset
- if no at visit 1 and no at visit 2 = never
However the situation is a bit more complex, with a new option called "NPA" (not possible to assess) as there two exceptions: as in order to make a reliable judgement, we need to see what happened with those metrics that were elevated. I create a simplified examples to ilustrate each of the exceptions:
A) For example this patient has C1, D1 and E1 elevated at visit 1, however, C2 is NA, therefore this patient at visit 2 would be an NPA
df <- data.frame(PatientID = c("112"),
sex= c("F"),
A1 = c( 961.810),
B1 = c(998.988)
C1 = c( 1019.330)
D1 = c( 1046.0)
E1 = c(1006.330)
G1= c(987.140 ),
A2 = c(NA )
B2 = c(998.988),
C2 = c( NA ),
D2 = c( 961.810),
E2 = c(1006.330),
G2= c(NA), stringsAsFactors = F)
B) In this case C1, D1 and E1 elevated at visit 1, C2 is NA, but A2 is elevated, so regardless to have C1 missing, this patient presents with a clear "yes" at visit 2, that together with a "yes" in visit 1, would be a "ongoing" case.
df <- data.frame(PatientID = c("112"),
sex= c("F"),
A1 = c( 961.810),
B1 = c(998.988)
C1 = c( 1019.330)
D1 = c( 1046.0)
E1 = c(1006.330)
G1= c(987.140 ),
A2 = c(1800.810)
B2 = c(998.988),
C2 = c( NA ),
D2 = c( 961.810),
E2 = c(1006.330),
G2= c(NA), stringsAsFactors = F)
How could I code this. Sorry I know this is a bit rumbling! Thanks!
Aucun commentaire:
Enregistrer un commentaire