jeudi 29 juillet 2021

Create a new factor level (new row) based on data from other rows with conditional statements

df <- data.frame(PatientID = c("0002" ,"0002", "0005", "0005" ,"0009" ,"0009" ,"0018", "0018" ,"0039" ,"0039" , "0043" ,"0043", "0046", "0046" ,"0048" ,"0048"),
                 Timepoint= c("A", "B", "A", "B", "A", "B", "A", "B", "A", "B",  "A", "B",  "A", "B", "A", "B"),
                 A = c(NA , 977.146 , NA , 964.315 ,NA , 952.311 , NA , 950.797 , 947.465 , 902.852 ,  985.124  ,NA , 930.141 ,1007.790 , 1027.110 , 999.414),
                 B = c(998.988 , NA , 998.680 , NA , 1020.560 ,  955.540 , 911.606 , 964.039   ,  988.087 , 902.367 , 959.338 ,1029.050 , 987.374 ,1066.400  ,957.512 , 917.597),
                 C = c( 987.140 , 961.810 , 929.466 , 978.166, 969.469 , 943.398  ,936.034,  965.292 , 996.404 , 920.610 , 967.047, 913.517 , 893.428 , 921.606 , 929.590  ,950.493), 
                 D = c( 961.810 , 929.466 , 978.166, 1005.820 , 925.752 , 969.469  ,943.398 ,  965.292 , 996.404 ,  967.047 ,  NA , 893.428 , 921.606 , 976.192 , 929.590 , 950.493),
                 E = c(1006.330, 1028.070 ,  954.274 ,1005.910  ,949.969 , 992.820 ,934.407 , 948.913 ,    961.375  ,955.296 , 961.128  ,998.119 ,1009.110 , 994.891 ,1000.170  ,982.763),
                 G= c(NA , 958.990 , 924.680 , 955.927 , NA , 949.384  ,973.348 , 984.392 , 943.894 , 961.468 , 995.368 , 994.997 ,  979.454 , 952.605 ,NA , 956.507), stringsAsFactors = F)

Based on this dataframe, I need to create an extra FACTOR level for the variable (df$TimePoint) that will be filled based on the following conditions - we have already factors A and B in that variable so lets say that we want to create factor level X :

  • For df$A. If df$Timepoint B is >999 then the factor X will be filled with the same value as df$Timepoint level B value, otherwise (if it is ≤999) then it will be filled with the value at df$timepoint A.

  • For df$B. If df$Timepoint B is >986, factor X will be == as df$Timepoint level B value, otherwise X will == df$timepoint A.

  • For df$C. If df$Timepoint B is >1000, factor X will be == as df$Timepoint level B value, otherwise X will == df$timepoint A.

  • For df$D. If df$Timepoint B is >1030, factor X will be == as df$Timepoint level B value, otherwise X will == df$timepoint A.

  • For df$E. If df$Timepoint B is >800, factor X will be == as df$Timepoint level B value, otherwise X will == df$timepoint A.

  • For df$G. If df$Timepoint B is >950, factor X will be == as df$Timepoint level B value, otherwise X will == df$timepoint A.

The new dataframe would look like this:

enter image description here

Thanks in advance! Best

Aucun commentaire:

Enregistrer un commentaire