lundi 4 décembre 2017

compare values in R with a function

This is a follow up from a previous post: compare values in R when within a string of letters

I need to check if the value of CURRENT_ID is equal or equal plus 1 to the value of CURRENT_TEXT_1 or CURRENT_TEXT_2, when CURRENT_TEXT_1 or CURRENT_TEXT_2 are equal to DISPLAY_BOUNDARY. If it is, then I need in the OUTPUT column a value of 1, otherwise a value of zero.

Here are some example lines of my datafile (df) and the output I would like to obtain:

 PARTICIPANT     ITEM   CONDITION      CURRENT_TEXT_1               CURRENT_TEXT_2                 CURRENT_ID            OUTPUT
 ppt01          1         1            DISPLAY_BOUNDARY 1 the       iaRegion 4 rd 0 x width 333    7                     0
 ppt01          3         1            iaRegion 2 rd 0 x width 1    DISPLAY_BOUNDARY 9 a           11                    0
 ppt01          4         2            DISPLAY_BOUNDARY 2 aware     iaRegion 6 rd 0 x width 768    3                     1
 ppt01          6         3            DISPLAY_BOUNDARY 3 door      iaRegion 8 rd 0 x width 534    4                     1
 ppt01          9         4            DISPLAY_BOUNDARY 6 in        iaRegion 9 rd 0 x width 924    5                     0
 ppt01          48        5            DISPLAY_BOUNDARY 6 the       iaRegion 10 rd 0 x width 712   8                     0
 ppt02          3         4            iaRegion 14 rd 0 x width 756 DISPLAY_BOUNDARY 15 put        17                    0
 ppt02          7         5            iaRegion 1 rd 0 x width 334  DISPLAY_BOUNDARY 1 where       3                     0
 ppt02          8         6            DISPLAY_BOUNDARY 3 At        iaRegion 2 rd 0 x width 215    5                     0
 ppt02          35        2            iaRegion 3 rd 0 x width 524  DISPLAY_BOUNDARY 1 outside     2                     1
 ppt03          10        1            iaRegion 11 rd 0 x width 190 DISPLAY_BOUNDARY 2 school      4                     0
 ppt03          56        1            DISPLAY_BOUNDARY 8 blue      iaRegion 11 red 0 x width 383  9                     1

Is it correct to do:

ct3 <- as.numeric(gsub("DISPLAY_BOUNDARY ([0-9]+).*","\\1",df$CURRENT_TEXT_1))  
ct4 <- as.numeric(gsub("DISPLAY_BOUNDARY ([0-9]+).*","\\1",df$CURRENT_TEXT_2))

df$OUTPUT <- as.numeric(mapply(function(id,x1,x2,x3,x4) id %in% c(x1+1,x2+1,x1,x2), as.numeric(df$CURRENT_ID),ct3,ct4,ct3,ct4))

I am not sure about function(id,x1,x2,x3,x4) if I need to put x3 and x4, and about ,ct3,ct4,ct3,ct4) if I need to write twice ct3 and ct4.

Aucun commentaire:

Enregistrer un commentaire