mardi 31 octobre 2017

how can I get the count of visits per participant taking into account the hours when they went to their appointment in R

I have a dataset like the one below but with much more participants. Just copy and paste the following code in R:

d <- structure(list(id = c(33L, 33L, 33L, 33L, 33L, 33L, 33L, 33L, 33L, 33L, 33L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L), VisitDate = c("10/12/14", "10/12/14", "10/13/14", "10/14/14", "11/7/14", "11/7/14", "11/8/12", "11/8/14", "11/9/12", "4/17/13", "5/29/15", "10/26/12", "10/29/12", "11/7/13", "2/15/17", "2/9/15", "3/6/17", "3/7/13", "4/8/16", "4/8/16", "7/28/14", "9/14/12", "9/18/15", "9/18/15"), VisitHours = c(13L, 15L, 10L, 11L, 10L, 9L, 13L, 11L, 11L, 22L, 9L, 16L, 14L, 10L, 11L, 10L, 9L, 14L, 13L, 14L, 13L, 10L, 10L, 14L)), .Names = c("id", "VisitDate", "VisitHours"), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -24L), spec = structure(list(cols = structure(list(id = structure(list(), class = c("collector_integer", "collector")), VisitDate = structure(list(), class = c("collector_character", "collector")), VisitHours = structure(list(), class = c("collector_integer", "collector"))), .Names = c("id", "VisitDate", "VisitHours")), default = structure(list(), class = c("collector_guess", "collector"))), .Names = c("cols", "default"), class = "col_spec"))

Here, there are two participants with ids 33 and 45 whom had different appointments or visits at different dates and times. Basically, each VisitDate should count as one visit, except when the difference between VisitHours for the same day equals or is greater than 3. If it's lower than 3, then I want it to count as just one visit.

First, I would like to get a variable that would count the number of visits. For example, for participant 33, on the day 10/12/14, had two visits, but these were very close to each other (one at 13 and the other one at 15, see column VisitHours), so these visits should count only as one visit. On the other hand, participant 45 had two visits on 9/18/15 and these were far away from each other (one at 10 and other at 14, see column VisitHours), so these visits should count as two visits even if they were on the same day because the difference between VisitHours for the same day equals or is greater than 3. See the dataframe below as an example of how this should look like:

structure(list(id = c(33L, 33L, 33L, 33L, 33L, 33L, 33L, 33L, 33L, 33L, 33L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L), VisitDate = c("10/12/14", "10/12/14", "10/13/14", "10/14/14", "11/7/14", "11/7/14", "11/8/12", "11/8/14", "11/9/12", "4/17/13", "5/29/15", "10/26/12", "10/29/12", "11/7/13", "2/15/17", "2/9/15", "3/6/17", "3/7/13", "4/8/16", "4/8/16", "7/28/14", "9/14/12", "9/18/15", "9/18/15"), VisitHours = c(13L, 15L, 10L, 11L, 10L, 9L, 13L, 11L, 11L, 22L, 9L, 16L, 14L, 10L, 11L, 10L, 9L, 14L, 13L, 14L, 13L, 10L, 10L, 14L), CountVisits = c(1L, 0L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 1L, 1L)), .Names = c("id", "VisitDate", "VisitHours", "CountVisits"), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -24L), spec = structure(list(cols = structure(list(id = structure(list(), class = c("collector_integer", "collector")), VisitDate = structure(list(), class = c("collector_character", "collector")), VisitHours = structure(list(), class = c("collector_integer", "collector")), CountVisits = structure(list(), class = c("collector_integer", "collector"))), .Names = c("id", "VisitDate", "VisitHours", "CountVisits")), default = structure(list(), class = c("collector_guess", "collector"))), .Names = c("cols", "default"), class = "col_spec"))

At the end, I want just one row for each participant and the sum of the visits that were calculated in the previous dataframe:

d1 <- structure(list(id = c(33L, 45L), CountAllVisits = c(8L, 12L)), .Names = c("id",  "CountAllVisits"), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,  -2L), spec = structure(list(cols = structure(list(id = structure(list(), class = c("collector_integer", "collector")), CountAllVisits = structure(list(), class = c("collector_integer", "collector"))), .Names = c("id", "CountAllVisits")), default = structure(list(), class = c("collector_guess", "collector"))), .Names = c("cols", "default"), class = "col_spec"))

Thank you!

Aucun commentaire:

Enregistrer un commentaire