mardi 25 juin 2019

How to use a for-loop to create a new variable based on differences between posixct for log files

I am trying to loop through a log file dataset I have, to add a variable in which a server session number is stored for every observation. For the first line, I want to create a new variable 'session number' with value 1. After that, I want a different session number for the following line if the 'ResearchNumber' differs from the line before. If it is the same 'ResearchNumber', I want to check whether the difference in the Posixct variable is larger than 18000 seconds (or 30 minutes). If that's the case, I want to create a different session number (by increasing this with 1). In all the other cases, I want the session number to be the same as the previous line. To summarize, I want to create session numbers based on inactivity for longer than 30 minutes per participant.

I have tried several things, but my code doesn't seem to loop over all the lines, and with other solutions the time difference doesn't calculate in the right way.

I hope someone can help me fix this problem. All help is appreciated!


# create example data

ResearchNumber <- c("AL001","AL002","AL003")

DateTimeTag <- c(
  as.POSIXct('2014-09-29 10:35:40', tz='GMT'),
  as.POSIXct('2014-09-29 10:35:42', tz='GMT'),
  as.POSIXct('2014-09-29 10:38:18', tz='GMT')
)

logdata <- data.frame(ResearchNumber, DateTimeTag)


# loop through logdata to add variable to every observation with a server session number

linecount <- 1
for (lines in logdata) {
  if (linecount == 1) {
    session_number <- 1
    logdata$session_number <- session_number
    datetime <- logdata$DateTimeTag
    participantbefore <- logdata$ResearchNumber
    linecount <- (linecount + 1)
  } 
  else if (linecount > 1) {
    difference <- (logdata$DateTimeTag - datetime)
    if (logdata$ResearchNumber != participantbefore) {
      logdata$session_number <- (session_number + 1)
      participantbefore <- logdata$ResearchNumber
      session_number <- (session_number + 1)
      datetime <- logdata$DateTimeTag
    }
    else if (difference > 18000) {
      logdata$session_number <- (session_number + 1)
      participantbefore <- logdata$ResearchNumber
      session_number <- (session_number + 1)
      datetime <- logdata$DateTimeTag
    }
    else {
      logdata$session_number <- (session_number)
      participantbefore <- logdata$ResearchNumber
      datetime <- logdata$DateTimeTag
    }
  }
}

Aucun commentaire:

Enregistrer un commentaire