jeudi 9 janvier 2020

Subset All Rows Based on First Value of Column

I have a subsetting problem that I am stuck on. This is a snippet of the data:

UniqueID MonthYear FirstObs
ABC123   OCT-18    1
ABC123   NOV-18    0
ABC123   JAN-19    0
ABC123   FEB-19    0
DEF446   MAY-19    1
DEF456   JUN-19    0
DEF456   JUL-19    0
GHI789   OCT-18    1
GHI789   NOV-18    0

The dataset is quite large and has a series of rows that look like the above sample. I want to be able to write a subsetting formula that extracts every row with identical chunks of UniqueIDs together that start with the FirstObs=1 and put them together based on the month from which they originated. I would have something like the following:

Subset1 (all uniqueIDs that originated in October)
UniqueID MonthYear FirstObs
ABC123   OCT-18    1
ABC123   NOV-18    0
ABC123   JAN-19    0
ABC123   FEB-19    0
GHI789   OCT-18    1
GHI789   NOV-18    0


Subset2 (all uniqueIDs that originated in May)
UniqueID MonthYear FirstObs    
DEF446   MAY-19    1
DEF456   JUN-19    0
DEF456   JUL-19    0

Ideally, I would have one subset for every chunk that starts with FirstObs=1 from every month. I know I need to use some series of ifelse and subset functions, but I'm not sure how to go about using them in the best way.

Aucun commentaire:

Enregistrer un commentaire