Here is my Data.frame
New = (data.frame(ID=c(1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,3,3,4,4),
DC=c("Qualx","lmx", "lmx","lmx","lmx", "Qualx","Qualx","Qualx",
"lmx","lmx", "lmx", "Qualx", "Qualx","Qualx","Qualx","Qualx","lmx", "Qualx", "Qualx", "Qualx")))
Now I would like to group by (ID,DC), and then extract counts or frequencies(percent * 100 format)
My approach using DPLYR New1 <- New %>% group_by(ID,DC) %>% mutate(count=n())%>% mutate(freq = count / sum(count))
However, my 'freq' column seems to be displaying wrong information.
Once I do get my frequency values, I would like to mutate again, and get another column based on ifelse - something like:
%>% mutate(n = ifelse(freq == .5, DC, 'Unknown')
however, when I perform the above operation, I keep running into various errors.
I also tried:
D_F_P <- New %>% group_by(ID,DC) %>% table() %>% data.frame() %>% mutate(freq = Freq / sum(Freq))%>%mutate(assign = ifelse(freq == .1, DC, 'Unknown'))
The above operation provides a numeric value for 'assign' column instead of returning the string value present in DC column, like this:
ID DC Freq freq assign
1 1 lmx 5 0.25 Unknown
2 2 lmx 2 0.10 1
3 3 lmx 1 0.05 Unknown
4 4 lmx 0 0.00 Unknown
5 1 Qualx 4 0.20 Unknown
6 2 Qualx 5 0.25 Unknown
7 3 Qualx 1 0.05 Unknown
8 4 Qualx 2 0.10 2
Instead I want it to display
ID DC Freq freq assign
1 1 lmx 5 0.25 Unknown
2 2 lmx 2 0.10 lmx
3 3 lmx 1 0.05 Unknown
4 4 lmx 0 0.00 Unknown
5 1 Qualx 4 0.20 Unknown
6 2 Qualx 5 0.25 Unknown
7 3 Qualx 1 0.05 Unknown
8 4 Qualx 2 0.10 Qualx
My main goal is to group by (ID,CD), then get frequencies (percentage*100), then use an ifelse statement that returns values in DC column. Any help would be appreciated. You don't even have to use my approach, any approach related to 'dplyr' in your personal way would also be appreciated. Thank you
Aucun commentaire:
Enregistrer un commentaire