I have this problem statement, I am having this as an example:
Product_id product_type views inventory
1 producttype1 Y Y
2 producttype2 N N
3 producttype3 Y Y
4 producttype4 N N
5 producttype5 Y Y
6 producttype6 N N
7 producttype7 Y Y
8 producttype1 N N
9 producttype2 Y Y
10 producttype3 N N
11 producttype4 Y Y
12 producttype5 N N
13 producttype6 Y Y
14 producttype7 N N
15 producttype7 Y Y
I have 10 millions as population from where I am trying to extract a 10% sample of population and I have to group them by product_type, views. But in the end when I get the sample, if the sample it is less than 500k then I can keep it as it is but in the scenario when the sample is highest than 500k I have to reduce the sample at 500k. This is the code that I wrote to group and to extract the 10% sample:
MPSSAMPLE %>%
group_by(product_type, views) %>%
sample_frac(.10) -> sampledData
Can anyone help me with the conditions?
Aucun commentaire:
Enregistrer un commentaire