Being not so well versed with r, don't know if this is a simple problem or not. I want to create a range of ID's based on their sum of values which make up 60%(or approx) of total sum. here is the dataframe. DF
ID Val
98 2
98 1
98 4
3 11
3 6
3 8
3 1
24 3
24 2
46 1
46 2
59 6
Such that I would first sort the DF by ID and then check for which range of ID's the Value sums upto 60% and group them and for rest, group them by 10%,10%,10%,10%(or it could be random 10%,10%,20% or 5%,15%,10%,10%). such that my dataframe would look like
ID Val
3-24 35 # (11+6+8+1+3+2) ~ 62% of the total sum of `Val` column
46-59 9 # (1+2+6) = 18% of the total sum of `Val` column
98 7 # (2+1+4) =14% of the total sum of `Val` column
I could try this
DF=DF[with(DF, order(DF$ID)), ]
perce = sum(DF$ID)*60/100
for(i in 1:dim(DF)[1]){
if(DF$Val == perce){
ID=which(DF$ID)
.
.
.
put those ID's in a range that constitutes 60%
}
}
I don't know if this could be possible or not.?
Thanks Domnick
Aucun commentaire:
Enregistrer un commentaire