samedi 7 octobre 2017

Make a range of ID's based on sum of values in R

Being not so well versed with r, don't know if this is a simple problem or not. I want to create a range of ID's based on their sum of values which make up 60%(or approx) of total sum. here is the dataframe. DF

ID     Val
98     2
98     1
98     4
3     11
3      6
3      8
3      1
24     3
24     2
46     1
46     2
59     6

Such that I would first sort the DF by ID and then check for which range of ID's the Value sums upto 60% and group them and for rest, group them by 10%,10%,10%,10%(or it could be random 10%,10%,20% or 5%,15%,10%,10%). such that my dataframe would look like

ID     Val
3-24   35           # (11+6+8+1+3+2) ~ 62% of the total sum of `Val` column
46-59  9            # (1+2+6) = 18% of the total sum of `Val` column
98     7            # (2+1+4) =14% of the total sum of `Val` column

I could try this

DF=DF[with(DF, order(DF$ID)), ]
perce = sum(DF$ID)*60/100
for(i in 1:dim(DF)[1]){
     if(DF$Val == perce){
      ID=which(DF$ID)
       .
       .
       .
put those ID's in a range that constitutes 60%

       }
    }

I don't know if this could be possible or not.?

Thanks Domnick

Aucun commentaire:

Enregistrer un commentaire