mercredi 7 août 2019

create a new group based on conditions applied on rows of different columns

I am creating a new column creating groups of 5 member each per group who have to belong to same area and zip code distance should be less than five.

I tried if condition to do the same

''''
for i in range(len(df1)):
    j=i+1
    while j< len(df1):
        if (df1['pincode'][j]-df1['pincode'][i])<=5:
            df1['rank'][j]=df1['rank'][i]+1
            break;
            j=j+1
        print(i,j)
''''
#Dataset with expected outcome

   ID,Location,zip,rank,group id
    1,Area1,15000,1,1
    2,Area1,15000,2,1
    3,Area1,15000,3,1
    4,Area1,15000,4,1
    5,Area1,15000,5,1
    6,Area1,15001,1,2
    7,Area1,15002,2,2
    8,Area1,15003,3,2
    9,Area1,15004,4,2
    10,Area1,15000,5,2
    11,Area1,15000,1,3
    12,Area1,15000,2,3
    13,Area1,15000,3,3
    14,Area1,15000,4,3
    15,Area2,15000,1,4
    16,Area2,15000,2,4
    17,Area2,15000,3,4
    18,Area2,15000,4,4
    19,Area2,15000,5,4
    20,Area2,15000,1,5
    21,Area2,15500,1,6
    22,Area2,15000,2,6
    23,Area2,15000,3,6
    24,Area2,15000,4,6

First four columns are the input data and i am trying to create the rank and group id.

Aucun commentaire:

Enregistrer un commentaire