mardi 16 juillet 2019

Want to know how many objects are in the overlap of two different subsets

I have a category with certain characteristics (height & weight, defined by np.where) and a different category with other characteristics (if someone is a twin or not & how many siblings, defined by np.where). I want to see how many fall into both categories at the same time (like how many would be in the center if a Venn diagram was made?).

I'm importing columns of a CSV file. This is what the table looks like:

    Child  Inches  Weight Twin  Siblings
0     A      53     100    Y         3
1     B      54     110    N         4
2     C      56     120    Y         2
3     D      58     165    Y         1
4     E      60     150    N         1
5     F      62     160    N         1
6     H      65     165    N         3

import pandas as pd
import numpy as np

file = pd.read_csv(r'~/Downloads/Test3 CVS_Sheet1.csv')
#%%
height = file["Inches"]
weight = file["Weight"]
twin = file["Twin"]
siblings = file["Siblings"]
#%%
area1 = np.where((height <= 60) & (weight <= 150))[0]
#%%
#has two or more siblings (and is a twin)
group_a = np.where((siblings >= 2) & (twin == 'Y'))[0]

#has two or more siblings (and is not a twin)
group_b = np.where((siblings >= 2) & (twin == 'N'))[0]

#has only one sibling (and is twin)
group_c = np.where((siblings == 1) & (twin == 'Y'))[0]

#has only one sibling (and is not a twin)
group_d = np.where((siblings == 1) & (twin == 'N'))[0]
#%%
for i in area1:
    if group_a==True:
        print("in area1 there are", len(i), "children in group_a")
    elif group_b==True:
        print("in area1 there are", len(i), "children in group_b")  
    elif group_c==True:
        print("in area1 there are", len(i), "children in group_c")
    elif group_d==True:
        print("in area1 there are", len(i), "children in group_d")


I get the error: "ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()"

I'm hoping for an output like:

"in area1 there are 2 children in group_a"
"in area1 there are 1 children in group_b"
"in area1 there are 0 children in group_c"
"in area1 there are 1 children in group_d"

Thanks in advance!

Aucun commentaire:

Enregistrer un commentaire