I have this dataframe(df), that looks like
+-----------------+-----------+----------------+---------------------+--------------+-------------+
| Gene | Gene name | Tissue | Cell type | Level | Reliability |
+-----------------+-----------+----------------+---------------------+--------------+-------------+
| ENSG00000001561 | ENPP4 | adipose tissue | adipocytes | Low | Approved |
| ENSG00000001561 | ENPP4 | adrenal gland | glandular cells | High | Approved |
| ENSG00000001561 | ENPP4 | appendix | glandular cells | Medium | Approved |
| ENSG00000001561 | ENPP4 | appendix | lymphoid tissue | Low | Approved |
| ENSG00000001561 | ENPP4 | bone marrow | hematopoietic cells | Medium | Approved |
| ENSG00000002586 | CD99 | adipose tissue | adipocytes | Low | Supported |
| ENSG00000002586 | CD99 | adrenal gland | glandular cells | Medium | Supported |
| ENSG00000002586 | CD99 | appendix | glandular cells | Not detected | Supported |
| ENSG00000002586 | CD99 | appendix | lymphoid tissue | Not detected | Supported |
| ENSG00000002586 | CD99 | bone marrow | hematopoietic cells | High | Supported |
| ENSG00000002586 | CD99 | breast | adipocytes | Not detected | Supported |
| ENSG00000003056 | M6PR | adipose tissue | adipocytes | High | Approved |
| ENSG00000003056 | M6PR | adrenal gland | glandular cells | High | Approved |
| ENSG00000003056 | M6PR | appendix | glandular cells | High | Approved |
| ENSG00000003056 | M6PR | appendix | lymphoid tissue | High | Approved |
| ENSG00000003056 | M6PR | bone marrow | hematopoietic cells | High | Approved |
+-----------------+-----------+----------------+---------------------+--------------+-------------+
Expected output:
+-----------+--------+-------------------------------+
| Gene name | Level | Tissue |
+-----------+--------+-------------------------------+
| ENPP4 | Low | adipose tissue, appendix |
| ENPP4 | High | adrenal gland, bronchus |
| ENPP4 | Medium | appendix, breast, bone marrow |
| CD99 | Low | adipose tissue, appendix |
| CD99 | High | bone marrow |
| CD99 | Medium | adrenal gland |
| ... | ... | ... |
+-----------+--------+-------------------------------+
code used (took help from multiple if else conditions in pandas dataframe and derive multiple columns):
def text_df(df):
if (df[df['Level'].str.match('High')]):
return (df.assign(Level='High') + df['Tissue'].astype(str))
elif (df[df['Level'].str.match('Medium')]):
return (df.assign(Level='Medium') + df['Tissue'].astype(str))
elif (df[df['Level'].str.match('Low')]):
return (df.assign(Level='Low') + df['Tissue'].astype(str))
df = df.apply(text_df, axis = 1)
Error: KeyError: ('Level', 'occurred at index 172')
I can't understand what am I doing wrong. any suggestion?
Aucun commentaire:
Enregistrer un commentaire