Goal: To categorize an ID (row) into top 3 prioritized categories
I have the following data frame table (white) and I'm trying to create the following columns (highlighted yellow).
Below is the logic I want to implement in code:
To fill in 'Cat_Priority_1' column:
1. If 'Cat_1' is not null, then 'Cat_Priority_1' = the value in 'Cat_1'
2. If 'Cat_1' is null, then look to see if 'Cat_2' is null
a. If 'Cat_2' is not null then 'Cat_Priority_1' equals the value in 'Cat_2'
3. If 'Cat_2' is null then look to see if 'Cat_3' is null
a. If 'Cat_3' is not null then 'Cat_Priority_1' equals the value in 'Cat_3'
4. If 'Cat_3' is null, then look to see if 'Cat_4' is null
.... and so on all the way through Cat_6
To fill in 'Cat_Priority_2' column: this column follows the same logic as 'Cat_Priority_1' EXCEPT 'Cat_Priority_2' cannot equal 'Cat_Priority_1'; please see screen shot of table above for examples
To fill in 'Cat_Priority_3' column: this column follows the same logic as 'Cat_Priority_1' EXCEPT 'Cat_Priority_3' cannot equal 'Cat_Priority_1' or 'Cat_Priority_2'; please see screen shot of table above for examples
Logic Caveat #1: If Cat_1 through Cat_6 are all 'nan' then 'Cat_Priority_1' equals 'Cat_1', 'Cat_Priority_2' equals 'Cat_2', 'Cat_Priority_3' equals 'Cat_3'
Logic Caveat #2: Only two of the columns have values populated --> See row 5 in above table for example
Logic Caveat #3: Only one of the columns have values populated --> See row 6 in above table for example
Below is some code I started to try. Unfortunately, it did not work ... see table output it produced below:
df_mock['Prioritize_1'] = df_mock.apply(lambda x: x.first_valid_index(), axis=1)
Any help is greatly appreciated!
Aucun commentaire:
Enregistrer un commentaire