I have a large dataset of poultry. A subset of it looks like this:
>poultry
Bird_ID weight_grams Year_weighed Alive
1 1_brown 855 2021 A
2 1_brown 850 2019 A
3 2_brown 852 2021 A
4 2_brown 848 2020 A
5 3_brown 851 2021 D
6 3_brown 850 2020 A
7 4_brown 620 2018 D
8 4_brown 580 2015 A```
I want to create a column to my poultry data frame and call it 'status'. This status column will indicate:
(i) 'New Big' if a bird used to have a weight <850 grams in one of the years it was weighed then grew to ≥850grams in another year and it has ‘A’ under Alive column, meaning it is still alive.
(ii) 'Big' if a bird always had a weight ≥850 grams in all the years it was weighed and it has ‘A’ under Alive column, meaning it is still alive.
(iii) 'Dead' if a bird has at least a ‘D’ under Alive column.
My desired output looks like this:
```> output
Bird_ID weight_grams Year_weighed Alive Status
1 1_brown 855 2021 A Big
2 1_brown 850 2019 A Big
3 2_brown 852 2021 A New Big
4 2_brown 848 2020 A New Big
5 3_brown 851 2021 D Dead
6 3_brown 850 2020 A Dead
7 4_brown 620 2018 D Dead
8 4_brown 580 2015 A Dead
I have tried to work with dplyr pipes as follows:
group_by(Bird_ID) %>%
mutate(status = if((min(weight_grams) < 850) & (max(weight_grams) >= 850) & (Alive == "A")) 'New Big' else if ((Alive == "A") & (min(weight_grams) >= 850)) 'Big' else 'Dead')
Unfortunately this produces several warning messages which I don’t understand. I will be grateful for any pointers
Aucun commentaire:
Enregistrer un commentaire