I have a large dataframe (my_DF) with 4 important columns: ID (1-->100), YEAR (2000, 2001, 2002, 2003, 2004, 2005), MONTH (January --> December), LENGHT (continuous values from 0.1 to 1.0). A thing like:
YEAR MONTH ID LENGHT
1 2000 january S1 0.2
2 2000 january S1 0.3
3 2000 january S1 0.1
4 2000 january S2 0.5
5 2000 january S2 0.3
6 2000 february S1 0.9
7 2000 february S1 0.4
8 2000 february S1 0.6
9 2000 february S3 0.4
10 2000 february S3 0.3
11 2000 march S1 0.7
...
I need to add a new column to the dataframe, filled with the median lenght for each unique situation, so for each ID, YEAR and MONTH values.
I succeded in obtain the values I was looking for, thanks to aggregate:
agg <- aggregate(my_DF["LENGHT"], by = list(my_DF$YEAR, my_DF$MONTH, my_DF$ID), median)
In this way I obtain the values I wanted, but, of course, this just create a new dataframe. I'm failing in finding a rapid way to paste the values of the dataframe "agg" in a new column of dataframe "my_DF", basing on the YEAR, MONTH and ID corrispondence.
For instance, I want obtain a thing like:
YEAR MONTH ID LENGHT MONTHLY_LENGHT_MEDIAN
1 2000 january S1 0.2 0.2
2 2000 january S1 0.3 0.2
3 2000 january S1 0.1 0.2
4 2000 january S2 0.5 0.4
5 2000 january S2 0.3 0.4
6 2000 february S1 0.9 0.6
7 2000 february S1 0.4 0.6
8 2000 february S1 0.6 0.6
9 2000 february S3 0.4 0.35
10 2000 february S3 0.3 0.35
11 2000 march S1 0.7 0.7
So, I'm wondering if a conditional command is suitable in my case (if, ifelse...).
Unfortunately, I'm not good with these commands... How could I do? Thanks for helping!
Aucun commentaire:
Enregistrer un commentaire