This is probably straight forward, but I am struggling big time.
I have a data frame with different industries between 1999 and 2000.
fyear industry employees
1 1999 Agriculture 132.260
2 2000 Agriculture 154.590
3 2001 Agriculture 147.725
4 2002 Agriculture 142.098
5 2003 Agriculture 77.169
6 2004 Agriculture 82.979
7 2005 Agriculture 99.625
8 2006 Agriculture 98.195
9 2007 Agriculture 95.193
10 2008 Agriculture 104.459
11 2009 Agriculture 182.930
12 2010 Agriculture 180.648
13 2011 Agriculture 173.408
14 2012 Agriculture 181.483
15 2013 Agriculture 109.842
16 2014 Agriculture 90.177
17 2015 Agriculture 92.067
18 2016 Agriculture 83.568
19 2017 Agriculture 70.251
20 2018 Agriculture 65.082
21 2019 Agriculture 82.754
22 1999 Aircraft 653.194
23 2000 Aircraft 692.918
24 2001 Aircraft 666.751
25 2002 Aircraft 633.565
26 2003 Aircraft 687.611
27 2004 Aircraft 701.827
28 2005 Aircraft 725.825
29 2006 Aircraft 751.171
30 2007 Aircraft 744.060
31 2008 Aircraft 750.319
32 2009 Aircraft 677.598
33 2010 Aircraft 690.605
34 2011 Aircraft 712.501
35 2012 Aircraft 716.985
36 2013 Aircraft 709.918
I am trying to create some growth variables
df$employeegrowth <- df$employees / lag(df$employees) - 1
This naturally causes some issues for every "1999" rows, which I would like to replace with NA.
I am trying to solve this issue with an if formula:
df$employeegrowth <- if(df$fyear == "1999") {
df$employeegrowth <- "NA"
}
But this substitutes every value in the employee growth column with NA.
I do not want to delete the entire row as the other columns contain valuable information.
could someone point me in the right direction on this?
Aucun commentaire:
Enregistrer un commentaire