jeudi 17 septembre 2020

Deleting specific column/row values with if conditions

This is probably straight forward, but I am struggling big time.

I have a data frame with different industries between 1999 and 2000.

     fyear           industry      employees
1    1999            Agriculture   132.260
2    2000            Agriculture   154.590
3    2001            Agriculture   147.725
4    2002            Agriculture   142.098
5    2003            Agriculture    77.169
6    2004            Agriculture    82.979
7    2005            Agriculture    99.625
8    2006            Agriculture    98.195
9    2007            Agriculture    95.193
10   2008            Agriculture   104.459
11   2009            Agriculture   182.930
12   2010            Agriculture   180.648
13   2011            Agriculture   173.408
14   2012            Agriculture   181.483
15   2013            Agriculture   109.842
16   2014            Agriculture    90.177
17   2015            Agriculture    92.067
18   2016            Agriculture    83.568
19   2017            Agriculture    70.251
20   2018            Agriculture    65.082
21   2019            Agriculture    82.754
22   1999               Aircraft   653.194
23   2000               Aircraft   692.918
24   2001               Aircraft   666.751
25   2002               Aircraft   633.565
26   2003               Aircraft   687.611
27   2004               Aircraft   701.827
28   2005               Aircraft   725.825
29   2006               Aircraft   751.171
30   2007               Aircraft   744.060
31   2008               Aircraft   750.319
32   2009               Aircraft   677.598
33   2010               Aircraft   690.605
34   2011               Aircraft   712.501
35   2012               Aircraft   716.985
36   2013               Aircraft   709.918

I am trying to create some growth variables

df$employeegrowth <- df$employees / lag(df$employees) - 1

This naturally causes some issues for every "1999" rows, which I would like to replace with NA.

I am trying to solve this issue with an if formula:

df$employeegrowth <- if(df$fyear == "1999") {
  df$employeegrowth <- "NA"
}

But this substitutes every value in the employee growth column with NA.

I do not want to delete the entire row as the other columns contain valuable information.

could someone point me in the right direction on this?

Aucun commentaire:

Enregistrer un commentaire