mercredi 27 février 2019

r: using `for` and `if` to run run function on numeric vars only

I have a four column dataframe with date, var1_share, var2_share, and total. I want to multiply each of the share metrics against the total only to create new variables containing the raw values for both var1 & var2. See below code (a bit verbose) to construct the dataframe that contains the share variables:

df<- data.frame(dt= seq.Date(from = as.Date('2019-01-01'), 
    to= as.Date('2019-01-10'), by= 'day'),
    var1= round(runif(10, 3, 12), digits = 1), 
    var2= round(runif(10, 3, 12), digits = 1))
df$total<- apply(df[2:3], 1, sum)
ratio<- lapply(df[-1], function(x) x/df$total)
ratio<- data.frame(ratio)
colnames(df)<- c('date', 'var1_share', 'var2_share', 'total')

The final dataframe should look like this:

> df
date var1_share var2_share total
1  2019-01-01  0.5862069  0.4137931     1
2  2019-01-02  0.6461538  0.3538462     1
3  2019-01-03  0.3591549  0.6408451     1
4  2019-01-04  0.7581699  0.2418301     1
5  2019-01-05  0.3989071  0.6010929     1
6  2019-01-06  0.5132743  0.4867257     1
7  2019-01-07  0.5230769  0.4769231     1
8  2019-01-08  0.4969325  0.5030675     1
9  2019-01-09  0.5034965  0.4965035     1
10 2019-01-10  0.3254438  0.6745562     1

I have nested an if statement within a for loop, hoping to return a new dataframe called share. I want it to skip date when using the share variables for I've incorporated is.numeric so that it ignores date, however, when I run it, it only returns the date and not the desired result of date, the share of each variable (as separate columns), and the total column. See below code:

for (i in df){
  share<- if(is.numeric(i)){
     i * df$total
    } else i
  share<- data.frame(share)

> share
1  2019-01-01
2  2019-01-02
3  2019-01-03

How do I adjust this function so that share returns a dataframe containing date, variable 1 and 2 raw variables, and total?

Aucun commentaire:

Enregistrer un commentaire