vendredi 8 juillet 2016

Renaming and adding variables to a list of datasets in R

I've got a list of datasets, and I want to make a few changes to these datasets using R.

First, if variable "mac_sector" exists, I want to rename it to "sector".

Second, if there is no variable called "mac_sector" or "sector", I want to create a new column variable called "sector" with putting "total" as value.

Lastly, I rearrange the columns because I want variable "sector" to be the 3rd column in each dataset.

I wrote the script (some parts are not even in R language) below, but obviously it's not working, so I'm hoping that some of you may be able to help me with this.

I also want to save these changes to the respective datasets, but I've no idea how to even go about that in this particular case?? (I know of the save() command but I feel like it wouldn't work here)

setwd("C:\\Users\\files")
mylist = list.files(pattern="*.dta")


    #Loop through all of the datasets in C:\\Users\\files
    for (i in 1:length(mylist)) {

      # Reading the datasets into R
      df <- read.dta13(mylist[i], nonint.factors = TRUE)

      # If column mac_sector exists, rename to sector
      if(exists(mac_sector, df)){
      df <- rename(df, c(mac_sector="sector"))
      }

      # If column variable with pattern("sector") does not exist, create variable sector=total
      if(does not exist(pattern="sector")){
        generate sector = total
      }

      # rearrange variable, sector must be placed 3rd
      df <- arrange.vars(df, c("sector" = 3))
    }

Aucun commentaire:

Enregistrer un commentaire