mardi 7 mars 2017

Extract characters and rename columns in R

I'm an R newbie. I exported data from a database and am trying to rename the columns.

Example existing names (one site per water quality parameter) are in quotes below. There are the 6 possible parameters at each site and 40 sites; I would like to rename columns based on parameter and site. Site names are 3-7 characters and always occur after the last decimal point. My dataset has 240 columns and 47,714 rows (rows are time stamps of hourly continuous data). I want to be able to use the code for other exports from this database with same format and parameters, but possibly different sites.

For example:

  1. "Water.Temp.Water.Temp.BUBU" | "Water.Temp.Temperature.BUBU" <--- Temp.WHC
  2. "Water.Temp.Field.Visits.KNF_DUP" <--- FVTemp.KNF_DUP
  3. "Sp.Cond.TempCorrected_nodrift.LOD_DUP" <---SpCnd.LOD_Dup
  4. "Sp.Cond.TempCorrected.PFM" <--- SpC.PFM
  5. "Sp.Cond.Field.Visits.CC7" <-- FVSpC.CC7
  6. "Cond.Conductivity.TM02Dup"<-- Cond.TM02Dup

I can not figure out how to write the contains() in an if statement, or how to extract characters from a string with multiple decimal points and that does not extract the same number of characters from the end of the column name. I am also wondering if a for loop through colnames() is the best solution.

for (i in 1:colnames(AQexport1)){
  if (colnames(AQexport1[i]) contains "Water.Temp.W" | "Water.Temp.T"){
    colname(AQexport1[i]) <- Temp.insert_site_name_here
  } 
    elseif (colnames(AQexport1[i])) contains "Water.Temp.F") {
      colname(AQexport1[i]) <- FVTemp.insert_site_name_here
    }
    elseif (colnames(AQexport1[i])) contains "nodrift") {
      colname(AQexport1[i]<-SpCnd.insert_site_name_here
    }
    elseif (colnames((AQexport1[i])) contains "Sp.Cond.T") {
      colname(AQexport1[i]<-SpC.insert_site_name_here
    }
#continue elseif statements
} 

Aucun commentaire:

Enregistrer un commentaire