mercredi 16 juin 2021

string is not being pulled correctly using my ifelse statement

I am trying to create a new column in dataframe based on partial string matching other column, basically exactly what was done here: Create new column in dataframe based on partial string matching other column. However, my strings are clearly not matching correctly.

#To show you how I built the dataframe in case this is causing my issue

library(dplyr)

#Load data file

LIT_raw<-read.csv("BIASK_Terr_Data Extraction_Master JJT_EBCleaning-2021-06-10.csv", header = T, stringsAsFactors = F) 

#check import

head(LIT_raw)
names(LIT_raw)

##make my dataframe #subset for "Focus", which is whether study was research and monitoring or management and then other relevant columns

LITsub <- LIT_raw[58:135, (names(LIT_raw) %in% c("Year", "Citation", "Focus", "StudyArchetype1", "StudyArchetype2", "Design_IK_WS_roles1", "Design_IK_WS_roles2", "Implementa_IK_roles1", "Implementa_IK_roles2", "Implementa_WS_roles1", "Implementa_WS_roles2", "Analysis_IK_WS_roles1", "Analysis_IK_WS_roles2", "RepDesc_IK_WS_roles1", "RepDesc_IK_WS_roles2"))]

length(LITsub)
names(LITsub)
LITsub$Focus

#[1] "Year" "Citation" "Focus" "Design_IK_WS_roles1"
#[5] "Design_IK_WS_roles2" "Implementa_IK_roles1" "Implementa_IK_roles2" "Implementa_WS_roles1" #[9] "Implementa_WS_roles2" "Analysis_IK_WS_roles1" "Analysis_IK_WS_roles2" "RepDesc_IK_WS_roles1" #[13] "RepDesc_IK_WS_roles2" "StudyArchetype1" "StudyArchetype2"

#make new column called Design_IK_WS_comb (and insert this new column after Design_IK_WS_roles2), combine Design_IK_WS_roles1 Design_IK_WS_roles2 with commas between entries and put the result in Design_IK_WS_comb

LITsub$Design_IK_WS_comb = paste(LITsub$Design_IK_WS_roles1, LITsub$Design_IK_WS_roles2, sep=",")

#make new column called Design_ifhowbraid #If columns Design_IK_WS_comb contains "IK and WS used to assess the same things, IK and WS used to assess different things or No weaving practices - only IK or WS engaged in analysis" take these phrases out and put them in Design_ifhowbraid #if column Design_IK_WS_comb does not contain any of "IK and WS used to assess the same things, IK and WS used to assess different things or No weaving practices - only IK or WS engaged in analysis" then include the phrase "code manually" in Design_ifhowbraid

LITsub$Design_ifhowbraid <- ifelse(grepl("IK and WS used to assess the same things", LITsub$Design_IK_WS_comb, ignore.case = T), "IK and WS used to assess the same things",
  ifelse(grepl("IK and WS used to assess different things", LITsub$Design_IK_WS_comb, ignore.case = T), "IK and WS used to assess different things",
  ifelse(grepl("No weaving practices - only IK or WS engaged in analysis", LITsub$Design_IK_WS_comb, ignore.case = T), "No weaving practices - only IK or WS engaged in analysis", "CodeManually")))

#The code completes, but then when I go in to check

LITsub$Design_ifhowbraid

#All of the rows say "CodeManually" even though there are exact matches in Design_IK_WS_comb. For example, if I execute

LITsub$Design_IK_WS_comb

#One of the rows/results that comes up is #[65] "Multiple,Use IK as local scale expertise, IK used in formulating research questions and hypotheses, Western science informing IK methods, IK and WS used for different things ".

#Also, if I want to rename the newly added to dataframe with all these new columns, how would I do that?

With thanks,

Aucun commentaire:

Enregistrer un commentaire