I am trying to create a new column in dataframe based on partial string matching other column, basically exactly what was done here: Create new column in dataframe based on partial string matching other column. However, my strings are clearly not matching correctly.
#To show you how I built the dataframe in case this is causing my issue
library(dplyr)
#Load data file
LIT_raw<-read.csv("BIASK_Terr_Data Extraction_Master JJT_EBCleaning-2021-06-10.csv", header = T, stringsAsFactors = F)
#check import
head(LIT_raw)
names(LIT_raw)
##make my dataframe #subset for "Focus", which is whether study was research and monitoring or management and then other relevant columns
LITsub <- LIT_raw[58:135, (names(LIT_raw) %in% c("Year", "Citation", "Focus", "StudyArchetype1", "StudyArchetype2", "Design_IK_WS_roles1", "Design_IK_WS_roles2", "Implementa_IK_roles1", "Implementa_IK_roles2", "Implementa_WS_roles1", "Implementa_WS_roles2", "Analysis_IK_WS_roles1", "Analysis_IK_WS_roles2", "RepDesc_IK_WS_roles1", "RepDesc_IK_WS_roles2"))]
length(LITsub)
names(LITsub)
LITsub$Focus
#[1] "Year" "Citation" "Focus" "Design_IK_WS_roles1"
#[5] "Design_IK_WS_roles2" "Implementa_IK_roles1" "Implementa_IK_roles2" "Implementa_WS_roles1" #[9] "Implementa_WS_roles2" "Analysis_IK_WS_roles1" "Analysis_IK_WS_roles2" "RepDesc_IK_WS_roles1" #[13] "RepDesc_IK_WS_roles2" "StudyArchetype1" "StudyArchetype2"
#make new column called Design_IK_WS_comb (and insert this new column after Design_IK_WS_roles2), combine Design_IK_WS_roles1 Design_IK_WS_roles2 with commas between entries and put the result in Design_IK_WS_comb
LITsub$Design_IK_WS_comb = paste(LITsub$Design_IK_WS_roles1, LITsub$Design_IK_WS_roles2, sep=",")
#make new column called Design_ifhowbraid #If columns Design_IK_WS_comb contains "IK and WS used to assess the same things, IK and WS used to assess different things or No weaving practices - only IK or WS engaged in analysis" take these phrases out and put them in Design_ifhowbraid #if column Design_IK_WS_comb does not contain any of "IK and WS used to assess the same things, IK and WS used to assess different things or No weaving practices - only IK or WS engaged in analysis" then include the phrase "code manually" in Design_ifhowbraid
LITsub$Design_ifhowbraid <- ifelse(grepl("IK and WS used to assess the same things", LITsub$Design_IK_WS_comb, ignore.case = T), "IK and WS used to assess the same things",
ifelse(grepl("IK and WS used to assess different things", LITsub$Design_IK_WS_comb, ignore.case = T), "IK and WS used to assess different things",
ifelse(grepl("No weaving practices - only IK or WS engaged in analysis", LITsub$Design_IK_WS_comb, ignore.case = T), "No weaving practices - only IK or WS engaged in analysis", "CodeManually")))
#The code completes, but then when I go in to check
LITsub$Design_ifhowbraid
#All of the rows say "CodeManually" even though there are exact matches in Design_IK_WS_comb. For example, if I execute
LITsub$Design_IK_WS_comb
#One of the rows/results that comes up is #[65] "Multiple,Use IK as local scale expertise, IK used in formulating research questions and hypotheses, Western science informing IK methods, IK and WS used for different things ".
#Also, if I want to rename the newly added to dataframe with all these new columns, how would I do that?
With thanks,
Aucun commentaire:
Enregistrer un commentaire