I'm trying to create a matrix to eventually run inter-rater reliability. I am trying to populate a matrix with TRUE and FALSE (or 1/0) based on whether a string is present in a row with a matching ID in a second or third matrix. I've included what that should look like at the bottom.
Below is my reproducible example, including the two existing matrices and what I've tried so far to produce the final matrix I want. I was able to get it to the point where I can confirm I'm selecting the correct columns that match the matrix I'm wanting to match (see output with "m1" in all columns that have "m1" in the column name). I haven't figured out how to get to the next stage of properly matching the id column between m1.mat and the final matrix, reliability.ex. In excel this would be something like a VLOOKUP, but when I search for VLOOKUP equivalents in R, I just get join/merge functions, which I don't think will work for what I need, but maybe I'm wrong. I tried doing all this in excel but ultimately got stuck and would rather have it in R if possible anyway.
require(stringr)
set.seed(327)
ids <- sample(1:1000, 5)
m.cols <- c("id", "IP1", "IP2", "IP3", "IP4", "IP5")
m1.mat <- matrix(data=NA, nrow=5, ncol=6)
colnames(m1.mat) <- m.cols
m1.mat[1,] <- c(ids[1], "abc", "ghi", NA, NA, NA)
m1.mat[2,] <- c(ids[2], "def", NA, NA, NA, NA)
m1.mat[3,] <- c(ids[3], "mno", "jkl", NA, NA, NA)
m1.mat[4,] <- c(ids[4], "ghi", "abc", NA, NA, NA)
m1.mat[5,] <- c(ids[5], "abc", "def", "ghi", "jkl", "mno")
m2.mat <- matrix(data=NA, nrow=5, ncol=6)
colnames(m2.mat) <- m.cols
m2.mat[1,] <- c(ids[1], "def", "ghi", NA, NA, NA)
m2.mat[2,] <- c(ids[2], "def", "mno", NA, NA, NA)
m2.mat[3,] <- c(ids[3], "mno", "jkl", "abc", NA, NA)
m2.mat[4,] <- c(ids[4], "ghi", "abc", NA, NA, NA)
m2.mat[5,] <- c(ids[5], "abc", "def", "ghi", "jkl", "mno")
reliability.ex <- matrix(data=NA, nrow=5, ncol=11)
ex.cols <- c("id", "abc_m1", "abc_m2", "def_m1", "def_m2", "ghi_m1", "ghi_m2", "jkl_m1", "jkl_m2", "mno_m1", "mno_m2")
colnames(reliability.ex) <- ex.cols
reliability.ex[,1] <- ids
ip.indx <- grepl('m1', colnames(reliability.ex))
for (i in 1:nrow(reliability.ex)) {
for(j in 1:ncol(reliability.ex)) {
if (grepl("m1", colnames(reliability.ex)[j])==TRUE) {
reliability.ex[i,j] <- "m1"
}
}
}
Below are the matrices based on the above code:
> m1.mat
id IP1 IP2 IP3 IP4 IP5
[1,] "345" "abc" "ghi" NA NA NA
[2,] "615" "def" NA NA NA NA
[3,] "478" "mno" "jkl" NA NA NA
[4,] "792" "ghi" "abc" NA NA NA
[5,] "881" "abc" "def" "ghi" "jkl" "no"
> m2.mat
id IP1 IP2 IP3 IP4 IP5
[1,] "345" "def" "ghi" NA NA NA
[2,] "615" "def" "mno" NA NA NA
[3,] "478" "mno" "jkl" "abc" NA NA
[4,] "792" "ghi" "abc" NA NA NA
[5,] "881" "abc" "def" "ghi" "jkl" "mno"
> reliability.ex
id abc_m1 abc_m2 def_m1 def_m2 ghi_m1 ghi_m2 jkl_m1 jkl_m2 mno_m1 mno_m2
[1,] "345" "m1" NA "m1" NA "m1" NA "m1" NA "m1" NA
[2,] "615" "m1" NA "m1" NA "m1" NA "m1" NA "m1" NA
[3,] "478" "m1" NA "m1" NA "m1" NA "m1" NA "m1" NA
[4,] "792" "m1" NA "m1" NA "m1" NA "m1" NA "m1" NA
[5,] "881" "m1" NA "m1" NA "m1" NA "m1" NA "m1" NA
And this is what I want to be able to produce instead of what is currently named reliability.ex:
> reliability.desired
id abc_m1 abc_m2 def_m1 def_m2 ghi_m1 ghi_m2 jkl_m1 jkl_m2 mno_m1 mno_m2
[1,] "345" "1" "0" "0" "1" "1" "1" "0" "0" "0" "0"
[2,] "615" "0" "0" "1" "1" "0" "0" "0" "0" "0" "1"
[3,] "478" "0" "1" "0" "0" "0" "0" "1" "1" "1" "1"
[4,] "792" "1" "1" "0" "0" "1" "1" "0" "0" "0" "0"
[5,] "881" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1"
Any help is appreciated! I'm still figuring out R.
Aucun commentaire:
Enregistrer un commentaire