lundi 5 août 2019

Remove rows of matrix on multiple conditions

This problem is best addressed by example.

Setup

Mat1 <- matrix(nrow =9, ncol =9)
colnames(Mat1) <- c("Name", "Strategy.Assets", "Jan.94", "Jan.95", "Jan.96", "Jan.97", "1", "2", "3")
Mat1[,1] <- letters[1:9]
Mat1[,2] <- cbind(20,30,40,50,60,30,30,40,50)
Mat1[,3:6] <- rnorm(36,0,1)
Mat1[,7] <- c(0,0.0,0,0,0,0,0,0,0)
Mat1[,8] <- c(0.95, 0.8, 0,0,0,0,0,0,0)
Mat1[,9] <- c(0.95,0.6,0.7,0,0,0,0,0,0)

Mat1



treat columns "1" "2" and "3" as the beginning of the correlation matrix (it should be 9 by 9 but i am only showing the first 3 columns).

For each row in columns "1", "2" and "3" i need to identify if there is a value >= 0.95. If there is, i need to know the position. In this case the value 0.95 appears in entry m= 1, n=2 of the correlation matrix. I then need to go to the column "strategy assets" and compare the values for 1 and 2 (in this case 20 and 30). After this, I need to omit the row with the lower value (row 1 as 20 is less than 30).

I need to repeat this process for all rows in the correlation matrix.

Notice that entry (1,3) of the correlation matrix also equals 0.95. However, as row1 has already been removed (from the first iteration), i do not need the loop to continue running in this case.

> Mat1
      Name Strategy.Assets Jan.94               Jan.95               Jan.96                Jan.97               1   2      3     
 [1,] "a"  "20"            "2.73748468138839"   "0.324252987935552"  "0.298858667829153"   "0.491399365053435"  "0" "0.95" "0.95"
 [2,] "b"  "30"            "1.10842864788104"   "1.08246654235009"   "0.101101014064615"   "-0.027943739783141" "0" "0.8"  "0.6" 
 [3,] "c"  "40"            "-0.909300523946026" "0.165680975177448"  "0.369117390404421"   "-0.539831669474995" "0" "0"    "0.7" 
 [4,] "d"  "50"            "0.020300058103183"  "1.2487927105618"    "-0.262119117432464"  "-1.19709346846802"  "0" "0"    "0"   
 [5,] "e"  "60"            "-1.2741234771257"   "0.467062075091042"  "-1.84544534028566"   "0.737963009590861"  "0" "0"    "0"   
 [6,] "f"  "30"            "-0.109189282015503" "0.365438517062692"  "-0.687077248724174"  "-1.33503513711636"  "0" "0"    "0"   
 [7,] "g"  "30"            "-1.02922335962633"  "-0.338738643996438" "-0.243365754619073"  "-0.263558724170233" "0" "0"    "0"   
 [8,] "h"  "40"            "-0.666421298986536" "-1.32579626054673"  "-1.19934398000762"   "0.662649874793231"  "0" "0"    "0"   
 [9,] "i"  "50"            "2.49328945984711"   "-0.476787387353059" "-0.0349993434823028" "-0.906745892615347" "0" "0"    "0" 


then my desired output is:

output <- Mat1[2:9,1:6]
> output
     Name Strategy.Assets Jan.94               Jan.95               Jan.96                Jan.97              
[1,] "b"  "30"            "1.10842864788104"   "1.08246654235009"   "0.101101014064615"   "-0.027943739783141"
[2,] "c"  "40"            "-0.909300523946026" "0.165680975177448"  "0.369117390404421"   "-0.539831669474995"
[3,] "d"  "50"            "0.020300058103183"  "1.2487927105618"    "-0.262119117432464"  "-1.19709346846802" 
[4,] "e"  "60"            "-1.2741234771257"   "0.467062075091042"  "-1.84544534028566"   "0.737963009590861" 
[5,] "f"  "30"            "-0.109189282015503" "0.365438517062692"  "-0.687077248724174"  "-1.33503513711636" 
[6,] "g"  "30"            "-1.02922335962633"  "-0.338738643996438" "-0.243365754619073"  "-0.263558724170233"
[7,] "h"  "40"            "-0.666421298986536" "-1.32579626054673"  "-1.19934398000762"   "0.662649874793231" 
[8,] "i"  "50"            "2.49328945984711"   "-0.476787387353059" "-0.0349993434823028" "-0.906745892615347"



I need to be able to apply this function to matrices in a set. Each matrix in the set is of different size. but the final column before the correlaiton matrix is always "Jan.97".

Aucun commentaire:

Enregistrer un commentaire