jeudi 9 juillet 2020

for and if loop to add factors based on several standards ends with only the last cluster

I have a dataframe like this:

> str(All_intron_IRratio)
'data.frame':   210845 obs. of  10 variables:
 $ coordinates  : Factor w/ 210830 levels "chr1:10000137-10003234",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ Strand       : Factor w/ 2 levels "-","+": 1 1 1 2 2 2 2 2 2 2 ...
 $ Name         : Factor w/ 21376 levels "0610005C13Rik/ENSMUSG00000085214/clean",..: 14522 14522 14522 4326 4326 4326 4326 4326 4326 4326 ...
 $ E11_IRratio  : num  0.017 0.01993 0.02159 0.05614 0.00981 ...
 $ E14_IRratio  : num  0.01033 0.01355 0.04758 0.01527 0.00927 ...
 $ E18_IRratio  : num  0.00368 0.00736 0.00989 0.02 0.01283 ...
 $ Adult_IRratio: num  0.0334 0.0238 0.0224 0.0127 0.0065 ...
 $ E14vsE11     : num  -0.006673 -0.006384 0.02599 -0.040872 -0.000546 ...
 $ E18vsE14     : num  -0.00664 -0.00619 -0.03769 0.00473 0.00356 ...
 $ AdultvsE18   : num  0.02969 0.01648 0.01249 -0.00732 -0.00633 ...

I want to add a factor column to cluster the data based on the value of E14vsE11, E18vsE14, AdultvsE18. For that I have written several filters:

All_intron_IRratio <- read.csv("All_intron_IRratio.csv", stringsAsFactors = FALSE)
filter1 <- All_intron_IRratio$E14vsE11 > 0.005 & All_intron_IRratio$E18vsE14 >0.005 & All_intron_IRratio$AdultvsE18 >0.005
filter2 <- All_intron_IRratio$E14vsE11 > 0.005 & All_intron_IRratio$E18vsE14 >0.005 & All_intron_IRratio$AdultvsE18 < (-0.005)
filter3 <- All_intron_IRratio$E14vsE11 < (-0.005) & All_intron_IRratio$E18vsE14 < (-0.005) & All_intron_IRratio$AdultvsE18 < (-0.005)
filter4 <- All_intron_IRratio$E14vsE11 < (-0.005) & All_intron_IRratio$E18vsE14 < (-0.005) & All_intron_IRratio$AdultvsE18 > 0.005
filter5 <- abs(All_intron_IRratio$E14vsE11 <= 0.005) & abs(All_intron_IRratio$E18vsE14) <= 0.005 & abs(All_intron_IRratio$AdultvsE18) <= 0.005

f

Then I used a loop to cluster them, I did not receive any error but in the output file all the added factors is the last one(cluster6 in my output). I have checked my original file, if i manually select them, I should be able to get several factors.

for (i in (1:nrow(All_intron_IRratio))){
  if (isTRUE(filter1)){
    All_intron_IRratio$cluster <- "cluster1"
  } else if (isTRUE(filter2)){ 
    All_intron_IRratio$cluster <- "cluster2"
  } else if (isTRUE(filter3)){
    All_intron_IRratio$cluster <- "cluster3"
  } else if (isTRUE(filter4)){
    All_intron_IRratio$cluster <- "cluster4"
  } else if (isTRUE(filter5)){
    All_intron_IRratio$cluster <- "cluster5"
  } else{
    All_intron_IRratio$cluster <- "cluster6"
  }}
  write.csv(All_intron_IRratio, file = "All_intron_IRratio_clustered.csv")

which part have I made the mistakes?

Thanks a lot,

Aucun commentaire:

Enregistrer un commentaire