mardi 27 juillet 2021

ifelse condition always evaluates to "true"

I think I have the same issue posted here: R Programming: Condition giving always TRUE but I do not know how to apply the answer to my situation. The ifelse statement below searches for a match in the major_allele column for REF and vice-versa and then does the same for minor_allele and ALT columns. If both are true then I create three new columns. If either is false then I see if the other possibility is true (major_allele in ALT and vice-versa and minor_allele in REF and vice-versa) in a nested ifelse statement and the three new columns are different. The problem is that it always evaluates to true as shown by the rows with astericks. The previous answer at the above link stated that ifelse always produces a vector so I am guessing that it is only looking at the first observation, evaluates to true and then moves to doing the same across all rows. However, I would like it to go row-by-row. I tried two separate if statements as well and I get the same problem, however, it only does the second if statement (see below, last column is the most obvious to examine difference)

 for (i in files){
 rsid.tmp  <- read.table(i, header = TRUE, sep=" ", stringsAsFactors = FALSE, fill = TRUE)

 ifelse(((sapply(lapply(rsid.tmp$REF, grepl, rsid.tmp$major_allele),any)) || 
 (sapply(lapply(rsid.tmp$major_allele, grepl, rsid.tmp$REF),any)))
 && ((sapply(lapply(rsid.tmp$ALT, grepl, rsid.tmp$minor_allele),any)) || 
 (sapply(lapply(rsid.tmp$minor_allele, grepl, rsid.tmp$ALT),any))),
            {rsid.tmp$new_gwas_a1 <- rsid.tmp$REF;
             rsid.tmp$new_gwas_a2 <- rsid.tmp$ALT;
            rsid.tmp$new_minor_allele_frequency <- rsid.tmp$minor_allele_frequency},
    ifelse(((sapply(lapply(rsid.tmp$minor_allele, grepl, rsid.tmp$REF),any)) || 
    (sapply(lapply(rsid.tmp$REF, grepl, rsid.tmp$minor_allele),any)))
         && ((sapply(lapply(rsid.tmp$major_allele, grepl, rsid.tmp$ALT),any)) || 
    (sapply(lapply(rsid.tmp$ALT, grepl, rsid.tmp$major_allele),any))),
              {rsid.tmp$new_gwas_a2 <- rsid.tmp$REF;
              rsid.tmp$new_gwas_a1 <- rsid.tmp$ALT;
              rsid.tmp$new_minor_allele_frequency <- (1 - rsid.tmp$minor_allele_frequency)}, 
 ))

 new_file <- sub("(\\.txt)$", "_updated\\1", i)
 write.table(rsid.tmp, new_file, quote = FALSE, row.names = FALSE)

            }

REF ALT minor_allele_frequency  minor_allele    major_allele   new_gwas_a1  new_gwas_a2  
new_minor_allele_frequency
A   G   0.000219914 G   A   A   G   0.000219914
C   T   0.0144844   T   C   C   T   0.0144844
C   T   0.0445486   T   C   C   T   0.0445486
C   T   0.00647968  T   C   C   T   0.00647968
C   T   0.222656    T   C   C   T   0.222656
**A G,T 0.12189     A   G   A   G,T 0.12189
A   G,T 0.305252    A   G   A   G,T 0.305252**
C   T   0.00210762  T   C   C   T   0.00210762
C   A   0.00139373  A   C   C   A   0.00139373

 

BELOW EXAMPLE WITH TWO IF STATEMENTS_____________________________________________________

 for (i in files){
 rsid.tmp  <- read.table(i, header = TRUE, sep=" ", stringsAsFactors = FALSE, fill = TRUE)

 if(((sapply(lapply(rsid.tmp$REF, grepl, rsid.tmp$major_allele),any)) || 
 (sapply(lapply(rsid.tmp$major_allele, grepl, rsid.tmp$REF),any)))
  && ((sapply(lapply(rsid.tmp$ALT, grepl, rsid.tmp$minor_allele),any)) || 
 (sapply(lapply(rsid.tmp$minor_allele, grepl, rsid.tmp$ALT),any))))
            {rsid.tmp$new_gwas_a1 <- rsid.tmp$REF;
             rsid.tmp$new_gwas_a2 <- rsid.tmp$ALT;
            rsid.tmp$new_minor_allele_frequency <- rsid.tmp$minor_allele_frequency}


 if(((sapply(lapply(rsid.tmp$minor_allele, grepl, rsid.tmp$REF),any)) || 
 (sapply(lapply(rsid.tmp$REF, grepl, rsid.tmp$minor_allele),any)))
  && ((sapply(lapply(rsid.tmp$major_allele, grepl, rsid.tmp$ALT),any)) || 
 (sapply(lapply(rsid.tmp$ALT, grepl, rsid.tmp$major_allele),any))))
              {rsid.tmp$new_gwas_a2 <- rsid.tmp$REF;
              rsid.tmp$new_gwas_a1 <- rsid.tmp$ALT;
              rsid.tmp$new_minor_allele_frequency <- (1 - rsid.tmp$minor_allele_frequency)}

 new_file <- sub("(\\.txt)$", "_updated\\1", i)
 write.table(rsid.tmp, new_file, quote = FALSE, row.names = FALSE)

 REF    ALT minor_allele_frequency  minor_allele    major_allele    new_gwas_a1 new_gwas_a2 new_minor_allele_frequency
 A  G   0.000219914 G   A   G   A   0.999780086
 C  T   0.0144844   T   C   T   C   0.9855156
 C  T   0.0445486   T   C   T   C   0.9554514
 C  T   0.00647968  T   C   T   C   0.99352032
 C  T   0.222656    T   C   T   C   0.777344
 A  G,T 0.12189 A   G   G,T A   0.87811
 A  G,T 0.305252    A   G   G,T A   0.694748
 C  T   0.00210762  T   C   T   C   0.99789238
 C  A   0.00139373  A   C   A   C   0.99860627

Aucun commentaire:

Enregistrer un commentaire