vendredi 2 novembre 2018

Re coding in R using complicated statement

I have geno data 40000 rows (SNPs) and 500 columns (humans) looking like

AA AG GG GA AA
CC CG CC GC GG
AC CC CA CA CC

Example presenting only 3 SNPs and 5 humans.

I need to convert letters to numbers using keys presented next. Note that Three letters A, C and G can not occur in one row. Only A and C or A and G, or C and G.

If A presented within row, key is:

AA = 0
AG =1
GG = 2
AC = 1
CC = 2

, if A is not presented, key is:

CC = 0 
CG = 1 
GG = 2

Notice that CC in one case is 2 in other case is 0.

So example will look like:

0 1 2 1 0
0 1 0 1 2
1 2 1 1 2

How to do it in R for all rows and columns?

Thank you!

Aucun commentaire:

Enregistrer un commentaire