I have geno data 40000 rows (SNPs) and 500 columns (humans) looking like
AA AG GG GA AA
CC CG CC GC GG
AC CC CA CA CC
Example presenting only 3 SNPs and 5 humans.
I need to convert letters to numbers using keys presented next. Note that Three letters A, C and G can not occur in one row. Only A and C or A and G, or C and G.
If A presented within row, key is:
AA = 0
AG =1
GG = 2
AC = 1
CC = 2
, if A is not presented, key is:
CC = 0
CG = 1
GG = 2
Notice that CC in one case is 2 in other case is 0.
So example will look like:
0 1 2 1 0
0 1 0 1 2
1 2 1 1 2
How to do it in R for all rows and columns?
Thank you!
Aucun commentaire:
Enregistrer un commentaire