I am writing this post in relation to another one I posted but got no answer to and that is because I don't think I was specific enough (I hope this is ok!) Essentially, I think I have figured out that the best way to do what I am trying to do is to use an IF function, but a little bit snagged on how to do it, as I am looking for a very specific set of rules.
I have a data matrix of samples (columns) and genes (rows). Each set of five columns belong to one one sample type, say, one time point for example repeated 5 times , the next five columns are the second time point and so on.
I would like to be able to look at genes that change from one time point to another only if it has a difference of a minimum change of 50 counts or more. So if the change from one gene to another were 45 counts (for example), it would be rejected. Is there any way of doing this and if so, would somebody be kind enough to share some code for this?.. I don't just want a statement of true and false (this would be a great start) but then I would like to make a data matrix of the TRUE statement, so that I only have a list of genes that change by a minimum of 50 counts (in either direction, whether up or down)
Pease see example data matrix code attached. Many thanks for your time!
**X51378P3 X51378P4 X48275P5 X48277P1 X48277P2** X28046 X23154 X23156
X23157 X23241 **X8657 X10459 X8302 X8726 X8727** X8309 X5260 X47471
X51394 X18
ENSMUSG00000042096 0 2 0 1 3
2 13 5 3 6 238 211 149 182 214 843
831 1072 815 971
ENSMUSG00000033208 91 47 100 41 79
764 848 744 491 671 2361 2888 2323 2297 2778 4613
6634 6603 5477 4924
ENSMUSG00000021750 46 51 28 28 34
89 90 81 88 73 9083 6238 3876 6754 7066 11727
10135 16857 10669 12581
ENSMUSG00000041205 290 141 156 122 146
431 432 377 310 388 1514 1714 1363 1428 1677 1492
2036 1465 1573 1585
ENSMUSG00000026556 4260 3486 3545 2315 3090
2818 2039 2204 2139 2241 807 973 689 787 1094
466 660 460 457 579
ENSMUSG00000032908 112 77 78 76 98
399 286 359 218 282 1451 1266 897 1183 1416 1881
2243 2281 1862 2144
ENSMUSG00000045246 7 4 11 7 11
13 29 36 19 14 762 958 810 905 720 2950
2390 2916 2684 2878
ENSMUSG00000023019 159 108 104 96 116
68 94 94 62 132 878 1039 774 941 829 3164
3191 3405 2671 3019
ENSMUSG00000029054 9 1 13 2 4
27 39 49 13 35 1834 2277 1054 1744 2449 3905
4228 3240 2941 3489
ENSMUSG00000010476 9380 8541 8906 5609 7406
4478 4422 4865 3739 4003 886 1473 979 956 1199
247 380 434 297 375
ENSMUSG00000020788 79 109 93 53 91
124 163 212 128 135 3561 3396 1944 3128 3754 6632
6844 5198 5595 6646
ENSMUSG00000047945 18196 14417 16349 10746 14262
19114 13732 13902 12339 13406 4224 7321 4514 5056 6271
702 899 630 883 741
ENSMUSG00000022096 183 120 156 76 159
384 205 160 225 189 2466 2488 1958 2504 2921 2955
3255 3218 2442 2928
ENSMUSG00000020734 233 85 157 150 108
183 204 253 187 182 5854 4614 2719 4949 6563 12011
14573 10291 9136 12527
If you look at ENSMUSG00000029054, the average value of the first 5 columns is 5.8 and the average of the second 5 (which would represent another sample) is 32.6. So the difference between the 2 is 26.8. So what I would like to do is filter this matrix such that the average change between each sample is a minimum of 50..
What I am truly stuck with is defining this argument that says, I want to define a specific delta change between samples as well as saying that first 5 samples are actually he same condition and I want to take the mean of these values and compare them to the mean of the next 5 values and so on.
Many thanks again all!
Aucun commentaire:
Enregistrer un commentaire