I'm writing a program that takes a file name and a number (n) from command line arguments. It reads a DNA sequence then finds and outputs the subsequences of length n with the highest frequency of C and G. Here's the code:
import java.io.File;
import java.io.FileNotFoundException;
import java.util.Arrays;
import java.util.Collections;
import java.util.Scanner;
public class CGIslands {
public static void main(String args[]) {
// check arguments
if(args.length != 2) {
System.err.println("ERROR: expected file name and an integer");
System.exit(1);
}
int n = Integer.parseInt(args[1]);
String fileName = args[0];
Scanner s = null;
String sequence = "";
// read in sequence
try {
s = new Scanner(new File(fileName));
} catch (FileNotFoundException e) {
e.printStackTrace();
}
while(s.hasNext()) {
sequence = sequence + s.next();
}
s.close();
int numOfSubseq = sequence.length() - n + 1;
double CGCount = 0;
Double[] frequencyArr = new Double[numOfSubseq];
Double[] countArr = new Double[numOfSubseq];
for (int i = 0; i < numOfSubseq; i++) {
CGCount = 0;
for (int j = 0; j < n; j++) {
if (sequence.charAt(i + j) == 'C' || sequence.charAt(i + j) == 'G') {
CGCount++;
}
}
countArr[i] = CGCount;
frequencyArr[i] = CGCount / n;
}
// print output
System.out.println("n = " + n);
System.out.format("Highest frequency: %.0f / %d = %.2f%%\n", Collections.max(Arrays.asList(countArr)), n, Collections.max(Arrays.asList(frequencyArr)) * 100);
System.out.println("CG Islands:");
for (int i = 0; i < numOfSubseq; i++) {
System.out.println(countArr[i] + " " + Collections.max(Arrays.asList(countArr)));
if (countArr[i] == Collections.max(Arrays.asList(countArr))) {
System.out.format("%d thru %d: %s\n", i + 1, i + n, sequence.substring(i, i + n));
}
}
}
}
Command line arguments:
inputFile.txt 4
inputFile.txt:
CCAATACCGT
The part I'm having trouble with is here:
for (int i = 0; i < numOfSubseq; i++) {
if (countArr[i] == Collections.max(Arrays.asList(countArr))) {
System.out.format("%d thru %d: %s\n", i + 1, i + n, sequence.substring(i, i + n));
}
}
I've checked the values of both and they both equal 3.0 twice, but the if statement is only run once.
Output:
n = 4
Highest frequency: 3 / 4 = 75.00%
CG Islands:
6 thru 9: ACCG
Expected:
n = 4
Highest frequency: 3 / 4 = 75.00%
CG Islands:
6 thru 9: ACCG
7 thru 10: CCGT
Any ideas why the second one isn't printed even though the condition is still true?
Aucun commentaire:
Enregistrer un commentaire