
Ordering Strings of Varying Length Lexicographically

A subsequence of a string is a collection of symbols contained in order (though not necessarily contiguously) in the string (e.g., ACG is a subsequence of TATGCTAAGATC). The indices of a subsequence are the positions in the string at which the symbols of the subsequence appear; thus, the indices of ACG in TATGCTAAGATC can be represented by (2, 5, 9).

As a substring can have multiple locations, a subsequence can have multiple collections of indices, and the same index can be reused in more than one appearance of the subsequence; for example, ACG is a subsequence of AACCGGTT in 8 different ways.
Given: Two DNA strings s and t (each of length at most 1 kbp) in FASTA format.
Sample input


Return: One collection of indices of s in which the symbols of t appear as a subsequence of s. If multiple solutions exist, you may return any one.
Sample output

3 8 10




public class Finding_a_Spliced_Motif {public static void main(String[] args) {ArrayList<String> fasta = BufferedReader2("C:/Users/Administrator/Desktop/rosalind_sseq.txt", "fasta");ArrayList<Integer> index = new ArrayList<>();//双指针法int i = 0;//第一条序列,主序列int j = 0;//第二条序列,亚序列while (j < fasta.get(1).length()) {if (fasta.get(1).charAt(j) == fasta.get(0).charAt(i)) {index.add(i + 1);j++;//亚序列前进}i++;//主序列前进}for (int k = 0; k < index.size(); k++) {System.out.print(index.get(k) + " ");}}public static ArrayList<String> BufferedReader2(String path, String choose) {//返回值类型是新建集合大类,此处是Set而非哈希。BufferedReader reader;ArrayList<String> tag = new java.util.ArrayList<String>();ArrayList<String> fasta = new java.util.ArrayList<String>();try {reader = new BufferedReader(new FileReader(path));String line = reader.readLine();StringBuilder sb = new StringBuilder();while (line != null) {//多次匹配带有“>”的行,\w代表0—9A—Z_a—z,需要转义。\W代表非0—9A—Z_a—z。if (line.matches(">[\\w*|\\W*]*")) {tag.add(line);//定义字符串变量seq保存删除换行符的序列信息if (sb.length() != 0) {String seq = sb.toString();fasta.add(seq);sb.delete(0, sb.length());//清空StringBuilder中全部元素}} else {sb.append(line);//重新向StringBuilder添加元素}// read next lineline = reader.readLine();}String seq = sb.toString();fasta.add(seq);reader.close();} catch (IOException e) {e.printStackTrace();}if (choose.equals("tag")) {return tag;}return fasta;}


双指针法实现遍历的核心思想就是在遍历对象的过程中,不只使用单个指针进行数组或集合的访问,而是使用两个相同方向或者相反方向的指针进行扫描,从而达到相应的目的。换言之,双指针法充分使用了数组有序这一特征,从而在某些情况下简化运算。而实现双指针法关键点在于设定终止条件,本道题中两碱基字母相等就是终止条件:fasta.get(1).charAt(j) == fasta.get(0).charAt(i)。

