Genetic engineering

“Multidimensional spaces are completely out of style these days, unlike genetics problems” — thought physicist Woll and changed his subject of study to bioinformatics. Analysing results of sequencing he faced the following problem concerning DNA sequences. We will further think of a DNA sequence as an arbitrary string of uppercase letters “A”, “C”, “G” and “T” (of course, this is a simplified interpretation).

Let w be a long DNA sequence and s1, s2, …, sm — collection of short DNA sequences. Let us say that the collection filters w iff w can be covered with the sequences from the collection. Certainly, substrings corresponding to the different positions of the string may intersect or even cover each other. More formally: denote by |w| the length of w, let symbols of w be numbered from 1 to |w|. Then for each position i in w there exist pair of indices l, r (1 ≤ l ≤ i ≤ r ≤ |w|) such that the substring w[l … r] equals one of the elements s1, s2, …, sm of the collection.

Woll wants to calculate the number of DNA sequences of a given length filtered by a given collection, but he doesn’t know how to deal with it. Help him! Your task is to find the number of different DNA sequences of length n filtered by the collection {si}.

Answer may appear very large, so output it modulo 1000000009.

Input
First line contains two integer numbers n and m (1 ≤ n ≤ 1000, 1 ≤ m ≤ 10) — the length of the string and the number of sequences in the collection correspondently.

Next m lines contain the collection sequences si, one per line. Each si is a nonempty string of length not greater than 10. All the strings consist of uppercase letters “A”, “C”, “G”, “T”. The collection may contain identical strings.

Output
Output should contain a single integer — the number of strings filtered by the collection modulo 1000000009 (109 + 9).

Example
Input
2 1
A
Output
1
Input
6 2
CAT
TACT
Output
2
Note
In the first sample, a string has to be filtered by “A”. Clearly, there is only one such string: “AA”.

In the second sample, there exist exactly two different strings satisfying the condition (see the pictures below).


参考书籍:《挑战程序设计竞赛》
**题目大意:**给定m个子串,求构造长n的母串的方案数。母串中每个字符都至少来自一个子串。
解题思路: AC自动机+动态规划
建立AC自动机,处理出每个节点的最长后缀模式串长度max_match.
定义状态dp[i][j][k]dp[i][j][k]dp[i][j][k]为构造长为i的母串,trie树节点为j,后缀从后往前数最远第k个字符不满足要求。
转移方程:设当前节点为cur,接受字符c转移到下一节点nxt
(1)若nxt->max_match>k
dp[length+1][nxt][0]+=dp[length][cur][k]
(2)否则,未匹配的尾部增长为k+1,dp[length+1][nxt][k+1]+=dp[length][cur][k]

#include<iostream>
#include<cstdio>
#include<string>
#include<queue>
#include<algorithm>
#include<map>
using namespace std;
const int MAXN = 1e3+5;
const int MAXM = 15;
int dp[MAXN][MAXN][MAXM];
// dp[i][j][k] := 构造长i的母串,trie树节点为j,后缀有k个字符不满足要求
bool visited[MAXN][MAXN][MAXM];
const int MOD = 1e9+9 ;struct Node
{map<char, Node *> next;Node *fail;vector<int> match;static int _node_size;int id;int max_match;// 最长后缀匹配Node() : fail(NULL){id = _node_size++;max_match = 0;}
};int Node::_node_size = 0;Node *build(vector<string> pattens)
{Node *root = new Node();root->fail = root;for (int i = 0; i < pattens.size(); i++){Node *p = root;for (auto c : pattens[i]){if (p->next[c] == 0)p->next[c] = new Node();p = p->next[c];}p->match.push_back(i);p->max_match = pattens[i].size();}queue<Node *> que;for (int i = 0; i < 128; i++){if (!root->next[i]){root->next[i] = root;}else{root->next[i]->fail = root;que.push(root->next[i]);}}while (!que.empty()){Node *p = que.front();que.pop();for (auto a : p->next){int i = a.first;Node *np = a.second;if (!np)  continue;que.push(np);Node *f = p->fail;while (!f->next[i])f = f->fail;np->fail = f->next[i];np->max_match = max(np->max_match, np->fail->max_match);np->match.insert(np->match.end(), np->fail->match.begin(), np->fail->match.end());}}return root;
}Node *next_node(Node *p, char c)
{while (!p->next[c])p = p->fail;return p->next[c];
}Node *next_node(Node *p, string query)
{for(char c : query){p = next_node(p, c);}return p;
}struct State
{int length, k;Node *node;State(int length, Node *node, int k) : length(length), node(node), k(k) {}
};int main()
{int ids[256];char cs[4];const string acgt = "ACGT";for (int i = 0; i < acgt.length(); ++i){ids[acgt[i]] = i;cs[i] = acgt[i];}int n, m;cin>>n>>m;vector<string> dna(m);for (int i = 0; i < m; ++i){cin >> dna[i];}auto root = build(dna);dp[0][0][0] = 1;visited[0][0][0] = 1;queue<State> que;que.push(State(0, root, 0));while (!que.empty()){State s = que.front();que.pop();int length = s.length, k = s.k;Node *cur = s.node;if (length == n) continue;for (auto c : cs){auto nxt = next_node(cur, c);// 匹配到后缀且长度大于等于k+1,于是可以把k+1替换为0if (nxt->max_match > k){dp[length + 1][nxt->id][0] += dp[length][cur->id][k];dp[length + 1][nxt->id][0] %= MOD;if (!visited[length + 1][nxt->id][0]){visited[length + 1][nxt->id][0] = true;que.push(State(length + 1, nxt, 0));}}// 否则不行,未匹配的尾部增长为k+1else{if (k >= 9)continue;dp[length + 1][nxt->id][k + 1] += dp[length][cur->id][k];dp[length + 1][nxt->id][k + 1] %= MOD;if (!visited[length + 1][nxt->id][k + 1]){visited[length + 1][nxt->id][k + 1] = true;que.push(State(length + 1, nxt, k + 1));}}}}int ans = 0;for (int i = 0; i < root->_node_size; ++i){ans += dp[n][i][0];ans %= MOD;}cout<<ans<<endl;return 0;
}

CodeForces 86C-Genetic engineering相关推荐

  1. Codeforces 86C Genetic engineering (AC自己主动机+dp)

    题目大意: 要求构造一个串,使得这个串是由所给的串相连接构成,连接能够有重叠的部分. 思路分析: 首先用所给的串建立自己主动机,每一个单词节点记录当前节点可以达到的最长后缀. 開始的时候想的是dp[i ...

  2. 《挑战程序设计竞赛(第2版)》习题册攻略

    本项目来源于GitHub 链接: 项目GitHub链接 1 前言 项目为<挑战程序设计竞赛(第2版)>习题册攻略,已完结.可配合书籍或笔记,系统学习算法. 题量:约200道,代码注释内含详 ...

  3. 面对万亿级测序市场,纳米孔测序技术何去何从?

    这是<肠道产业>第 482 篇文章 [直播预告]"Protein& Cell人类微生物组专刊线上论坛" 12月21日晚7点开播,敬请期待!(点击查看详情) 编者 ...

  4. 液态大脑与固态大脑——圣塔菲最新群体智能文集

    来源:The Royal society 撰文 | Ricard Solé, Melanie Moses, and Stephanie Forrest 大脑,神经元构成的器官根植于许多生物体内.这是一 ...

  5. XIV Open Cup named after E.V. Pankratiev. GP of Europe

    A. The Motorway 等价于找到最小和最大的$L$满足存在$S$使得$S+(i-1)L\leq a_i\leq S+i\times L$ 即 $S\leq\min((1-i)L+a_i)$ ...

  6. 备战ccpc分站赛:秦皇岛和威海站(数论模块和dp模块)

    挑战程序设计竞赛(第2版)练习题 tips:难度(个人主观判断): 简单* 简单但卡思维 ** 中 *** 中稍加思考 **** 难 ***** 1 . 记录结果再利用的"动态规划" ...

  7. [渝粤教育] 南开大学 思辨式英文写作 参考 资料

    教育 -思辨式英文写作-章节资料考试资料-南开大学[] 随堂小测:What are the characteristics of critical essays? 1.[多选题]Which of th ...

  8. 【渝粤题库】陕西师范大学200481 高级英语(一)

    <高级英语(一)>作业 I. Explain the italicized words in English 1.The very act -was for me a far greate ...

  9. matlab 合成生物学,合成生物学原理

    你将学到什么 Modern techniques in DNA assembly and regulation of gene expression and protein activity How ...

最新文章

  1. 可穿戴智能设备的发展趋势及技术变化
  2. source insight设置tab键为4个空格
  3. 十大经典排序算法之选择排序及其优化
  4. ASP.NET Core 3.0预览版体验
  5. mysql修改字段默认值_MySQL增删改查操作
  6. opencv机器学习ml模块简介
  7. 微软发布 Windows XP 主题纪念毛衣:各种“致敬”堪称情怀满分
  8. Mybatis Plus条件查询
  9. PCB CS架构(工程系统)实现单点登入方法
  10. 电脑锁屏按什么键解锁_锁屏键除了锁屏还能干什么?这 6 个 App 带你玩转手机实体键...
  11. linux后台执行命令与putty打开程序界面 screen
  12. vs2008 sp1 C++ 发布程序
  13. libnet库的安装与使用
  14. 同步(双向)BUCK电路设计
  15. 高校就业管理系统数据库设计
  16. Android 传感器概述
  17. 手把手教你 VSCode搭建STM32开发环境
  18. 一位女性程序员的职业规划
  19. go mod出现zip: not a valid zip file的解决办法
  20. 申请 NVIDIA vGPU 90天试用 LICENSE

热门文章

  1. 已解决SyntaxError: positional argument follows keyword argument
  2. 如何做到源代码防泄密
  3. 2020R1快开门式压力容器操作考试及R1快开门式压力容器操作模拟考试系统
  4. 计算机学术型硕士论文,专业学术型硕士培养
  5. 曲柄滑块机构运动分析和参数优化
  6. 怎么查看手机的信号质量
  7. Java与Winform进行AES加解密数据传输的工具类与对应关系和示例
  8. 香港虚拟主机哪家比较好
  9. 论文翻译及笔记 --Visual Place Recognition: A Survey
  10. 科研文献工具Histcite介绍