blast的相关概念
1、基本概念
相似性(Similarity) 是指序列比对过程中用来描述检测序列和目标序列之间相同或相似碱基或氨基酸残基占全部比对碱基或氨基酸残基的比例的高低,属于量的判断。 同源性(Homology) 是指从某一共同祖先经趋异进化而形成的不同序列。只有当两个蛋白质在进化关系上具有共同的祖先时,才可称它们为同源的,属于质的判断。 相似性和同源性的关系 当相似程度高于 50% 时,比较容易推测检测序列和目标序列可能是同源序列; 序列相似性比较和同源性分析 序列相似性分析: 就是用来计算待研究序列与某序列之间的相似性程度,常用的软件包有 BLAST 、 FASTA 等; 局部相似性比对的生物学基础 我们在获得一个Blast结果时需要看这两个指标。 fastacmd -d db_name -s p38398 From: http://www.biostatistician.cn/thread-467-1-1.html Using BLASTClust to Make Non-redundant Sequence Sets BLASTClust is a program within the standalone BLAST package used to cluster either protein or nucleotide sequences. The program begins with pairwise matches and places a sequence in a cluster if the sequence matches at least one sequence already in the cluster. In the case of proteins, the blastp algorithm is used to compute the pairwise matches; in the case of nucleotide sequences, the Megablast algorithm is used. In the simplest case, BLASTClust takes as input a file containing catenated FASTA-format sequences, each with a unique identifier at the start of the definition line. BLASTClust formats the input sequence to produce a temporary BLAST database, performs the clustering, and removes the database at completion. Hence, there is no need to run formatdb in advance to use BLASTClust. The output of BLASTClust consists of a file, one cluster to a line, of sequence identifiers separated by spaces. The clusters are sorted from the largest cluster to the smallest. BLASTClust accepts a number of parameters that can be used to control the stringency of clustering including thresholds for score density, percent identity, and alignment length. The BLASTClust program has a number of applications, the simplest of which is to create a non-redundant set of sequences from a source database. As an example, one might have a library of a few thousand short nucleotide sequence reads and wish to replace these with a non-redundant set. To produce the non-redundant set, one might use: blastclust -i infile -o outfile -p F -L .9 -b T -S 95 The sequences in "infile" will be clustered and the results will be written to "outfile". The input sequences are identified as nucleotide (-p F); "-p T", or protein, is the default. To register a pairwise match two sequences will need to be 95% identical (-S 95) over an area covering 90% of the length (-L .9) of each sequence (-b T) . Using "-b F" instead of "-b T" would enforce the alignment length threshold on only one member of a sequence pair. The parameter "S", used here to specify the percent identity, can also be used to specify, instead, a "score density." The latter is equivalent to the BLAST score divided by the alignment length. If "S" is given as a number between 0 and 3, it is interpreted as a score density threshold; otherwise it is interpreted as a percent identity threshold. To create a stringent non-redundant protein sequence set, use the following command line: blastclust -i infile -o outfile -p T -L 1 -b T -S 100 In this case, only sequences which are identical will be clustered together. The “blastclust.txt” file in the standalone BLAST package details the full range of BLASTClust parameters. |
blast的相关概念相关推荐
- 2021年大数据Flink(三十三):Table与SQL相关概念
目录 相关概念 Dynamic Tables & Continuous Queries Table to Stream Conversion 相关概 ...
- 2021年大数据Flink(十):流处理相关概念
目录 流处理相关概念 数据的时效性 流处理和批处理 流批一体API DataStream API 支持批执行模式 API 编程模型 流处理相关概念 数据的时效 ...
- r语言remarkdown展示图_使用R语言包circlize可视化展示blast双序列比对结果
circlize这个包还挺强大的,R语言里用来画圈图还挺方便的. 今天这篇文章记录用circlize这个包画圈图展示blast双序列比对结果的代码 植物线粒体基因组类的文章通常会分析细胞器基因组间基因 ...
- 以太坊智能合约开发第二篇:理解以太坊相关概念
链客,专为开发者而生,有问必答! 此文章来自区块链技术社区,未经允许拒绝转载. 很多人都说比特币是区块链1.0,以太坊是区块链2.0.在以太坊平台上,可以开发各种各样的去中心化应用,这些应用构成了以太 ...
- 为什么 Biopython 的在线 BLAST 这么慢?
用过网页版本 BLAST 的童鞋都会发现,提交的序列比对往往在几分钟,甚至几十秒就可以得到比对的结果:而通过调用 API 却要花费几十分钟或者更长的时间!这到底是为什么呢? NCBIWWW 基本用法 ...
- BLAST引物或靶点特异性
blastn -query qpcr_primer.txt -db Es_cds_db -outfmt 6 -evalue 1e-2 -out qpcr_primer_EsCDS.blast -num ...
- 生信分析-本地BLAST
一. 本地blast简介 本地Blast(Basic Local Alignment Search Tool),是基于本地的比对搜索工具,可以在自己建立的数据库进行blast搜索,与NCBI的在线bl ...
- 第二章 序列比对——Blast局部比对
第二章 序列比对--Blast局部比对 阅读量: 330 主要为基因组测序比对相关知识,部分内容作笔记自查使用.如有错误或遗漏还请海涵,可评论或邮箱联系. 最后修改时间:2020-04-16 16: ...
- Basic local alignment search tool (BLAST)
Basic local alignment search tool (BLAST) 包括:blastn, blastp, blastx, tblastn, tblastx等. 使用conda安装即可. ...
最新文章
- tcpdump for Android 移动端抓包
- Skype for Business Server 2015-09-测试-基本功能(建议:看PDF!)
- 国外网站评出对程序员最具影响的书籍清单
- Android添加单元测试的方法与步骤
- java方法不可覆盖_详解Java构造方法为什么不能覆盖,我的钻牛角尖病又犯了.......
- 华罗庚的数学有多厉害?靠报纸上的一个四边形算出导弹基地的位置
- 计算机控制技术实际PID控制,计算机控制技术数字PID.doc
- iphone使用linux命令apt-get也没有问题
- iphone 制作在线播放器
- 【t063】最聪明的机器人
- Flutter动画系列之AnimatedWidget
- autocad2007二维图画法_AutoCAD2007教程(二)二维基本绘图命令
- matlab cftool 最小二乘,【转】最小二乘法与matlab拟合工具箱cftool
- App Store 评分和评论:用户评论如何影响 App Store 排名
- Vins-Mono系列代码和理论解读<五>.位姿图Pose_graph理论和代码实现细节
- 推荐几个程序员Mac m1max芯片笔记本软件
- IPv4与IPv6之间的区别
- 关于猜数字中随机数的产生
- Python批量对DJ歌曲进行下载,配合电子木鱼更佳
- Genbank的gbff格式转gff3格式(补充)