我现在的问题是这样的

vcf文件转plink的格式

方法一

vcftools

[lyc@200server ~]$ vcftools --vcf Rice.recode.vcf --plink --out output

出错这样

VCFtools - 0.1.17

(C) Adam Auton and Anthony Marcketta 2009

Parameters as interpreted:

--vcf Rice.recode.vcf

--out output

--plink

Warning: Expected at least 2 parts in FORMAT entry: ID=RNC,Number=2,Type=Character,Description="Reason for No Call in GT: . = n/a, M = Missing data, P = Partial data, I = gVCF input site is non-called, D = insufficient Depth of coverage, - = unrepresentable overlapping deletion, L = Lost/unrepresentable allele (other than deletion), U = multiple Unphased variants present, O = multiple Overlapping variants present, 1 = site is Monoallelic, no assertion about presence of REF or ALT allele">

After filtering, kept 141 out of 141 Individuals

Writing PLINK PED and MAP files ...

Unrecognized values used for CHROM: Chr1 - Replacing with 0.

ls

ls

clear

Unrecognized values used for CHROM: Chr2 - Replacing with 0.

Expected at least 2 parts in FORMAT entry: ID=RNC,Number=2,Type=Character,Description="Reason for No Call in GT: . = n/a, M = Missing data, P = Partial data, I = gVCF input site is non-called, D = insufficient Depth of coverage, - = unrepresentable overlapping deletion, L = Lost/unrepresentable allele (other than deletion), U = multiple Unphased variants present, O = multiple Overlapping variants present, 1 = site is Monoallelic, no assertion about presence of REF or ALT allele">

Unrecognized values used for CHROM: Chr3 - Replacing with 0.

Unrecognized values used for CHROM: Chr4 - Replacing with 0.

Unrecognized values used for CHROM: Chr5 - Replacing with 0.

Unrecognized values used for CHROM: Chr6 - Replacing with 0.

Unrecognized values used for CHROM: Chr7 - Replacing with 0.

Unrecognized values used for CHROM: Chr8 - Replacing with 0.

Unrecognized values used for CHROM: Chr9 - Replacing with 0.

Unrecognized values used for CHROM: Chr10 - Replacing with 0.

Unrecognized values used for CHROM: Chr11 - Replacing with 0.

Unrecognized values used for CHROM: Chr12 - Replacing with 0.

Unrecognized values used for CHROM: ChrUn - Replacing with 0.

Unrecognized values used for CHROM: ChrSy - Replacing with 0.

Unrecognized values used for CHROM: chrC - Replacing with 0.

Done.

After filtering, kept 7186300 out of a possible 7186300 Sites

Run Time = 1647.00 seconds

[lyc@200server ~]$ ls

file.log output.ped prettify

file-temporary.bed output-temporary.bed Rice.recode.vcf

file-temporary.bim output-temporary.bim test

file-temporary.fam output-temporary.fam toy.map

LICENSE plink toy.ped

output.log plink_linux_x86_64_20201019.zip

output.map ppp

[lyc@200server ~]$

[lyc@200server ~]$ ls

file.log output.ped prettify

file-temporary.bed output-temporary.bed Rice.recode.vcf

file-temporary.bim output-temporary.bim test

file-temporary.fam output-temporary.fam toy.map

LICENSE plink toy.ped

output.log plink_linux_x86_64_20201019.zip

output.map ppp

[lyc@200server ~]$ clear

[lyc@200server ~]$ Expected at least 2 parts in FORMAT entry: ID=RNC,Number=2,Type=Character,Description="Reason for No Call in GT: . = n/a, M = Missing data, P = Partial data, I = gVCF input site is non-called, D = insufficient Depth of coverage, - = unrepresentable overlapping deletion, L = Lost/unrepresentable allele (other than deletion), U = multiple Unphased variants present, O = multiple Overlapping variants present, 1 = site is Monoallelic, no assertion about presence of REF or ALT allele">

-bash: 未预期的符号 `newline' 附近有语法错误

方法二plink

plink --vcf Rice.recode.vcf --recode --out file

plink --vcf Rice.recode.vcf --recode --out output --double-id

plink --vcf Rice.recode.vcf --recode --out output --const-fid family_id

都是一样的结果

773821 MB RAM detected; reserving 386910 MB for main workspace.

Error: Line 38 of .vcf file has a GT half-call.

Use --vcf-half-call to specify how these should be processed.

就无语

之前我的文件有这些

file.log LICENSE plink Rice.recode.vcf

file-temporary.bed output.log plink_linux_x86_64_20201019.zip test

file-temporary.bim output.map ppp toy.map

file-temporary.fam output.ped prettify toy.ped

现在是这些

file.log output.ped prettify

file-temporary.bed output-temporary.bed Rice.recode.vcf

file-temporary.bim output-temporary.bim test

file-temporary.fam output-temporary.fam toy.map

LICENSE plink toy.ped

output.log plink_linux_x86_64_20201019.zip

output.map ppp

然后看到了这些

image.png

https://www.cog-genomics.org/plink/1.9/input

然后我就重新安装了plink为plink2

http://www.cog-genomics.org/plink/2.0/

linux上安装Plink

https://blog.csdn.net/qq_40605470/article/details/108882992

但是又遇到了新的问题

[lyc@200server ~]$ plink2 --vcf Rice.recode.vcf --recode --out ccc

PLINK v2.00a3LM 64-bit Intel (2 Mar 2021) www.cog-genomics.org/plink/2.0/

(C) 2005-2021 Shaun Purcell, Christopher Chang GNU General Public License v3

Logging to ccc.log.

Options in effect:

--export ped

--out ccc

--vcf Rice.recode.vcf

Start time: Thu Mar 18 20:20:58 2021

773821 MiB RAM detected; reserving 386910 MiB for main workspace.

Using up to 80 threads (change this with --threads).

--vcf: 7146k variants scanned.

Error: Invalid chromosome code 'ChrUn' on line 7146662 of --vcf file.

(Use --allow-extra-chr to force it to be accepted.)

End time: Thu Mar 18 20:21:32 2021

然后发现文件又是临时文件

[lyc@200server ~]$ ls

ccc.log plink2

ccc-temporary.psam plink2_linux_x86_64_20210302.zip

file.log plink2.log

file-temporary.bed plink_linux_x86_64_20201019.zip

file-temporary.bim ppp

file-temporary.fam prettify

LICENSE Rice.recode.vcf

output.log test

output.map toy.map

output.ped toy.ped

output-temporary.bed transform.log

output-temporary.bim transform-temporary.bed

output-temporary.fam transform-temporary.bim

plink transform-temporary.fam

说明又出错了

所以,现在或许又要解决那一行的无效染色体

然后我就使用了--allow-extra-chr

[lyc@200server ~]$ plink2 --vcf Rice.recode.vcf --recode --out ccc --allow-extra-chr

PLINK v2.00a3LM 64-bit Intel (2 Mar 2021) www.cog-genomics.org/plink/2.0/

(C) 2005-2021 Shaun Purcell, Christopher Chang GNU General Public License v3

Logging to ccc.log.

Options in effect:

--allow-extra-chr

--export ped

--out ccc

--vcf Rice.recode.vcf

Start time: Fri Mar 19 16:31:21 2021

773821 MiB RAM detected; reserving 386910 MiB for main workspace.

Using up to 80 threads (change this with --threads).

--vcf: 7186300 variants scanned.

Error: Line 38 of --vcf file has a GT half-call.

Use --vcf-half-call to specify how these should be processed.

End time: Fri Mar 19 16:32:26 2021

哈,又回到了最初的起点

image.png

可是我看意思,plink2不应该有这个问题啊

[lyc@200server ~]$ plink2 --vcf Rice.recode.vcf --recode --out ccc --allow-extra-chr --vcf-half-call

PLINK v2.00a3LM 64-bit Intel (2 Mar 2021) www.cog-genomics.org/plink/2.0/

(C) 2005-2021 Shaun Purcell, Christopher Chang GNU General Public License v3

Logging to ccc.log.

Options in effect:

--allow-extra-chr

--export ped

--out ccc

--vcf Rice.recode.vcf

--vcf-half-call

Start time: Fri Mar 19 16:33:16 2021

Error: Missing --vcf-half-call argument.

For more info, try "plink2 --help " or "plink2 --help | more".

[lyc@200server ~]$ plink --vcf Rice.recode.vcf --allow-extra-chr --recode --vcf-half-call'missing' --out eee

PLINK v1.90b6.21 64-bit (19 Oct 2020) www.cog-genomics.org/plink/1.9/

(C) 2005-2020 Shaun Purcell, Christopher Chang GNU General Public License v3

Logging to eee.log.

Options in effect:

--allow-extra-chr

--out eee

--recode

--vcf Rice.recode.vcf

--vcf-half-callmissing

Error: Unrecognized flag ('--vcf-half-callmissing').

For more information, try "plink --help " or "plink --help | more".

发现原来是因为少了空格

[lyc@200server ~]$ plink --vcf Rice.recode.vcf --allow-extra-chr --recode --vcf-half-call 'haploid' --out eee

PLINK v1.90b6.21 64-bit (19 Oct 2020) www.cog-genomics.org/plink/1.9/

(C) 2005-2020 Shaun Purcell, Christopher Chang GNU General Public License v3

Logging to eee.log.

Options in effect:

--allow-extra-chr

--out eee

--recode

--vcf Rice.recode.vcf

--vcf-half-call haploid

773821 MB RAM detected; reserving 386910 MB for main workspace.

--vcf: eee-temporary.bed + eee-temporary.bim + eee-temporary.fam written.

7186300 variants loaded from .bim file.

141 people (0 males, 0 females, 141 ambiguous) loaded from .fam.

Ambiguous sex IDs written to eee.nosex .

Using 1 thread (no multithreaded calculations invoked).

Before main variant filters, 141 founders and 0 nonfounders present.

Calculating allele frequencies... done.

Total genotyping rate is 0.878065.

7186300 variants and 141 people pass filters and QC.

Note: No phenotypes present.

--recode ped to eee.ped + eee.map ... done.

[lyc@200server ~]$ plink --file eee --allow-extra-chr --make-bed --out rice

PLINK v1.90b6.21 64-bit (19 Oct 2020) www.cog-genomics.org/plink/1.9/

(C) 2005-2020 Shaun Purcell, Christopher Chang GNU General Public License v3

Logging to rice.log.

Options in effect:

--allow-extra-chr

--file eee

--make-bed

--out rice

773821 MB RAM detected; reserving 386910 MB for main workspace.

.ped scan complete (for binary autoconversion).

Performing single-pass .bed write (7186300 variants, 141 people).

--file: rice-temporary.bed + rice-temporary.bim + rice-temporary.fam written.

7186300 variants loaded from .bim file.

141 people (0 males, 0 females, 141 ambiguous) loaded from .fam.

Ambiguous sex IDs written to rice.nosex .

Using 1 thread (no multithreaded calculations invoked).

Before main variant filters, 141 founders and 0 nonfounders present.

Calculating allele frequencies... done.

Total genotyping rate is 0.878065.

7186300 variants and 141 people pass filters and QC.

Note: No phenotypes present.

--make-bed to rice.bed + rice.bim + rice.fam ... done.

这回终于对了

image.png

但问题是第一种转的,不知道应该是哪一种

image.png

我直接用了

我发现用plink2依然也还是这个问题,我看到一个答案貌似说是因为版本还不够新,今年初的那个改进版本或许可以修正这个问题

[lyc@200server ~]$ plink2 --vcf Rice.recode.vcf --allow-extra-chr --make-bed --out test

PLINK v1.90b6.21 64-bit (19 Oct 2020) www.cog-genomics.org/plink/1.9/

(C) 2005-2020 Shaun Purcell, Christopher Chang GNU General Public License v3

Logging to test.log.

Options in effect:

--allow-extra-chr

--make-bed

--out test

--vcf Rice.recode.vcf

773821 MB RAM detected; reserving 386910 MB for main workspace.

Error: Line 38 of .vcf file has a GT half-call.

Use --vcf-half-call to specify how these should be processed.

vfc格式linux,2021-03-17 在linux上将vcf文件转plink的格式bed,bim,fam相关推荐

  1. 2021.03.17模块

    2021.03.17 总结 模块 什么是模块,什么是包 一个py文件就是一个模块,文件名就是模块名(如果一个模块想要被其他模块使用,模块名必须是标识符并且不是关键字) 一个包含__init__.py文 ...

  2. 2021.03.17 pokémon小游戏开发记录与周总结

    2021.03.17 pokémon小游戏开发记录与周总结 此篇仅包含部分项目代码,只是个人的学习总结. 文章目录 2021.03.17 pokémon小游戏开发记录与周总结 前言 一.前期准备 二. ...

  3. 如何免费把vcf文件转换成excel格式

    vcf文件怎么转成excel这篇文章有网友评论说不想花钱.那么我们就来讲一讲vcf文件怎么转成excel格式不花钱. 默认20条内容不收费 九雷VCF转换器支持一键批量把VCF通讯录文件转换成Exce ...

  4. 2021.9.17 zookeeper Linux 常用命令

    zookeeper的安装目录:/usr/local/zookeeper-3.4.6/bin/zkServer.sh; 配置文件路径:-/conf/zoo.cfg 端口 :2181: ZooKeeper ...

  5. linux 入门命令,新手入门Linux命令集锦

    一.常用系统工作命令 1.wget 命令 作用:用于在终端中下载网络文件. 格式:wget [参数] 下载地址 参数及作用: -b : 后台下载模式 -d:显示调试信息 -N:该参数指定wget只下载 ...

  6. 怎么转换html文件为mp3,如何把音频转换成mp3_音频文件怎么转mp3格式-系统城

    随着计算机技术的发展,网络上的音频文件的格式会随着音质的好坏决定存储的格式,一些朋友想要把某些音频文件转化成mp3格式,却不知道怎么操作.那么我们该如何把音频文件转换成mp3呢?接下来小编就给大家带来 ...

  7. html格式怎么转换mp4视频文件怎么打开吗,QSV文件怎么打开 qsv文件转换成mp4格式教程详解...

    很多朋友都有遇到过QSV视频文件无法打开的情况吧.今天本文主要分享一下QSV文件怎么打开,另外如果需要手机.电脑都可以轻松打开qsv文件,则还需要将QSV文件转换成MP4格式就可以了,下面具体来看看. ...

  8. 如何将CAJ文件转换成PDF格式?分享两种实用的方法

    CAJ是一种特定的文献格式,通常用于中国学术期刊和学位论文等.在学习生活中我们查阅一些文献资料,一些权威文献报刊通常情况下都是CAJ文件格式,打开它需要使用专业的阅读工具 ,这时候就需要将它转换成PD ...

  9. 怎么把音乐文件转成mp3格式?这4个方法帮你轻松搞定

    分享4个好用的音乐文件转换工具,支持多种音乐格式的转换,亲测好用! 一.加密音乐格式转换 1.音乐解锁 一个加密音乐格式转换在线工具,支持多个音乐平台的音乐格式转换,页面简洁,使用也方便,打开之后就可 ...

  10. 如何将pdf文件转换成txt格式

    工作中会遇到很多pdf格式的文件,有的是自己查找的资料,有的是客户发来的文件,针对这些pdf文件想要进行二次编辑,只能将其转换成可编辑的其他格式,比如txt,那么如何将pdf文件转换成txt格式呢? ...

最新文章

  1. 解决idea导入项目后依赖报错问题
  2. 指纹浏览器 开源 linux,浏览器指纹--Canvas指纹
  3. 关于无法用127.0.0.1连接数据库的解决办法
  4. Vue学习之路---No.7(分享心得,欢迎批评指正)
  5. java jdbc连接derby,通过JDBC连接到Derby数据库失败
  6. 全栈深度学习第5期: 神经网络调试技巧
  7. 51单片机扩展io口实验c语言,【51单片机】普通I/O口模拟SPI口C语言程序
  8. week one(1)—What is machine learning?
  9. sap gui java_不喜欢SAP GUI?那试试用Eclipse进行ABAP开发吧
  10. PS小技巧 | 不需要抠图的黑白配
  11. PSPNet: Pyramid Scene Parsing Network论文解读
  12. rocketmq获取消息id_贞炸了!上线之后,消息收不到了!
  13. 美元升值对中国资产价格的影响
  14. HTML5期末大作业:抗疫主题网站设计(14页) HTML+CSS+JavaScript web课程设计网页规划与设计...
  15. 高斯公式(三重积分和第二类曲面积分互相转换)
  16. python学习面向对象_Python小白必学的面向对象
  17. 雨伞16骨好还是24骨好_伞骨什么材质好 晴雨伞骨数越多越好吗
  18. 青铜变王者,桌面云是如何逆袭的?
  19. SOAP Version 1.2
  20. 计算机系统——汇编语言基础

热门文章

  1. 【kimol君的无聊小发明】—用python写音乐下载器
  2. 【实战 01】心脏病二分类数据集
  3. 完全平方数-动态规划
  4. 基于matlab进行图像处理学习——从入门到入魔
  5. android微信支付插件,Android通过Apk插件调起微信支付
  6. 用Python玩转数据(一)
  7. springboot 实现图片合并
  8. Minimax AI 算法在井字游戏(或 Noughts and Crosses)游戏中的实现
  9. 使用VMware虚拟机搭建Panabit透明网桥环境
  10. 什么是计算机的超级用户账号,administrator是什么意思