FunAsr微调finetune

相关文件路径
/root/.cache/modelscope/hub/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/

里面有finetune.yaml可以手动查看修改微调参数

tree /root/.cache/modelscope/hub/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/
/root/.cache/modelscope/hub/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/
├── README.md
├── am.mvn
├── config.yaml
├── configuration.json
├── decoding.yaml
├── example
│ └── asr_example.wav
├── fig
│ └── struct.png
├── finetune.yaml
├── lm
│ ├── lm.pb
│ └── lm.yaml
├── model.pb
├── seg_dict
└── tokens.txt

3 directories, 13 files

官方文档参考

here1
here2

以上两个链接略有不同，第二个链接说的更加清楚明白一些；第一个链接主要是代码，比funasr的finetune.py要全一些的代码。

搭建自定义任务
here3
暂时没有实践。
打印日志

方法来自：阿里巴巴李泽瑞
from modelscope.utils.logger import get_logger
logger = get_logger()

可以用这个来打一些日志，和logging使用方法一样，就是再包了一层；在funasr中，可以通过这个logger来打印些中间结果看看

多卡训练

python finetune.py 默认使用单卡训练；
多卡试试

NCCL_P2P_DISABLE=1 CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node 2 finetune.py > log.txt 2>&1
NCCL_P2P_DISABLE=1 CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m torch.distributed.launch --nproc_per_node 8 finetune.py > log.txt 2>&1

打印日志调试

token处理文件：funasr/datasets/large_datasets/utils/tokenize.py
训练迭代文件：funasr/train/trainer.py 568行

568 for iiter, (keys, batch) in enumerate(
569             reporter.measure_iter_time(iterator, "iter_time"), 1
570         ):

日志打印

开头创建logger

from modelscope.utils.logger import get_loggerlogger = get_logger()

中间地方打印想要的日志：

571             assert isinstance(batch, dict), type(batch)
572
573             text = batch["text"]
574             if text.dtype != torch.int64 and text.dtype != torch.int32:
575                 logger.info("itter: " + str(itter))
576                 logger.info("text.dtype: " + text.dtype)
577                 logger.info("sample keys: {}".format(keys))
578             else:
579                 logger.info("text.dtype: " + text.dtype)
580                 continue

funasr位置
有两个，一个是git clone下来的项目位置，一个是pip/conda安装的位置。如果在git clone中修改日志不生效，则修改pip/conda安装包下的相关文件，即可生效。

pip/conda位置：/root/anaconda3/envs/你的env/lib/python3.*/site-packages/funasr

FunAsr微调finetune相关推荐

BERT微调finetune笔记
参考: 什么是BERT? - 知乎 (zhihu.com) 词向量之BERT - 知乎 (zhihu.com) BERT 详解 - 知乎 (zhihu.com) 详解Transformer (Atte ...
模型微调(finetune)
----接上次的鸟的图像分类,其acc为84%. 这次依然使用此数据集,并用resenet网络进行finetune,然后进行鸟的图像分类. 1.什么是finetune? 利用已训练好的模型进行重构(自 ...
pytorch模型微调(Finetune)
Transfer Learning & Model Finetune 模型微调 **Transfer Learning:**机器学习分支,研究源域(source domain)的知识如何应用到 ...
【论文解读】(如何微调BERT？) How to Fine-Tune BERT for Text Classification?
文章目录论文信息 1. 论文内容 2. 论文结论 2.1 微调流程 2.2 微调策略(Fine-Tuning Strategies) 2.3 Further Pretrain 3. 论文实验介绍 3 ...
深度学习—— caffe下进行微调finetune
一.引言一直以来就很纠结finetune,其意思就是已知别人已经训练好的的模型和网络结构,自己的数据较小,而任务基本相同,想在其基础之上进行训练成自己的模型,这样,弥补了自己数据量小的缺点,也即是微 ...
深度学习检测小目标常用方法
作者丨船长@知乎来源丨https://zhuanlan.zhihu.com/p/83220498 编辑丨极市平台本文仅用于学术分享,如有侵权,请联系后台作删文处理. 引言在深度学习目标检测中,特 ...
干货 | 深度学习检测小目标常用方法
点击上方"视学算法",选择"星标"公众号重磅干货,第一时间送达 github地址:https://github.com/Captain1986/Captain ...
越线人群计数--Crossing-line Crowd Counting with Two-phase Deep Neural Networks
Crossing-line Crowd Counting with Two-phase Deep Neural Networks ECCV2016 人群计数有两种做法:1) region-of-int ...
任何网络都能山寨！新型黑盒对抗攻击可模拟未知网络进行攻击 | CVPR 2021
来源:AI科技评论本文约3500字,建议阅读9分钟本文解读对抗攻击与元学习联姻的两篇典型的论文. 最近几年,元学习风生水起,这阵风也刮到了对抗攻击领域.本文解读对抗攻击与元学习联姻的两篇典型的论文(本 ...

FunAsr微调finetune

FunAsr微调finetune相关推荐

最新文章

热门文章