一、参考资料

如何用Kaldi做语音识别?

Kaldi官网

kaldi仓库

二、相关介绍

Kaldi是当前最流行的开源语音识别工具(Toolkit),旨在提供灵活且可扩展的组件,包括多种语音信号处理,语音识别,声纹识别和深度神经网络。Kaldi使用WFST来实现解码算法,主要由C++编写,在此之上使用bash和Python脚本做了一些工具。而实时识别系统的好坏,取决于语音识别的性能,语音识别包含“特征提取声学模型语言模型解码器”等部分。Kaldi工具箱集成了几乎所有搭建语音识别器需要用到的工具。Kaldi支持GPU进行训练。

语音模型训练过程一般都非常复杂,对于同一数据集和模型,不同的人训练出来的结果可能会有很大差距。

2.1 TDNN

TDNN与xvector

[DL] 时延神经网络(TDNN)

**时延神经网络(TDNN)**来自1989年的论文《Phoneme recognition using time-delay neural networks》。原文中主要使用TDNN来识别音素,在识别"B", “D”, "G"三个浊音中得到98.5%的准确率,高于HMM的93.7%。

2.2 x-vector

声纹识别之xvector

声纹识别X-Vector

使用x-Vector的流程

声纹识别算法阅读之x-vector

x-vector的论文发表在ICASSP 2018,kaldi的核心开发者Daniel Povey也是这篇论文的作者之一,论文来链接如下:

X-VECTORS: ROBUST DNN EMBEDDINGS FOR SPEAKER RECOGNITION

X-VECTORS是当前声纹识别领域主流的baseline模型框架,得益于其网络中的statistics pooling层,X-VECTORS可接受任意长度的输入,转化为固定长度的特征表达;此外,在训练中引入了包含噪声和混响在内的数据增强策略,使得模型对于噪声和混响等干扰更加鲁棒。

三、关键步骤

保姆级kaldi语音识别(2)Linux系统Ubuntu20.04下开源语音识别工具kaldi配置

kaldi的编译安装与报错解决方法

3.1 准备工作

3.2 安装依赖

sudo apt-get update
sudo apt-get upgrade
sudo apt-get install git
sudo apt-get install bc
sudo apt-get install g++
sudo apt-get install zlib1g-dev make automake autoconf bzip2 libtool subversion
sudo apt-get install libatlas3-base

3.3 下载源码

git clone https://github.com/kaldi-asr/kaldi.git
cd kaldi/tools

3.4 检查依赖

用kaldi自带的脚本check_dependencies.sh来检测是否安装完所有必须的依赖工具,缺什么就安装什么。

extras/check_dependencies.sh

安装依赖

sudo apt-get install sox gfortran python2.7

安装MKL

extras/install_mkl.sh
extras/install_mkl.sh: Configuring ld runtime bindings
+ echo '/opt/intel/lib/intel64
/opt/intel/mkl/lib/intel64'
+ ldconfig
extras/install_mkl.sh: MKL package intel-mkl-64bit-2019.2-057 was successfully installed

安装openfst

重新配置configure

cd openfst-1.7.2/
sudo ./configure

编译openfst

cd ..
sudo make openfst

3.5 (再次)检查依赖

extras/check_dependencies.sh
extras/check_dependencies.sh: all OK.

3.6 编译tools

依赖安装成功,说明编译所需的工具和环境都配置好了,接下来就可以编译tools。

cd kaldi/tools
sudo make -j 8

3.7 编译src

cd src
./configure --sharedmake depend -jmake -j 12
mkdir /mnt/d/MyDocuments/cache/kaldi/src/lib
The version of configure script matches kaldi.mk version. Good.
make -C base
make[1]: Entering directory '/mnt/d/MyDocuments/cache/kaldi/src/base'
...
...
...
make[1]: Leaving directory '/mnt/d/MyDocuments/cache/kaldi/src/latbin'
Done

3.8 运行yesno样例

cd ..
cd egs/yesno/s5
./run.sh
Preparing train and test data
Dictionary preparation succeeded
utils/prepare_lang.sh --position-dependent-phones false data/local/dict <SIL> data/local/lang data/lang
Checking data/local/dict/silence_phones.txt ...
--> reading data/local/dict/silence_phones.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/silence_phones.txt is OKChecking data/local/dict/optional_silence.txt ...
--> reading data/local/dict/optional_silence.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/optional_silence.txt is OKChecking data/local/dict/nonsilence_phones.txt ...
--> reading data/local/dict/nonsilence_phones.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/nonsilence_phones.txt is OKChecking disjoint: silence_phones.txt, nonsilence_phones.txt
--> disjoint property is OK.Checking data/local/dict/lexicon.txt
--> reading data/local/dict/lexicon.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/lexicon.txt is OKChecking data/local/dict/extra_questions.txt ...
--> data/local/dict/extra_questions.txt is empty (this is OK)
--> SUCCESS [validating dictionary directory data/local/dict]**Creating data/local/dict/lexiconp.txt from data/local/dict/lexicon.txt
fstaddselfloops data/lang/phones/wdisambig_phones.int data/lang/phones/wdisambig_words.int
prepare_lang.sh: validating output directory
utils/validate_lang.pl data/lang
Checking existence of separator file
separator file data/lang/subword_separator.txt is empty or does not exist, deal in word case.
Checking data/lang/phones.txt ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/lang/phones.txt is OKChecking words.txt: #0 ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/lang/words.txt is OKChecking disjoint: silence.txt, nonsilence.txt, disambig.txt ...
--> silence.txt and nonsilence.txt are disjoint
--> silence.txt and disambig.txt are disjoint
--> disambig.txt and nonsilence.txt are disjoint
--> disjoint property is OKChecking sumation: silence.txt, nonsilence.txt, disambig.txt ...
--> found no unexplainable phones in phones.txtChecking data/lang/phones/context_indep.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 1 entry/entries in data/lang/phones/context_indep.txt
--> data/lang/phones/context_indep.int corresponds to data/lang/phones/context_indep.txt
--> data/lang/phones/context_indep.csl corresponds to data/lang/phones/context_indep.txt
--> data/lang/phones/context_indep.{txt, int, csl} are OKChecking data/lang/phones/nonsilence.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 2 entry/entries in data/lang/phones/nonsilence.txt
--> data/lang/phones/nonsilence.int corresponds to data/lang/phones/nonsilence.txt
--> data/lang/phones/nonsilence.csl corresponds to data/lang/phones/nonsilence.txt
--> data/lang/phones/nonsilence.{txt, int, csl} are OKChecking data/lang/phones/silence.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 1 entry/entries in data/lang/phones/silence.txt
--> data/lang/phones/silence.int corresponds to data/lang/phones/silence.txt
--> data/lang/phones/silence.csl corresponds to data/lang/phones/silence.txt
--> data/lang/phones/silence.{txt, int, csl} are OKChecking data/lang/phones/optional_silence.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 1 entry/entries in data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.int corresponds to data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.csl corresponds to data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.{txt, int, csl} are OKChecking data/lang/phones/disambig.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 2 entry/entries in data/lang/phones/disambig.txt
--> data/lang/phones/disambig.int corresponds to data/lang/phones/disambig.txt
--> data/lang/phones/disambig.csl corresponds to data/lang/phones/disambig.txt
--> data/lang/phones/disambig.{txt, int, csl} are OKChecking data/lang/phones/roots.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 3 entry/entries in data/lang/phones/roots.txt
--> data/lang/phones/roots.int corresponds to data/lang/phones/roots.txt
--> data/lang/phones/roots.{txt, int} are OKChecking data/lang/phones/sets.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 3 entry/entries in data/lang/phones/sets.txt
--> data/lang/phones/sets.int corresponds to data/lang/phones/sets.txt
--> data/lang/phones/sets.{txt, int} are OKChecking data/lang/phones/extra_questions.{txt, int} ...
Checking optional_silence.txt ...
--> reading data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.txt is OKChecking disambiguation symbols: #0 and #1
--> data/lang/phones/disambig.txt has "#0" and "#1"
--> data/lang/phones/disambig.txt is OKChecking topo ...Checking word-level disambiguation symbols...
--> data/lang/phones/wdisambig.txt exists (newer prepare_lang.sh)
Checking data/lang/oov.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 1 entry/entries in data/lang/oov.txt
--> data/lang/oov.int corresponds to data/lang/oov.txt
--> data/lang/oov.{txt, int} are OK--> data/lang/L.fst is olabel sorted
--> data/lang/L_disambig.fst is olabel sorted
--> SUCCESS [validating lang directory data/lang]
Preparing language models for test
arpa2fst --disambig-symbol=#0 --read-symbol-table=data/lang_test_tg/words.txt input/task.arpabo data/lang_test_tg/G.fst
LOG (arpa2fst[5.5.1056~1-f6f4c]:Read():arpa-file-parser.cc:94) Reading \data\ section.
LOG (arpa2fst[5.5.1056~1-f6f4c]:Read():arpa-file-parser.cc:149) Reading \1-grams: section.
LOG (arpa2fst[5.5.1056~1-f6f4c]:RemoveRedundantStates():arpa-lm-compiler.cc:359) Reduced num-states from 1 to 1
fstisstochastic data/lang_test_tg/G.fst
1.20397 1.20397
Succeeded in formatting data.
steps/make_mfcc.sh --nj 1 data/train_yesno exp/make_mfcc/train_yesno mfcc
utils/validate_data_dir.sh: WARNING: you have only one speaker.  This probably a bad idea.Search for the word 'bold' in http://kaldi-asr.org/doc/data_prep.htmlfor more information.
utils/validate_data_dir.sh: Successfully validated data-directory data/train_yesno
steps/make_mfcc.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
steps/make_mfcc.sh: Succeeded creating MFCC features for train_yesno
steps/compute_cmvn_stats.sh data/train_yesno exp/make_mfcc/train_yesno mfcc
Succeeded creating CMVN stats for train_yesno
fix_data_dir.sh: kept all 31 utterances.
fix_data_dir.sh: old files are kept in data/train_yesno/.backup
steps/make_mfcc.sh --nj 1 data/test_yesno exp/make_mfcc/test_yesno mfcc
utils/validate_data_dir.sh: WARNING: you have only one speaker.  This probably a bad idea.Search for the word 'bold' in http://kaldi-asr.org/doc/data_prep.htmlfor more information.
utils/validate_data_dir.sh: Successfully validated data-directory data/test_yesno
steps/make_mfcc.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
steps/make_mfcc.sh: It seems not all of the feature files were successfully procesed (29 != 31); consider using utils/fix_data_dir.sh data/test_yesno
steps/make_mfcc.sh: Less than 95% the features were successfully generated. Probably a serious error.
steps/compute_cmvn_stats.sh data/test_yesno exp/make_mfcc/test_yesno mfcc
Succeeded creating CMVN stats for test_yesno
fix_data_dir.sh: kept 29 utterances out of 31
fix_data_dir.sh: old files are kept in data/test_yesno/.backup
steps/train_mono.sh --nj 1 --cmd utils/run.pl --totgauss 400 data/train_yesno data/lang exp/mono0a
steps/train_mono.sh: Initializing monophone system.
steps/train_mono.sh: Compiling training graphs
steps/train_mono.sh: Aligning data equally (pass 0)
steps/train_mono.sh: Pass 1
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 2
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 3
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 4
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 5
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 6
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 7
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 8
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 9
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 10
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 11
steps/train_mono.sh: Pass 12
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 13
steps/train_mono.sh: Pass 14
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 15
steps/train_mono.sh: Pass 16
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 17
steps/train_mono.sh: Pass 18
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 19
steps/train_mono.sh: Pass 20
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 21
steps/train_mono.sh: Pass 22
steps/train_mono.sh: Pass 23
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 24
steps/train_mono.sh: Pass 25
steps/train_mono.sh: Pass 26
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 27
steps/train_mono.sh: Pass 28
steps/train_mono.sh: Pass 29
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 30
steps/train_mono.sh: Pass 31
steps/train_mono.sh: Pass 32
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 33
steps/train_mono.sh: Pass 34
steps/train_mono.sh: Pass 35
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 36
steps/train_mono.sh: Pass 37
steps/train_mono.sh: Pass 38
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 39
steps/diagnostic/analyze_alignments.sh --cmd utils/run.pl data/lang exp/mono0a
run.pl: job failed, log is in exp/mono0a/log/analyze_alignments.log
steps/diagnostic/analyze_alignments.sh: analyze_phone_length_stats.py failed, but ignoring the error (it's just for diagnostics)
steps/diagnostic/analyze_alignments.sh: see stats in exp/mono0a/log/analyze_alignments.log
1 warnings in exp/mono0a/log/update.*.log
exp/mono0a: nj=1 align prob=-81.88 over 0.05h [retry=0.0%, fail=0.0%] states=11 gauss=371
steps/train_mono.sh: Done training monophone system in exp/mono0a
tree-info exp/mono0a/tree
tree-info exp/mono0a/tree
fstminimizeencoded
fstpushspecial
fsttablecompose data/lang_test_tg/L_disambig.fst data/lang_test_tg/G.fst
fstdeterminizestar --use-log=true
fstisstochastic data/lang_test_tg/tmp/LG.fst
0.534295 0.533859
[info]: LG not stochastic.
fstcomposecontext --context-size=1 --central-position=0 --read-disambig-syms=data/lang_test_tg/phones/disambig.int --write-disambig-syms=data/lang_test_tg/tmp/disambig_ilabels_1_0.int data/lang_test_tg/tmp/ilabels_1_0.11190 data/lang_test_tg/tmp/LG.fst
fstisstochastic data/lang_test_tg/tmp/CLG_1_0.fst
0.534295 0.533859
[info]: CLG not stochastic.
make-h-transducer --disambig-syms-out=exp/mono0a/graph_tgpr/disambig_tid.int --transition-scale=1.0 data/lang_test_tg/tmp/ilabels_1_0 exp/mono0a/tree exp/mono0a/final.mdl
fstrmsymbols exp/mono0a/graph_tgpr/disambig_tid.int
fsttablecompose exp/mono0a/graph_tgpr/Ha.fst data/lang_test_tg/tmp/CLG_1_0.fst
fstdeterminizestar --use-log=true
fstrmepslocal
fstminimizeencoded
fstisstochastic exp/mono0a/graph_tgpr/HCLGa.fst
0.5342 -0.000422432
HCLGa is not stochastic
add-self-loops --self-loop-scale=0.1 --reorder=true exp/mono0a/final.mdl exp/mono0a/graph_tgpr/HCLGa.fst
steps/decode.sh --nj 1 --cmd utils/run.pl exp/mono0a/graph_tgpr data/test_yesno exp/mono0a/decode_test_yesno
decode.sh: feature type is delta
steps/diagnostic/analyze_lats.sh --cmd utils/run.pl exp/mono0a/graph_tgpr exp/mono0a/decode_test_yesno
run.pl: job failed, log is in exp/mono0a/decode_test_yesno/log/analyze_alignments.log
local/score.sh --cmd utils/run.pl data/test_yesno exp/mono0a/graph_tgpr exp/mono0a/decode_test_yesno
local/score.sh: scoring with word insertion penalty=0.0,0.5,1.0
%WER 0.00 [ 0 / 232, 0 ins, 0 del, 0 sub ] exp/mono0a/decode_test_yesno/wer_10_0.0

3.9 运行成功

%WER 0.00 [ 0 / 232, 0 ins, 0 del, 0 sub ] exp/mono0a/decode_test_yesno/wer_10_0.0

四、说话人识别系统

使用kaldi中的x-vector在aishell数据库上建立说话人识别系统

使用kaldi中的x-vector在aishell数据库上建立说话人识别系统。

整个系统分为三部分:

  1. 前端预处理部分,主要包括mfcc特征提取,VAD,数据扩充(增加混响、增加不同类型的噪声)等;
  2. 基于TDNN的特征提取器,该结构生成说话人表征,说话人表征也称为“embedding、x-vector”
  3. 后处理部分,对于说话人表征,采用LDA进行降维并训练PLDA模型对测试进行打分。

五、FAQ

Q:c++: fatal error: Killed signal terminated program cc1plus

C++: fatal error: Killed signal terminated program cc1plus的问题解决

c++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
make[1]: *** [<builtin>: arpa-to-const-arpa.o] Error 1
make[1]: Leaving directory '/mnt/d/MyDocuments/cache/kaldi/src/lmbin'
make: *** [Makefile:168: lmbin] Error 2
make: *** Waiting for unfinished jobs....
c++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
make[1]: *** [<builtin>: apply-cmvn.o] Error 1
make[1]: *** Waiting for unfinished jobs....
c++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
make[1]: *** [<builtin>: feat-to-dim.o] Error 1
c++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
make[1]: *** [<builtin>: ivector-subtract-global-mean.o] Error 1
make[1]: *** Waiting for unfinished jobs....
c++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
make[1]: *** [<builtin>: compose-transforms.o] Error 1
c++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
make[1]: *** [<builtin>: minimize-lattice.o] Error 1
make[1]: *** Waiting for unfinished jobs....
c++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
make[1]: *** [<builtin>: ivector-mean.o] Error 1
c++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
make[1]: *** [<builtin>: compute-and-process-kaldi-pitch-feats.o] Error 1
c++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
make[1]: *** [<builtin>: paste-vectors.o] Error 1
c++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
make[1]: *** [<builtin>: extend-transform-dim.o] Error 1
c++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
make[1]: *** [<builtin>: extract-feature-segments.o] Error 1
c++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
make[1]: *** [<builtin>: multiply-vectors.o] Error 1
c++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
make[1]: *** [<builtin>: paste-feats.o] Error 1
c++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
make[1]: *** [<builtin>: compute-cmvn-stats-two-channel.o] Error 1
c++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
make[1]: *** [<builtin>: subset-feats.o] Error 1
c++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
make[1]: *** [<builtin>: compute-fbank-feats.o] Error 1
c++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
make[1]: *** [<builtin>: fmpe-acc-stats.o] Error 1
c++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
make[1]: *** [<builtin>: compute-vad-from-frame-likes.o] Error 1
c++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
make[1]: *** [<builtin>: fmpe-apply-transform.o] Error 1
c++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
make[1]: *** [<builtin>: compute-mfcc-feats.o] Error 1
c++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
make[1]: *** [<builtin>: agglomerative-cluster.o] Error 1
c++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
make[1]: *** [<builtin>: determinize-lattice-pruned.o] Error 1
make[1]: Leaving directory '/mnt/d/MyDocuments/cache/kaldi/src/featbin'
make: *** [Makefile:168: featbin] Error 2
make[1]: Leaving directory '/mnt/d/MyDocuments/cache/kaldi/src/ivectorbin'
make: *** [Makefile:168: ivectorbin] Error 2
ar -cr kaldi-nnet.a nnet-nnet.o nnet-component.o nnet-loss.o nnet-pdf-prior.o nnet-randomizer.o
ranlib kaldi-nnet.a
c++ -shared -o libkaldi-nnet.so -Wl,--as-needed  -Wl,-soname=libkaldi-nnet.so,--whole-archive kaldi-nnet.a -Wl,--no-whole-archive -Wl,-rpath=/mnt/d/MyDocuments/cache/kaldi/tools/openfst-1.7.2/lib -rdynamic   -Wl,-rpath=/mnt/d/MyDocuments/cache/kaldi/src/lib  ../cudamatrix/libkaldi-cudamatrix.so  ../hmm/libkaldi-hmm.so  ../tree/libkaldi-tree.so  ../util/libkaldi-util.so  ../matrix/libkaldi-matrix.so  ../base/libkaldi-base.so /mnt/d/MyDocuments/cache/kaldi/tools/openfst-1.7.2/lib/libfst.so -L/opt/intel/mkl/lib/intel64 -Wl,-rpath=/opt/intel/mkl/lib/intel64 -l:libmkl_intel_lp64.so -l:libmkl_core.so -l:libmkl_sequential.so -ldl -lpthread -lm -lm -lpthread -ldl
ln -sf /mnt/d/MyDocuments/cache/kaldi/src/nnet/libkaldi-nnet.so /mnt/d/MyDocuments/cache/kaldi/src/lib/libkaldi-nnet.so
make[1]: Leaving directory '/mnt/d/MyDocuments/cache/kaldi/src/nnet'
make[1]: Leaving directory '/mnt/d/MyDocuments/cache/kaldi/src/lat'
make: *** [Makefile:164: lat] Error 2
错误原因:
内存不足,导致编译中断解决办法:
增加进程数量,增加make编译时的进程数量
make -j 12其他办法:
(1)增加内存
(2)增加SWAP交换空间
(3)减少进程数量,减少make编译时的进程数量

Kaldi语音识别技术相关推荐

  1. Kaldi语音识别技术(三) ----- 完成L.fst的生成

    Kaldi语音识别技术(三) ----- 完成L.fst的生成 文章目录 Kaldi语音识别技术(三) ----- 完成L.fst的生成 基础知识 一.运行环境准备 二.文件准备 lexicon.tx ...

  2. Kaldi语音识别技术(五) ----- 特征提取

    Kaldi语音识别技术(五) ----- 特征提取 文章目录 Kaldi语音识别技术(五) ----- 特征提取 一.识别流程 二.MFCC特征提取概述 三.文件格式 文件格式说明 提取部分数据 修复 ...

  3. 基于《Kaldi语音识别》技术及开源语音语料库分享

    前言: 数据堂自AI开源计划发起,面向高校和科研机构首次开源的[1505小时中文普通话语音数据集],该数据集句标注准确率达到了98%,得到了很多开发者的认可. 不仅如此,数据堂基于此开源数据集还精选出 ...

  4. kaldi 语音识别

    广告关闭 腾讯云双11爆品提前享,精选热门产品助力上云,云服务器首年88元起,买的越多返的越多,最高满返5000元! 所以kaldi.cntk.tensorflow等支持深度学习的工具目前比较流行,k ...

  5. 语音识别技术的原理及研究难点

    在我们的生活中,语言是传递信息最重要的方式,它能够让人们之间互相了解.人和机器之间的交互也是相同的道理,让机器人知道人类要做什么.怎么做.交互的方式有动作.文本或语音等等,其中语音交互越来越被重视,因 ...

  6. [转]Kaldi语音识别

    Kaldi语音识别1.声学建模单元的选择1.1对声学建模单元加入位置信息2.输入特征3.区分性技术4.多音字如何处理?5.Noise Robust ASR6.Deep Learning[DNN/CNN ...

  7. 清华大学出版社-图书详情-《深度学习:语音识别技术实践》

    前 言 作为人工智能技术的重要组成部分,语音识别旨在研究计算机如何听懂人的讲话.来源于人工神经网络的深度学习促进了语音识别技术的发展.本书从使用开源的语音识别构建系统Kaldi开始讲起,引导读者亲自实 ...

  8. (深入篇)漫游语音识别技术—带你走进语音识别技术的世界

    前有古人,后有小王,大家好,我是你们爱思考的小王学长,今天咱们继续漫游语音识别技术哈,今天内容稍微专业一些,大家可以结合上一篇漫游语音识别技术一起学习. 上篇我们简单了解了语音识别技术的概念.前世今生 ...

  9. kaldi教程_赠书 | 全球稀缺的Kaldi学习资料,《Kaldi语音识别实战》给补上了

    刚刚过去的十年是语音技术发展的黄金十年.Kaldi的出现,被业内公认为极大地降低了语音识别技术学习与使用的门槛,成为广受欢迎的工具. Kaldi 项目发布不久,就吸引了国内外的大量用户,形成了一个活跃 ...

最新文章

  1. shell编程开发应用指南
  2. arma找不到合适的模型_TAP300R系列直角方肩立铣刀,您还在为找不到合适刀具发愁吗?...
  3. android air创建文件夹,安卓版Airdrop将上线:无需安装APP,轻松实现文件隔空投送...
  4. python多个main方法_Python,main方法未运行(同一文件中有多个类)
  5. Calvin: Fast Distributed Transactions for Partitioned Database Systems研读
  6. 你的才艺怎样变现?--Rarible平台
  7. linux怎么创建牡蛎_牡蛎的意思
  8. 计算机设备管理器里面没有图像,设备管理器里没有图像设备怎么办?
  9. 宝贝宝贝用计算机弹奏,原神宝贝宝贝琴谱 原神琴谱两只老虎爱跳舞怎么弹
  10. 【视频+图文 直播贴】2014.9.9 Apple苹果发布会
  11. 【《Real-Time Rendering 3rd》 提炼总结】(三) 第三章 · GPU渲染管线与可编程着色器 The Graphics Processing Unit
  12. 返利机器人分享话术_返利机器人裂变话术
  13. 数值计算(三)-插值法(2)牛顿插值法
  14. 如何实现在on ethernetPacket中自动回复NDP response消息
  15. #python元组(元组的创建和删除)
  16. 《MongoDB在信息资源共享建设的应用实践》
  17. FME入门视频教程:第二节 FME模板的使用,视频讲解如何使用已经做好的FME模板工具
  18. 帧 计算机网络中传输数据的最小单位
  19. B站弹幕文件protobuf协议的逆向和还原
  20. 记K8s Pod The node was low on resource: [DiskPressure]. 问题排查

热门文章

  1. vue项目-后台管理系统
  2. MacOS redis开机启动设置
  3. Ural 1671. Anansi's Cobweb(并查集)
  4. htts 及 tomcat ssl配置
  5. Java杂谈——求所有的4位吸血鬼数字
  6. html简单情侣对话
  7. 用HTML写一个简易的登录界面
  8. JS点击复制按钮复制相关内容
  9. 卢卡斯定理扩展卢卡斯
  10. 企业级网络架构—云平台高可用网络的修炼之道