wav2letter++ 第一次training 日志

w2l的github有一个demo
https://github.com/facebookresearch/wav2letter/tree/master/tutorials/1-librispeech_clean
按照demo训练
有一处有变化
wav2letter/tutorials/librispeech_clean/prepare_data.py --src $W2LDIR/LibriSpeech/ --dst $W2LDIR
改为
wav2letter/tutorials/1-librispeech_clean/prepare_data.py --src $W2LDIR/LibriSpeech/ --dst $W2LDIR
但执行的时候还报错。

wav2letter/tutorials/1-librispeech_clean/prepare_data.py: line 22:
Copyright (c) Facebook, Inc. and its affiliates.
All rights reserved.This source code is licensed under the BSD-style license found in the
LICENSE file in the root directory of this source tree.---------Script to package original Mini Librispeech datasets into a form readable in
wav2letter++ pipelines[If you haven't downloaded the datasets] Please download all the original datasets
in a folder on your own
> wget -qO- http://www.openslr.org/resources/12/train-clean-100.tar.gz | tar xvz
> wget -qO- http://www.openslr.org/resources/12/dev-clean.tar.gz | tar xvz
> wget -qO- http://www.openslr.org/resources/12/test-clean.tar.gz | tar xvzCommand : prepare_data.py --src [...]/LibriSpeech/ --dst [...]Replace [...] with appropriate paths
: File name too long
from: can't read /var/mail/__future__

提醒我目录太长,因为之前确实因为习惯将目录放在一个比较深的子目录里,所以按照流程,重新设置了目录,运行还报错,经分析,是我系统默认用python2启动运行,所以手动加前缀python3.7,运行就正常了

python3.7 wav2letter/tutorials/1-librispeech_clean/prepare_data.py --src $W2LDIR/LibriSpeech/ --dst $W2LDIR

随后直接在bashrc中修改

alias python='/usr/local/bin/python3.7'
alias pip='/usr/local/bin/pip3.7'

执行训练的时候报错:
wav2letter/build/Train train --flagsfile wav2letter/tutorials/1-librispeech_clean/train.cfg

terminate called after throwing an instance of 'std::runtime_error'what():  loadSound: unknown format or could not open stream
*** Aborted at 1569812537 (unix time) try "date -d @1569812537" if you are using GNU date ***
PC: @     0x7f37686eb428 gsignal
*** SIGABRT (@0x3e8000034a6) received by PID 13478 (TID 0x7f377400c800) from PID 13478; stack trace: ***@     0x7f3769565390 (unknown)@     0x7f37686eb428 gsignal@     0x7f37686ed02a abort@     0x7f376903084d __gnu_cxx::__verbose_terminate_handler()@     0x7f376902e6b6 (unknown)@     0x7f376902e701 std::terminate()@     0x7f376902e919 __cxa_throw@           0x5d1ca1 w2l::loadSound<>()@           0x5d1e60 w2l::loadSound<>()@           0x5e230a w2l::W2lListFilesDataset::getLoaderData()@           0x5d77f4 w2l::W2lDataset::getFeatureData()@           0x5d8da9 w2l::W2lDataset::getFeatureDataAndPrefetch()@           0x5d90de w2l::W2lDataset::get()@           0x4821c4 _ZZ4mainENKUlSt10shared_ptrIN2fl6ModuleEES_IN3w2l17SequenceCriterionEES_INS3_10W2lDatasetEES_INS0_19FirstOrderOptimizerEES9_ddbiE3_clES2_S5_S7_S9_S9_ddbi.constprop.11419@           0x41b318 main@     0x7f37686d6830 __libc_start_main@           0x47d7b9 _start
Aborted

按照默认指导重装

git clone git://github.com/erikd/libsndfile.git
./autogen.sh
./configure --enable-werror
make
make check
sudo make install

执行

sndfile-play /w2l/LibriSpeech/train-clean-100/103/1240/103-1240-0009.flac

声音播放正常。然而还是报错。

https://github.com/facebookresearch/wav2letter/issues/241
https://github.com/facebookresearch/wav2letter/issues/360
都提到了这个问题,但没有提到如何安装Ogg/Opus support的版本。那么在libsndfile工程目录里搜索
grep -rn Ogg
发现README.md文件里

  • ENABLE_EXTERNAL_LIBS - enable Ogg, Vorbis and FLAC support. This option is
    available and set to ON if all dependency libraries were found.

那么强制设置成ON
cmake … DENABLE_EXTERNAL_LIBS=ON
编译完毕好像还是不行。静下心来分析,提示说的是依赖库都支持的话自动设置为On,所以关键还是找opus这个咚咚。找到官网,下载包,解压缩。
编译安装 ./configure && make && make install
对于libsndfile的安装:

-- Could NOT find Sndio (missing:  SNDIO_LIBRARY SNDIO_INCLUDE_DIR)
-- Found Opus: /usr/local/lib/libopus.so (found version "1.3.1")
-- Could NOT find Speex (missing:  SPEEX_LIBRARY SPEEX_INCLUDE_DIR)
-- Checking processor clipping capabilities...
-- Checking processor clipping capabilities... none
-- The following features have been disabled:* BUILD_SHARED_LIBS , build shared libraries* ENABLE_EXPERIMENTAL , enable experimental code* ENABLE_CPU_CLIP , Enable tricky cpu specific clipper* ENABLE_BOW_DOCS , enable black-on-white html docs-- Configuring done
-- Generating done
-- Build files have been written to: /home/changshengwu/devpath/machinelearning/wav2letter/libsndfile/CMakeBuild
-- The following features have been enabled:* BUILD_SHARED_LIBS , build shared libraries* ENABLE_EXTERNAL_LIBS , enable FLAC, Vorbis, and Opus codecs* BUILD_REGTEST , build regtest* ENABLE_CPACK , enable CPack support* ENABLE_PACKAGE_CONFIG , generate and install package config file

然而这个时候居然编译不过去,报了一个“ ‘round@@GLIBC_2.2.5’ error when compling with BUILD_SHARED_LIBS=ON”,这个问题反反复复的没有办法搞定。因为看到autogen.sh和configure的log里python还是直接取/usr/bin/python的方式,得到的python2.7,担心由于python兼容的问题,遂改彻底

sudo mv python python.bak.bak
sudo ln -s /usr/local/bin/python3.7 /usr/bin/python

然而编译还是不成功,github提交了一个issue,然后再仔细看看libsndfile的readme

Configuring CMakeYou can pass additional options with /D<parameter>=<value> when you run cmake command. Some useful system options:CMAKE_C_FLAGS - additional C compiler flagsCMAKE_BUILD_TYPE - configuration type, DEBUG, RELEASE, RELWITHDEBINFO or MINSIZEREL. DEBUG is defaultCMAKE_INSTALL_PREFIX - build install location, the same as --prefix option of configure scriptUseful libsndfile options:BUILD_SHARED_LIBS - build shared library (DLL under Windows) when ON, build static library othervise. This option is ON by default.BUILD_PROGRAMS - build libsndfile's utilities from programs/ directory, ON by default.BUILD_EXAMPLES - build examples, ON by default.BUILD_TESTING - build tests. Then you can run tests with ctest command, ON by default. Setting BUILD_SHARED_LIBS to ON disables this option.ENABLE_EXTERNAL_LIBS - enable Ogg, Vorbis and FLAC support. This option is available and set to ON if all dependency libraries were found.ENABLE_CPU_CLIP - enable tricky cpu specific clipper. Enabled and set to ON when CPU clips negative\positive. Don't touch it if you are not sureENABLE_BOW_DOCS - enable black-on-white documentation theme, OFF by default.ENABLE_EXPERIMENTAL - enable experimental code. Don't use it if you are not sure. This option is OFF by default.ENABLE_CPACK - enable CPack support. This option is ON by default.ENABLE_PACKAGE_CONFIG - generate and install package config file. This option is ON by default.ENABLE_STATIC_RUNTIME - enable static runtime on Windows platform, OFF by default.ENABLE_COMPATIBLE_LIBSNDFILE_NAME - set DLL name to libsndfile-1.dll (canonical name) on Windows platform, sndfile.dll otherwise, OFF by default. Library name can be different depending on platform. The well known DLL name on Windows platform is libsndfile-1.dll, because the only way to build Windows library before was MinGW toolchain with Autotools. This name is native for MinGW ecosystem, Autotools constructs it using MinGW platform rules from sndfile target. But when you build with CMake using native Windows compiler, the name is sndfile.dll. This is name for native Windows platform, because Windows has no library naming rules. It is preffered because you can search library using package manager or CMake's find_library command on any platform using the same sndfile name.Deprecated options:DISABLE_EXTERNAL_LIBS - disable Ogg, Vorbis and FLAC support. Replaced by ENABLE_EXTERNAL_LIBSDISABLE_CPU_CLIP - disable tricky cpu specific clipper. Replaced by ENABLE_CPU_CLIPBUILD_STATIC_LIBS - build static library. Use BUILD_SHARED_LIBS instead

再次检查配置文件BUILD_SHARED_LIBS = ON其实是给windows用的,有检查/usr/local/lib和/usr/local/include,对应的libsndfile的so和头文件都在。这时候感觉很抓狂。
接下来决定重新配置和编译一下wav2letter。cmake之后的log信息提示都正常。

-- Found ZLIB: /usr/lib/x86_64-linux-gnu/libz.so (found version "1.2.8")
-- Found Ogg: /usr/lib/x86_64-linux-gnu/libogg.so (found version "1.3.2")
-- Required SndFile dependency Ogg found.
-- Found Vorbis: /usr/lib/x86_64-linux-gnu/libvorbis.so (found version "1.3.5")
-- Required SndFile dependency Vorbis found.
-- Found VorbisEnc: /usr/lib/x86_64-linux-gnu/libvorbisenc.so (found version "1.3.5")
-- Required SndFile dependency VorbisEnc found.
-- Found FLAC: /usr/lib/x86_64-linux-gnu/libFLAC.so (found version "1.3.1")
-- Required SndFile dependency FLAC found.
-- Found SNDFILE: /usr/local/include
-- Found libsndfile: (lib: /usr/local/lib/libsndfile.so include: /usr/local/include
-- libsndfile found.

这回编译完毕,train已经不再报错,终于顺利地跑下去了。

训练

训练过程没有任何提示,在CPU模式下,我的电脑大约跑了40小时左右。最后在训练目录librispeech_clean_trainlogs下生成

 001_config001_log001_model_last.bin001_model_lists#dev-clean.lst.bin001_perf

几种文件,而从decode.cfg来看 001_model_lists#dev-clean.lst.bin是用于解码的,这应该就是我们的训练结果。

解码

按照tutorial继续运行decode:

./wav2letter/build/Decoder --flagsfile wav2letter/tutorials/1-librispeech_clean/decode.cfg

这个版本的log及其丰富,而且刚开始出现了很多这样的提示

Falling back to using letters as targets for the unknown word: valleyed
Falling back to using letters as targets for the unknown word: woodbegirt
Falling back to using letters as targets for the unknown word: citadelled
Falling back to using letters as targets for the unknown word: dedalos
Falling back to using letters as targets for the unknown word: hazewrapped
Falling back to using letters as targets for the unknown word: chiaroscurists
Falling back to using letters as targets for the unknown word: chiaroscurist
Falling back to using letters as targets for the unknown word: crampness
Falling back to using letters as targets for the unknown word: angor
Falling back to using letters as targets for the unknown word: greeing
Falling back to using letters as targets for the unknown word: million'd
Falling back to using letters as targets for the unknown word: sharp'st
.......Skipping unknown entry: 'semon's'
Skipping unknown entry: 'battleax'
Falling back to using letters as targets for the unknown word: andella
Falling back to using letters as targets for the unknown word: andella
Skipping unknown entry: 'andella'
Skipping unknown entry: 'andella'
Falling back to using letters as targets for the unknown word: kaffar's
Skipping unknown entry: 'kaffar's'
Falling back to using letters as targets for the unknown word: thel
....
|T|: and henry might return to england at any moment
|P|: and henry might return to england at any moment
[sample: test-clean-61-70968-0048, WER: 0%, LER: 0%, slice WER: 17.9706%, slice LER: 8.60217%, progress (slice 1): 99.542%]
|T|: i love thee with a love i seemed to lose with my lost saints i love thee with the breath smiles tears of all my life and if god choose i shall but love thee better after death
|P|: i love thee with a love i seemed to lose with my lost saints i love thee with the breath smiles tears of all my length and if god choose i shall but love the better after death
[sample: test-clean-908-31957-0025, WER: 5.26316%, LER: 3.42857%, slice WER: 18.705%, slice LER: 8.92016%, progress (slice 3): 98.0153%]
|T|: ain't they the greatest
|P|: and then the gratis
[sample: test-clean-4992-41806-0010, WER: 75%, LER: 30.4348%, slice WER: 17.9879%, slice LER: 8.60928%, progress (slice 1): 99.6947%]
|T|: on she hurried until sweeping down to the lagoon and the island lo the cotton lay before her
|P|: on she hurried until sweeping down to the lagoon and the island lo the cotton lay before her
[sample: test-clean-1995-1837-0027, WER: 0%, LER: 0%, slice WER: 17.9634%, slice LER: 8.59807%, progress (slice 1): 99.8473%]
|T|: the christmas holidays came and she and anne returned to the parsonage and to that happy home circle in which alone their natures expanded amongst all other people they shrivelled up more or less
|P|: the christmas holidays came and she and ann returned to the parsonage and that happy home circle in which alone their natures expanded amongst all other people that from all that more lush
[sample: test-clean-3575-170457-0039, WER: 23.5294%, LER: 11.2821%, slice WER: 18.7173%, slice LER: 8.9266%, progress (slice 3): 98.1679%]
|T|: in strict accuracy nothing should be included under the head of conspicuous waste but such expenditure as is incurred on the ground of an invidious pecuniary comparison
|P|: in strict accuracy nothing should be included under the head of conspicuous waste but such expenditure as is in court on the ground of an envious pecuniary comparison
[sample: test-clean-3570-5696-0009, WER: 11.1111%, LER: 4.7619%, slice WER: 17.9495%, slice LER: 8.58897%, progress (slice 1): 100%]

因为我之前测试flashlight有很多错误,所以又不自信了,于是我取消了decode过程,折腾了半天的arrayfire和flashlight,最后又跑了一遍。但上述log应该是正常的,T表示Test测试集,P表示Hyp估计的结果,sample表示这条case的WER和LER,slice WER和slice LER表示总测试的结果。

还有一个方法,进入测试的runtime目录,会看到三个文件:

-rw-rw-r--  1352708 10月  8 15:54 lists#test-clean.lst.hyp
-rw-rw-r--  1 941511 10月  8 15:54 lists#test-clean.lst.log
-rw-rw-r--  1 359475 10月  8 15:54 lists#test-clean.lst.ref

这三个文件分别表示了测试集的内容,解码的内容和测试过程的log。在log文件的最后一行,给出了整个测试结果

[Decode lists/test-clean.lst (2620 samples) in 182.909s (actual decoding time 0.275s/sample) -- WER: 18.5008, LER: 8.82783]

关于flashlight make test错误

flashlight make test错误如下:

Start  1: AutogradTest
1/11 Test #1: AutogradTest .....................***Failed 70.15 sec
Start 2: OptimTest
2/11 Test #2: OptimTest ........................***Failed 0.04 sec
Start 3: ModuleTest
3/11 Test #3: ModuleTest .......................***Failed 1.01 sec
Start 4: SerializationTest
4/11 Test #4: SerializationTest ................***Failed 2.79 sec
Start 5: NNUtilsTest
5/11 Test #5: NNUtilsTest ...................... Passed 0.04 sec
Start 6: DatasetTest
6/11 Test #6: DatasetTest ...................... Passed 0.84 sec
Start 7: DatasetUtilsTest
7/11 Test #7: DatasetUtilsTest ................. Passed 0.01 sec
Start 8: MeterTest
8/11 Test #8: MeterTest ........................ Passed 0.03 sec
Start 9: AllReduceTest
9/11 Test #9: AllReduceTest ....................***Exception: Other 0.51 sec
Start 10: ContribModuleTest
10/11 Test #10: ContribModuleTest ................***Failed 0.17 sec
Start 11: ContribSerializationTest
11/11 Test #11: ContribSerializationTest ......... Passed 0.04 sec45% tests passed, 6 tests failed out of 11Total Test time (real) = 75.65 secThe following tests FAILED:
1 - AutogradTest (Failed)
2 - OptimTest (Failed)
3 - ModuleTest (Failed)
4 - SerializationTest (Failed)
9 - AllReduceTest (OTHER_FAULT)
10 - ContribModuleTest (Failed)
Errors while running CTest
Makefile:71: recipe for target 'test' failed

刚开始不知道咋办,就一级级的找原因,发现可进入build/test目录独立进行测试

-rwxrwxr-x  1 2883824 10月  8 14:35 AllReduceTest*
-rwxrwxr-x  1 3202496 10月  8 14:35 AutogradTest*
drwxrwxr-x 14 4096 10月  8 14:23 CMakeFiles/
-rw-rw-r--  1 1032 10月  8 14:22 cmake_install.cmake
-rwxrwxr-x  1 2813208 10月  8 14:29 ContribModuleTest*
-rwxrwxr-x  1 2804752 10月  8 14:29 ContribSerializationTest*
-rw-rw-r--  1 811 10月  8 14:22 CTestTestfile.cmake
-rwxrwxr-x  1 2947960 10月  8 14:34 DatasetTest*
-rwxrwxr-x  1 2866632 10月  8 14:29 DatasetUtilsTest*
drwxrwxr-x  4 4096 10月  8 14:22 googletest/
-rw-rw-r--  1  489359 10月  8 14:22 Makefile
-rwxrwxr-x  1  2892104 10月  8 14:32 MeterTest*
-rwxrwxr-x  1  2856544 10月  8 14:29 ModuleTest*
-rwxrwxr-x  1 2798672 10月  8 14:35 NNUtilsTest*
-rwxrwxr-x  1 2861760 10月  8 14:32 OptimTest*
-rwxrwxr-x  1 2938736 10月  8 14:31 SerializationTest*

于是将AutogradTest单独运行,这样出来了详细的log

Value of: jacobianTestImpl(func_weightNorm_in, in, 1E-1)
Actual: false
Expected: true
[ FAILED ] AutogradTest.WeightNormConv (2669 ms)
[ RUN ] AutogradTest.Rnn
unknown file: Failure
C++ exception with description "rnn not yet implemented on CPU" thrown in the test body.
[ FAILED ] AutogradTest.Rnn (0 ms)
[ RUN ] AutogradTest.Lstm
unknown file: Failure
C++ exception with description "rnn not yet implemented on CPU" thrown in the test body.
[ FAILED ] AutogradTest.Lstm (0 ms)
[ RUN ] AutogradTest.Gru
unknown file: Failure
C++ exception with description "rnn not yet implemented on CPU" thrown in the test body.
[ FAILED ] AutogradTest.Gru (0 ms)

原来rnn是不能在目前的cpu平台支持的,不过w2l运行的是cnn,所以对我的没关系。这个过程还因为arrayfire安装了spdlog,总之各种折腾,终于跑完了第一次的流程。

总结

  1. ubuntu对python的管理需要注意,v14 v16都是默认启动python2.7,而现在比较新的软件对python2愈加不友好了。本文用了比较简单粗暴的办法启动了python3.7,如果后面遇到问题再回溯,暂时可以满足需要。
  2. libsndfile安装依赖确实,最关键的是要装opus,然后libsndfile和wav2letter都需要重新编译。

wav2letter++ 第一次training 日志相关推荐

  1. linux 学习 14 日志管理

    第十四讲 日志管理 14.1 日志管理-简介 1.日志服务 在CentOS 6.x中日志服务已经由rsyslogd取代了原先的syslogd服务.rsyslogd日志服务更加先进,功能更多.但是不论 ...

  2. syslog(),closelog()与openlog()--日志操作函数

    为了满足某些目的,进行日志记录是很有必要的. 在典型的 LINUX 安装中,/var/log/messages 包含所有的系统消息,/var/log/mail 包含来自邮件系统的其它日志消息, /va ...

  3. 所属文件不可访问_日志文件写入失败(permission denied)

    用过Laravel的小伙伴一开始安装完框架后可能都遇到过daily 日志文件写入失败的问题,接下来我们就来详细说下日志文件写入失败的原因以及对应的解决方案. 在讲这个问题之前可能需要简单介绍下Linu ...

  4. MySQL二进制日志文件的用法_数据恢复

    文章目录 开启二进制日志功能 关闭/打开二进制日志记录 刷新二进制日志文件 查看二进制日志文件的存储位置 利用二进制日志文件恢复数据的本质 二进制日志提取/导出到脚本文件中 查看当前二进制日志的最后一 ...

  5. mysql分表全局查询_mysql如何查询多样同样的表/sql分表查询、java项目日志表分表的开发思路/按月分表...

    之前开发的一个监控系统,数据库的日志表是单表,虽然现在数据还不大并且做了查询sql优化,不过以后数据库的日志表数据肯定会越来越庞大,将会导致查询缓慢,所以把日志表改成分表,日志表可以按时间做水平分表, ...

  6. SQL Server事务日志备份,截断和缩减操作

    In this article, we will cover SQL Server Transaction log backups, truncate and shrink operations wi ...

  7. python log日志打印两遍_python打印log重复问题

    浅析python日志重复输出问题 问题起源: ​ 在学习了python的函数式编程后,又接触到了logging这样一个强大的日志模块.为了减少重复代码,应该不少同学和我一样便迫不及待的写了一个自己的日 ...

  8. Linux(日志管理)

    系统常用日志 常见系统日志存储位置 图片来源韩顺平Linux 注意:二进制的日志文件需要用lastlog查看 日志管理服务(rsyslogd) ps -aux |grep "rsyslog& ...

  9. 日志文件写入失败(permission denied)

    用过Laravel的小伙伴一开始安装完框架后可能都遇到过daily 日志文件写入失败的问题,接下来我们就来详细说下日志文件写入失败的原因以及对应的解决方案. 在讲这个问题之前可能需要简单介绍下Linu ...

最新文章

  1. 框架中解决部分页面返回登录
  2. 复制文件以及异常处理
  3. jpa mysql sql分页查询语句_JPA多条件复杂SQL动态分页查询功能
  4. 单例设计模式-容器单例
  5. 如何让知识图谱告诉你“故障根因”
  6. 25. ThreadLocal的使用场景
  7. 《设计模式详解》结构型模式 - 桥接模式
  8. 输入两个整数n和m,从数列1,2,3,……n中随意取几个数,使其和等于m 转载
  9. Laravel Request 和 Laravel Input 常用操作方法
  10. java泛型T和通配符问号的区别
  11. python爬取音乐下载_Python爬取全抖音好听背景音乐,一次性下载
  12. 微信二维码扫码登录思路
  13. 从价值出发,技术管理痛点的正解
  14. 第四章 OAuth2.0规范(史上最详细解释)——获得授权
  15. 贵州中小学教师计算机考试题目,2019贵州教师招聘考试习题及答案:小学数学...
  16. Thymeleaf 教程:使用Thymeleaf[转自官方]
  17. 从五个方面解说:数字技术对就业的有怎样的影响,你知道吗?
  18. 为什么需要运营商级NAT设备?
  19. 基于FFmpeg 实现RTSP, 音视频编解码,视频流添加文字,音视频合成MP4
  20. 利用 telnet 命令测试 SMTP 服务(QQ邮箱)

热门文章

  1. MVC验证04-自定义验证规则、日期范围验证
  2. python基本运算符号有哪些
  3. 机器学习复习之逻辑斯蒂回归以及决策树
  4. Java中数组的定义和使用
  5. 【AE工具】AE一键切换中英文小工具,免费下载 支持CC2014-CC2019
  6. Vue.js学习笔记—shop-bus:实战:利用计算属性、指令等知识开发购物车
  7. android多线程处理的方法以及应用场景
  8. 初学者怎么学习网页设计
  9. layui totalRow 多层嵌套json_鹏华资产40亿产品兑付追踪:中招嵌套结构 或踩雷非标-基金频道...
  10. php开启websocket服务,php实现简单的websocket服务