【语音识别】日语语音识别系统Julius(v4.4)的基于DNN的识别（5月8号：识别结果更新）

发现国内对于Julius的资料太少了，现在补充一下。Julius最新更新于2016.9，加入了基于DNN的识别，但实际使用的时候发现有很多必要条件并没有在homepage上标明出来。现在做一个00readme-DNN(←)的翻译。日本人的英语很多语法问题，特地附上原文。

A. Julius and DNN-HMM
======================

From 4.4, Julius can perform DNN-HMM based recognition in two ways:

1. standalone: directly compute DNN for HMM inside Julius (>= 4.4) //1.单机：直接为HMM构建DNN在julius里（版本>= 4.4）【本文仅翻译这一块】

2. network: receive state probabilities calculated by other process
via socket (<= 4.3.1)

Both are described below.

A.1. Standalone mode
=====================

From version 4.4, Julius is capable of performing DNN-HMM based recognition by itself. It can read a DNN definition along with a HMM, and can compute the network against input (spliced) feature vectors and output the node scores of output layer for each frame, which will be used as output probabilities of corresponding HMM states in the
HMM. All computation will be done in a single process.

// 从版本4.4开始，julius可以选择DNN-HMM进行识别。julius中的HMM可以读取一个DNN的定义，并且能够使用输入（拼接）特征向量建立网络并输出每一帧的输出层的node scores。这将被用作HMM输出可能性。所有构建是单线程。

Note that the current implementation is very simple and limited. Only basic functions are implemented for NN. Any number of hidden layers can be defined, but the number of the nodes in the hidden layers should be the same. No batch computation is performed: all frame-wise. SIMD instruction (Intel AVX) is used to speed up the computation. Only tested on Windows and Ubuntu on Intel PC.See "libsent/src/phmm/calc_dnn.c" for the actual implementation.

//注意的是，目前是非常简单和有限的功能。只有NN的基本功能。隐藏层数可以定义，但与隐藏层中的node数应该相同。没有bach供选择：所有帧长。SIMD指令（ntel AVX）被用作加速这个构建。只在Intel PC的Windows和Ubuntu进行了测试。看 "libsent/src/phmm/calc_dnn.c" 可以得到实际的更新信息。

o run, you need // 你需要

1) an HMM AM (GMM defs are ignored, only its structure is used) //一个HMM声学模型
2) a DNN definition that corresponds to 1) //一个与上1一致的DNN定义
3) ".dnnconf" configuration file (text) // ".dnnconf"

The .dnnconf file specifies the parameters, options, DNN definition files, and other parameters all relating to DNN computation. A sample file is located in the top directory of Julius archive as "Sample.dnnconf".

// ".dnnconf"文件写明了参数，选项，DNN定义文件和其他与构建DNN相关的参数。给了一个样例在 "Sample.dnnconf"。

The matrix/vector definitions should be given in ".npy" format(i. e. python's "NumPy.save" format). Only 32bit-float little endian datatype is acceptable.

//矩阵向量应该定义成".npy" 形式（比如python's "NumPy.save" ）。只有32bit 小端数据类型被接受。

To prepare a model for DNN-HMM, note that the orders are important.The order of the output nodes in the DNN should be the order of HMM state definition id. If not, Julius won't work properly.

//顺序很重要。DNN的输出是HMM的状态定义。否则，无法正确运行。

Julius uses SIMD instruction for internal DNN computation. For Intel CPU, dispatch function for several Intel SIMD instruction sets (SSE, AVX and FMA) are implemented. You need gcc-4.7 or later to compile all the codes. They are all compiled and built-in into Julius, and will be determined which one to use at run time. Run "julius -setting" and see which code will be used on your cpu. AVX can be run on Sandy Bridge, and FMA on Haswell, later one will run faster. And for ARM architecture, you can enable NEON SIMD codes by adding "--enable-neon" to configure.

//Julius在DNN构建中使用的是SIMD指令。对于Intel CPU,有很多类型的指令类型（SSE, AVX and FMA）。你需要至少gcc-4.7或更高版本。Julius已经包含这些了，你可以定义用哪个在运行的时候。运行"julius -setting" ，看什么code类型将被用在你的cpu。 AVX can be run on Sandy Bridge, and FMA on Haswell, later one will run faster. And for ARM architecture, you can enable NEON SIMD codes by adding "--enable-neon" to configure.

--------------------------------

自己的感觉就是更新了很局限的一些功能，尝试后发现出error，找不到原因才仔细去读这些说明文件发现有很多限定条件。大家多注意。

--------------------------------

5.8更新：

【重要】4.4版本这个DNN-HMM声学模型（.SID）在使用的时候,老版本（4.3）

julius.dnnconf DNN(Julius単体)版の特徴量変換設定ファイル

这个是没有的，4.4一定注意要用上这个，否则会一直提示你的特征量输入不对。

在32Bit服务器上跑完了，大概2W条语音用了35小时左右，对比了4.3版本的结果发现是有不一样的，自己筛选几条来看识别结果是要好些，等识别率计算好了再写上来。

【语音识别】日语语音识别系统Julius(v4.4)的基于DNN的识别（5月8号：识别结果更新）相关推荐

fgo生日语音咋触发日语语音识别
突出了活泼有趣的特色,往往可以记得很快,而且不容易遗忘,用动画的方式学习日语,学员日语学习网站能将他们"黏住",除了和词汇拼读的关系,结合日本新课标与中国学生特色. 让小孩在家学 ...
linux语音识别_linux语音识别 arm_linux 语音识别引擎 - 云+社区 - 腾讯云
广告关闭腾讯云双11爆品提前享,精选热门产品助力上云,云服务器首年88元起,买的越多返的越多,最高满返5000元! 简介语音识别是针对已经录制完成的录音文件,进行识别的服务,异步返回识别文本,可应用 ...
js语音识别_js 语音识别_js 语音识别库 - 云+社区 - 腾讯云
广告关闭 2017年12月,云+社区对外发布,从最开始的技术博客到现在拥有多个社区产品.未来,我们一起乘风破浪,创造无限可能. 录音文件识别请求,数据结构,android sdk,ios sdk,自学 ...
js 语音识别_js语音识别_js 语音识别库 - 云+社区 - 腾讯云
广告关闭腾讯云双11爆品提前享,精选热门产品助力上云,云服务器首年88元起,买的越多返的越多,最高满返5000元! 录音文件识别请求,数据结构,android sdk,ios sdk,自学习模型,使 ...
语音性别识别_语音识别识别性别_语音文字识别 - 云+社区 - 腾讯云
广告关闭腾讯云双11爆品提前享,精选热门产品助力上云,云服务器首年88元起,买的越多返的越多,最高满返5000元! 一句话识别,错误码,产品简介,产品优势,应用场景,计费概述,购买方式,欠费说明,功 ...
ZS语音识别(智能语音识别工具)V1.3 绿色版
ZS语音识别(智能语音识别工具)是一款很优秀好用的智能语音识别辅助工具.这款ZS语音识别工具功能强大,简单易操作,使用后可以帮助用户更轻松便捷的进行语音识别操作.软件可以帮助用户快速识别音频文件并将其 ...
android 语音识别_Android语音识别教程
android 语音识别您可能听说过" Google Now项目" ,在这里您可以发出语音命令,Android会为您获取结果. 它可以识别您的声音并将其转换为文本或采取适当的措施 ...
语音识别研究综述——阅读笔记3（端到端语音识别、语音识别的难度与热点）
端到端语音识别传统语音识别由多个模块组成,彼此独立训练,但各个子模块的训练目标不一致,容易产生误差积累,使得子模块的最优解并不一定是全局最优解. 针对这一问题,提出了端到端语音识别,直接对等式(1) ...
【语音识别】玩转语音识别 1 语音识别简介
[语音识别]⚠️玩转语音识别 1⚠️ 语音识别简介概述语音识别的历史语音识别的应用概述从今天开始我们将开启一个新的深度学习章节, 为大家来讲述一下深度学习在语音识别 (Speech Reco ...
Android语音识别——谷歌语音识别与百度语音识别
Android语音识别,简单的理解就是把语音转化为文字. 在日常中,语音识别,车载导航.语音输入等,虽然不一定准确,但用途广泛. 这里就介绍下谷歌原生的语音识别与百度的语音识别谷歌语音识别谷歌语音 ...

【语音识别】日语语音识别系统Julius(v4.4)的基于DNN的识别（5月8号：识别结果更新）

【语音识别】日语语音识别系统Julius(v4.4)的基于DNN的识别（5月8号：识别结果更新）相关推荐

最新文章

热门文章