ROS Kinetic使用PocketSphinx进行语音识别

ROS Kinetic使用PocketSphinx进行语音识别
- 安装准备
- 安装ROS-PocketSphinx
- 启动与测试
- 除此之外需要多说几句

目前在ROS中调用pocketsphinx进行语音识别的教程中，有大部分的教程没有针对Kinetic版本的ROS进行讲解，很少的几个，比如倔强不倒翁大神的这一篇很贴近了，但仍然没有解决问题。本文大部分内容参考了倔强不倒翁的教程和另一片现在找不到的教程，在此表示感谢！

安装准备

首先安装各种库和组件

sudo apt-get install ros-kinetic-audio-common libasound2 gstreamer0.10-*  gstreamer1.0-pocketsphinx

安装libsphinxbase1

sudo dpkg -i libsphinxbase1_0.8-6_amd64.deb

安装libpocketsphinx1

sudo dpkg -i libpocketsphinx1_0.8-5_amd64.deb

安装gstreamer0.10-pocketsphinx

sudo dpkg -i gstreamer0.10-pocketsphinx_0.8-5_amd64.deb

这个可能不需要，因为前面装了gstreamer1.0-pocketsphinx，但我两个版本都装了，所以不好说，抱歉(・＿・)

安装ROS-PocketSphinx

这个repo目前只支持到ROS Jade，而且本身使用的sphinx版本也很老了，但目前没看到更好的方案。github上有几个替代方案，我只试用了CMU自己出的那个，但并无卵用。之后有时间的话会再试试其他几个方案的。

cd ~/catkin_ws/src
git clone https://github.com/mikeferguson/pocketsphinx

启动与测试

首先需要进行的是麦克风的测试，确认一下麦克风确实能收到声音，也没有太大的干扰。
这个怎么测就不演示了。。。此处本有一波黑恶势力(￣▽￣)

直接启动demo的命令为

roslaunch pocketsphinx robocup.launch

然而一定会报如下错误（请原谅我盗图吧，我自己报错的时候没截图(>ω･* )ﾉ）

这个错误的原因不是很清楚，我个人的观点是repo里用的pocketsphinx版本太老，还在用现在pocketsphinx已经抛弃的DMP文件格式，另外也缺少hmm的设置，以及与ROS kinetic版本不匹配等一系列原因合成出来的配置错误。其实ROS kinetic和indigo两个版本的差距不是那么巨大，有不少在indigo上能用的程序在kinetic下重新编一编改一下也能用的。总之，接下来要通过对recognizer.py和launch文件的修改将这个错误干掉。

首先是对recognizer.py的更改：
在init中加入hmm参数

def __init__(self):# Start noderospy.init_node("recognizer")self._device_name_param = "~mic_name"  # Find the name of your microphone by typing pacmd list-sources in the terminalself._lm_param = "~lm"self._dic_param = "~dict"self._hmm_param = "~hmm"     #加入hmm参数

在start_recognizer中把set_property这一句屏蔽掉

   def start_recognizer(self):rospy.loginfo("Starting recognizer... ")self.pipeline = gst.parse_launch(self.launch_config)self.asr = self.pipeline.get_by_name('asr')self.asr.connect('partial_result', self.asr_partial_result)self.asr.connect('result', self.asr_result)#self.asr.set_property('configured', True) #使用launch文件中提供的设置self.asr.set_property('dsratio', 1)

还是在start_recognizer中获取hmm参数并设置

        # Configure language modelif rospy.has_param(self._lm_param):lm = rospy.get_param(self._lm_param)else:rospy.logerr('Recognizer not started. Please specify a language model file.')returnif rospy.has_param(self._dic_param):dic = rospy.get_param(self._dic_param)else:rospy.logerr('Recognizer not started. Please specify a dictionary.')return#获取hmm参数if rospy.has_param(self._hmm_param):hmm = rospy.get_param(self._hmm_param)else:rospy.logerr('Recognizer not started. Please specify a hmm.')returnself.asr.set_property('lm', lm)self.asr.set_property('dict', dic)self.asr.set_property('hmm', hmm)   #设置hmm参数self.bus = self.pipeline.get_bus()self.bus.add_signal_watch()self.bus_id = self.bus.connect('message::application', self.application_message)self.pipeline.set_state(gst.STATE_PLAYING)self.started = True

接下来需要修改的是launch文件，需要在其中加入hmm的参数设置：

    <param name="hmm" value="/usr/local/lib/python3.5/dist-packages/pocketsphinx/model/en-us"/>

以demo中的voice_cmd.launch为例：

<launch><node name="recognizer" pkg="pocketsphinx" type="recognizer.py"><param name="lm" value="$(find pocketsphinx)/demo/voice_cmd.lm"/><param name="dict" value="$(find pocketsphinx)/demo/voice_cmd.dic"/><param name="hmm" value="/usr/local/lib/python3.5/dist-packages/pocketsphinx/model/en-us"/></node><node name="voice_cmd_vel" pkg="pocketsphinx" type="voice_cmd_vel.py" output="screen"/></launch>

我在这里使用的hmm是通过pip下的pocketsphinx中自带的英语模型，识别率还不错，如果用tidigits那一套的话只能识别0-9的英文+Oh♂
之后测试只需要输入

roslaunch pocketsphinx voice_cmd.launch

然后启动gazebo或者stage或者fake或者龟机本体
对着麦克风大声的说出你的命令：FORWARD─=≡Σ(((つ•̀ω•́)つ
就可以看着自己的龟机很听话的满地跑了ヾ(๑╹◡╹)ﾉ

除此之外需要多说几句：

1.lm、dic、hmm三个东西最好是能配套使用，否则会造成各种识别错误
2.要自制词典的话请去Sphinx自行制作
3.语言模型文件不一定使用DMP格式，lm也可以，但是lm.bin格式不行
4.英语的识别率和你的发音是否标准有很大关系，达不到80%以上正确率的请自己反省
5.Sphinx的中文做的很一般，中文语音识别请认准科大讯飞
6.麦克风质量、距离、周围环境噪音等都会对识别造成影响
7.有没有什么办法能在代码块里高亮出某行代码呢(´･_･`)？