背景

在使用AXE模式隐私号外呼用户时发现几家隐私号服务提供商并不是都有接通回调可以设置
所以需要设置通用的用户接听识别方案(录音和播报欢迎语等场景)

目的

在接入语音模型训练之前通过波形准确识别嘟嘟嘟和彩铃覆盖90%以上的case

调研

VAD:
语音活动检测(Voice Activity Detection,VAD)又称语音端点检测,语音边界检测。目的是从声音信号流里识别和消除长时间的静音期，以达到在不降低业务质量的情况下节省话路资源的作用，它是IP电话应用的重要组成部分。静音抑制可以节省宝贵的带宽资源，可以有利于减少用户感觉到的端到端的时延。

TarsosDSP:
git 地址: https://github.com/JorenSix/TarsosDSP
TarsosDSP is a Java library for audio processing. Its aim is to provide an easy-to-use interface to practical music processing algorithms implemented, as simply as possible, in pure Java and without any other external dependencies. The library tries to hit the sweet spot between being capable enough to get real tasks done but compact and simple enough to serve as a demonstration on how DSP algorithms works. TarsosDSP features an implementation of a percussion onset detector and a number of pitch detection algorithms: YIN, the Mcleod Pitch method and a “Dynamic Wavelet Algorithm Pitch Tracking” algorithm. Also included is a Goertzel DTMF decoding algorithm, a time stretch algorithm (WSOLA), resampling, filters, simple synthesis, some audio effects, and a pitch shifting algorithm.

回铃音:
表示被叫用户处于被振铃状态，采用频率为450±25Hz的交流电源，发送电平为-10±3dBm，它是5s断续的信号音，即1s送，4s断，与振铃音一致。

彩铃音:
连续不间断的音乐波形

思路

根据对波形的分析从左到右分为三段分别为

“请输入四位分机号以#号键结束”
“振铃嘟嘟嘟”
“用户说话”

所以目的分为三步
4. 跳过特定时长绕过输入分机号的播报
5. 对沉默后的第一段活跃做检测去匹配彩铃特征或者嘟声特征
6. 找到跳出特征的时刻就是用户接听的时刻

代码实现

使用TarsosDSP提供的静音检测能力和频率识别能力
注意要自己引入一下依赖 tarsos包在上面调研的tarsos介绍的git地址里

调用:

 public static void main (String[] args){PickUp pickUp = new PickUp("xxx.wav", 8000, 16, 1000, 4500);pickUp.start();System.exit(-1);}

PickUp:

package xxx;import be.tarsos.dsp.AudioDispatcher;
import be.tarsos.dsp.AudioEvent;
import be.tarsos.dsp.AudioProcessor;
import be.tarsos.dsp.SilenceDetector;
import be.tarsos.dsp.io.TarsosDSPAudioFloatConverter;
import be.tarsos.dsp.io.TarsosDSPAudioFormat;
import be.tarsos.dsp.io.UniversalAudioInputStream;
import be.tarsos.dsp.pitch.PitchDetectionHandler;
import be.tarsos.dsp.pitch.PitchDetectionResult;
import be.tarsos.dsp.pitch.PitchProcessor;
import java.io.*;
import java.util.concurrent.ConcurrentLinkedQueue;public class PickUp {public enum RingbackType {UNCHECK,DU_NORMALITY,DU_OTHER,SONG;}private ConcurrentLinkedQueue<byte[]> audioQueue = new ConcurrentLinkedQueue<byte[]>();private boolean isFinishReadFile = false; // 是否读取完文件private String filePath;private String fileName;private int readLength = 1600; // 100ms音频的字节数private int noinputTimeout = 1000; //跳过开始多少msprivate int silenceMaxTimes = 10; // 以100ms为单位 检测连续的多少次静音private float sampleRate = 8000; // 采样率private int sampleSizeInBits = 16; //位深度/*** 用户接听检测* @param filePath 文件路径* @param sampleRate 采样率* @param sampleSizeInBits 位深度* @param noinputTimeout 需要跳过多久时长开始检测* @param silenceTimeout 默认沉默多久结束(兜底)** @date 检测方式:1.嘟嘟嘟采用450HZ的频率检测 2.彩铃采用连续活跃进行检测*/public PickUp(String filePath, float sampleRate, int sampleSizeInBits, int noinputTimeout, int silenceTimeout) {this.filePath = filePath;this.sampleRate = sampleRate;this.sampleSizeInBits = sampleSizeInBits;//根据参数计算100ms音频的字节数this.readLength = (int)sampleRate*(sampleSizeInBits/8)/10;this.noinputTimeout = noinputTimeout;//计算检测几个 100毫秒单位长度this.silenceMaxTimes = (int)silenceTimeout/100;}public void start() {File audioFile = new File(this.filePath);FileInputStream fis;try {audioQueue.clear();fileName = audioFile.getName();isFinishReadFile = false;Thread sttThread = new Thread(vadRunbale);sttThread.start();fis = new FileInputStream(audioFile);byte[] byteArr = new byte[this.readLength];int size;fis.skip(44);while ((size = fis.read(byteArr)) != -1) {audioQueue.add(byteArr.clone());}while (!audioQueue.isEmpty() && !isFinishReadFile) {Thread.sleep(2000);}isFinishReadFile = true;fis.close();while (sttThread.isAlive()) {Thread.sleep(2000);}//在这里回调System.out.println("正常结束");} catch (FileNotFoundException e) {e.printStackTrace();} catch (IOException e) {e.printStackTrace();} catch (InterruptedException e) {e.printStackTrace();}}private Runnable vadRunbale = new Runnable() {volatile int countHZ = 0;volatile int count450HZ = 0;@Overridepublic void run() {RingbackType ringbackType = RingbackType.UNCHECK;int currentPartTime = 0, silenceTimes = 0, firstActiveTimes = 0, differentCount = 0;try {// 使用tarsos检测静音TarsosDSPAudioFormat tdspFormat = new TarsosDSPAudioFormat(sampleRate, sampleSizeInBits, 1, true, false);float[] voiceFloatArr = new float[readLength / tdspFormat.getFrameSize()];while (!isFinishReadFile) {// 条件是主动结束,并且队列中已经没有数据byte[] data = audioQueue.poll();if (data == null) {Thread.sleep(50);continue;}TarsosDSPAudioFloatConverter.getConverter(tdspFormat).toFloatArray(data.clone(),voiceFloatArr);SilenceDetector silenceDetector = new SilenceDetector();boolean isSlience = silenceDetector.isSilence(voiceFloatArr);//以100ms为单位多次检测静音if ((currentPartTime+=100) >= noinputTimeout) {boolean checkHZ = false;if (isSlience) {if(firstActiveTimes == 0){System.out.println("活动前静音,忽略");continue;}System.out.println("检测到静音"+ringbackType);// 检测连续静音到达最大值 结束if(++silenceTimes >=silenceMaxTimes){isFinishReadFile = true;//检测到静音就不需要等待文件读取完成}switch(ringbackType){case UNCHECK:if(countHZ==count450HZ){if(countHZ<=11){ringbackType = RingbackType.DU_NORMALITY;//中国标准为嘟1s 停4ssilenceMaxTimes = 41;}else {ringbackType = RingbackType.DU_OTHER;checkHZ = true;}}break;case DU_OTHER:checkHZ = true;//连续3个打破特征跳出if(countHZ!=count450HZ){differentCount++;count450HZ = countHZ;}else {differentCount = 0;}if(differentCount>=3){isFinishReadFile = true;}//嘟声启动hz检查checkHZ = true;break;case SONG://持续音乐中断isFinishReadFile = true;break;default:break;}} else {System.out.println("活动状态"+ringbackType);switch(ringbackType){case UNCHECK:firstActiveTimes++;//首次活跃大于两秒,判定为音乐if(firstActiveTimes>=20){ringbackType = RingbackType.SONG;}//首次活跃开始启动HZ检查checkHZ = true;break;case DU_NORMALITY://沉默时长小于40if(silenceTimes!=0 &&silenceTimes<35){isFinishReadFile = true;}//不break继续执行case DU_OTHER://连续3个打破特征跳出if(countHZ!=count450HZ){differentCount++;count450HZ = countHZ;}else {differentCount = 0;}if(differentCount>=3){isFinishReadFile = true;}//嘟声启动hz检查checkHZ = true;break;default:break;}//重置静音次数silenceTimes = 0;}//做HZ检查if(checkHZ && !isFinishReadFile){//做HZ判断AudioDispatcher dispatcher = new AudioDispatcher(new UniversalAudioInputStream(new ByteArrayInputStream(data), tdspFormat), data.length, 0);AudioProcessor audioProcessor = new PitchProcessor(PitchProcessor.PitchEstimationAlgorithm.FFT_YIN, 8000, data.length, new PitchDetectionHandler(){@Overridepublic void handlePitch(PitchDetectionResult pitchDetectionResult, AudioEvent audioEvent) {countHZ++;float pitch = pitchDetectionResult.getPitch();System.out.println(pitch+"HZ");if(pitch>445&&pitch<455){count450HZ++;}}});dispatcher.addAudioProcessor(audioProcessor);dispatcher.run();}}}System.out.println(fileName+"退出,位置为"+currentPartTime/10+"     "+ringbackType);} catch (Exception e) {e.printStackTrace();}}};}

效果测试

回铃音

每0.1秒打印一次日志频率特征符合预期响1s停4s符合预期

彩铃音

每0.1秒打印一次日志特征识别为音乐特征结束符合实际接听时间(对应上面的彩铃音波形图)

AXE模式隐私号基于语音流分析的用户接听识别方案相关推荐

百家号基于AE的视频渲染技术探索
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-nm0WWHSi-1657593629762)(https://p3-juejin.byteimg.com/tos-cn- ...
关于隐私保护通话 - 隐私号关系详解
在某种程度上,隐私号业务的多样性和复杂性主要体现在号码绑定关系上. 对应各种业务场景,有各种不同的号码绑定关系模式: 基本绑定模式:AXB,AX(AXN,XB).AXE.AXYB: 复合绑定模式:AX ...
全流分析取证：高级威胁哪里跑？！
网络安全的趋势和技术选择网络空间安全涉及的安全响应,是指在安全事件发生后,通过人工或者自动化的方式,能采取相应的措施,降低安全事件带来的危害和影响:从启明星辰发布的近几年安全态势观察报告来看,安全响 ...
基于matlab 的语音信号分析和处理,基于matlab_的语音信号分析和处理
基于matlab_的语音信号分析和处理 1 基于MATLAB 的语音信号分析和处理福建师范大学协和学院信息技术系电子信息科学与技术专 124122006028 王祯飞指导老师黄小芬[摘要]本 ...
网络语音流隐写分析全流程 (Steganalysis of VoIP Speech Streams)
欢迎访问我的个人博客:https://hi.junono.com/ AMR隐写数据集地址(Kaggle) 网络语音流隐写分析全流程隐写分析流程介绍: 基本知识 **基于网络语音(VoIP)流的隐写术 ...
基于Matlab App Designer的语音信号分析与处理（二）：IIR和FIR滤波器的设计，语音信号的滤波
接上文:https://blog.csdn.net/weixin_53877178/article/details/122470759 目录一.课题的任务二.内容.步骤和要求 (1)语音信号的采集 ...
用知乎为公众号引流分析报告
用知乎为公众号引流分析报告自从公众号改版之后,文章打开率就变得越来越低,从零起步的公众号往往是运营最难跨越的鸿沟.刚起步的微信公众号,没人关注怎么办?怎样获得初始的1000个粉丝?小辉辉给你带来一些 ...
基于卷积神经网络和时域金字塔池化的语音情感分析
基于卷积神经网络和时域金字塔池化的语音情感分析一.概述这是最近学习<Speech Emotion Recognition Using Deep Convolutional Neural ...
网赚项目：怎么做好一个副业,视频号的引流及变现模式
大家好,我是蝶衣王的小编视频号自开通以来,不少企业和个人都参与进来.视频号运营不同于其他平台,并没有很多成功的经验可以借鉴,很多人都是摸着石头过河.那么我们可以来梳理一下视频号引流到私域池塘里面的一 ...
基于计算机程序设计语音,基于php语言分析计算机编程的发展前景.doc
文档介绍: 基于php语言分析计算机编程的发展前景.docEvaluationWarning:ThedocumentwascreatedwithSpire..基于php语言分析计算机编程的发展前景吕昌 ...

AXE模式隐私号基于语音流分析的用户接听识别方案

背景

目的

调研

思路

代码实现

效果测试

回铃音

彩铃音

AXE模式隐私号基于语音流分析的用户接听识别方案相关推荐

最新文章

热门文章