iOS语音转文字实现
目前正在搞一个IM的APP,内部好友之间可以发送语音,需要长按实现语音转文字的功能,之前使用的是阿里的NUI.framework,但是这个破玩意,经常出现转出来的文字重复,即使了多声道控制都无法处理掉,体验太差。没办法,就决定替换为apple自己的实现,毕竟siri那么强大!此实现包含本地音频及远程音频,你只需要按时数据model保存对应的path即可,内部会自动识别。
现在来看看实现条件:
在
Info.plist
里面添加两个键值对:
1、Privacy - Speech Recognition Usage Description
(用于请求语音识别)2、
Privacy - Microphone Usage Description
(用于请求麦克风语音输入授权)。并给出相应的文字描述。
导入库文件:
#import <Speech/Speech.h>
以下为实现头文件及逻辑文件:
头文件: NSVoice2Text.h
// // NSVoice2Text.h // 语音转文字 // // Created by wise on 2021/10/13. //#import <Foundation/Foundation.h> #import <Speech/Speech.h>NS_ASSUME_NONNULL_BEGINtypedef NS_ENUM(NSUInteger, NSVoice2TextAuthorationStatus) {NSVoice2TextAuthorizationStatusNotDetermined, //语音识别未授权NSVoice2TextAuthorizationStatusDenied, //用户拒绝使用语音识别NSVoice2TextAuthorizationStatusRestricted, //语音识别在这台设备上受到限制NSVoice2TextAuthorizationStatusAuthorized, //可以语音识别 };@interface NSVoiceModel : NSObject @property (nonatomic,copy) NSString *path;@property (nonatomic,assign) NSInteger taskId;@property (nonatomic,assign) BOOL isRunning;@property (nonatomic,assign) BOOL isInQueue; @end@interface NSVoice2TextFinal : NSObject @property (nonatomic,copy) NSString *value;@property (nonatomic,assign) NSInteger taskId;@property (nonatomic,copy) NSError * __nullable error; @end@interface NSVoice2Text : NSObject+ (BOOL) isRunning;//权限 + (void)voice2TextRequestAuthorationStatus:(void (^)(NSVoice2TextAuthorationStatus status))requestBlock;+ (void)voice2TextGotter:(NSArray <NSVoiceModel *>*)glist runningModelBlock:(void (^__nullable)(NSVoiceModel *amodel))runningModelBlock resultsBlock:(void (^)(NSVoice2TextFinal *finalValue))resultsBlock rtaget:(id)rtaget;@endNS_ASSUME_NONNULL_END
实现文件:NSVoice2Text.m
// // NSVoice2Text.m // 语音转文字 // // Created by wise on 2021/10/13. //#import "NSVoice2Text.h" #import "NSMutableTaskQueue.h"typedef void (^VoiceConversionResultsBlock) (NSVoice2TextFinal *finalValue);@interface NSVoiceModel () @property (nonatomic,weak) id taskTarget;@property (nonatomic, copy) VoiceConversionResultsBlock voiceConversionBlock;@property (nonatomic, copy) void (^voiceConversionRunningBlock)(NSVoiceModel *md); @end@implementation NSVoiceModel@end@implementation NSVoice2TextFinal @endstatic NSVoice2Text *v2text = nil;@interface NSVoice2Text ()<SFSpeechRecognizerDelegate> {NSMutableArray <NSVoiceModel *>* taskList; }@property (nonatomic, assign) NSVoice2TextAuthorationStatus authorationStatus;@property(nonatomic,strong)SFSpeechRecognizer *speechRecognizer;//语音识别器@end@implementation NSVoice2Text - (instancetype)init {self = [super init];if (self){taskList = [NSMutableArray arrayWithCapacity:0];}return self; }+ (instancetype)shareInstance {if (!v2text){v2text = [[NSVoice2Text alloc] init];}return v2text; }+ (void)releaseInstance {if (v2text){v2text = nil;} }- (SFSpeechRecognizer *)speechRecognizer {if (_speechRecognizer == nil) {NSLocale *cale = [[NSLocale alloc]initWithLocaleIdentifier:@"zh-CN"];_speechRecognizer = [[SFSpeechRecognizer alloc]initWithLocale:cale];_speechRecognizer.delegate = self;}return _speechRecognizer; }+ (BOOL) isRunning {NSVoiceModel *md = [[NSVoice2Text shareInstance]->taskList firstObject];return md.isRunning; }- (void)resume {NSVoiceModel *md = [self->taskList firstObject];if (md && !md.isInQueue){md.isInQueue = YES;if (md.voiceConversionRunningBlock){md.voiceConversionRunningBlock(md);}if (md.path && md.path > 0 && !md.isRunning){md.isRunning = YES;NSString *text = @"^(http|https)+.*";NSPredicate *regextest = [NSPredicate predicateWithFormat:@"SELF MATCHES %@", text];BOOL flag = [regextest evaluateWithObject:md.path];if (flag){[self startVoiceConversionWithURL:md.path];}else{[self startVoiceConversionWithFilePath:md.path];}}else{NSVoice2TextFinal *el = [[NSVoice2TextFinal alloc] init];el.taskId = -1;el.error = [NSError errorWithDomain:@"语音路径错误或为空" code:404 userInfo:nil];md.voiceConversionBlock(el);}} }- (void)addItToTask:(NSVoiceModel *)md {__block BOOL contained = NO;[taskList enumerateObjectsUsingBlock:^(NSVoiceModel * _Nonnull obj, NSUInteger idx, BOOL * _Nonnull stop) {if (obj.taskId == md.taskId){contained = YES;*stop = YES;}}];if (!contained){[taskList addObject:md];} }+ (void)voice2TextRequestAuthorationStatus:(void (^)(NSVoice2TextAuthorationStatus status))requestBlock {//发送语音认证请求(首先要判断设备是否支持语音识别功能)[SFSpeechRecognizer requestAuthorization:^(SFSpeechRecognizerAuthorizationStatus status){[[NSVoice2Text shareInstance] setAuthorationStatus:status];requestBlock(status);}]; }+ (void)voice2TextGotter:(NSArray <NSVoiceModel *>*)glist runningModelBlock:(void (^__nullable)(NSVoiceModel *amodel))runningModelBlock resultsBlock:(void (^)(NSVoice2TextFinal *finalValue))resultsBlock rtaget:(id)rtaget {[glist enumerateObjectsUsingBlock:^(NSVoiceModel * _Nonnull obj, NSUInteger idx, BOOL * _Nonnull stop){[obj setVoiceConversionRunningBlock:runningModelBlock];[obj setVoiceConversionBlock:resultsBlock];[obj setTaskTarget:rtaget];[[NSVoice2Text shareInstance] addItToTask:obj];}];[[NSVoice2Text shareInstance] resume]; }- (void)startVoiceConversionWithFilePath:(NSString *)path {[self startVoiceConversion:[NSURL fileURLWithPath:path]]; }- (void)startVoiceConversionWithURL:(NSString *)url {[self startVoiceConversion:[NSURL URLWithString:url]]; }#pragma mark - private methods ///开始转换 - (void)startVoiceConversion:(NSURL *)url {__weak typeof(taskList) weakTaskList = taskList;__weak typeof(self) this = self;SFSpeechURLRecognitionRequest *recognitionRequest = [[SFSpeechURLRecognitionRequest alloc]initWithURL:url];NSLocale *cale = [[NSLocale alloc]initWithLocaleIdentifier:@"zh-CN"];SFSpeechRecognizer *sp = [[SFSpeechRecognizer alloc]initWithLocale:cale];NSOperationQueue *otherQuene = [[NSOperationQueue alloc]init];[sp setQueue:otherQuene];[sp recognitionTaskWithRequest:recognitionRequest resultHandler:^(SFSpeechRecognitionResult * _Nullable result, NSError * error){NSVoiceModel *md = [weakTaskList firstObject];if (md.taskTarget){if (!error){if (result){BOOL isFinal = [result isFinal];//是否结束if (isFinal){NSString *str = [[result bestTranscription]formattedString];NSVoice2TextFinal *el = [[NSVoice2TextFinal alloc] init];el.taskId = md.taskId;el.error = nil;el.value = str;md.voiceConversionBlock(el);[weakTaskList removeObject:md];[this resume];}}else{NSVoice2TextFinal *el = [[NSVoice2TextFinal alloc] init];el.taskId = md.taskId;el.error = error;md.voiceConversionBlock(el);[weakTaskList removeObject:md];[this resume];}}else{NSVoice2TextFinal *el = [[NSVoice2TextFinal alloc] init];el.taskId = md.taskId;el.error = error;md.voiceConversionBlock(el);[weakTaskList removeObject:md];[this resume];}}else{[weakTaskList removeObject:md];[this resume];}}]; } @end
此实现内部已经实现了队列转文字功能,你只需要随时传入数据模型即可。
代码分析:
1、权限请求
+ (void)voice2TextRequestAuthorationStatus:(void (^)(NSVoice2TextAuthorationStatus status))requestBlock;
用于请求隐私权限,只有用户同意后方可使用此功能。否则无法使用此功能。
2、传入音频文件路径
+ (void)voice2TextGotter:(NSArray <NSVoiceModel *>*)glist runningModelBlock:(void (^__nullable)(NSVoiceModel *amodel))runningModelBlock resultsBlock:(void (^)(NSVoice2TextFinal *finalValue))resultsBlock rtaget:(id)rtaget
音频以数据模型NSVoiceModel传入,将你的音频文件与此模型实现映射关系,taskID用于实现绑定,参考头文件的定义及实现。
2.1 runningModelBlock,因为支持队列事务,所以,当前正在处理哪条,则会对外输出此条。页面上可以此显示"正在转换中"文字
2.2 resultsBlock,转换结果文字,以NSVoice2TextFinal对外输出,你只需要处理好这里面的逻辑好可。
3、完整使用:
[NSVoice2Text voice2TextRequestAuthorationStatus:^(NSVoice2TextAuthorationStatus status){if (status == NSVoice2TextAuthorizationStatusAuthorized){NSVoiceModel *md = [[NSVoiceModel alloc] init];[md setTaskId:[bmodel.messageId integerValue]];[md setPath:bmodel.audioFilePath];[NSVoice2Text voice2TextGotter:@[md] runningModelBlock:^(NSVoiceModel * _Nonnull amodel){NSString *taskId = intToStr(amodel.taskId);//通过taskId找到对应的处理的UI,显示"正在转换中"}resultsBlock:^(NSVoice2TextFinal * _Nonnull finalValue){if (!finalValue.error){NSString *taskId = intToStr(finalValue.taskId);NSString *trTexgt = [finalValue value];//通过taskId找到对应的处理的UI,转换完成,得到转换后的文字}else{NSString *taskId = intToStr(finalValue.taskId);//此taskId对应的语音转换失败,亦可找到对应的UI,显示"转换失败"等文字}}];}else{[weakSelf showToastMessageThenHide:@"未授权使用语音识别功能"];}}rtaget:weakSelf];
iOS语音转文字实现相关推荐
- iOS 语音播放文字内容--制作简易听书软件(AVSpeechSynthesizer)
iOS 语音播放文字内容--制作简易听书软件(AVSpeechSynthesizer) 字数46 阅读731 评论8 喜欢50 Collection/Bookmark/Share for width ...
- iOS 语音读文字so easy
#import <AVFoundation/AVFoundation.h> // 点击事件里读文字 -(void)touchesBegan:(NSSet<UITouch *> ...
- iOS开发之语音朗读文字
本文使用AVSpeechSynthesizer和AVSpeechUtterance两个类来完成语音朗读文字功能. 需要: 为项目添加AVFoundation框架 导入头文件import AVFound ...
- android ios语音转码,手机如何将语音转文字?这几种方法真简单,安卓苹果通用...
原标题:手机如何将语音转文字?这几种方法真简单,安卓苹果通用 手机可以将语音转文字吗? 当然可以啦,方法还不止一种,下面就简单给大家介绍几种,安卓苹果手机通用的方法吧. 1.输入法-实时语音转文字 手 ...
- 文字转wav_这6款超良心语音转文字工具,真让人省心!
今天的文章,小叶子想帮大家解决「语音转文字」的痛点. 工作学习中,我们总会遇上语音转文字的需求,例如工作会议布置大量任务.期末最后一节课划重点,语速太快来不及记录,只能录音. 但录音一时爽,听时「火葬 ...
- 普通话转粤语_语音转文字评测:几款语音转文字app,你了解多少?
语音转文字有必要吗?能用在哪里?这是大多数人对于语记类app的疑问所在,今天为大家简单介绍一下几款实用的语记app以及简单的应用描述. 1. 讯飞语记 讯飞语记是讯飞旗下的语音转文字产品,如我们所知, ...
- 三大运营商回复 4G 降速;微信上线语音转文字功能;IntelliJ IDEA 2019.2.1 发布 | 极客头条...
快来收听极客头条音频版吧,智能播报由标贝科技提供技术支持. 「CSDN 极客头条」,是从 CSDN 网站延伸至官方微信公众号的特别栏目,专注于一天业界事报道.风里雨里,我们将每天为朋友们,播报最新鲜有 ...
- android语音输入文字,盘点好用的语音输入APP,懒得打字的时候就说话吧!
原标题:盘点好用的语音输入APP,懒得打字的时候就说话吧! 本文为「智活范」原创作品,欢迎关注我们! 上次推完好用的录音APP后,立刻就有萌友来问了,能不能直接录音转文字呢,这样说话就能生成文字,多省 ...
- AR眼镜语音转文字实测!效果像开了弹幕,对话记录可保存回溯
明敏 发自 凹非寺 量子位 | 公众号 QbitAI AR眼镜字幕功能效果到底咋样? 实测来了! 不光语音能实时转成文字,还能分辨说话人主体,甚至还能把文字记录都保存好,方便以后回溯. 今年,一家来自 ...
- 使用deepspeech.pytorch项目对中文普通话数据集进行语音转文字
目录 介绍 注意事项 实验过程 thchs30 aishell Primewords Chinese Corpus Set 1 Free ST Chinese Mandarin Corpus Aida ...
最新文章
- expect--自动批量分发公钥脚本
- SAP行列转换的一个方法
- 刚柔并济的开源分布式事务解决方案
- 增长量计算n+1原则_土方量计算方法
- Js中清空文件上传字段(input type=file )
- pymol怎么做底物口袋表面_汽车表面有划痕怎么办?建议大家这样做,自己动手就解决...
- 老船履带工具使用方法_PS中各个工具的使用方法与技巧
- 华为p20nfc怎么复制门禁卡_“碰一碰”即可开大门,华为手机上这个“逆天”功能,你用了吗?...
- iOS 关于使用xib创建cell的两种初始化方式
- c语言大作业背单词,c语言必背代码 c语言入门必背单词 c语言必背100代码
- C语言:大小字母转换(ASCII码)
- paypal java开发_paypal开发指南
- 【毕业设计】深度学习 YOLO 实现车牌识别算法
- 精打视频教程(10)菜鸟电子面单打印与发货
- java 实现中英文翻译_java实现简单的英文文本单词翻译器功能示例
- 腾讯互娱web后端面经分享
- WFP 学习(一)——构架把握
- eclipes和idea常用快捷键及缩写大全
- 吃欢天新面食的26种吃法,中国人的福音!
- Docker系列(二十三)——Docker实例五Docker安装MongoDB实例