Unity FACEGOOD Audio2Face 通过音频驱动面部BlendShape
目录
Audio2Face简介
在Unity中应用
Audio2Face简介
在元宇宙的热潮下,为了让AI数字人渗透到更多的领域中,FACEGOOD已经将语音驱动口型的算法技术开源,开源地址:
https://github.com/FACEGOOD/FACEGOOD-Audio2Face
该技术可以实时将音频数据转换为驱动数字人面部BlendShape的权重数据,不同于ARKit中的52个BlendShape,它的数量多达116个,我们可以通过对应关系得到相应的数值,对应关系如下:
ARKit | Voice2Face | ||||
eyeBlinkLeft | eye_blink2_l | ||||
eyeLookDownLeft | eye_lookDown2_l | ||||
eyeLookInLeft | eye_lookRight_l | ||||
eyeLookOutLeft | eye_lookLeft_l | ||||
eyeLookUpLeft | eye_lookUp_l | ||||
eyeSquintLeft | eye_shutTight_l | ||||
eyeWideLeft | max(eye_downLidRaise_l,eye_upLidRaise_l) | ||||
eyeBlinkRight | eye_blink2_r | ||||
eyeLookDownRight | eye_lookDown2_r | ||||
eyeLookInRight | eye_lookRight_r | ||||
eyeLookOutRight | eye_lookLeft_r | ||||
eyeLookUpRight | eye_lookUp_r | ||||
eyeSquintRight | eye_shutTight_r | ||||
eyeWideRight | max(eye_downLidRaise_r,eye_upLidRaise_r) | ||||
jawForward | jaw_thrust_c | ||||
jawLeft | jaw_sideways_l | ||||
jawRight | jaw_sideways_r | ||||
jawOpen | mouth_stretch_c | ||||
mouthClose | mouth_chew_c | ||||
mouthFunnel | max(mouth_funnel_dl,mouth_funnel_dr,mouth_funnel_ul,mouth_funnel_ur) | ||||
mouthPucker | max(mouth_pucker_l,mouth_pucker_r) | ||||
mouthLeft | mouth_sideways_l | ||||
mouthRight | mouth_sideways_r | ||||
mouthSmileLeft | mouth_lipCornerPull_l | ||||
mouthSmileRight | mouth_lipCornerPull_r | ||||
mouthFrownLeft | max(mouth_lipCornerDepress_l,mouth_lipCornerDepressFix_l) | ||||
mouthFrownRight | max(mouth_lipCornerDepress_r,mouth_lipCornerDepressFix_r) | ||||
mouthDimpleLeft | mouth_dimple_l | ||||
mouthDimpleRight | mouth_dimple_r | ||||
mouthStretchLeft | mouth_lipStretch_l | ||||
mouthStretchRight | mouth_lipStretch_r | ||||
mouthRollLower | max(mouth_suck_dl,mouth_suck_dr) | ||||
mouthRollUpper | max(mouth_suck_ul,mouth_suck_ur) | ||||
mouthShrugLower | mouth_chinRaise_d | ||||
mouthShrugUpper | mouth_chinRaise_u | ||||
mouthPressLeft | mouth_press_l | ||||
mouthPressRight | mouth_press_r | ||||
mouthLowerDownLeft | mouth_lowerLipDepress_l | ||||
mouthLowerDownRight | mouth_lowerLipDepress_r | ||||
mouthUpperUpLeft | mouth_upperLipRaise_l | ||||
mouthUpperUpRight | mouth_upperLipRaise_r | ||||
browDownleft | brow_lower_l | ||||
browDownRight | brow_lower_r | ||||
browInnerUp | brow_raise_c | ||||
browOuterUpLeft | brow_raise_l | ||||
browOuterUpRight | brow_raise_r | ||||
cheekPuff | max(cheek_puff_l,cheek_puff_r) | ||||
cheekSquintLeft | cheek_up | ||||
cheekSquintRight | cheek_up | ||||
noseSneerLeft | nose_out_l | ||||
noseSneerRight | nose_out_r | ||||
tongueOut |
生产的数据结果如下图所示,可见是116个取值范围为-1~1的小数:
这116个数值依次对应下面116个BlendShape名称:
brow_lower_l
tongue_Scale__X
tongue_Scale_Y
tongue_Scale__Y
tongue_Scale_Z
tongue_Scale__Z
nose_out_l
nose_out_r
tongue_u
tongue_u_u
brow_raise_d
cheek_suck_r
mouth_stretch_u
tongue_u_d
tooth_d_d
tongue_d
tooth_r
tooth_d_u
cheek_UP
eye_blink1_l
eye_blink1_r
eye_blink2_l
eye_blink2_r
eye_lidTight_l
eye_lidTight_r
eye_shutTight_l
eye_shutTight_r
brow_lower_r
eye_upperLidRaise_l
eye_upperLidRaise_r
eye_downLidRaise_l
eye_downLidRaise_r
jaw_sideways_l
jaw_sideways_r
jaw_thrust_c
mouth_chew_c
mouth_chinRaise_d
mouth_chinRaise_u
brow_raise_c
mouth_dimple_l
mouth_dimple_r
mouth_funnel_dl
mouth_funnel_dr
mouth_funnel_ul
mouth_funnel_ur
mouth_lipCornerDepressFix_l
mouth_lipCornerDepressFix_r
mouth_lipCornerDepress_l
mouth_lipCornerDepress_r
brow_raise_l
mouth_lipCornerPullOpen_l
mouth_lipCornerPullOpen_r
mouth_lipCornerPull_l
mouth_lipCornerPull_r
mouth_lipStretchOpen_l
mouth_lipStretchOpen_r
mouth_lipStretch_l
mouth_lipStretch_r
mouth_lowerLipDepress_l
mouth_lowerLipDepress_r
brow_raise_r
mouth_lowerLipProtrude_c
mouth_oh_c
mouth_oo_c
mouth_pressFix_c
mouth_press_l
mouth_press_r
mouth_pucker_l
mouth_pucker_r
mouth_screamFix_c
mouth_sideways_l
cheek_puff_l
mouth_sideways_r
mouth_stretch_c
mouth_suck_dl
mouth_suck_dr
mouth_suck_ul
mouth_suck_ur
mouth_upperLipRaise_l
mouth_upperLipRaise_r
nose_wrinkle_l
nose_wrinkle_r
cheek_puff_r
tooth_l
eye_lookDown1_l
eye_lookDown2_l
eye_lookLeft_l
eye_lookRight_l
eye_lookUp_l
eye_lookDown1_r
eye_lookDown2_r
eye_lookLeft_r
eye_lookRight_r
cheek_raise_l
eye_lookUp_r
tongue_Rot_1X
tongue_Rot__1X
tongue_Rot_2X
tongue_Rot__2X
tongue_Rot_3X
tongue_Rot__3X
tongue_Rot_1Y
tongue_Rot__1Y
tongue_Rot_2Y
cheek_raise_r
tongue_Rot__2Y
tongue_Rot_3Y
tongue_Rot__3Y
tongue_Rot_1Z
tongue_Rot__1Z
tongue_Rot_2Z
tongue_Rot__2Z
tongue_Rot_3Z
tongue_Rot__3Z
tongue_Scale_X
cheek_suck_l
在Unity中应用
可以用过构建python服务,Unity客户端开启麦克风录制音频,将音频数据发送给python服务端,服务端转换为驱动BlendShape的权重数据后,返回给Unity客户端进行驱动。需要注意的是Unity中BlendShape的权重范围并不是[-1,1],因此需要进行映射。
例如:
//将[-1,1]映射到[-100,100]
private float Remap(float v)
{return v * 100f;
}
下面是一段测试音频产生的bs权重数据文件,每一行包含116个权重数值,我们拿来进行测试,将其放到StreamingAssets文件夹下。
测试模型:
测试代码:
using System.IO;
using System.Collections;
using System.Collections.Generic;using UnityEngine;public class TEST : MonoBehaviour
{private Coroutine coroutine;private SkinnedMeshRenderer smr;private readonly List<List<float>> valueList = new List<List<float>>();private IEnumerator Start(){smr = GetComponent<SkinnedMeshRenderer>();string path = Path.Combine(Application.streamingAssetsPath, "weight.txt");using (StreamReader streamReader = new StreamReader(path)){string content;while ((content = streamReader.ReadLine()) != null){List<float> list = new List<float>();content = content.Trim();string[] splitArray = content.Split(' ');for (int i = 0; i < splitArray.Length; i++){float.TryParse(splitArray[i], out float result);list.Add(result);}valueList.Add(list);yield return null;}}}private IEnumerator ExecuteCoroutine(){for (int i = 0; i < valueList.Count; i++){List<float> list = valueList[i];smr.SetBlendShapeWeight(0, Remap(list[49])); //brow_raise_lsmr.SetBlendShapeWeight(1, Remap(list[60])); //brow_raise_rsmr.SetBlendShapeWeight(2, Remap(list[25])); //eye_shutTight_lsmr.SetBlendShapeWeight(3, Remap(list[26])); //eye_shutTight_rsmr.SetBlendShapeWeight(4, Remap(list[87])); //eye_lookRight_lsmr.SetBlendShapeWeight(5, Remap(list[86])); //eye_lookLeft_l smr.SetBlendShapeWeight(6, Remap(list[92])); //eye_lookRight_rsmr.SetBlendShapeWeight(7, Remap(list[91])); //eye_lookLeft_rsmr.SetBlendShapeWeight(8, Remap(list[88])); //eye_lookUp_lsmr.SetBlendShapeWeight(9, Remap(list[94])); //eye_lookUp_rsmr.SetBlendShapeWeight(10, Remap(list[85])); //eye_lookDown2_lsmr.SetBlendShapeWeight(11, Remap(list[90])); //eye_lookDown2_rsmr.SetBlendShapeWeight(12, Mathf.Max(Remap(list[71]), Remap(list[82]))); //cheek_pull_l cheek_pull_rsmr.SetBlendShapeWeight(13, Remap(list[18])); //cheek_UPsmr.SetBlendShapeWeight(14, Remap(list[18])); //cheek_UPsmr.SetBlendShapeWeight(15, Remap(list[6])); //nose_out_l smr.SetBlendShapeWeight(16, Remap(list[7])); //nose_out_rsmr.SetBlendShapeWeight(17, Remap(list[70])); //mouth_sideways_lsmr.SetBlendShapeWeight(18, Remap(list[72])); //mouth_sideways_rsmr.SetBlendShapeWeight(19, Mathf.Max(Remap(list[67]), Remap(list[68]))); //mouth_pucker_l mouth_pucker_2smr.SetBlendShapeWeight(20, Mathf.Max(Remap(list[41]), Remap(list[42]), Remap(list[43]), Remap(list[44]))); //mouth_funnel_dl dr ul ursmr.SetBlendShapeWeight(21, Remap(list[52])); //mouth_lipCornerPull_lsmr.SetBlendShapeWeight(22, Remap(list[53])); //mouth_lipCornerPull_rsmr.SetBlendShapeWeight(23, Mathf.Max(Remap(list[47]), Remap(list[45]))); //mouth_lipCornerDepress_l mouth_lipCornerDepressFix_lsmr.SetBlendShapeWeight(24, Mathf.Max(Remap(list[48]), Remap(list[46]))); //mouth_lipCornerDepress_r mouth_lipCornerDepressFix_rsmr.SetBlendShapeWeight(25, Remap(list[39])); //mouth_dimple_lsmr.SetBlendShapeWeight(26, Remap(list[40])); //mouth_dimple_rsmr.SetBlendShapeWeight(27, Remap(list[65])); //mouth_press_lsmr.SetBlendShapeWeight(28, Remap(list[66])); //mouth_press_rsmr.SetBlendShapeWeight(29, Remap(list[36])); //mouth_chinRaise_dsmr.SetBlendShapeWeight(30, Remap(list[37])); //mouth_chinRaise_usmr.SetBlendShapeWeight(31, Remap(list[56])); //mouth_lipStretch_lsmr.SetBlendShapeWeight(32, Remap(list[57])); //mouth_lipStretch_rsmr.SetBlendShapeWeight(33, Remap(list[78])); //mouth_upperLipRaise_lsmr.SetBlendShapeWeight(34, Remap(list[79])); //mouth_upperLipRaise_rsmr.SetBlendShapeWeight(35, Remap(list[58])); //mouth_lowerLipDepress_lsmr.SetBlendShapeWeight(36, Remap(list[59])); //mouth_lowerLipDepress_rsmr.SetBlendShapeWeight(37, Mathf.Max(Remap(list[76]), Remap(list[77]))); //mouth_suck_ul mouth_suck_ursmr.SetBlendShapeWeight(38, Mathf.Max(Remap(list[74]), Remap(list[75]))); //mouth_suck_dl mouth_suck_drsmr.SetBlendShapeWeight(39, Remap(list[35])); //mouth_chew_csmr.SetBlendShapeWeight(40, Remap(list[34])); //jaw_thrust_csmr.SetBlendShapeWeight(41, Remap(list[73])); //mouth_stretch_csmr.SetBlendShapeWeight(42, Remap(list[32])); //jaw_sideways_lsmr.SetBlendShapeWeight(43, Remap(list[33])); //jaw_sideways_rsmr.SetBlendShapeWeight(44, Remap(list[38])); //brow_raise_csmr.SetBlendShapeWeight(45, Remap(list[22])); //eye_blink2_rsmr.SetBlendShapeWeight(46, Remap(list[21])); //eye_blink2_lsmr.SetBlendShapeWeight(47, Remap(list[0])); //brow_lower_lsmr.SetBlendShapeWeight(48, Remap(list[27])); //brow_lower_rsmr.SetBlendShapeWeight(49, Mathf.Max(Remap(list[31]), Remap(list[29]))); //eye_downLidRaise_r eye_upLidRaise_rsmr.SetBlendShapeWeight(50, Mathf.Max(Remap(list[30]), Remap(list[28]))); //eye_downLidRaise_l eye_upLidRaise_lyield return new WaitForSeconds(.07f);}coroutine = null;}private float Remap(float v){return v * 100f;}private void OnGUI(){GUI.enabled = coroutine == null;if (GUILayout.Button("Begin", GUILayout.Width(200f), GUILayout.Height(50f))){coroutine = StartCoroutine(ExecuteCoroutine());}GUI.enabled = coroutine != null;if (GUILayout.Button("Stop", GUILayout.Width(200f), GUILayout.Height(50f))){StopCoroutine(coroutine);coroutine = null;}}
}
Unity FACEGOOD Audio2Face 通过音频驱动面部BlendShape相关推荐
- AI数字人:语音驱动面部模型及超分辨率重建Wav2Lip-HD
1 Wav2Lip-HD项目介绍 数字人打造中语音驱动人脸和超分辨率重建两种必备的模型,它们被用于实现数字人的语音和图像方面的功能.通过Wav2Lip-HD项目可以快速使用这两种模型,完成高清数字人形 ...
- linux驱动:音频驱动(六)ASoc之codec设备
linux驱动:音频驱动(六)ASoc之codec设备
- linux驱动:音频驱动(五)ASoc之codec驱动
linux驱动:音频驱动(五)ASoc之codec驱动
- linux驱动:音频驱动(四)ASoc之machine设备
linux驱动:音频驱动(四)ASoc之machine设备
- linux 音频驱动的流程,Intel平台下Linux音频驱动流程分析
[软件框架] 在对要做的事情一无所知的时候,从全局看看系统的拓扑图对我们认识新事物有很大的帮助.Audio 部分的驱动程序框架如下图所示: 这幅图明显地分为 3 级. 上方蓝色系的 ALSA Kern ...
- Linux 音频驱动
Linux 音频驱动 硬件介绍 WM8960与IMX6ULL之间有两个通信接口:I2C和I2S 其中I2C用于配置WM8960 I2S用于音频数据传输 修改设备树文件 编写I2C子节点设备树 code ...
- Linux音频驱动开发概括
原址 1.嵌入式音频系统硬件连接 下图所示的嵌入式设备使用IIS将音频数据发送给编解码器.对编解码器的I/O寄存器的编程通过IIC总线进行. 2.音频体系结构-ALSA ALSA是Advanced L ...
- RK系列开发板音频驱动适配指南(二)
背景: 上一篇文章RK系列开发板音频驱动适配指南-DAI模块适配中已经阐述音频驱动适配的DAI模块适配步骤以及核心代码的展示,本次主要介绍音频驱动适配中的DMA模块适配. RK系列开发板 DMA模块适 ...
- Linux 音频驱动(一) ASoC音频框架简介
目录 1. ALSA简介 2. ASoC音频驱动构成 3. PCM数据流 4. 数据结构简介 5. ASoC音频驱动注册流程 1. ALSA简介 Native ALSA Application:tin ...
最新文章
- 03Template Method模式
- 3.4.3 嵌套查询
- lambda ::_您无法从这里到达那里:Netlify Lambda和Firebase如何使我陷入无服务器的死胡同
- 敏捷研发项目,我们该如何度量?
- vue使用js-cookie写入获取不到_Vue 面向对象 - 实战 - 内容管理系统(五)
- 面试官系统精讲Java源码及大厂真题 - 01 开篇词:为什么学习本专栏
- opencv 轮廓提取文字
- 自然人如何开发票-以广西为例
- 鸿蒙渊主线任务,天下3易信公众平台
- import上一级目录的模块(Python)
- PyCharm 社区版(Community)能不能商用?
- SQL Server 使用DATEADD()函数实现秒、分钟、小时、日、周、月份、季度、年份加减
- 光猫,怕不怕雷电?雷电天气,要不要关光猫?
- 路由流量管理(TM)初步认识(二次更新)
- uname 命令如何实现?
- 两个向量构成的平行四边形面积的求解 ————简单技巧
- 【UML】例析UML类图的几种关系
- 即时通信回弹 android,放大10倍看华为 P30,看完“真像”的你可能和我一样放弃 P30Pro...
- 8188gu驱动和su realtek_Realtek全系列官方网卡驱动
- 天津政府应急系统之GIS一张图(arcgis api for flex)讲解(二)鹰眼模块