目录

Audio2Face简介

在Unity中应用


Audio2Face简介

在元宇宙的热潮下,为了让AI数字人渗透到更多的领域中,FACEGOOD已经将语音驱动口型的算法技术开源,开源地址:

https://github.com/FACEGOOD/FACEGOOD-Audio2Face

该技术可以实时将音频数据转换为驱动数字人面部BlendShape的权重数据,不同于ARKit中的52个BlendShape,它的数量多达116个,我们可以通过对应关系得到相应的数值,对应关系如下:

ARKit Voice2Face
eyeBlinkLeft eye_blink2_l
eyeLookDownLeft eye_lookDown2_l
eyeLookInLeft eye_lookRight_l
eyeLookOutLeft eye_lookLeft_l
eyeLookUpLeft eye_lookUp_l
eyeSquintLeft eye_shutTight_l
eyeWideLeft max(eye_downLidRaise_l,eye_upLidRaise_l)
eyeBlinkRight eye_blink2_r
eyeLookDownRight eye_lookDown2_r
eyeLookInRight eye_lookRight_r
eyeLookOutRight eye_lookLeft_r
eyeLookUpRight eye_lookUp_r
eyeSquintRight eye_shutTight_r
eyeWideRight max(eye_downLidRaise_r,eye_upLidRaise_r)
jawForward jaw_thrust_c
jawLeft jaw_sideways_l
jawRight jaw_sideways_r
jawOpen mouth_stretch_c
mouthClose mouth_chew_c
mouthFunnel max(mouth_funnel_dl,mouth_funnel_dr,mouth_funnel_ul,mouth_funnel_ur)
mouthPucker max(mouth_pucker_l,mouth_pucker_r)
mouthLeft mouth_sideways_l
mouthRight mouth_sideways_r
mouthSmileLeft mouth_lipCornerPull_l
mouthSmileRight mouth_lipCornerPull_r
mouthFrownLeft max(mouth_lipCornerDepress_l,mouth_lipCornerDepressFix_l)
mouthFrownRight max(mouth_lipCornerDepress_r,mouth_lipCornerDepressFix_r)
mouthDimpleLeft mouth_dimple_l
mouthDimpleRight mouth_dimple_r
mouthStretchLeft mouth_lipStretch_l
mouthStretchRight mouth_lipStretch_r
mouthRollLower max(mouth_suck_dl,mouth_suck_dr)
mouthRollUpper max(mouth_suck_ul,mouth_suck_ur)
mouthShrugLower mouth_chinRaise_d
mouthShrugUpper mouth_chinRaise_u
mouthPressLeft mouth_press_l
mouthPressRight mouth_press_r
mouthLowerDownLeft mouth_lowerLipDepress_l
mouthLowerDownRight mouth_lowerLipDepress_r
mouthUpperUpLeft mouth_upperLipRaise_l
mouthUpperUpRight mouth_upperLipRaise_r
browDownleft brow_lower_l
browDownRight brow_lower_r
browInnerUp brow_raise_c
browOuterUpLeft brow_raise_l
browOuterUpRight brow_raise_r
cheekPuff max(cheek_puff_l,cheek_puff_r)
cheekSquintLeft cheek_up
cheekSquintRight cheek_up
noseSneerLeft nose_out_l
noseSneerRight nose_out_r
tongueOut

生产的数据结果如下图所示,可见是116个取值范围为-1~1的小数:

这116个数值依次对应下面116个BlendShape名称:

brow_lower_l
tongue_Scale__X
tongue_Scale_Y
tongue_Scale__Y
tongue_Scale_Z
tongue_Scale__Z
nose_out_l
nose_out_r
tongue_u
tongue_u_u
brow_raise_d
cheek_suck_r
mouth_stretch_u
tongue_u_d
tooth_d_d
tongue_d
tooth_r
tooth_d_u
cheek_UP
eye_blink1_l
eye_blink1_r
eye_blink2_l
eye_blink2_r
eye_lidTight_l
eye_lidTight_r
eye_shutTight_l
eye_shutTight_r
brow_lower_r
eye_upperLidRaise_l
eye_upperLidRaise_r
eye_downLidRaise_l
eye_downLidRaise_r
jaw_sideways_l
jaw_sideways_r
jaw_thrust_c
mouth_chew_c
mouth_chinRaise_d
mouth_chinRaise_u
brow_raise_c
mouth_dimple_l
mouth_dimple_r
mouth_funnel_dl
mouth_funnel_dr
mouth_funnel_ul
mouth_funnel_ur
mouth_lipCornerDepressFix_l
mouth_lipCornerDepressFix_r
mouth_lipCornerDepress_l
mouth_lipCornerDepress_r
brow_raise_l
mouth_lipCornerPullOpen_l
mouth_lipCornerPullOpen_r
mouth_lipCornerPull_l
mouth_lipCornerPull_r
mouth_lipStretchOpen_l
mouth_lipStretchOpen_r
mouth_lipStretch_l
mouth_lipStretch_r
mouth_lowerLipDepress_l
mouth_lowerLipDepress_r
brow_raise_r
mouth_lowerLipProtrude_c
mouth_oh_c
mouth_oo_c
mouth_pressFix_c
mouth_press_l
mouth_press_r
mouth_pucker_l
mouth_pucker_r
mouth_screamFix_c
mouth_sideways_l
cheek_puff_l
mouth_sideways_r
mouth_stretch_c
mouth_suck_dl
mouth_suck_dr
mouth_suck_ul
mouth_suck_ur
mouth_upperLipRaise_l
mouth_upperLipRaise_r
nose_wrinkle_l
nose_wrinkle_r
cheek_puff_r
tooth_l
eye_lookDown1_l
eye_lookDown2_l
eye_lookLeft_l
eye_lookRight_l
eye_lookUp_l
eye_lookDown1_r
eye_lookDown2_r
eye_lookLeft_r
eye_lookRight_r
cheek_raise_l
eye_lookUp_r
tongue_Rot_1X
tongue_Rot__1X
tongue_Rot_2X
tongue_Rot__2X
tongue_Rot_3X
tongue_Rot__3X
tongue_Rot_1Y
tongue_Rot__1Y
tongue_Rot_2Y
cheek_raise_r
tongue_Rot__2Y
tongue_Rot_3Y
tongue_Rot__3Y
tongue_Rot_1Z
tongue_Rot__1Z
tongue_Rot_2Z
tongue_Rot__2Z
tongue_Rot_3Z
tongue_Rot__3Z
tongue_Scale_X
cheek_suck_l

在Unity中应用

可以用过构建python服务,Unity客户端开启麦克风录制音频,将音频数据发送给python服务端,服务端转换为驱动BlendShape的权重数据后,返回给Unity客户端进行驱动。需要注意的是Unity中BlendShape的权重范围并不是[-1,1],因此需要进行映射。

例如:

//将[-1,1]映射到[-100,100]
private float Remap(float v)
{return v * 100f;
}

下面是一段测试音频产生的bs权重数据文件,每一行包含116个权重数值,我们拿来进行测试,将其放到StreamingAssets文件夹下。

测试模型:

测试代码:

using System.IO;
using System.Collections;
using System.Collections.Generic;using UnityEngine;public class TEST : MonoBehaviour
{private Coroutine coroutine;private SkinnedMeshRenderer smr;private readonly List<List<float>> valueList = new List<List<float>>();private IEnumerator Start(){smr = GetComponent<SkinnedMeshRenderer>();string path = Path.Combine(Application.streamingAssetsPath, "weight.txt");using (StreamReader streamReader = new StreamReader(path)){string content;while ((content = streamReader.ReadLine()) != null){List<float> list = new List<float>();content = content.Trim();string[] splitArray = content.Split(' ');for (int i = 0; i < splitArray.Length; i++){float.TryParse(splitArray[i], out float result);list.Add(result);}valueList.Add(list);yield return null;}}}private IEnumerator ExecuteCoroutine(){for (int i = 0; i < valueList.Count; i++){List<float> list = valueList[i];smr.SetBlendShapeWeight(0, Remap(list[49]));    //brow_raise_lsmr.SetBlendShapeWeight(1, Remap(list[60]));    //brow_raise_rsmr.SetBlendShapeWeight(2, Remap(list[25]));    //eye_shutTight_lsmr.SetBlendShapeWeight(3, Remap(list[26]));    //eye_shutTight_rsmr.SetBlendShapeWeight(4, Remap(list[87]));    //eye_lookRight_lsmr.SetBlendShapeWeight(5, Remap(list[86]));    //eye_lookLeft_l    smr.SetBlendShapeWeight(6, Remap(list[92]));    //eye_lookRight_rsmr.SetBlendShapeWeight(7, Remap(list[91]));    //eye_lookLeft_rsmr.SetBlendShapeWeight(8, Remap(list[88]));    //eye_lookUp_lsmr.SetBlendShapeWeight(9, Remap(list[94]));    //eye_lookUp_rsmr.SetBlendShapeWeight(10, Remap(list[85]));   //eye_lookDown2_lsmr.SetBlendShapeWeight(11, Remap(list[90]));   //eye_lookDown2_rsmr.SetBlendShapeWeight(12, Mathf.Max(Remap(list[71]), Remap(list[82])));   //cheek_pull_l cheek_pull_rsmr.SetBlendShapeWeight(13, Remap(list[18]));   //cheek_UPsmr.SetBlendShapeWeight(14, Remap(list[18]));   //cheek_UPsmr.SetBlendShapeWeight(15, Remap(list[6]));    //nose_out_l   smr.SetBlendShapeWeight(16, Remap(list[7]));    //nose_out_rsmr.SetBlendShapeWeight(17, Remap(list[70]));   //mouth_sideways_lsmr.SetBlendShapeWeight(18, Remap(list[72]));   //mouth_sideways_rsmr.SetBlendShapeWeight(19, Mathf.Max(Remap(list[67]), Remap(list[68])));   //mouth_pucker_l mouth_pucker_2smr.SetBlendShapeWeight(20, Mathf.Max(Remap(list[41]), Remap(list[42]), Remap(list[43]), Remap(list[44])));   //mouth_funnel_dl dr ul ursmr.SetBlendShapeWeight(21, Remap(list[52]));   //mouth_lipCornerPull_lsmr.SetBlendShapeWeight(22, Remap(list[53]));   //mouth_lipCornerPull_rsmr.SetBlendShapeWeight(23, Mathf.Max(Remap(list[47]), Remap(list[45])));   //mouth_lipCornerDepress_l mouth_lipCornerDepressFix_lsmr.SetBlendShapeWeight(24, Mathf.Max(Remap(list[48]), Remap(list[46])));   //mouth_lipCornerDepress_r mouth_lipCornerDepressFix_rsmr.SetBlendShapeWeight(25, Remap(list[39]));   //mouth_dimple_lsmr.SetBlendShapeWeight(26, Remap(list[40]));   //mouth_dimple_rsmr.SetBlendShapeWeight(27, Remap(list[65]));   //mouth_press_lsmr.SetBlendShapeWeight(28, Remap(list[66]));   //mouth_press_rsmr.SetBlendShapeWeight(29, Remap(list[36]));   //mouth_chinRaise_dsmr.SetBlendShapeWeight(30, Remap(list[37]));   //mouth_chinRaise_usmr.SetBlendShapeWeight(31, Remap(list[56]));   //mouth_lipStretch_lsmr.SetBlendShapeWeight(32, Remap(list[57]));   //mouth_lipStretch_rsmr.SetBlendShapeWeight(33, Remap(list[78]));   //mouth_upperLipRaise_lsmr.SetBlendShapeWeight(34, Remap(list[79]));   //mouth_upperLipRaise_rsmr.SetBlendShapeWeight(35, Remap(list[58]));   //mouth_lowerLipDepress_lsmr.SetBlendShapeWeight(36, Remap(list[59]));   //mouth_lowerLipDepress_rsmr.SetBlendShapeWeight(37, Mathf.Max(Remap(list[76]), Remap(list[77])));    //mouth_suck_ul mouth_suck_ursmr.SetBlendShapeWeight(38, Mathf.Max(Remap(list[74]), Remap(list[75])));    //mouth_suck_dl mouth_suck_drsmr.SetBlendShapeWeight(39, Remap(list[35]));   //mouth_chew_csmr.SetBlendShapeWeight(40, Remap(list[34]));   //jaw_thrust_csmr.SetBlendShapeWeight(41, Remap(list[73]));   //mouth_stretch_csmr.SetBlendShapeWeight(42, Remap(list[32]));   //jaw_sideways_lsmr.SetBlendShapeWeight(43, Remap(list[33]));   //jaw_sideways_rsmr.SetBlendShapeWeight(44, Remap(list[38]));   //brow_raise_csmr.SetBlendShapeWeight(45, Remap(list[22]));   //eye_blink2_rsmr.SetBlendShapeWeight(46, Remap(list[21]));   //eye_blink2_lsmr.SetBlendShapeWeight(47, Remap(list[0]));   //brow_lower_lsmr.SetBlendShapeWeight(48, Remap(list[27]));   //brow_lower_rsmr.SetBlendShapeWeight(49, Mathf.Max(Remap(list[31]), Remap(list[29])));   //eye_downLidRaise_r eye_upLidRaise_rsmr.SetBlendShapeWeight(50, Mathf.Max(Remap(list[30]), Remap(list[28])));   //eye_downLidRaise_l eye_upLidRaise_lyield return new WaitForSeconds(.07f);}coroutine = null;}private float Remap(float v){return v * 100f;}private void OnGUI(){GUI.enabled = coroutine == null;if (GUILayout.Button("Begin", GUILayout.Width(200f), GUILayout.Height(50f))){coroutine = StartCoroutine(ExecuteCoroutine());}GUI.enabled = coroutine != null;if (GUILayout.Button("Stop", GUILayout.Width(200f), GUILayout.Height(50f))){StopCoroutine(coroutine);coroutine = null;}}
}

Unity FACEGOOD Audio2Face 通过音频驱动面部BlendShape相关推荐

  1. AI数字人:语音驱动面部模型及超分辨率重建Wav2Lip-HD

    1 Wav2Lip-HD项目介绍 数字人打造中语音驱动人脸和超分辨率重建两种必备的模型,它们被用于实现数字人的语音和图像方面的功能.通过Wav2Lip-HD项目可以快速使用这两种模型,完成高清数字人形 ...

  2. linux驱动:音频驱动(六)ASoc之codec设备

    linux驱动:音频驱动(六)ASoc之codec设备

  3. linux驱动:音频驱动(五)ASoc之codec驱动

    linux驱动:音频驱动(五)ASoc之codec驱动

  4. linux驱动:音频驱动(四)ASoc之machine设备

    linux驱动:音频驱动(四)ASoc之machine设备

  5. linux 音频驱动的流程,Intel平台下Linux音频驱动流程分析

    [软件框架] 在对要做的事情一无所知的时候,从全局看看系统的拓扑图对我们认识新事物有很大的帮助.Audio 部分的驱动程序框架如下图所示: 这幅图明显地分为 3 级. 上方蓝色系的 ALSA Kern ...

  6. Linux 音频驱动

    Linux 音频驱动 硬件介绍 WM8960与IMX6ULL之间有两个通信接口:I2C和I2S 其中I2C用于配置WM8960 I2S用于音频数据传输 修改设备树文件 编写I2C子节点设备树 code ...

  7. Linux音频驱动开发概括

    原址 1.嵌入式音频系统硬件连接 下图所示的嵌入式设备使用IIS将音频数据发送给编解码器.对编解码器的I/O寄存器的编程通过IIC总线进行. 2.音频体系结构-ALSA ALSA是Advanced L ...

  8. RK系列开发板音频驱动适配指南(二)

    背景: 上一篇文章RK系列开发板音频驱动适配指南-DAI模块适配中已经阐述音频驱动适配的DAI模块适配步骤以及核心代码的展示,本次主要介绍音频驱动适配中的DMA模块适配. RK系列开发板 DMA模块适 ...

  9. Linux 音频驱动(一) ASoC音频框架简介

    目录 1. ALSA简介 2. ASoC音频驱动构成 3. PCM数据流 4. 数据结构简介 5. ASoC音频驱动注册流程 1. ALSA简介 Native ALSA Application:tin ...

最新文章

  1. 03Template Method模式
  2. 3.4.3 嵌套查询
  3. lambda ::_您无法从这里到达那里:Netlify Lambda和Firebase如何使我陷入无服务器的死胡同
  4. 敏捷研发项目,我们该如何度量?
  5. vue使用js-cookie写入获取不到_Vue 面向对象 - 实战 - 内容管理系统(五)
  6. 面试官系统精讲Java源码及大厂真题 - 01 开篇词:为什么学习本专栏
  7. opencv 轮廓提取文字
  8. 自然人如何开发票-以广西为例
  9. 鸿蒙渊主线任务,天下3易信公众平台
  10. import上一级目录的模块(Python)
  11. PyCharm 社区版(Community)能不能商用?
  12. SQL Server 使用DATEADD()函数实现秒、分钟、小时、日、周、月份、季度、年份加减
  13. 光猫,怕不怕雷电?雷电天气,要不要关光猫?
  14. 路由流量管理(TM)初步认识(二次更新)
  15. uname 命令如何实现?
  16. 两个向量构成的平行四边形面积的求解 ————简单技巧
  17. 【UML】例析UML类图的几种关系
  18. 即时通信回弹 android,放大10倍看华为 P30,看完“真像”的你可能和我一样放弃 P30Pro...
  19. 8188gu驱动和su realtek_Realtek全系列官方网卡驱动
  20. 天津政府应急系统之GIS一张图(arcgis api for flex)讲解(二)鹰眼模块

热门文章

  1. 合并b站m4s格式的音视频轨道
  2. R语言--计算各种距离
  3. 【PyG】与networkx的图转换
  4. 【渝粤教育】国家开放大学2018年春季 0049-22T法律文书 参考试题
  5. Docker 容器技术入门
  6. 【ASM】Oracleasm命令
  7. 谷歌五笔输入法电脑版_新手学拼音还是学五笔打字(看完你就明白)
  8. 药品通用名和商品名称数据库下载
  9. 无法启动程序 系统找不到指定的文件
  10. 分享!史上最全的STM32库....