CONTROLLABLE AVATAR

AI 数字人主播生成器

为商业场景制作专业 AI 主播视频。精准控制手势、克隆人声、规模化生产虚拟代言人视频，无需聘请演员或租用录音棚。

数字人口播动作与姿态预设模拟器

基于 3D 姿态骨架引擎，自由选择并预览丰富的数字人讲解手势与动作节奏。

⚡数字人 3D 骨架手势与姿态实时监视

1 / 7

👤

自然讲解 (Steady Explain)

30+ 预设微动作

涵盖自然讲解、手势互动、情绪表情、身体姿态、生理自然等 5 大分类的丰富动作库。

✦

高自由度肢体语言

数字人动作自动与语音叙述同步，或由系统根据语境自动选择最合适的动作。

✦

智能动作分配

无需手动指定每个动作，系统可根据内容自动添加符合当下场景的肢体表达。

可控数字人工作台核心控制与配置 (Workbench Real Controls)

在实际运行的演示生成工作台中，我们为您配备了完备的微调参数，确保数字人在朗读您的口播文本时，呈现最符合业务意图的姿态与动作：

🖼️

首帧人像生成与定位

支持 Prompt 文生首帧人像或本地上传照片。提供可视化拖拽定位与智能 Top-Weighted 构图裁切，精准锁定数字人姿态视角。

🎭

动作驱动与多镜头并发

收录常规、极限与舞蹈三大类 3D 骨架动作预设。支持勾选多个动作预设，一键启动 GPU 多线程并发生成多个分镜画面。

🎙️

人声克隆与音频驱动

支持零样本音频克隆或直接上传本地音频。通过文本智能生成精准对口型的音轨配音，完美匹配数字人嘴形与情绪。

⚙️

画面规格与渲染参数

自由配置 16:9/9:16 等画面比例与 720p/1080p 分辨率，灵活调节帧率与推理步数，平衡算力推理效率与画质呈现。

完整动作预置词典

以下动作均来自系统 3D 姿态预设库，每个预设包含精确的肢体骨骼序列与动作描述，由视频渲染引擎直接消费。

👤Natural explanation

Keep a relaxed smile and make restrained oral gestures while speaking.

GRP: coreREADY ⚡

👤emphasis

Raise your hand to emphasize an important point, and then return to a neutral explanation state.

GRP: coreREADY ⚡

👤Look at the teleprompter

Look down briefly at the prompt content, then look back at the camera.

GRP: coreREADY ⚡

👤drink water

Pick up the water glass and take a sip before continuing to talk.

GRP: coreREADY ⚡

👤sneeze

Sneeze gently, cover it with your hand and then recover.

GRP: coreREADY ⚡

👤clear throat

Clear your throat with a slight cough and continue oral broadcasting.

GRP: coreREADY ⚡

👤laugh briefly

Laugh briefly at something funny.

GRP: coreREADY ⚡

👤Adjust your glasses

Adjust your glasses before continuing.

GRP: coreREADY ⚡

👤tidy hair

Arrange the hair around your ears and continue.

GRP: coreREADY ⚡

👤rub temples

Briefly rub your temples, as if thinking.

GRP: coreREADY ⚡

👤shrug

Shrugging to express frustration, then continue explaining.

GRP: coreREADY ⚡

👤Spread your hands to explain

Spread your hands slightly to explain.

GRP: coreREADY ⚡

👤list two points

Point out the first and second points with your hands.

GRP: coreREADY ⚡

👤opening greeting

Face the camera and wave hello naturally.

GRP: coreREADY ⚡

👤Wave at the end

Finish with a smile and a wave.

GRP: coreREADY ⚡

👤Toggle hand clap

Clap your hands lightly, as if switching to the next point.

GRP: coreREADY ⚡

👤look at side screen

Look briefly to the side, then back to the camera.

GRP: coreREADY ⚡

👤nod continuously

Continuous nodding indicates approval or confirmation.

GRP: coreREADY ⚡

👤Shake your head gently

Shake your head in the negative and continue explaining.

GRP: coreREADY ⚡

👤surprise reaction

A brief expression of surprise.

GRP: coreREADY ⚡

👤Hold back a yawn

He briefly suppressed a small yawn and then recovered.

GRP: coreREADY ⚡

👤sniff

Sniff lightly and continue as if nothing happened.

GRP: coreREADY ⚡

👤Cover your mouth and cough lightly

Cover your mouth with your hand and cough lightly before continuing.

GRP: coreREADY ⚡

👤lean forward

Lean forward slightly, as if to emphasize the point.

GRP: coreREADY ⚡

👤Lean back slightly

Lean back briefly to relax, then return to the teaching position.

GRP: coreREADY ⚡

👤Touch your chin and think

Touch your chin as if thinking about what to say.

GRP: coreREADY ⚡

👤Thumbs up

Give a brief thumbs up to express a recommendation.

GRP: coreREADY ⚡

👤Point your index finger up as a reminder

A raised index finger reminds the audience of a key point.

GRP: coreREADY ⚡

👤quotes with both hands

Make a two-handed quote motion.

GRP: coreREADY ⚡

👤Put your hands on your chest

Put your hands on your chest to express sincerity.

GRP: coreREADY ⚡

👤deep breath adjustment

Take a slight deep breath before continuing.

GRP: coreREADY ⚡

👤Move your neck

Move your neck slightly and continue.

GRP: coreREADY ⚡

👤tidy collar

Arrange your collar or neckline.

GRP: coreREADY ⚡

👤listen carefully

It's like listening to the field control feedback, then nodding to continue.

GRP: coreREADY ⚡

👤Cross your arms and think

He folded his arms briefly to think and then unfolded them again.

GRP: coreREADY ⚡

👤blink confused

He looked a little confused and then relieved.

GRP: coreREADY ⚡

👤Tilt your head and smile

He tilted his head slightly and smiled before returning to his straight position.

GRP: coreREADY ⚡

* 后端内置 30+ 种动作指令且持续支持更多预设，可通过提示词灵活驱动肢体动作。

数字人生成与渲染流水线

基于全链路 AI 视频生成架构。从口播台词解析、声音克隆与音轨对齐，到 GPU 算力集群多线程并发渲染，实现高效无损的成片输出。

01SCRIPT & PROMPT

口播台词与脚本输入

输入数字人期望朗读的口播台词文本，或提供首帧人像生成 Prompt 提示词。系统将自动解析文本语义与语音时间轴分段。

02VOICE & LIP-SYNC

人声克隆与对口型合成

基于零样本人声克隆或内置精选音色生成高保真配音，并通过对口型对齐算法，将语音特征毫秒级映射至唇部与面部肌肉轨迹。

03POSE & GPU RENDER

3D 姿态驱动与 GPU 并发渲染

将所选 3D 骨架手势姿态与人声音轨高维合流，并发调度 GPU 渲染集群，批量生成动作过渡极其平滑、自然的数字人视频成品。

分步指南

如何生成一支 AI 主播视频

Cuevo 把一张肖像、一段脚本、几个动作标签编译为完整渲染的虚拟代言人视频。按下面五步走，几分钟内交付首支成片。

Upload Presenter Image
PNG, JPG up to 20MB
1
载入肖像首帧
上传一张半身肖像照，或描述你想要的形象，让文生图引擎为你生成首帧画面。
GESTURE PRESETS
Stable Presentation
Expressive Hand Gestures
Cheerful Dance Motion
2
勾选驱动动作
在手势、极限、舞蹈预设库中多选 —— 每个勾选的动作都会作为独立的并行渲染任务。
Text-to-Speech
Type script
Voice Clone
Use voice profile
Audio File
Upload WAV/MP3
3
选择配音方式
输入脚本并选择内置 TTS 音色、绑定一份克隆音色样本，或直接上传你的 WAV/MP3 音轨。
ASPECT RATIO
Landscape 16:9
Portrait 9:16
INFERENCE STEPS20 Steps
4
调节输出参数
设置画面比例、分辨率、帧率与推理步数，在渲染速度与画质之间取得平衡。
BATCH GENERATORCompleted
Stable
Gesture
5
一键批量生成并下载
点击 Batch Generate —— 每个选定动作渲出一支 MP4。在画廊预览每个版本，下载满意的成片。

常见问题

了解 Cuevo AI 如何驱动数字人姿态生成、声音克隆配音以及批量视频渲染。

Q.如何为数字人选择和驱动不同的动作姿态？

在 Cuevo AI 工作台的动作预设库中，收录了涵盖常规讲解、高频互动以及舞步等丰富的 3D 骨架动作。您可以轻松勾选一个或多个动作，系统将自动将所选姿态平滑应用至数字人视频中。

Q.系统是如何实现声音克隆与高精准对口型的？

我们集成了零样本声音克隆技术与精准唇形匹配算法。只需上传一段短音频参考，系统便能高保真还原您的个人声线与说话韵律，并将音轨特征毫秒级映射至数字人的唇部与面部表情。

Q.生成的数字人视频支持哪些画质规格与批量导出？

工作台支持 16:9 横屏与 9:16 竖屏等多维度画面比例，并提供 720p/1080p 分辨率与帧率设置。系统支持多镜头并发批量渲染，完成后可一键无损高清下载成片。

CONTROLLABLE AVATAR

AI 数字人主播生成器

为商业场景制作专业 AI 主播视频。精准控制手势、克隆人声、规模化生产虚拟代言人视频，无需聘请演员或租用录音棚。

数字人口播动作与姿态预设模拟器

基于 3D 姿态骨架引擎，自由选择并预览丰富的数字人讲解手势与动作节奏。

⚡数字人 3D 骨架手势与姿态实时监视

1 / 7

👤

自然讲解 (Steady Explain)

30+ 预设微动作

涵盖自然讲解、手势互动、情绪表情、身体姿态、生理自然等 5 大分类的丰富动作库。

✦

高自由度肢体语言

数字人动作自动与语音叙述同步，或由系统根据语境自动选择最合适的动作。

✦

智能动作分配

无需手动指定每个动作，系统可根据内容自动添加符合当下场景的肢体表达。

可控数字人工作台核心控制与配置 (Workbench Real Controls)

在实际运行的演示生成工作台中，我们为您配备了完备的微调参数，确保数字人在朗读您的口播文本时，呈现最符合业务意图的姿态与动作：

🖼️

首帧人像生成与定位

支持 Prompt 文生首帧人像或本地上传照片。提供可视化拖拽定位与智能 Top-Weighted 构图裁切，精准锁定数字人姿态视角。

🎭

动作驱动与多镜头并发

收录常规、极限与舞蹈三大类 3D 骨架动作预设。支持勾选多个动作预设，一键启动 GPU 多线程并发生成多个分镜画面。

🎙️

人声克隆与音频驱动

支持零样本音频克隆或直接上传本地音频。通过文本智能生成精准对口型的音轨配音，完美匹配数字人嘴形与情绪。

⚙️

画面规格与渲染参数

自由配置 16:9/9:16 等画面比例与 720p/1080p 分辨率，灵活调节帧率与推理步数，平衡算力推理效率与画质呈现。

完整动作预置词典

以下动作均来自系统 3D 姿态预设库，每个预设包含精确的肢体骨骼序列与动作描述，由视频渲染引擎直接消费。

👤Natural explanation

Keep a relaxed smile and make restrained oral gestures while speaking.

GRP: coreREADY ⚡

👤emphasis

Raise your hand to emphasize an important point, and then return to a neutral explanation state.

GRP: coreREADY ⚡

👤Look at the teleprompter

Look down briefly at the prompt content, then look back at the camera.

GRP: coreREADY ⚡

👤drink water

Pick up the water glass and take a sip before continuing to talk.

GRP: coreREADY ⚡

👤sneeze

Sneeze gently, cover it with your hand and then recover.

GRP: coreREADY ⚡

👤clear throat

Clear your throat with a slight cough and continue oral broadcasting.

GRP: coreREADY ⚡

👤laugh briefly

Laugh briefly at something funny.

GRP: coreREADY ⚡

👤Adjust your glasses

Adjust your glasses before continuing.

GRP: coreREADY ⚡

👤tidy hair

Arrange the hair around your ears and continue.

GRP: coreREADY ⚡

👤rub temples

Briefly rub your temples, as if thinking.

GRP: coreREADY ⚡

👤shrug

Shrugging to express frustration, then continue explaining.

GRP: coreREADY ⚡

👤Spread your hands to explain

Spread your hands slightly to explain.

GRP: coreREADY ⚡

👤list two points

Point out the first and second points with your hands.

GRP: coreREADY ⚡

👤opening greeting

Face the camera and wave hello naturally.

GRP: coreREADY ⚡

👤Wave at the end

Finish with a smile and a wave.

GRP: coreREADY ⚡

👤Toggle hand clap

Clap your hands lightly, as if switching to the next point.

GRP: coreREADY ⚡

👤look at side screen

Look briefly to the side, then back to the camera.

GRP: coreREADY ⚡

👤nod continuously

Continuous nodding indicates approval or confirmation.

GRP: coreREADY ⚡

👤Shake your head gently

Shake your head in the negative and continue explaining.

GRP: coreREADY ⚡

👤surprise reaction

A brief expression of surprise.

GRP: coreREADY ⚡

👤Hold back a yawn

He briefly suppressed a small yawn and then recovered.

GRP: coreREADY ⚡

👤sniff

Sniff lightly and continue as if nothing happened.

GRP: coreREADY ⚡

👤Cover your mouth and cough lightly

Cover your mouth with your hand and cough lightly before continuing.

GRP: coreREADY ⚡

👤lean forward

Lean forward slightly, as if to emphasize the point.

GRP: coreREADY ⚡

👤Lean back slightly

Lean back briefly to relax, then return to the teaching position.

GRP: coreREADY ⚡

👤Touch your chin and think

Touch your chin as if thinking about what to say.

GRP: coreREADY ⚡

👤Thumbs up

Give a brief thumbs up to express a recommendation.

GRP: coreREADY ⚡

👤Point your index finger up as a reminder

A raised index finger reminds the audience of a key point.

GRP: coreREADY ⚡

👤quotes with both hands

Make a two-handed quote motion.

GRP: coreREADY ⚡

👤Put your hands on your chest

Put your hands on your chest to express sincerity.

GRP: coreREADY ⚡

👤deep breath adjustment

Take a slight deep breath before continuing.

GRP: coreREADY ⚡

👤Move your neck

Move your neck slightly and continue.

GRP: coreREADY ⚡

👤tidy collar

Arrange your collar or neckline.

GRP: coreREADY ⚡

👤listen carefully

It's like listening to the field control feedback, then nodding to continue.

GRP: coreREADY ⚡

👤Cross your arms and think

He folded his arms briefly to think and then unfolded them again.

GRP: coreREADY ⚡

👤blink confused

He looked a little confused and then relieved.

GRP: coreREADY ⚡

👤Tilt your head and smile

He tilted his head slightly and smiled before returning to his straight position.

GRP: coreREADY ⚡

* 后端内置 30+ 种动作指令且持续支持更多预设，可通过提示词灵活驱动肢体动作。

数字人生成与渲染流水线

基于全链路 AI 视频生成架构。从口播台词解析、声音克隆与音轨对齐，到 GPU 算力集群多线程并发渲染，实现高效无损的成片输出。

01SCRIPT & PROMPT

口播台词与脚本输入

输入数字人期望朗读的口播台词文本，或提供首帧人像生成 Prompt 提示词。系统将自动解析文本语义与语音时间轴分段。

02VOICE & LIP-SYNC

人声克隆与对口型合成

基于零样本人声克隆或内置精选音色生成高保真配音，并通过对口型对齐算法，将语音特征毫秒级映射至唇部与面部肌肉轨迹。

03POSE & GPU RENDER

3D 姿态驱动与 GPU 并发渲染

将所选 3D 骨架手势姿态与人声音轨高维合流，并发调度 GPU 渲染集群，批量生成动作过渡极其平滑、自然的数字人视频成品。

分步指南

如何生成一支 AI 主播视频

Cuevo 把一张肖像、一段脚本、几个动作标签编译为完整渲染的虚拟代言人视频。按下面五步走，几分钟内交付首支成片。

Upload Presenter Image
PNG, JPG up to 20MB
1
载入肖像首帧
上传一张半身肖像照，或描述你想要的形象，让文生图引擎为你生成首帧画面。
GESTURE PRESETS
Stable Presentation
Expressive Hand Gestures
Cheerful Dance Motion
2
勾选驱动动作
在手势、极限、舞蹈预设库中多选 —— 每个勾选的动作都会作为独立的并行渲染任务。
Text-to-Speech
Type script
Voice Clone
Use voice profile
Audio File
Upload WAV/MP3
3
选择配音方式
输入脚本并选择内置 TTS 音色、绑定一份克隆音色样本，或直接上传你的 WAV/MP3 音轨。
ASPECT RATIO
Landscape 16:9
Portrait 9:16
INFERENCE STEPS20 Steps
4
调节输出参数
设置画面比例、分辨率、帧率与推理步数，在渲染速度与画质之间取得平衡。
BATCH GENERATORCompleted
Stable
Gesture
5
一键批量生成并下载
点击 Batch Generate —— 每个选定动作渲出一支 MP4。在画廊预览每个版本，下载满意的成片。

常见问题

了解 Cuevo AI 如何驱动数字人姿态生成、声音克隆配音以及批量视频渲染。

AI 数字人主播生成器

数字人口播动作与姿态预设模拟器

自然讲解 (Steady Explain)

30+ 预设微动作

高自由度肢体语言

智能动作分配

可控数字人工作台核心控制与配置 (Workbench Real Controls)

首帧人像生成与定位

动作驱动与多镜头并发

人声克隆与音频驱动

画面规格与渲染参数

完整动作预置词典

数字人生成与渲染流水线

口播台词与脚本输入

人声克隆与对口型合成

3D 姿态驱动与 GPU 并发渲染

如何生成一支 AI 主播视频

载入肖像首帧

勾选驱动动作

选择配音方式

调节输出参数

一键批量生成并下载

常见问题

Q.如何为数字人选择和驱动不同的动作姿态？

Q.系统是如何实现声音克隆与高精准对口型的？

Q.生成的数字人视频支持哪些画质规格与批量导出？

AI 数字人主播生成器

数字人口播动作与姿态预设模拟器

自然讲解 (Steady Explain)

30+ 预设微动作

高自由度肢体语言

智能动作分配

可控数字人工作台核心控制与配置 (Workbench Real Controls)

首帧人像生成与定位

动作驱动与多镜头并发

人声克隆与音频驱动

画面规格与渲染参数

完整动作预置词典

数字人生成与渲染流水线

口播台词与脚本输入

人声克隆与对口型合成

3D 姿态驱动与 GPU 并发渲染

如何生成一支 AI 主播视频

载入肖像首帧

勾选驱动动作

选择配音方式

调节输出参数

一键批量生成并下载

常见问题

Q.如何为数字人选择和驱动不同的动作姿态？

Q.系统是如何实现声音克隆与高精准对口型的？

Q.生成的数字人视频支持哪些画质规格与批量导出？