Speech simulation
Overview
The ZegoAvatar SDK provides the speech simulation
feature. Based on the sound waves of voice, this feature drives a virtual avatar to change its mouth shapes in real time. In this case, a virtual avatar can express its emotions like a real person.
This feature can be widely used in social interaction and voice chat live streaming scenarios.
Prerequisites
Before you implement the speech simulation
feature, ensure that the following conditions are met:
- Integrate the ZegoAvatar SDK into your project. For more information, see Integrate the SDK.
- Create a basic virtual avatar by referring to Create a virtual avatar.
Implementation steps
Refer to the following procedures to implement the speech simulation
feature.
1. Start voice detection
- Before you start the voice detection, ensure that the microphone permission is enabled.
- If the ZegoCharacterHelper class is used, no APIs related to IZegoCharacter need to be called.
After the basic virtual avatar is created, call the startDetectExpression API, set the drive mode to Audio
, and use the microphone to detect sound waves. Then, call the setExpression
API of ZegoCharacterHelper to set facial expressions and drive mouth shape changes of the virtual avatar.
// Start voice detection.
ZegoAvatarService.getInteractEngine().startDetectExpression(ZegoExpressionDetectMode.Audio,expression -> {
// Drive mouth shape changes of the virtual avatar.
mCharacterHelper.setExpression(expression);
});
2. Customize audio collection
You can call the setCustomAudioDelegate API to customize the audio data collection agent. (You need to inherit AudioDataDelegate to implement the onStart and onStop methods.) After audio data is collected, call the sendAudioData API to send data.
public class CustomAudioDataDelegate extends AudioDataDelegate {
private Boolean isRunning = false;
@Override
public void onStart() {
// Start audio collection.
isRunning = true;
}
@Override
public void onStop() {
// Stop audio collection.
isRunning = false;
}
// This is the audio pre-processing callback of Express. Send data collected by Express to the ZegoAvatar SDK.
public void onProcessCapturedAudioData(ByteBuffer data, int dataLength, ZegoAudioFrameParam param) {
if(isRunning){
// Specify the audio data.
// data: The source PCM data.
// length: The data length.
// dataType: The number of data bits collected. 0 indicates 16 bits, and 1 indicates 8 bits.
// timeStamp: The timestamp from the collection start time to the current time, in milliseconds.
super.sendAudioData(data, dataLength, 1/* If data provided by RTC is 8 bits, enter 1.*/, super.getDurationMs()/* This is a parent method and can be called directly.*/) ;
}
}
}
3. Stop voice detection
Run the application in the backend or exit the current page. Call the stopDetectExpression API to stop voice detection.
// Stop voice detection.
ZegoAvatarService.getInteractEngine().stopDetectExpression();