Documentation
ExpressVideoSDK Video Call
Documentation
Demo APP
SDK Center
API Center
FAQ
Code Market
Console
Sign Up
Log In
中文站 English
  • Documentation
  • Video Call
  • Upgrade using advanced features
  • Distincitve features
  • AI-based noise reduction in real-world scenarios

AI-based noise reduction in real-world scenarios

Last updated:2024-09-09 18:34

Scene-based AI noise reduction refers to the real-time automatic recognition of different scenes and intelligent adjustment of AI noise reduction strategies to provide the best noise reduction and audio quality effects. Currently, two common noise reduction scenes are supported:

  • In the call scene, all sounds except human voices are recognized as noise and eliminated. Based on the elimination of steady-state noise (please refer to Audio 3A processing for details), non-steady-state noise is effectively eliminated to achieve high-fidelity human voice, including noise such as mouse, keyboard, tapping, air conditioning, kitchen utensils, noisy restaurants, environmental noise, coughing, blowing, and non-human voice noise, as well as reverberation of voices in small rooms.
  • In the music scene, the noise reduction effect is automatically adjusted to restore the music sound quality. Real-time music detection is performed on the microphone input, and in the sound card, singing, or near-field music scenes, the noise reduction level is automatically adjusted to ensure high-fidelity music sound quality.
  • Before using the AI noise reduction function, please contact ZEGOCLOUD technical support for special packaging.
  • Starting from version 3.0.0, ZEGO Express SDK supports intelligent recognition of music scenes. In the music scene, AI noise reduction can automatically reduce the noise reduction level and improve the audio quality experience. If you need to use this function, please contact ZEGOCLOUD technical support for special packaging and configuration.

Advantages

  • 80% of the noise can be eliminated.
  • Low latency.
  • Low memory usage, similar to traditional noise reduction.
  • Low CPU usage.
  • Music scene recognition accuracy reaches 99%.

Use cases

This feature is suitable for 1v1 or multi-person audio and video call scenes such as voice chat rooms, meetings, and voice gaming, as well as live streaming or online KTV scenes with sound cards, singing, and near-field music.

To enable music scene recognition, please turn on the music detection switch and contact ZEGOCLOUD technical support to configure the music detection function.

Noise that can be eliminated

Developers can use this feature to eliminate the following noises:

Scene Typical Noises
Meeting Room
  • Keyboard sound
  • Table tapping sound
Office
  • Keyboard sound
  • Colleague talking sound
Transportation
  • Car horn sound
  • Whistling sound of cars passing by
  • Car music sound
  • Rain sound and windshield wiper sound
Internet Cafe
  • Keyboard sound
  • Surrounding people talking sound
Coffee Shop
  • Chair dragging sound
  • Surrounding people talking sound
  • Sharp collision sound

效果展示

办公室

原始音频包含:鼠标点击声、键盘声、鼓掌声、摩擦声、办公室嘈杂声、空调声等。

您的浏览器不支持 audio 标签。

AI 降噪后:

您的浏览器不支持 audio 标签。

公共场所

原始音频包含:雨声、电车声、炒菜声、汽车呼啸声等。

您的浏览器不支持 audio 标签。

AI 降噪后:

您的浏览器不支持 audio 标签。

Prerequisites

Before implementing the AI denoising feature, please make sure:

  • A project has been created in ZEGOCLOUD Console and applied for a valid AppID and AppSign. For details, please refer to Console - How to view project information .
  • ZEGO Express SDK has been integrated into the project to implement basic real-time audio and video functions. For details, please refer to Integrate the SDK and Implement a basic video call.

Steps to use

Developers can follow the following steps to configure AI noise reduction:

  1. Please contact ZEGOCLOUD technical support to enable the music detection feature. If it is already enabled, please ignore this step.

  2. For the specific process of initialization and logging into the room, please refer to the implementation guide in the video call documentation for "Create engine" and "Join room".

  3. Call the enableANS interface to enable noise suppression. This feature can make the human voice clearer after it is enabled.

  1. After enabling noise suppression, developers can call the setANSMode interface to set the ANS mode and enable the AI denoising feature. The following are some AI denoising modes, for more modes please refer to ZegoANSMode.

    AI Denoising Mode Applicable Scenarios
    ZegoANSMode.AI
    Lightweight mode with low power consumption and package size, still providing good denoising effects. Suitable for indoor noise environments and relatively comfortable regions in China.
    ZegoANSMode.AI_BALANCED
    Balanced mode that completely eliminates noise while preserving the human voice without loss. Slightly increased power consumption. Suitable for complex communication environments such as outdoor markets, transportation, and regions with severe noise interference.
    ZegoANSMode.AI_LOW_LATENCY
    Low latency mode that maintains pure denoising effects and high-fidelity voice quality even with a 10ms delay. Suitable for latency-sensitive scenarios such as game voice chat, game team communication, and real-time singing.
    // Enable ANS
    engine.enableANS(true);
    // Set AI noise suppression mode according to requirements. Note: After setting ANS mode to ZegoANSMode, ZEGO Express SDK will forcibly disable transient noise suppression [enableTransientANS]
    engine.setANSMode(ZegoANSMode.AI);
Page Directory
  • Free trial
  • 提交工单
    咨询集成、功能及报价等问题
    电话咨询
    400 1006 604
    Get Consulting
    Scan Wechat QR code