⌘Ctrlk

Speech Recognition

Introduction

Speech Recognition, often called Automatic Speech Recognition (ASR) or Speech-to-Text (STT), is the capability of a machine or program to identify words spoken aloud and convert them into readable text.
The sound itself is actually a wave

Feature Extraction

Firstly, convert the original signal into digital format
Then, divide digital audio into different frames and extract different signal for each frames

To identify the pattern and feature of each frame to come up with correct phonemes (The sound unit)

WER & CER

WER and CER are standard to recognize the accuracy of speech recognition

PreviousAI Ethics NextCrawl4AI

Last updated 25 days ago