What is Automated Speech Recognition?

Updated on Feb 03, 2026

This guide will explain the basics of Automated Speech Recognition (ASR) and related concepts. More information regarding Automated Speech Recognition can be found here.

What is Automated Speech Recognition (ASR)?

ASR is a technology that automatically recognizes spoken language and automatically generates text transcripts.

Do all my videos receive transcripts?

All new videos uploaded to the platform will automatically be transcribed. However, existing media items uploaded prior to enabling automatic transcriptions are not transcribed, and videos over four hours in length are not automatically transcribed. Videos can still be submitted for transcribing, or transcriptions can be manually uploaded.

How long does it take for my video to be transcribed?

Transcription time can take double the time of the video. For example, a 1 hour video may take up to 2 hours to release a transcript. Processing may take longer if there is high demand.

If a recording is longer than 4 hours, it will not be eligible for ASR Transcription. This is due to limits set by EchoVideo's web host. You can manually transcribe these videos and upload the VTT file to apply it to the media. Alternatively, you can edit longer videos into shorter segments, which will then be transcribed whenever the ASR feature is toggled (either upon saving after editing or when posted to a course).

What is the difference between transcripts and closed captioning?

Transcripts are a textual representation of spoken audion from a video. Closed Captions meet ADA requirements for accessibility and include contextual audio such as background noises. For example, take a scene of someone opening a door and greeting someone. A text transcript would only note the text, while closed captioning would include sounds such as a door opening, speaker identification, background noises, in addition to the spoken audio.

Can transcripts be edited?

Yes, transcripts can be edited within the Echo360 environment or downloaded as a text file for offline editing and reuploading.

Previous Article Viewing Live Captures

Next Article Editing Transcripts

Faculty Guides and Documentation

What is Automated Speech Recognition?