The Spoken Language Analysis module of Open Brain AI offers a variety of tools for speech-to-text and automatic analysis of transcribed texts concerning the different linguistic levels.
The first step in the Spoken Language Analysis module is transcription. Open Brain AI offers automatic transcription using an Automatic Speech Recognition (ASR) system. The ASR system used is Google Speech-to-Text, which is one of the most accurate commercial ASR systems available.
There are two options for dealing with background noise in the audio files:
If there is more than one speaker in the audio file, the Spoken Language Analysis module splits the transcription into separate transcripts for each speaker. This is useful for clinical settings, where you may want to analyze the speech of a patient and their clinician separately.
The Spoken Language Analysis module aligns the words in the transcription with the sound wave. This allows you to perform further acoustic analysis, such as measuring the duration of words.
The Spoken Language Analysis module subsequetly analyzes the grammar of the transcribed text. This includes analyzing the morphology, phonology, syntax, lexicon, and semantics of the text. A GPT3 large language model, which is a type of artificial intelligence that can understand and generate human language analyzes in combination both the text and the morphosyntactic measures to provide:
The Spoken Language Analysis module also provides acoustic analysis measures from the speech recordings. This includes analyzing the pitch, loudness, and the duration of the sounds.
Acoustic analysis can be used to identify abnormalities in speech production, such as those that occur in people with aphasia.
The Spoken Language Analysis module provides a variety of tools for speech-to-text and automatic analysis of transcribed texts concerning the different linguistic levels. These tools can be used by clinicians and researchers to assess the speech of people with speech, language, and communication disorders.