Speech recognition (in many contexts, also known as automatic speech recognition, computer speech recognition or voice recognition) is the process of converting a speech signal to a set of words, by means of an algorithm implemented as a computer program. Speech recognition applications that have emerged over the last years include voice dialing (e.g., Call home), call routing (e.g., I would like to make a collect call), simple data entry (e.g., entering a credit card number), and preparation of structured documents (e.g., a radiology report).
Defining the Problem
According to
"Survey of the State of the Art in Human Language Technology (1997) by Ron Cole et all" Speech recognition is the process of converting an
acoustic signal, captured by a microphone or a telephone, to a set of words. The recognized words can be the final results, for such applications as commands & control, data entry, and document preparation. They can also serve as the input to further
linguistic processing in order to achieve text formating or
speech understanding.
Speech recognition systems can be characterized by many parameters as in the table below.
| Parameters
| Range
|
| Speaking Mode
| Isolated words to continuous speech
|
| Speaking Style
| Read speech to spontaneous speech
|
| Enrollment
| Speaker-dependent to Speaker-independent
|
| Vocabulary
| Small (< 20 words) to large (> 20,000 words)
|
| Language Model
| Finite-state to context-sensitive
|
| Perplexity
| Small (< 10) to large (> 100)
|
| SNR
| High (> 30 dB) to low (< 10 dB)
|
| Transducer
| Voice-cancelling microphone to telephone
|
An isolated-word speech recognition system requires that the speaker pause briefly between words, whereas a continuous speech recognition system does not. Spontaneous, or extemporaneously generated, speech contains disfluencies and is much more dificult to recognize than speech read from script. Some systems require speaker enrollment (a user must provide samples of his or her speech before using them) whereas other systems are said to be speaker-independent, in that no enrollment is necessary. Some of the other parameters depend on the specific task. Recognition is generally more difficult when vocabularies are large or have many similar-sounding words. When speech is produced in a sequence of words,
language models or artificial
grammars are used to restrict the combination of words. The simplest language model can be specified as a
finite-state network, where the permissible words following each word are explicitly given. More general language models approximating natural language are specified in terms of a
context-sensitive grammar.
One popular measure of the difficulty of the task, combining the
vocabulary size and the language model, is
perplexity, loosely defined as the
geometric mean of the number of words that can follow a word after the language model has been applied. In addition, there are some external parameters that can affect speech recognition system performance, including the characteristics of the environmental
noise and the type and the placement of the microphone.
More on
[ Speech recognition ]
Linux :: Speech Technology
Speech Recognition - Twitter Search@jeepersmedia http://twitpic.com/onugs - I use a speech recognition program called Dragon naturally speaking to control everything on my comnormsantoro (norm santoro) Sat, 07 Nov 2009 21:57:29 -0000
@jeepersmedia http://twitpic.com/onugs - I use a speech recognition program called Dragon naturally speaking to control everything on my com
Voice recognition and speech-to-text navigation on the Droid is pretty cool, though the pronunciation of F'laudrdale" is funny.pbarbanes (pbarbanes) Sat, 07 Nov 2009 20:29:37 -0000
Voice recognition and speech-to-text navigation on the Droid is pretty cool, though the pronunciation of F'laudrdale" is funny.
dan barry we will have natural language speech recognition, you can count on it. #singularityugeraldhuff (Gerald Huff) Sat, 07 Nov 2009 19:55:27 -0000
dan barry we will have natural language speech recognition, you can count on it. #singularityu
Speech Recognition Software « best equity loan http://bit.ly/2v7Bpasmazraani (Sassine Mazraani) Sat, 07 Nov 2009 19:30:06 -0000
Speech Recognition Software « best equity loan http://bit.ly/2v7Bpa
HOODBILLI TWITTER 5.0 speech recognition. no need to type. its called. MOUTHER. LMAO. am a pc and MOUTHER is my idea. @MsRoyalty @czurebeatshoodbilli (Royale Alagoa) Sat, 07 Nov 2009 19:27:34 -0000
HOODBILLI TWITTER 5.0 speech recognition. no need to type. its called. MOUTHER. LMAO. am a pc and MOUTHER is my idea. @MsRoyalty @czurebeats
My attempt at using speech recognition software: http://bit.ly/1rHCpO "Go on then, enlighten in, and the pain of different," he says. WTF?!ellefie (Elle) Sat, 07 Nov 2009 19:18:24 -0000
My attempt at using speech recognition software: http://bit.ly/1rHCpO "Go on then, enlighten in, and the pain of different," he says. WTF?!
Subscribe to Speech_Recognition RSS feed 
CVoiceControl - CVoiceControl is a speech recognition system that allows the user to connect spoken commands to Unix commands.
FreeSpeech - Free Speech Recognition for Linux - Openmind (Freespeech) is a free speech recognition project for Linux It will be designed so that it can be easily integrated into any application or windowmanager as well as the kde and gnome desktop environments
404
IBM ViaVoice SDK for Linux - the ViaVoice Kit provide the necessary tools to develop applications that incorporate speech recognition using Linux
Meta Description: [ The page you requested cannot be displayed (HTTP response code 404) ]
The Festival Speech Synthesis System - Festival is a general multi-lingual speech synthesis system developed at CSTR. It offers a full text to speech system with various APIs, as well an environment for development and research of speech synthesis techniques.
The MBROLA PROJECT - Multi-lingual text to speech synthesis. Free software download for research purposes.
Viavoice Mailing List Official Website - Archive of messages on the Viavoice mailinglist