Voice Recognition For Mac

Control your computer by voice with speed and accuracy. Dragon speech recognition software is better than ever. Talk and your words appear on the screen. Say commands and your computer obeys. Dragon is 3x faster than typing and it's 99% accurate. Master Dragon right out of the box, and start experiencing big productivity gains immediately.

If you use a speech recognition program such as Nuance's Dragon Dictate () or the new Dragon Express to dictate to your Mac, you may be using the default headset provided with the software, or you may have picked up some other kind of microphone to use for dictation. While the headset that Nuance includes with Dragon Dictate is acceptable, if you want to get better accuracy and use a comfortable microphone for speech recognition, it's worth looking at the many different types of mics available.

Here's an overview of the different types of microphones you can use with speech recognition software, how they work, and what might be the best mic for the way you work.

Three types of microphones

There are three types of microphones that you can use with speech recognition software. The most common type of microphone is a USB headset. Nuance includes one of these with boxed versions of Dragon Dictate. While the headset they provide is acceptable, there are many headsets that are much better.

The second type of microphone is wireless. There are two technologies for wireless: Bluetooth and DECT. Each offer different advantages and disadvantages.

Finally, you can use a desktop microphone, which allows you to work without wires or without wearing the mic. (Technically, you can use your Mac's internal microphone, or the one in your monitor if you use an Apple display, but this won't give you the best results with speech recognition software.)

Good headsets may cost more than the actual speech recognition software you use, but if you plan to spend a lot of time dictating, the amount of time you save using speech recognition software ensures that this investment will pay for itself very quickly.

Wired headsets: Basic tool for talking to a computer

Headsets are the most common type of microphone that people use to talk to a computer. You may be using a headset now to talk to friends or family via iChat or Skype, so using one for dictation won't seem very different. However, the type of headset you use to chat over the Internet is not at all what you need for speech recognition software.

First, consider the pros and cons of using a headset. On the plus side, headsets offer very good accuracy—speech recognition software will get more words right because the microphone connected to a headset is generally in a good position. On the other hand, headsets can be annoying to wear; they mess up your hair, and if you wear glasses, headsets press them against your head, and the wires that tether you to your computer can prevent you from moving around.

Speech recognition software is very sensitive to the quality of the voice the microphone pipes into your computer. This sound quality depends on the ambient noise in the environment in which you dictate. For this reason, headsets designed for speech recognition include noise canceling features that eliminate noises around you—be they the voices of your coworkers, the sounds of cars coming from an open window, or phones ringing in nearby offices.

For this reason, you cannot use just any headset for speech recognition. While Dragon Dictate may offer good accuracy with a cheap headset, you'll be spending a lot of time correcting mistakes, and it would probably be more efficient to type rather than dictate.

If you plan to do any serious dictation, you should look for headsets that are specifically designed for speech recognition. The $100 Plantronics Blackwire 435 is an interesting wired USB headset, with two separate earpieces that are connected together by wires, but that you can only wear using over–the–ear adapters. You can either use both of them if you want stereo sound, or just the one with the microphone boom if you're doing basic dictation or VoIP calls. This microphone is very light, and, in spite of of the fact that you have to wear it over your ear, is fairly comfortable. It also contains an in-line control device, allowing you to change the volume or mute the microphone whenever you want. However, while the accuracy is very good, there is a slight hiss in the earpiece, and a slight echo as my voice seems to come through the earpiece.

One company that makes headsets especially for speech recognition and use in noisy environments is theBoom C from UmeVoice. This $150 headset provides excellent accuracy, in part because it has a very long boom, the part of the headset that sticks out in front of your mouth with the actual microphone at its tip. While many headsets have a boom that positions the microphone near the corner of your mouth, headsets from theBoom have extra long booms, so the microphone is almost directly in front of your mouth. This headset offers excellent accuracy, but I found it to be one of the most uncomfortable headsets I have ever worn: it is hard plastic, and the shape doesn't fit well on my head. If you plan to use this type of headset, you should try it on first to see if you think you can wear it for a long time.

There are plenty of other headsets designed for speech recognition, and if you wish to use a wired headset, it's worth looking around to see which models are available.

Wireless headsets: Untether yourself

While wired headsets offer excellent accuracy, they keep you tied to your computer. That long, sinuous cable, that gets tangled whenever you reach for something at the far corner of your desk or knocks over your coffee cup, can be an annoyance. In addition, some people like to move around while they dictate; I like to stand up, pace in my office, and have nothing forcing me to remain seated at my desk. After all, one of the reasons to use dictation software is so you don't have to keep your hands on your keyboard.

As I mentioned previously, there are two types of wireless technology: Bluetooth and DECT. The former is commonly used for those tiny earpieces that people use with cell phones. Because of the way Bluetooth works and the way Bluetooth earpieces are designed, they don't offer good accuracy with speech recognition software. The frequency range of Bluetooth is somewhat limited, and Bluetooth earpieces are very short and their booms don't reach anywhere near the corner of your mouth. When I tested the Plantronics Voyager, a Bluetooth earpiece that Nuance used to provide with Dragon Dictate (Nuance now offers the Plantronics Calisto), the sound quality was poor and there was interference coming into my ear.

On the other hand, DECT technology offers clear advantages for use with speech recognition software. Plantronics' $280 Savi 440, a DECT headset, offers wideband audio, an extended frequency range, and a noise canceling microphone to provide much better quality audio than Bluetooth devices. In addition, DECT technology offers superior range than Bluetooth. You probably won't dictate 100 feet from your Mac, but you could with this headset; Bluetooth, however, is limited to around 30 feet, and even then, the reception isn't ideal. This headset also offers three different ways to wear it: a standard, over–the–head headband, with a cushion on the earpiece; a behind–the–neck headband; and an over–the–ear earpiece. I found the latter to be uncomfortable, and the behind–the–neck headband pressed against my glasses, causing irritation. In the end, the standard over–the–head headband turned out to be the most comfortable, and this microphone is so light that I barely notice it.

While accuracy is very good with this microphone, it is slightly inferior to a standard wired headset that has a boom closer to the front of the mouth. The Savi 440 is well-designed, however, with a longer boom than what you are used to seeing on a wireless earpiece; the boom almost reaches the corner of my mouth. Since the quality of the microphone itself is so good, this is an excellent microphone for speech recognition in a quiet environment.

Desktop microphones: Comfort and freedom

If you don't want to wear a mic, then a desktop microphone might be for you. The major disadvantages to a desktop mic is that it needs to be more or less in front of your mouth, and if you turn your head or stand up, recognition will suffer. But you are free from wires and annoying devices that mess up your hair, press against your glasses, or irritate your ears.

I tried several desktop microphones, and two models stood out. Blue Microphones' $150 Yeti is a very large, old-fashioned type of microphone, which is designed for making recordings on a computer. While not specifically designed for speech recognition, the Yeti offers excellent sound quality and works quite well with Dragon Dictate. If you choose the cardioid setting, the mic picks up sound in front rather than all around it, ensuring that background noise from behind the microphone is not picked up. In my tests, the Yeti offered very good accuracy, but given the size of the microphone, it can get in the way. If you plan to do other types of recordings in addition to dictation (such as podcasts), this is an excellent microphone that will allow you to do both.

For a desktop microphone that is both accurate and doesn't get in the way, SpeechWare's $279 USB 3-in-1 TableMike is certainly one of the best available. With wideband audio and noise cancellation, this microphone is designed specifically for speech recognition, and you can set it on your desk with the tip of the microphone more than a foot from your mouth and get excellent accuracy. The standard version of this microphone comes with a 15-inch flexible boom; I found this to be just a bit too short, requiring the microphone's base to be too close to my keyboard. The company also offers an optional telescopic boom that extends to 19 inches; I found this length to be ideal, allowing me to move the base just far enough away for it to be practical on my desk.

The disadvantage to desktop microphones is that there is a sweet spot for getting good recognition. You can turn your head a bit, and it will still work very well, but if you want to slide over to the side of your desk, or turn to the side to look at something, say, on a table next to your desk, then you either have to move the microphone or turn back to dictate.

The best microphone for you

In this overview, I have discussed the three different types of microphones that work well with speech recognition software. Each user will have different needs and imperatives, and you should consider these carefully before investing in an expensive microphone. Ideally, you should make sure that you can return your purchase if it doesn't suit you.

For me, wearing a headset is an annoyance. Being tied to my computer by a wire is exactly what I don't want if I'm using speech recognition software to dictate to my Mac. However, I work in a home office with little background noise. If you work in a busy office with lots of people chattering and phones ringing around you, you may need a headset because the position of the microphone boom in front of your mouth will ensure that the noise canceling blocks out all that ambient sound.

Wireless microphones are wonderful, especially the Plantronics Savi 440 that I tested, but since they work on batteries, you have to make sure that they stay charged. (The Savi 440 has a charging base, and you can buy additional batteries to switch when you need.) Some of them can be very uncomfortable, especially if they just hook over your ear. Wearing them for long periods of time can be annoying.

The epiphany that I had when testing all these microphones was discovering that a good desktop microphone such as the TableMike offers numerous advantages. The accuracy of this microphone, even at a distance of around 12 inches from my mouth, is as good as any headset; in a quiet environment, even 18 inches is fine. Also, with this microphone on my desk, I don't need to reach for a headset and put it on my head if I only want to dictate a paragraph or two, such as to reply to an email. I can keep the mic handy, with its flexible boom in a vertical position, then, if I want to dictate something, bend the boom, activate the microphone in Dragon Dictate, and start talking. This microphone is even good enough that I can lean back in my chair and dictate in a comfortable position; I'm not locked into a rigid position as with other desktop microphones.

There are literally hundreds of microphones that you can use speech recognition. They fit into the three families have described here, and prices range from $50 to several hundred dollars. Given the difference in quality among these microphones, you definitely get what you pay for. If you plan to use dictation software frequently, you should seriously think of investing in a good microphone.

[Senior contributor Kirk McElhearn writes about more than just Macs on his blog Kirkville. Twitter: @mcelhearn Kirk is the author of Take Control of Scrivener 2.]

Note: When you purchase something after clicking links in our articles, we may earn a small commission. Read our affiliate link policy for more details.

Speech recognition software is available for many computing platforms, operating systems, use models, and software licenses. Here is a listing of such, grouped in various useful ways.

5Windows

Acoustic models and speech corpus (compilation)[edit]

The following list presents notable speech recognition software engines with a brief synopsis of characteristics.

Application name	Description	Open-source	License	Operating system	Programming language	Supported language, note	Offline or online
CMU Sphinx	HMM	Yes	BSD style	Cross-platform	Java	English	Offline
HTK	No	HTK specific	Cross-platform	C	English; version 3.5 released December 2015
Julius	HMM trigrams	Yes	BSD style, non-commercial	Cross-platform	C	Japanese, English; [2]	Offline
Kaldi	Neural net	Yes	Apache	Cross-platform	C++	English
RWTH ASR	RWTH Aachen University	No	RWTH ASR, non-commercial use only	Linux, macOS	C++	English

Macintosh[edit]

Application name	Description	Open-source	License
Dragon for Mac (discontinued 2018)	macOS; by Nuance	No	Proprietary
Dragon Dictate (discontinued)	macOS; by Nuance	No	Proprietary
MacSpeech Scribe (discontinued)	Transcription from recorded text; acquired by Nuance
iListen (discontinued)	PowerPC Macintosh; discontinued by MacSpeech; acquired by Nuance
Speakable items	Included with macOS
ViaVoice (discontinued)	IBM Product; acquired by Nuance
Voice Navigator	Original GUI voice control; 1989

Cross-platform web apps based on Chrome[edit]

The following list presents notable speech recognition software that operate in a Chrome browser as web apps. They make use of HTML5 Web-Speech-API.^[1]

Application name	Description	Open-source	License	Price	Note
Speechmatics^[2]	Cloud based and on-premise automatic speech recognition	No	Proprietary	From £0.06 per minute of audio

Mobile devices and smartphones[edit]

Many mobile phone handsets, including feature phones and smartphones such as iPhones and BlackBerrys, have basic dial-by-voice features built in. Many third-party apps have implemented natural-language speech recognition support, including:

Application name	Description	Open-source	License	Price	Note
Assistant.ai	Assistant for Android, iOS and Windows Phone	No	Proprietary, freeware	Free	Discontinued
Dragon Dictation	No	Proprietary, freeware	Free
Google Now	Android voice search	No	Proprietary, freeware	Free
Google Voice Search	No	Proprietary, freeware	Free
Microsoft Cortana	Microsoft voice search	No	Proprietary, freeware	Free
Siri Personal Assistant	Apple's virtual personal assistant	No	Proprietary, freeware	Free
Alexa – Amazon Echo	Amazon's personal assistant	No	Proprietary
SILVIA	Android and iOS	No
Vlingo

Windows[edit]

Windows built-in speech recognition[edit]

The Windows Speech Recognition version 8.0 by Microsoft comes built into Windows Vista, Windows 7, Windows 8 and Windows 10.Speech Recognition is available only in English, French, Spanish, German, Japanese, Simplified Chinese, and Traditional Chinese and only in the corresponding version of Windows; meaning you cannot use the speech recognition engine in one language if you use a version of Windows in another language. Windows 7 Ultimate and Windows 8 Pro allow you to change the system language, and therefore change which speech engine is available. Windows Speech Recognition evolved into Cortana (software), a personal assistant included in Windows 10.

Add-ons for Windows 7 speech recognition[edit]

Voice Finger – software for Windows Vista and Windows 7 that improves the Windows speech recognition system by adding several extensions to accelerate and improve the mouse and keyboard control.

Windows 7, 8, 10 third-party speech recognition[edit]

Braina – Dictate into third party software and websites^[3], fill web forms and execute vocal commands.^[4]
Dragon NaturallySpeaking from Nuance Communications – Successor to the older DragonDictate product. Focus on dictation. 64-bit Windows support since version 10.1.
SpeechMagic – Nuance Communications acquired Philips owned. Medical industry focus according to Frost & Sullivan. Standalone or embedded.^[5]
Tazti – Create speech command profiles to play PC games and control applications – programs. Create speech commands to open files, folders, webpages, applications. Windows 7, Windows 8 and Windows 8.1 versions.^[6]

Windows XP or 2000 only[edit]

Microsoft Speech API – Speech recognition functionality included as part of Microsoft Office and on Tablet PCs running Microsoft Windows XP Tablet PC Edition. It can also be downloaded as part of the Speech SDK 5.1 for Windows applications, but since that is aimed at developers building speech applications, the pure SDK form lacks any user interface, and thus is unsuitable for end users.

Built-in software[edit]

Microsoft Kinect includes built-in software which allows speech recognition of commands.
Older generations of Nokia phones like Nokia N Series (before using Windows 7 mobile technology) used speech-recognition with family names from contact list and a few commands.
Siri, originally implemented in the iPhone 4S, Apple's personal assistant for iOS, which uses technology from Nuance Communications.
Cortana (software), Microsoft's personal assistant built into Windows Phone and Windows 10.

Interactive voice response[edit]

The following are interactive voice response (IVR) systems:

Genesys^[7]
HTK – copyrighted by Microsoft, but allows altering software for licensee's internal use
LumenVox ASR
Tellme Networks; acquired by Microsoft

Unix-like x86 and x86-64 speech transcription software[edit]

Janus Recognition Toolkit (JRTk)^[8]^[9]

Discontinued software[edit]

IBM ViaVoice – Embedded version still maintained by IBM.^[10] No longer supported for versions above Windows Vista.^[11] Untested above macOS 10.4 or on Macintoshes with an Intel chipset.^[12]
Quack.com; acquired by AOL; the name has now been reused for an iPad search app.
SpeechWorks from Nuance Communications.
Yap Speech Cloud – Speech-to-text platform acquired by Amazon.com.

References[edit]

^'Web Speech API Specification'. dvcs.w3.org. Archived from the original on 2016-06-21.Cite uses deprecated parameter |dead-url= (help)
^Orlowski, Andrew. 'Total recog: British AI makes universal speech breakthrough'. The Register. Situation Publishing. Retrieved 17 May 2018.
^'Speech Recognition Software for Windows PC – Braina'. www.brainasoft.com. Archived from the original on 2015-04-07.Cite uses deprecated parameter |dead-url= (help)
^'Dynamic Faceting-List of Most 57 Speech Recognition SWs and Web Services'. Archived from the original on February 13, 2019. Retrieved February 23, 2019.Cite uses deprecated parameter |dead-url= (help)
^'Philips SpeechMagic named European Technology Leader by Frost & Sullivan'. www.frost.com. Archived from the original on 2008-04-15.Cite uses deprecated parameter |dead-url= (help)
^O'Neill, Mark (2013-11-06). 'Control your PC with these 5 speech recognition programs'. PC World. Archived from the original on 2014-01-01. Retrieved 2013-12-30.Cite uses deprecated parameter |dead-url= (help)
^'Interactive Voice Response'. Genesys. Archived from the original on 2016-10-14.Cite uses deprecated parameter |dead-url= (help)
^[1]^{[dead link]}
^Lavie, A.; Waibel, A.; Levin, L.; Finke, M.; Gates, D.; Gavalda, M.; Zeppenfeld, T.; Zhan, Puming (1 April 1997). 'Janus-III: speech-to-speech translation in multiple languages'. 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing. IEEE Xplore. 1. pp. 99–102. CiteSeerX10.1.1.36.6967. doi:10.1109/ICASSP.1997.599557. ISBN978-0-8186-7919-3.
^'Archived copy'. Archived from the original on 2010-08-08. Retrieved 2010-06-29.Cite uses deprecated parameter |dead-url= (help)CS1 maint: archived copy as title (link)
^'Nuance product support for Microsoft Windows 7'. Nuance Communications, Customer Help. Retrieved 2019-03-16.
^'ViaVoice for Mac OS X on Intel Chipset'. Nuance Communications, Customer Help. Retrieved 2019-03-16.

Retrieved from 'https://en.wikipedia.org/w/index.php?title=List_of_speech_recognition_software&oldid=913769183'