Wednesday, April 28, 2010

Windows speech registry hacking for fun and profit

I've been using Windows Speech Recognition for over a year now, with Vocola providing almost all of my custom commands and serving to smooth out some of WSR's rough edges. Learning to work well with a speech recognition system takes time, and the ability to customize the system is a critical element to productivity. To their credit, Microsoft provides several mechanisms for customizing the behavior of both the speech user interface and the speech recognition engine itself.

Several of the most useful customizations I've made involve fiddling with the registry to improve system performance and recognition accuracy. For example, I'm a longtime Firefox and Thunderbird user, and one of the more painful aspects of my transition to WSR was getting these applications to perform acceptably. By default, Windows Speech uses Active Accessibility for these applications, enabling you to select a Firefox tab or Thunderbird mailbox by saying all or part of its name. Unfortunately, the performance of these applications made them nearly unusable: FF started to bog down with only a few tabs open, and using Thunderbird was completely unproductive. The performance also disrupted macro behavior: the speech engine was constantly spinning, so macros would become erratic, sometimes even after I switched applications.

I'd guessed that the Active Accessibility was the problem: perhaps Firefox is exposing the text of all the hyperlinks in all open documents, not just the ones I can see right now. Whatever the cause, at some point the app would hit some sort of internal WSR threshold, after which performance would rapidly fall apart.

Nosing around in the registry, I stumbled across the following key:

HKEY_CURRENT_USER\Software\Microsoft\Speech\Preferences\AppCompatDisableMSAA

Hello! From the existing entries, this key appears to accept string values indicating applications that should have Active Accessibility disabled. I added two new string values ("firefox.exe" and "thunderbird.exe"), restarted everything, and BAM! Looks like my computer was built this century after all!

Of course, disabling Active Accessibility disables the "built in" voice support for menus, hyperlinks, tabs, folders, etc., but given the performance difference I think that's a tradeoff I can live with.

Another problem was dictation: while WSR's dictation in supported applications is generally fine, dictation into non-supported applications involves using the correction panel, a slow and tedious process. Vocola includes improved dictation support, and another WSR registry key provides an effective mechanism for forcing the speech system to use Vocola for dictation in an application:

HKCU\Software\Microsoft\Speech\Preferences\AppCompatDisableDictation

By adding an application entry here, you disable WSR dictation, which effectively *enables* Vocola dictation for the listed application.

(as with the previous key, add the name of the application as a string value)

Here is a post related to these keys.

Today I discovered another set of settings related to the microphone. Through trial and error, I found that I could increase recognition accuracy by adjusting the volume level of my microphone. WSR sets its own microphone level on startup and at certain other times (switching from off to on state?), leading to a cat-and-mouse fight between us over the volume control knob. After exploring some other options, I stumbled on this post by Marty Markoe explaining how to set the microphone settings used by Windows Speech.

Taken together, these settings vastly improve my productivity with Windows Speech. With them, I can tune the performance of the speech system globally and for individual applications. Combined with Vocola's powerful and easy to use macro capabilities and dictation support, these adjustments turn WSR into a powerful system for computing by voice.

3 comments:

  1. To find the software simply search for "speech recognition" in the start menu. I found the speech recognition to be very accurate after a couple of training sessions.speech recognition program

    ReplyDelete
  2. WADE, the advanced virtual assistant is a patent pending copyrighted software for WADE Program Inc. WADE, puts people in control of their PC. Our mission is to design a very intelligence machine that can help you in your everyday life. WADE Software Inc was founded on September 1st 2015. As a new corporation WADE strives to become the most influential speech recognition program on the market.
    voice recognition software

    ReplyDelete
  3. Join the world’s largest community of ethical hackers and start hacking today! Be challenged and earn rewarding bounties. Learn more! https://www.hackerone.com/for-hackers/how-to-start-hacking

    ReplyDelete