Reviewing the last 111 years, it is easy to check off how technology has reduced physical labor:
• Cars replaced walking and horses
• Planes went from dirigibles to propeller driven to jet engines
• Automatic transmissions replaced manual stick shifts
• Digital photography replaced film
• Remote control boxes replaced television rotary dials
Use of voice as an input device also has evolved from the early days Speech to text programs have been around since the 1980s. Voice to text software was launched by Covox in 1982 for the growing personal computer industry with the IBM PC in the lead. Another company founded in 1982, Dragon Systems, continues to be the leader in the speech recognition. Scansoft, Inc. now owns and manufactures their well-known product, Dragon Naturally Speaking.
Voice recognition is not just for getting documents created. For example:
• Cordless and cell phones introduced us to voice activated dialing
• GPS mapping and directions equipment allow for voice commands
• Cars have a growing number of voice activated requests
• Appliances all over the house and the office are emerging for everyone
An article about the voice control evolution appeared in the December 07, 2011 issue of BusinessWeek. The information points to all of the rumors about Apple TV, Microsoft Xbox 360 game console and the growing number of electronics vendors – Samsung, LG, Sharp and Sony, etc., gearing up to move from button and touchpad controls to voice command and control.
One of the current salvos being launched is Apple including interactive voice recognition, SIRI with their new iPhone 4s. Asking about the weather or the stock market or directions is standard stuff. Any actions that one can do with finger touch are potential for SIRI. You can say ‘send a text message’, then say the recipient’s name from the contact list, confirm which phone number, dictate the message and send.
This is pretty basic stuff. By 2013, voice commands will be everywhere. Saying words distinctly helps with today’s voice input. Alabama born and raised speaks very differently than one from Maine. So the advances in technology will enable tone and inflection differences. After all, we can discern when someone is speaking with a “happy voice” or an “angry voice. There is at least one project underway that will detect a person’s mood by verbal cues.
Today’s Siri and Xbox voice control are growing in use. The expectations are that Apple’s TV set will have voice command; New Windows Operating Systems for PCs and Xboxes will have gesture and voice control; and Google will implement voice activated search beyond what is accessible now. It is also clear that Google TV will return.
The consumer electronics companies will promote interactive TV talk through voice-enabled apps for smartphones and tablets. Xfinity/Comcast already has a downloadable app that provides for customer programming of the DVR. At the TV, remote control functions can be issued through the smartphone’s internet connection. Comcast is testing the addition of voice-control features. LG, Sony, Panasonic, Toshiba, Samsung and Sharp will all test similar apps.
Each family member will be able to set their own Voice commands to program show recordings, change channels, access the web. OF course, there could be the battle of the voice controls that will have to be managed by some responsible person, such as an adult. Those individuals who are push-button phobic will have a some speaking issues as they learn how to talk what version of CSI the actually want to record. As with all technology advances; it can be anticipated that the transition to the new will be easier for some, harder for others.
Nuance, maker of the popular Dragon dictation software suites is what Apple has used for Siri as well. It appears that many manufacturers have turned to them to help transform remotes instead of eschew them. Nuance’s Thompson says TV, DVD, and set-top box makers are all working on models that look more like iPhones, some with touchscreens rather than that gaggle of unused buttons. Some of the prototypes are designed around a single prominent button that activates a microphone, he says. Cost will be a challenge, since such a device would need a microphone and Wi-Fi antenna instead of the infrared sensors now commonly used.
Nuance has estimated that 5% of TVs could be voice controlled by Christmas 2012. Of course, there are several problems to solve, such as which command takes preference, and how they would distinguish commands from normal conversation. But, there’s hope. SRI International, the company that worked on Siri before spinning it off into a separate company, has been working on solutions. They’ve been working on a project that can discern people’s moods by verbal cues, something that may potentially be used to differentiate commands.
Mike Thompson from Nuance Communications continues to say that interactive remote controls will have touchscreens rather than buttons. This is similar to the Logitech Harmony series of remotes. For Harmony, there are several different screens that change the action of the button that is pressed. Vlingo, an App maker, introduced voice Apps for smartphones late last year. They are expected to announce a voice recognition product for TVs at CES 2012.
We have all marveled at Dick Tracy’s wrist radio. The TV series, Knight Rider, was all about a car that could act better than its human star. Robots have been demonstrated that respond to voice commands and conversation. SIRI on the 4s is a real world demonstration of voice interaction. The key is the capability for human to speak and machine to hear the same thing. If you tell your automobile’s GPS mapping device that you want to go to Las Vegas; be sure that the directions take you to Nevada rather than New Mexico.
People are getting used to seeing others walking around talking to the air that surrounds them. These are people with a Bluetooth headset that is synced with a smartphone that is connected to cellular tower that sends the signals out into cyberspace. It is not just messages and conversations. Voice will be used to open the garage door, turn on the house lights and start the oven warming up to 400 degrees. It will be a novelty for upscale users only at the beginning. Prices will drop quickly and more will be in use by the end of this decade.
Video cameras are expanding along every city streets and intersection. As voice technology advances, we will have embedded microphones in our house, office, cars and all ‘smart’ devices. Devices will be listening to jump into action just as Captain Kirk expressed his commands starting with the phrase; “Computer …”