Posey's Tips & Tricks
Testing Out Microsoft Word's New Transcription Capabilities
Brien has long been a loyal user of Dragon's dictation feature. How do Microsoft Word's latest transcription improvements compare to that old stalwart?
A couple of years ago, I took Microsoft Word's native dictation capabilities for a test drive. I had been using Dragon NaturallySpeaking for many years to dictate the books and articles that I write, and was curious how Word's native dictation capabilities would compare. My conclusion: Word managed to accurately recognize most of my spoken words, but punctuation was problematic.
Lately, I have heard that Word's dictation capabilities have improved. At the same time, I also noticed that Microsoft recently added a Transcribe feature that makes it possible for Word to automatically create a written transcript of an audio file. Right now, this feature is only available in Word on the Web, but I'm sure that it will eventually make its way into the various Word apps. That being the case, I wanted to find out how well the new transcription feature works.
Of course, before I could test the transcription feature, I needed an audio file to use in the transcription process. It just so happens that every article on Redmondmag.com -- including this one -- offers the option to listen to audio of the article. You will notice the "Click to Listen to This Article" link in Figure 1 below (and also at the top of this page). I decided to create an .MP3 version of one of my articles, then use it to compare Word's transcription capabilities to those of Dragon NaturallySpeaking.
In the interest of time, I limited my test to the article's first two paragraphs. In case you are wondering, the text-to-speech engine correctly pronounced everything in the article's header and in those first two paragraphs. That includes my name, which text-to-speech engines always seem to get wrong. All of this is to say that the audio recording was of good quality and I don't have any qualms about using it to test Word's transcription capabilities.
Prior to performing the test, I assumed that Word was probably going to do a much better job of transcribing the file than Dragon would. There are two reasons. First, Dragon NaturallySpeaking requires you to verbalize the punctuation that you want to use. An audio representation of one of my articles certainly doesn't include any verbalized punctuation. Dragon does have an option to automatically add commas and periods, as shown in Figure 2, but I have never tried using this feature before.
The other reason I though that Dragon might not do such a good job is because the software has been trained based on my own unique speech patterns. As such, it is not optimized for the synthetic speech in my sample file (but then again, neither is Word).
I started with testing Dragon NaturallySpeaking. Here is the text that Dragon produced:
Posey's moonshot. What's it like to do IT work in space between zero gravity and the cumbersome spacesuit every day IT tasks like connecting some power and data cables suddenly become a lot more complicated by Brian Posey May 29, 2020 whenever I attend an IT conference. There are two questions that I always get asked by attendees first and perhaps most predictably I get asked what it's like to train to go to space, the second question is how in the world. I went from working in IT to training to be a commercial astronaut. The funny thing is, I often ask myself the same question to this day I continue to work in both IT and astronautics in many ways the two careers could not be more different. I can't seem to ever recall doing an IT project that ended with a parachute jump into the water or fighting off crushing G forces in an effort to avoid
I'm not going to cover every issue in the above text because I want to get on with seeing what Word can do. However, Dragon lost all of the paragraph breaks and eliminated the last two words after "avoid" (which were, incidentally "passing out"). Additionally, Dragon's punctuation can best be described as random and sporadic. Even so, Dragon did accurately recognize the words that were being spoken.
So what about Word on the Web? You can see the transcription option in Figure 3. Like Dragon NaturallySpeaking, Word simply asked me to upload the audio file. There was no option to insert (or omit) punctuation. Rather than simply pasting the text as I did with the text created by Dragon, let me give you a screen capture instead:
Like Dragon, Word seems to have had no problem recognizing the spoken words. Even though the resulting text is not exactly in paragraph form, it is far easier to read than the text produced by Dragon. Likewise, the punctuation is far from perfect, but still much better than what Dragon produced.
Interestingly, both applications cut off the last two words, leading me to suspect that the audio file ended too abruptly.
In any case, Word's transcription feature does seem to be a viable option for those who wish to transcribe an audio recording.
Brien Posey is a 20-time Microsoft MVP with decades of IT experience. As a freelance writer, Posey has written thousands of articles and contributed to several dozen books on a wide variety of IT topics. Prior to going freelance, Posey was a CIO for a national chain of hospitals and health care facilities. He has also served as a network administrator for some of the country's largest insurance companies and for the Department of Defense at Fort Knox. In addition to his continued work in IT, Posey has spent the last several years actively training as a commercial scientist-astronaut candidate in preparation to fly on a mission to study polar mesospheric clouds from space. You can follow his spaceflight training on his Web site.