Days before the opening of this year's Paris Olympics, my colleague Alex Kirschner argued that the Olympics were a now-or-never moment for Peacock. The four-year-old NBC streaming service has lagged behind more established competitors in terms of popularity, customer satisfaction and market share. So naturally, NBC teased a slew of perks in the months leading up to the opening ceremony, including “multi-view” live streaming, a Snoop Dogg correspondent and NFL-style “Gold Zone” coverage.
But perhaps the most questionable use of artificial intelligence is a virtual likeness of veteran commentator Al Michaels, created with his permission, trained to sound exactly like the famous commentator, and designed to give every fan a personalized “Daily Recap of the Olympics.”
NBC has announced that it will use AI to recreate the voice of Al Michaels for custom recaps of the 2024 Olympics. pic.twitter.com/eX5aTFBVen
— Front Office Sports (@FOS) June 26, 2024
Admittedly, there was some skepticism about whether Peacock could deliver a decent facsimile of the beloved sportscaster at scale, especially as the much-touted generative AI fad continues to produce shoddy results. (Not to mention viewers being outraged by Google's Olympics-themed AI ads, which ultimately led the company to pull them from the air.) But Michaels himself has been impressed with NBC's early results, telling Vanity Fair's Tom Kurth in a voiceover: “It's been amazing, frankly. It's been great. And it's been a little scary.” Last week, The Washington Post's Shira Ovid similarly called the “Al Michaels” recap “surprisingly good.” So, as an avid Olympics watcher, I knew I had to not only evaluate this for myself, but also understand why it was working so well if the rumors were true. And they were, dear readers.
Courtesy of Nitish Pahwa
For the past week, I've accessed Peacock's “Your Daily Olympics Recap” every morning from my laptop (the feature isn't available on the TV app). To set it up, I had to enter my name, choose three categories of sports I was interested in, and select two “themes” (such as “Trending Moments”) for the clips I wanted to see repeated each day. I was a little disappointed to later learn that these customizations couldn't be changed, but I understood the limitations of a brand new app that has signed up “hundreds of thousands of users,” as Peacock's president told Fast Company. (Incidentally, Peacock's viewership for this year's Olympics is setting some ridiculous records.) The recap appears in a URL, appropriately enough, tagged with “voice-ai.”
Sponsored by Microsoft's “AI companion” Copilot, the feature contains 14-18 segments averaging one minute each, each terminated with an introduction and closing by “Al Michaels.” And I was amazed at how stable, smooth and glitch-free these segments were.
In the first sports category, three clips from the biggest events of the day popped up (produced by the NBC team and assembled here with the help of AI), with “Al Michaels” quickly and thoughtfully giving his opinion on everything from Malaysia's badminton victory to individual athletes' knee injuries, then giving way to a series of individual full-screen highlight reels of relevant sports and matches (no AI voice here, just moments and vibes), before repeating with another sports category and finally a collection of “must-see” or “buzzworthy” moments, briefly contextualized by the voice bot and farewelled with an outlook on what would be featured on NBC primetime that night.
There were a lot of fun things about this show: Al Michaels' voice didn't sound like a cheesy robot, I could watch clips of my favorite Olympic sports full screen whenever I wanted, I could switch back and forth between shows at my leisure, and it would start each day with a new show and let me know about any shows I missed the day before.
I noticed a few issues here and there, especially with the auto-generated subtitles (Do you know a “Noah Lyons”? Or anyone with “Thai parents” instead of “Thai parents”?). And sometimes Michaels felt a bit flatter than I expected. For example, while commentating on the France vs. Egypt soccer match, Michaels-Bot almost muttered that the game had a “much happier ending” for France than for Egypt. (A similar situation occurred when he mentioned that Germany's struggles in basketball provided “another exciting” development.) The clips are mostly thrilling, but sometimes disappointing. A “must-see” highlight of a swimming match only showed the competitor diving, while highlights of others breaking records in mountain climbing or athletics showed 30 seconds or so of the actual action, followed by 30 seconds or so of the celebration, with no more highlights of the actual sports.
Courtesy of Nitish Pahwa
Still, I was amazed at how quickly and seamlessly these entire presentations were put together, pulling from full hours of multiple sports, each day with neat summaries the next day. I knew from Peacock's marketing that generative AI was “used to recreate the voice of Al Michaels, trained on his past NBC appearances (with his approval), and also used to create intros and synopsis commentary for clips featuring the AI voice.” Finally, there's also an element of human review for final cleanup and editing. But I wanted to see how AI was fully utilized in every part of the equation: the commentary, the clips, the design.
I reached out to the folks at NBC to find out. They told me that just after the rounds finish for each day of the Olympics (meaning around 4 or 5 p.m. ET), a large language model goes to work, synthesizing all the information available on Peacock about that day's Olympics, including schedules, lineups, winners, and losers, to produce a short summary for the Al Michaels bot. “The way we work with large language models for text is by using a kind of 'prompt chaining,' breaking down very complex tasks into a series of smaller concatenated steps to achieve a more accurate, higher-quality output,” said John Jerry, Peacock's senior vice president of product and UX. “The main inputs include event metadata (what is the sport, who are the players) and pulling in information from the subtitles.” Human editors then edit the generated transcript to sound like the real Al Michaels would naturally say it, and proofread the text for typos, especially around the athletes' names.
Heather Schwedel
The Olympics don't usually have villains, but this year there certainly are.
read more
The voice of Al Michaels also employs a separate LLM, trained on Michaels' past broadcasts as well as being “enhanced” with the pronunciation of names and words expected to be used in this year's Olympics. The resulting audio will also undergo human review.
“Peacock's product and data science teams are further optimizing the voice's sound by adjusting a variety of variables, including breathing, tone, speed and intonation, to make it as close as possible to Al Michaels' voice,” Jerry said.
I left California because of the wildfires. Now my town is facing a whole different kind of disaster. The Democratic candidate is dead. Now we have the chaotic energy of the Internet. There's a fundamental reason for teenagers to be anxious, and it's not the phone. Google really keeps losing. The latest blow is the cruellest yet.
Meanwhile, the clips are pulled from a library created daily by NBC’s Connecticut-based “Highlight Factory,” which is made up of human videographers who cut out the best moments for use in NBC’s external media (TV shows, YouTube, social media, Peacock Reels, etc.). The AI system also helps pull the right clips for the daily recap based on matching metadata, but human verifiers scan them to ensure the predicted clips are suitable for the purpose of a particular sport or category. At every step of the process, these editors come from across NBC Sports and are working to ensure that they don’t fail in public like other AIs. After all, a machine is just as likely to make mistakes as the humans who make it work. If an Al Michaels bot that sounds as good as this one requires so much manpower to resemble a disembodied version of a real person, it’s hard to imagine a “fully” automated presenter grace the Los Angeles Olympics.
In a way, this is the Olympic team's effort itself, a remarkable demonstration of how to apply generative AI to real, additive, and efficient use cases. This is not meant to disparage anyone, especially the great Al Michaels himself. I'm not saying that the Olympic AI was all benevolent — just ask the tourists in Paris dealing with the sophisticated AI-powered surveillance systems installed throughout the city that will probably stay there even after the Olympics are over. But at least at home, you can't do much worse than a robotic Al Michaels.