Learning Library

← Back to Library

Watsonx Powers Grammys, Security Tests Audio Hijacking

Key Points

  • IBM watsonx partnered with the Recording Academy for the 66th Grammy Awards, using a generative AI content engine to streamline creation of multi‑channel stories about over a thousand nominees across nearly 100 categories.
  • The watsonx.ai large language model was fine‑tuned on the Academy’s proprietary data, enabling editors to select templates, artists or categories, exclude topics, and instantly generate, re‑phrase, and edit headlines, bullets, and wrap‑ups, saving hundreds of hours of manual work.
  • This AI‑driven workflow helped the Grammy digital team deliver engaging content to more than five million music fans worldwide while maintaining brand consistency and creative flexibility.
  • IBM Security demonstrated a proof‑of‑concept “audio jacking” attack that hijacks live VoIP conversations, transcribes them with speech‑to‑text, uses an LLM to alter financial instructions, then synthesizes the modified speech with a cloned voice to trick victims into sending money.
  • The experiment showed that only a few seconds of a person's voice are needed to create a convincing clone, highlighting emerging risks of voice‑deepfake attacks and the importance of robust security controls.

Full Transcript

# Watsonx Powers Grammys, Security Tests Audio Hijacking **Source:** [https://www.youtube.com/watch?v=ZsWzF7g8YTc](https://www.youtube.com/watch?v=ZsWzF7g8YTc) **Duration:** 00:03:47 ## Summary - IBM watsonx partnered with the Recording Academy for the 66th Grammy Awards, using a generative AI content engine to streamline creation of multi‑channel stories about over a thousand nominees across nearly 100 categories. - The watsonx.ai large language model was fine‑tuned on the Academy’s proprietary data, enabling editors to select templates, artists or categories, exclude topics, and instantly generate, re‑phrase, and edit headlines, bullets, and wrap‑ups, saving hundreds of hours of manual work. - This AI‑driven workflow helped the Grammy digital team deliver engaging content to more than five million music fans worldwide while maintaining brand consistency and creative flexibility. - IBM Security demonstrated a proof‑of‑concept “audio jacking” attack that hijacks live VoIP conversations, transcribes them with speech‑to‑text, uses an LLM to alter financial instructions, then synthesizes the modified speech with a cloned voice to trick victims into sending money. - The experiment showed that only a few seconds of a person's voice are needed to create a convincing clone, highlighting emerging risks of voice‑deepfake attacks and the importance of robust security controls. ## Sections - [00:00:00](https://www.youtube.com/watch?v=ZsWzF7g8YTc&t=0s) **WatsonX Powers Grammy Content Creation** - IBM’s watsonx partnered with the Recording Academy to use generative AI for quickly producing and customizing multi‑channel stories that spotlight nominees and categories for the 66th Grammy Awards. - [00:03:07](https://www.youtube.com/watch?v=ZsWzF7g8YTc&t=187s) **AI Bot Hijacks Conversation** - The speaker explains how a middleman bot can intercept and alter dialogue, highlighting the need for evolving security measures against generative AI threats. ## Full Transcript
0:00The role of Watson X at the Grammys 0:02and IBM Security's audio jacking experiment 0:05all on this episode of IBM Tech now. 0:08What's up y'all my name is Ian and I am back 0:11to bring you the latest and greatest news and  announcements about IBM technology 0:16IBM watsonx recently partnered with the recording Academy  for the 66th annual Grammy Awards. 0:22The challenge they faced? 0:24Driving captivating content across multiple digital channels in today's highly fragmented media landscape. 0:30Not an easy task when you need to celebrate the achievements and stories 0:33of more than a thousand nominees across nearly 100  categories. 0:37The solution? 0:38AI stories with IBM watsonx, 0:41a generative AI content engine fueled by trusted data. 0:45Essentially the task was to build a content supply chain that would save hundreds of hours of research, 0:50writing and production time while offering creative flexibility and easy review 0:56This year's solution used the generative AI capabilities of watsonx 0:59to leverage a powerful large language model hosted in the Watsonx.ai component. 1:05The model was trained on the recording Academy's trusted  proprietary data. 1:09The AI stories interface let editorial team members choose templates 1:13that featured nominees or categories with a variety of layouts and branding. 1:18Next they selected an artist or award category to feature the subject of the post 1:23and any topics to exclude from the output. 1:26AI stories were then created featuring introductory texts, 1:29headlines, bullets, one- liners and wrap-up texts. 1:32Any of these outputs could be regenerated to create alternate phrasings and could be manually edited easily. 1:38And that's how IBM watsonx and the Recording Academy digital team delivered an engrossing digital experience to more than 5 million music fans worldwide. 1:47To learn more about watsonx at the Grammy Awards, click the link in the description of this video. 1:52Next up is a wild story about how the IBM security team recently conducted successful audio jacking experiments. 2:00It sounds like something out of a movie taking place in the future, 2:03but audio jacking intercepts and hijacks a live conversation 2:07then uses an LLM to understand the conversation 2:10in order to manipulate audio output that clones the victim's voice. 2:14Essentially they were able to modify the details of a live Financial conversation 2:18occurring  between the two speakers and divert money to a fake adversarial account. 2:23It works roughly like this: 2:25the attacker installs malware on a victim's phone or compromises a wireless voiceover IP service. 2:31Next, they utilize speech to text capabilities to convert the victim's voice and conversation into text 2:37and allow the LLM to understand the context of the conversation. 2:41Then they instruct the LLM to modify the sentence whenever anyone mentions a bank account. 2:46When the LLM modifies the sentence, the program uses text to speech with pre-cloned voices to generate and play the audio 2:54- and before you bump on the clone voices, nowadays they only  need 3 seconds of an individual's voice to clone  it. 3:01Finally, the bot switches the victim's bank account number with their attacker's number so funds are deposited into the wrong account. 3:08And just like that, the bot which is acting as a middleman 3:11has hijacked the conversation and changed key elements without either of the victims knowing. 3:16There's many more fine-tuned details to the whole process that are covered in the blog, 3:20but it's another illustration of how security processes will need to continually evolve 3:25as gen AI presents new opportunities for bad faith actors to strike. 3:30To learn more, hit the link below. 3:33Thanks so much for joining me today for this episode of IBM Tech Now. 3:36If you're interested in learning more about the topics I've covered make sure you explore the links in the description of this video 3:41and again please don't forget to subscribe to our channel 3:44to stay up to date on what's going on in Tech now.