top of page
Search

Master's Capstone Project - Adversarial Attacks


Finally, the end of my master's degree . The course is coming to an end; these last few weeks I've been working on the Capstone Project, which is the main deliverable for the end of the course.


I chose to do this work alone as a way to challenge myself and tackle a topic that interests me: Adversarial Attacks, a way to assess the robustness of models through a process of maximizing the error that the models present during the prediction process.


Here is a summary of the work:


The AAI-590 Capstone Project at the University of San Diego in the M.S. in Applied Artificial Intelligence program. This capstone focus on adversarial attacks against Whisper, OpenAI's automatic speech recognition model. Using multiple adversarial approaches, we implemented Projected Gradient Descent (PGD), Universal Adversarial Perturbation (UAP) and Targeted Carlini-Wagner (CW), in order to explore Whisper's vulnerabilities to intentional "smart noises" that can affect the model capabilities. The main finding is that Whisper remains vulnerable in the digital white-box setting. Untargeted attacks substantially change transcription output, targeted attacks can force a chosen phrase on a small evaluation batch, and a universal perturbation can generalize across multiple utterances. At the same time, the most successful attacks in the current implementation often operate at SNR levels that are more audible than the original ideal target, so effectiveness and imperceptibility remain the central tradeoff.

All the code is available on Github , as is the article created from this work.





Unfortunately (given the project's constraints ), none of the implemented attacks fully reached the ideal SNR range of 35 to 45 dB necessary for the noise to be truly imperceptible. The targeted attacks are the most audible. This indicates that, although the vulnerability is real in digital pipelines and batch transcription scenarios, it is possible to consider attacks that directly affect the outcome of transcriptions using the (Whysper) model.


Now, in the digital realm, such as in batch transcription, API processing, and telephone recordings, this risk is more than concrete. A disturbance that can be directly incorporated into audio files before they reach the model makes this attack a scalable vector that is difficult for automated systems to detect. However, the main current limitation is imperceptibility: at SNR levels of approximately 11 dB, a careful listener can still perceive the presence of adversarial noise. This reveals an important technical trade-off between the effectiveness of the attack and its perceptual stealth.


From a security standpoint, systems based on automatic speech recognition still need robust defense mechanisms against adversarial manipulation, especially in environments where input data is unreliable. Is it possible to take precautions? Certainly: strategies such as acoustic anomaly detection, adversarial training, and pre-processing filtering can be promising approaches.


An important aspect of this work is its potential for positive application. With proper development, these techniques can assist journalists, activists, and privacy advocates in acting as critical observers against undue surveillance and human rights violations. In the context of promoting data protection laws and whistleblower protection, such tools can strengthen investigative practices and expand freedom of expression, allowing individuals to protect themselves against automated monitoring systems.



Adversarial Attack on OpenAI Whisper full presentation

I've been absent for the past few months due to my commitment to the program and other work trips. Now I hope to resume creating content on this channel.



 
 
 

Comments


©2024 by Victor Hugo Germano

bottom of page