AI Scans Audio Recordings to Detect Voice Box Cancer

AI Scans Audio Recordings to Detect Voice Box Cancer


An AI-based tool may offer a non-invasive and more accessible way to distinguish between laryngeal cancer and benign vocal cord lesions.

People often “lose their voice” after spending the night cheering for a local sports team or singing along to their favorite songs at a concert. Such overuse can temporarily injure the vocal cords, making people’s voices sound hoarse and strained. But there’s a much more alarming cause that can also alter a person’s voice: laryngeal cancer, which may be fatal if left untreated. Clinicians typically assess this condition using invasive—and at times, unavailable in underserved areas—methods such as endoscopy and biopsy.

In a recent study, researchers found that certain acoustic features could distinguish people with vocal cord lesions from those without based on their voice recordings.1 One of the characteristics that the researchers measured could even differentiate between benign and cancerous lesions. This work, led by Phillip Jenkinsa general surgery resident at Oregon Health and Science University, put forward a non-invasive and more accessible way to diagnose voice disorders. Their findings were published in Frontiers in Digital Health.

This study is part of the Bridge2AI programa National Institutes of Health consortium, which aims to develop AI models to address key challenges in biomedical research. Jenkins’s team used an existing Bridge2AI-Voice dataset, which contains recordings of study participants reading the Rainbow Passagea text that speech pathologists commonly use to assess American English speakers.2

Using AI, the researchers extracted acoustic features that have been previously associated with vocal cord pathologies: fundamental frequency, which indicates pitch and intonation; jitter, which measures fluctuations in fundamental frequency and signifies control of the vocal cord; shimmer, which quantifies fluctuations of sound wave amplitudes and may denote the presence of lesions that interfere with vocal cord movement; and harmonic-to-noise ratio (HNR), which can indicate improper closing of the vocal cords.3,4

When the researchers compared these features in 122 individuals with no voice disorders, 13 with benign vocal cord lesions, and 10 with laryngeal cancer, they found significant differences between the HNR standard deviation in people with benign vocal cord lesions and those without voice disorders as well as between those with benign vocal cord lesions and laryngeal cancer. This suggested that among the metrics that the researchers tested, HNR may be the most indicative.

Findings from this study demonstrate the possibility of using the voice to diagnose vocal cord lesions non-invasively. In the future, Jenkins and his colleagues hope to evaluate larger datasets and incorporate more variables, such as the size of vocal cord lesions.


Leave a Reply

Your email address will not be published. Required fields are marked *