In the very short term, we're deploying this tech more in a post-operation/training role. Imagine being a student pilot, getting in from your solo cross country, and pulling up the debrief will all your comms laid out and transcribed. In this setting, it's helpful for the student to have immediate feedback such as "your readback here missed this detail...", etc. Controllers also have phraseology and QA reviews every 30 days where this is helpful. This will make human pilots and controllers better.
Next, we'll step up to active advisory (mapping to low assurance levels in the certification requirements). There's always a human in the loop that can respond to rare errors and override the system with their own judgement. We're designing with observability as a first-class consideration.
Looking out 5-10 years, it's conceivable that the error rates on a lot of these systems will be super-human (non-zero, but better than a human). It's also conceivable that you could actually respond "Say Again" to a speech-to-speech model that can correct and repeat any mistakes as they're happening.
Of course, that's a long ways from now. And there will always be a human in the loop to make a final judgement as needed.
I imagine that aviation regulatory bodies have high standards for this - a tool being fully additive to existing tools does not necessarily mean that it's cleared for use in a cockpit or in an ATC tower, right? Do you have thoughts about how you'll approach this? Also curious from a broader perspective - how do you sell any alerting tool into a niche that's highly conscious of distractions, and of not just false positive alerts but false negatives as well?
There are lots of small steps on this ladder.
The first is post-operational. You trigger an alert async and someone reviews it after the fact. Tools like this help bring awareness to hot spots or patterns of error that can be applied later in real time by the human controller.
A step up from that is real-time alerting, but not to the main station controller. There's always a manager in the tower that's looking over everyone's shoulder and triaging anything that comes up. That person is not as focused on any single area as the main controllers. There's precedence for tools surfacing alerts to the manager, and then they decide whether it's worth stepping in. This will probably be where our product sits for a while.
The bar to get in front of an active station controller is extremely high. But it's also not necessary for a safety net product like this to be helpful in real time.
To me, speech to text and back seems like an incremental solution, but the holy grail would be the ability to symbolically encode the meaning of the words and translate to and from that meaning. People' phraseology varies wildly (even though it often shouldn't). For example, if I'm requesting VFR flight following, I can do it many different ways, and give the information ATC needs in any order. A system that can convert my words to "NorCal Approach Skyhawk one two three sierra papa is a Cessna one seventy two slant golf, ten north-east of Stockton, four thousand three hundred climbing six thousand five hundred requesting flight following to Palo Alto at six thousand five hundred," is nice, but wouldn't it be amazing if it could translate that audio into structured data:
{
atc: NORCAL,
requester: "N123SP",
request: "VFR",
type: CESSNA_172,
equipment: [G],
location: <approx. lat/lon>,
altitude: 4300,
cruise_altitude: 6500,
destination: KPAO,
}
...for ingestion into potentially other digital-only analysis systems. You could structure all sorts of routine and non-routine requests like this, and check them for completeness, use it for training later, and so on. Maybe one day, display it in real time on ATC's terminal and in the pilot's EFIS. With structured data, you could associate people's spoken tail numbers with info broadcast over ADS-B and match them up in real time, too. I don't know, maybe this already exists and I just re-invented something that's already 20 years old, no idea. IMO there's lots of innovation possible bringing VHF transmissions into the digital world!Kidding aside, yes, you're exactly right. We're already doing this to a large degree and getting better. Lots of our own data labeling and model training to make this good.
This is effectively AGI.
And I've not seen anyone reputable suggest that our current LLM track will get us to that point. In fact there is no path to AGI. It requires another breakthrough in pure research in an environment where money is coming out of universities.
When, not if. The "artificial intelligence" as it is presently understood is statistical in nature. To rely on it for air traffic control seems quite irresponsible.
I'd be curious about what happens when the ASR fails. This is not the place to guess or AI-hallucinate. As a pilot, I can always ask "Say Again" over the radio if I didn't understand. ASR can't do that. Also, it would be pretty annoying if my readback was correct, but the system misunderstood either the ATC clearance or my readback and said NO.