The injured teenage survivor of a January 2025 shooting at a Nashville, Tennessee high school recently sued the manufacturer of an “AI gun detection” system that failed to detect the handgun that left two dead, including the shooter.
According to the lawsuit, which was filed in Davidson County court last month, the security company Omnilert either knew or should have known that there were “significant operational limitations in its gun detection system that could result in detection failures during actual emergencies, including limitations based on camera placement, proximity of the weapon to camera sensors, camera angle, lighting, and weapon visibility.”
Omnilert cofounder Ara Bagdasarian declined Ars’ invitation to answer questions about the lawsuit. System Integrations, the other defendant in the case, which resold the Omnilert system, also did not respond to Ars’ request for comment.



Didn’t I just say that slapping an LLM vision model on to dozens of camera streams would be a near impossible technical hurdle?
I never said vLLM models don’t exist. I said they’re impractical for this use case.
Haven’t been wrong yet. You on the other hand…
There are several examples of exactly what I said, contradicting your repeated claim. Since I don’t want to talk to someone with the conversational ability of Donald Trump demanding things be true in spite of evidence they’re not im going to be blocking you now. Have a nice day.
No one is denying the existence of vision based LLM models. The issue is performance. It takes in the order of double (or even triple) digit seconds to process an image through an LLM. Even if it took a single second to process an image using decent server-grade hardware (which starts at about $10k per card), that’s way too much and still not fast enough.
On just 10 cameras at a facility it would require north of $100k on just GPUs alone.
Whereas a specialized computer vision model could process several dozen camera streams, in real-time, on just one of those $10k cards.
An LLM would process an image in 10 seconds (generous) whereas a computer vision model operates in the milliseconds. We’re talking about a 1000x difference in required processing power.
That’s why you’re wrong and have zero clue what you’re talking about.
You’re arguing that that family uses a fully loaded semi-trailer to go 200m to the local park. It’s a clueless and asinine argument.