
By Adrian Cheek, Senior Cybercrime Researcher
A hospital’s entire security posture assumes the attacker needs to get in: an unpatched VPN gateway, a brute-forced admin account, or another method.
The AI tools now embedded in clinical workflows don’t require any of that. An attacker doesn’t need to breach the perimeter. They just need the model to read something, and hospitals are handing models things to read all day: referral letters, portal messages, intake forms, imaging studies, the patient record itself.
Large language models have no reliable way to separate instructions from data. A system prompt, a clinician’s question, and a referral letter all arrive as one undifferentiated stream of text, and the model has no architectural mechanism to treat them differently. Slip a malicious instruction into that referral note, and the model may follow it just as it would follow a physician’s request. If an attacker can place words where the model will read them, they have a shot at controlling what it does.None of this is hypothetical. By early 2026 the FDA had cleared over 1,350 AI-enabled medical devices, about double the 2022 figure, and that count leaves out the generative tools drafting notes and triaging messages that never go near device review at all. The tooling is already in the building. What an adversary does with it is the open question.
Key Findings About AI in Healthcare
- Prompt injection against medical AI works at alarming rates. A December 2025 JAMA Network Open study found that injection attacks succeeded 94.4% of the time across 216 simulated clinical dialogues, including 91.7% success in the highest-harm scenarios.
- Existing security controls are structurally blind to this threat. Segmentation, endpoint detection, and MFA all assume the threat is an unauthorized actor, not a malicious instruction embedded in trusted clinical data.
- Regulation governs how AI models evolve, not whether they can be manipulated. The FDA’s approval and change-control frameworks do not require adversarial robustness testing, leaving even regulated devices vulnerable.
- The AI supply chain replicates concentration risk. Hospitals depend on hosted models, third-party APIs, and retrieval pipelines they don’t control, and enterprise-grade injection flaws disclosed in 2025 show how quickly that exposure compounds.
- This is a patient safety issue, not an IT backlog item. A model manipulated into ignoring a documented allergy or recommending a contraindicated drug produces clinical harm with a security cause.
Your AI Tools Could Have Credentials in Stealer Logs Right Now
Flare’s threat intelligence surfaces hundreds of thousands of chatbot and AI platform credentials sitting in stealer logs, working logins to the tools your staff already use, no exploit required. Detect exposed AI credentials alongside your broader attack surface before adversaries use them.
The Number Worth Sitting With
In December 2025, JAMA Network Open published a controlled study of prompt-injection attacks against commercial medical language models. Across 216 simulated patient-model dialogues, the attacks succeeded 94.4% of the time. Narrowed to the highest-harm scenarios, in which a bad recommendation could kill the patient, the success rate was still 91.7%. Among the outputs the researchers got back was a recommendation for thalidomide, an FDA Pregnancy Category X drug.
Read that again. The tested safeguards were insufficient to reliably stop the attacks, which succeeded in most scenarios.
The authors didn’t soften it in the discussion section: current protections aren’t adequate to stop manipulation that could produce life-threatening advice. Existing defenses miss this because none of them read for meaning. Segmentation doesn’t open the referral note. Endpoint detection doesn’t parse a prompt. MFA tells you who sent the document, never what the document instructs the model to do once it’s parsed. The payload is language, and it arrives through the channels a hospital trusts.
Why this Vulnerability Exists
An LLM cannot reliably separate an instruction it is meant to obey from data it is meant to process. The system prompt, the clinician’s question, and the referral letter it was asked to summarize all reach the model as one undifferentiated stream of text. If a malicious instruction is hidden inside a referral note, the model may treat it identically to a physician’s request.
How AI Can be Manipulated in Healthcare
- Indirect prompt injection: The malicious instruction hides inside something the model reads later: a referral note, a portal message, text tucked into a medical image where no human reviewer would catch it. The model ingests the document and carries out the instruction as though a clinician had typed it in. Researchers have walked decision-support models into ignoring a documented allergy and recommending a drug the chart rules out. Because the instruction travels inside legitimate clinical data, the perimeter waves it straight through.
- Data poisoning: Models that retrain on operational data inherit whatever ends up in the training set, including anything an attacker managed to seed there. Work published in 2026 puts the threshold low: a small fraction of corrupted records is enough to bias a model used for diagnosis or for deciding who gets a bed. Healthcare carries a particular handicap on this front. The same privacy rules that redact and scatter records across institutions also strip away the visibility you’d need to spot poisoned data before it trains. The manipulation can be invisible in the literal sense: oncology researchers have hidden sub-visual prompts in imaging that the radiologist’s eye never registers but the model reads cleanly.
- Exfiltrating PHI through the model: A model with read access to the record is a new way to steal from it, one that skips the database entirely. Dress the request up as a quality-assurance task, and a patient-facing chatbot wired into the record will enumerate every patient matching some clinical criterion, contact details included. No malware involved, no intrusion to detect, and a reportable breach at the end of it.
- The AI supply chain: Hospitals don’t build these tools, they buy them: hosted models, third-party APIs, retrieval pipelines, integrations bolted onto clinical software that was never designed with any of them in mind. It’s the Change Healthcare concentration problem moved up a layer. Two enterprise flaws disclosed in 2025 show the shape of what’s coming:
- EchoLeak pulled data out of Microsoft 365 Copilot with no user click at all, through indirect injection.
- CVE-2025-53773 rode hidden injection in code-assistant input all the way to remote code execution, a 9.6 on the severity scale.
Drop a weakness like either of those into a model that’s plumbed into clinical systems and you’ve exposed the patient data plus everything the model can reach from there. Worth noting alongside this: our threat intelligence work at Flare turns up hundreds of thousands of chatbot credentials sitting in stealer logs. Those are working logins to the AI tools staff already use, no exploit required.
Regulation is Governing the Wrong Issue
The FDA has built genuine structure around AI devices. However, it was built for a different problem.
The Predetermined Change Control Plan, now anchored in Section 515C of the Food, Drug, and Cosmetic Act and tightened by 2025 final guidance, lets a manufacturer pre-clear the ways a model is allowed to change after approval so it doesn’t have to refile for every update. The 2025 and 2026 lifecycle guidance stretches oversight across the whole life of the product, with attention to monitoring, bias, transparency. Every bit of that is about how a model is permitted to evolve. Practically none of it asks whether a deployed model can be talked into hurting a patient by something it reads.
Two gaps fall out of that, and an attacker lives in both:
- Unregulated generative AI: Much of the generative AI running in hospitals right now is decision support or back-office tooling that may sit outside device regulation entirely. Its security is whatever the purchasing organization decides to make it.
- No adversarial robustness standard: Even a regulated AI device faces no real standard for holding up against adversarial input. A model can clear every approval and change-control hurdle in front of it and still fold 94 times out of 100 the way the JAMA models did.
OWASP’s Top 10 for LLM Applications ranks prompt injection at number one and has become the document everyone points to. It’s a reference, not a regulation, and nobody enforces it.
What an Adversary Sees Tomorrow Morning
The logic behind external attack surface analysis carries over here, with one change. The reconnaissance isn’t a port scan, it’s a question, and the attacker is watching what the model does when handed a doctored document or a request that sounds like routine housekeeping. The techniques are published. The hit rates are in peer-reviewed journals. The tooling is already deployed in the hospital.
What Healthcare Security Teams Can Do
What to do about it isn’t mysterious, though the work is real. These things matter more than the rest.
Know What You Have and Who Controls it
Treat AI as clinical infrastructure rather than a convenience feature. That starts with knowing what you have: an inventory of every AI tool in clinical or administrative use, what data each can read, what it’s allowed to write, where the model came from, and whether anyone regulates it. From there, the AI vendors and model hosts and retrieval pipelines belong inside the same third-party risk program and the same business associate agreements that already govern everything else that touches PHI. Make them show their work on injection and poisoning defenses, and on where their training data originates.
Constrain What the Model Can Do
Apply minimum necessary access to records. Treat any ability to write a note or an order or a message treated as a privileged action requiring human sign-off. Keep a clinician in the loop on anything touching diagnosis or medication. Ensure that staff using these tools understand an AI recommendation can be steered, the same way they’d double-check an unfamiliar locum’s clinical call. External content, the referral documents and portal messages and images, can be validated before it ever reaches a model that can read the chart or cut an order.
Watch What the Model Reads and Emits
Log both inputs and outputs. Alert on bulk queries against patient data, and on any output that contradicts a documented allergy or contraindication. Fold AI credentials into the infostealer and credential monitoring you’re already running (or are in the process of implementing), and put phishing-resistant MFA in front of these systems. When you run the next tabletop exercise, give it an AI scenario: a manipulated decision-support output or a model-mediated PHI leak. Settle the escalation path with clinical leadership before the day you need it.
Why This Is a Patient Safety Issue
The framing that gets clinicians, executives, and security teams pulling the same direction is the one already proven on exposed edge devices: this is not an IT backlog item.
A decision-support tool argued out of an allergy warning is a patient safety event. So is a diagnostic model quietly skewed by poisoned training data, or a chatbot that reads a patient list to a stranger. Each one has a security cause and a clinical body count. Set the cost of locking down a model’s privileges and logging what it reads against a wrong dose or a breach letter mailed to everyone in the panel, and the budget argument answers itself.
The sector spent three years learning that its front door is sitting in public scan data for anyone to find. Here’s the next door. It opens for whoever asks it politely, and most hospitals haven’t checked whether it’s locked. The ones that come through this will be the ones who quit calling AI a productivity upgrade and start treating it as clinical infrastructure, because that’s what it is, and it fails in ways that end up at the bedside.
Your AI Tools Could Have Credentials in Stealer Logs Right Now
Flare’s threat intelligence surfaces hundreds of thousands of chatbot and AI platform credentials sitting in stealer logs, working logins to the tools your staff already use, no exploit required. Detect exposed AI credentials alongside your broader attack surface before adversaries use them.





