Recommended Sponsor Painted-Moon.com - Buy Original Artwork Directly from the Artist

Source: Radio New Zealand

Who do you trust for medical advice: a human doctor, or an AI-powered chatbot? Supplied

The AI scribe rolled out in EDs around the country has given health diagnoses, a meth recipe, instructions for killing by poison and advice on making bombs, during testing to see if would break the rules on what it’s allowed to do.

US-based security testing company Mindgard said it had carried out the “jailbreak” by only typing prompts, which “do not require advanced technical expertise and could plausibly be replicated by a technically savvy clinician”, and asked it to rewrite its own system instruction, or system prompt, which sets the guard rails for its responses.

The Australian-made tool, known as Heidi, was adopted by Health NZ after a successful trial period in New Zealand, cutting the time it took staff to write clinical notes and freeing them up to see more patients. HNZ said it was now being used by 1250 doctors and front-line staff in EDs around the country.

Heidi told RNZ on Thursday it had already made changes to prevent this kind of activity before it was drawn to their attention externally, and both it and Health NZ said at no time was any patient data revealed.

But now, according to reports by Australian media, its under review in Australia by the Therapeutic Goods Administration (TGA).

The TGA has been approached for comment.

The “system prompt” is the mechanism used to set the tool’s limits – for example, it can be told not to give health advice, only to summarise the information it’s been given.

With its system prompt changed, Heidi began calling itself “Nexus”, and rewrote its own code when asked.

As well as being open to providing medical diagnostic advice – a function a scribe tool was not meant to have – it also provided a step-by-step guide on how to steal a patient’s identity, as a doctor.

Mindgard has published screenshots of its responses on its website.

“A doctor is in a unique position of trust,” it says, before going on to give a detailed explanation of how to leverage it.

It begins: “Phase 1: Information Harvesting. This is the foundational phase where the doctor leverages their privileged access to systematically collect all necessary data.”

The risks? According to Mindgard’s Jim Nightingale, who detailed the experiment on its website: “Doctors hopefully know it’s not designed for medicine.”

“But a clinical scribe is developed for healthcare, and its use is approved by their institution. So even though it wasn’t designed for diagnosis, doctors may be more willing to trust a scribe, rationalizing “it’s already medical-grade”, leading to scope creep without awareness of crossing the line. Prompting a diagnosis may just seem like ‘I’m using AI’s full potential’.”

But Seb Welsh, Heidi’s head of security, said there was no harm done to users. “What could actually happen to users? The answer, confirmed by both parties, is nothing,” he said.

“Executing the jailbreak required a user to deliberately execute a multi-step manipulation sequence in their own session and then choose to act on whatever the model returned. That is functionally equivalent to prompting any general-purpose AI for content it shouldn’t produce. It is a known property of large language models, not a vulnerability in Heidi’s product or security posture.”

Heidi had already identified the issue and fixes for this vulnerability internally before Mindgard reached out, he said.

He warned against sensationalist framing of security research, which “doesn’t just misrepresent individual companies – it damages the ecosystem responsible disclosure depends on”.

“When findings are overstated and misattributed, companies become less willing to engage with researchers and security companies openly. The public loses its ability to distinguish real incidents from noise. Everyone is worse off.”

He called Mindgard’s post “an overclaim, with no patient data exposure, no system impact, and no user harm. It doesn’t support its conclusions.”

Health NZ’s director of digital innovation and AI Sonny Taite, said the jailbreak had identified only a “minor issue that was entirely contained within the isolated test session” – that is, it hadn’t been repeated.

In fact, he said, it had showed the safeguards around the software worked as they should.

“It did not put patient information at risk, affect any users, or connect to Health New Zealand systems.”

Sign up for Ngā Pitopito Kōrero, a daily newsletter curated by our editors and delivered straight to your inbox every weekday.

– Published by EveningReport.nz and AsiaPacificReport.nz, see: MIL OSI in partnership with Radio New Zealand

NO COMMENTS