AI Clinical Notes

Beyond the AI scribe: how SignalEHR writes a clinical note you can actually sign

June 4, 2026 · 6 min read · By SignalEHR

How SignalEHR generates clinical notes: separate the voices, measure deterministically, constrain the writer, and keep a human as the author.

Every therapist has the same worry about AI notes, and most vendors quietly route around it: what happens when the software makes something up?

A progress note is part of the clinical and legal record, and it has your name on it. If the software invents a symptom, or puts a client's words in your mouth, that is the kind of error that can follow a patient through their chart for years. "Mostly accurate" is not something you can sign. Here is how SignalEHR's note engine is built so you don't have to take that bet.

Where a plain AI scribe goes wrong

Most scribes do the whole thing in one move. Audio becomes a transcript, the transcript goes to a language model, and the model writes the note. That single move is where hallucination lives.

A language model is, underneath, a fluency machine. It's trained to produce text that reads like a competent clinical note, and when the transcript is clean, it does. The trouble starts when the session is messy, and sessions are usually messy. People trail off. Audio drops. Two voices land on top of each other. When the input thins out, the model does what it was built to do and fills the silence with something plausible. Because the same step both observes the session and writes it up, you can't go back and tell which sentences came from the room and which the model produced to be helpful.

Our whole design is about prying those two jobs apart. Observe in one place. Write in another. Don't let the writer invent the thing it's supposed to be describing.

First, figure out who is talking

A session is a conversation, and a note has to attribute it correctly. When a client says they've been having thoughts of not waking up, that line is the client's, and it might belong in a risk assessment. A scribe that flattens both voices into one stream has to guess who spoke. Guess wrong, and you've logged a client's disclosure as your own remark, or handed an intervention to the person who didn't make it.

So SignalEHR separates the audio into distinct speakers before it writes a line. In an individual session that's therapist and client. In couples work it's three people kept apart, therapist and both partners, which is the only way to track each partner on their own instead of averaging the room into one mood. In groups it's everyone present, tracked by who actually spoke.

This is genuinely hard to do well. People interrupt, microphones are cheap, voices overlap. So we don't trust a single pass at it. The audio gets labeled as it streams in, a second pass goes back and corrects the labels that were wrong, and if the machine still has it backwards, the therapist fixes it with one click. By the time the note is written, every line already has a name on it.

Live sessions and recorded ones are not the same problem

We run two pipelines on purpose, because a one-on-one and a packed family session don't behave the same way.

Live individual and couples sessions stream as they happen. The audio is transcribed on the fly and the analysis runs while the session is still going. That live read is the only way to catch something while it can still change the next ten minutes in the room, like a client's affect dropping or a risk signal climbing. The cost of working live is that speaker labels are shaky in the moment, which is the whole reason that second cleanup pass exists.

Bigger recorded sessions, groups and families, get handled after the fact instead. Handing the system the entire recording at once, rather than a live trickle, buys much cleaner speaker separation in the noisy, crowded rooms where live transcription tends to fall apart. You give up the real-time read to win accuracy where it's hardest to earn. Neither path is better than the other. They suit different moments.

The part that is not a language model

Before a word of the note exists, a deterministic engine reads the session. Not a second AI. A rule-based one. It listens to the sound of the conversation, the pace, the pauses, the energy behind the voice, and it reads the language in the transcript, and out of that it produces the clinical readings a therapist already thinks in: anxiety, depression, the strength of the alliance, the level of risk. As numbers.

Because it runs on rules, it has no temperature and no imagination. Send the same session through it twice and you get the same numbers twice. The affect and the risk in your note are measured, not written up from a hunch. So by the time the writing step arrives, it doesn't have to guess how distressed someone was from the mood of a transcript. That reading is already on the table, and it doesn't move between runs.

The writer only arranges what is already there

Now, and only now, does a language model get involved, with a far smaller job than a scribe hands it. Its inputs are fixed: the verified transcript with the speakers attached, plus those measured readings. Its task is to lay that material out in whatever format the clinician keeps notes in, SOAP or DAP or BIRP or another standard structure. It works against a strict template, and any diagnostic codes arrive as suggestions to confirm, never as a diagnosis the model settled on by itself.

A scribe tells the model to write the note. We tell it to take this evidence and arrange it in this shape. A narrow instruction is a great deal harder to hallucinate against than a blank page.

So, about hallucination

I won't tell you we've solved it. Anyone promising an AI that can't be wrong is selling you the exact confidence you should be suspicious of. What we've done is make it a lot less likely, and then put a person at the end anyway.

The writer never sees anything but the real session, so there's nowhere for outside information to creep in. The clinical readings come from a reproducible engine, so the numbers don't wander. The speakers are kept separate, so nobody's words get reassigned. The template and the suggest-don't-diagnose rule keep the prose pinned to the evidence. And every note lands in the chart as a draft, for the clinician to read, edit, and sign. The software proposes. The clinician decides. The setup lowers the odds of a mistake, and the human catches whatever is left before it becomes part of the record.

It doesn't ride on any one model

Worth saying plainly: none of this is bolted to a particular vendor or model. The transcription, the speaker separation, the writer, those are all parts we can swap as better ones turn up, and better ones keep turning up. What doesn't change is the shape around them. Separate the voices. Measure with something deterministic. Fence in the writer. Leave a human as the author.

That's the part that makes a note trustworthy, and it'll still hold after this year's favorite model is old news. We aren't betting the chart on one AI. We're betting on a process that stays honest no matter what's plugged into it.

The honest version

A scribe is tuned to sound like a good note. We tuned for a note that's faithful to the session that actually happened. It's a smaller promise than "the AI does your notes for you," and it asks the clinician to stay in it. A therapist should be the one signing the chart anyway. If you've been waiting for an AI note you'd genuinely put your name on, that's the bar we wrote toward.

Built for clinicians who read every note before they sign it

Run one of your own sessions through it

Start a 14-day trial and put a real session through the pipeline. Full features, no credit card, and your sign-off is still the gate.

Related