2026-05-12 · 7 min read · #multi-agent #transparency

From Black Box to Glass Box: Why Multi-Agent Systems Need to Show Their Work

When a senior consultant recommends a decision, you can ask them why. They'll tell you the three reports they read, the customer call that surprised them, and the heuristic they've used for ten years. You may not agree, but you can argue back.

When an AI agent recommends a decision, you usually can't. The model gives you an answer, maybe a confidence score, and an aura of expertise. If you press it, you get either a polished restatement of the same answer, or — worse — a confident hallucination dressed up as reasoning.

This is the trust gap that's keeping multi-agent systems out of serious enterprise workflows. And it's the problem our HCI International 2026 paper is about.

What "transparency" actually means here

In the LLM world, "transparency" has become a buzzword with at least three meanings:

Process transparency — which agent ran, in what order, calling which tools.
Source transparency — which documents the agent retrieved and used.
Confidence transparency — how sure the agent is, and where its uncertainty comes from.

Most production systems do one of these badly. Almost none do all three. A "glass box" system makes all three legible without drowning the user in raw logs.

The same answer, but three new things to interrogate.

What enterprise users actually ask for

We ran a study with eight enterprise users — strategy analysts, IT decision-makers, and one extremely opinionated procurement lead. We showed them the same multi-agent system in two flavours: a "black box" version that just gave answers, and a "glass box" version that showed the three layers above.

Two findings surprised me:

1. Trust didn't come from accuracy. It came from auditability. Users in the glass-box condition were more willing to act on the system's answer even when they could see the system was sometimes wrong. The reason: they could spot when it was wrong and route around it.

2. Too much detail is just as bad as too little. Our first prototype showed every retrieved chunk, every tool call, every confidence score. Users stopped reading after fifteen seconds. The winning version had a one-line summary plus a "show me why" expander — most users opened the expander once, then trusted the summaries afterwards.

The bar isn't "explain everything." It's "explain enough that I can decide whether to dig deeper."

How we built it

The implementation is less exotic than you might expect:

Each agent logs structured events (retrieved, reasoned, delegated, concluded) to a per-request trace.
A small LLM call summarises each trace into one human-readable sentence per agent.
The UI renders a collapsed timeline; clicking expands the raw trace with citations.
Confidence is reported per-sub-answer, not per-whole-answer — averaging confidence over a 3-step chain is misleading.

None of that is rocket science. The hard part is deciding what to not show. Every "useful debugging detail" in the UI costs you a user.

What I'd do differently

If I started this project tomorrow, I'd:

Treat the explanation layer as a first-class feature from day one. We bolted it on after the agents were working, and it showed.
Build the "show me why" view first, then the answer. The view forces you to know what the system actually knows.
Stop calling these systems "agents." Users hear "robot." They want "research assistant" — something fast, well-read, and answerable to its own claims.

The bigger point

The conversation around AI safety often gets framed as a research problem. In enterprise deployment, it's mostly a UX problem. The model is allowed to be imperfect. The system around it has to make those imperfections visible, navigable, and ignorable when the user knows better. That's what "glass box" means in practice — and it's the only version of multi-agent AI I've seen people actually use twice.

The paper goes into more detail. If you'd like a preprint when it's ready, ping me — [email protected].

← All posts