Skip to content

Republic / Articles /

Article updated 2026-06-04.
Written by Christofer Sandin.

UX-design for AI

For most people, AI is a black box. Something that produces answers without explaining how or where the result came from, often with a confidence that’s out of all proportion to how it actually works. When users don’t understand what’s happening or why, it also becomes hard to trust the result.

The solution isn’t to try to make the black box completely transparent. We don’t know exactly why a language model picks one word over another — it’s a complex probabilistic calculation that goes over most of our heads.

But we can do something else. We can swap the black sheet metal for plexiglass. Not to show the mechanics themselves, but to show the process and what’s happening, which choices are being made and with what degree of confidence. That’s where we can build trust, and that’s where those of us building AI services have a job to do.

Calibrate trust between distrust and over-trust

Trust in AI isn’t a scale where “more” is always better. It’s a zone between two different kinds of failure:

  • Active distrust. The user doesn’t trust the system and ignores it, which makes the whole process pointless.
  • Over-trust and automation bias. The user accepts the system’s output without scrutiny, which can be catastrophic when the system gets it wrong.

What we should aim for is something we can call calibrated trust — a confidence that’s proportionate to what the system can actually deliver, situation by situation. It isn’t a falsely confident trust, but a trust that rests on four legs:

  • Capability to actually carry out the task.
  • Benevolence to act in my interest, without hidden motives.
  • Integrity to make the process transparent and ethical.
  • Predictability to make consistent judgments over time.

A language model is built on probability and can therefore never guarantee trust. But perhaps it can strive toward it by letting the user assess those four dimensions.

Surface decisions where they’re made

Today, feedback from AI systems is either nonexistent or overwhelming. Either you just get a spinning symbol and hope something good will appear, or you get an enormous amount of text that nobody reads. Both cases leave the user with no way to relate to the result.

What we should do instead is a Decision Node Audit. Transparency, in this case, is about surfacing the moments when the AI departs from rules and makes a qualified guess. It’s these probabilistic choices that need to become visible. We don’t need to show every token or the entire chain of reasoning — just the forks where a deterministic rule ends and a probabilistic judgment takes over.

For a recommendation system in an e-commerce setting, it might look like this:

  • Comparing search terms against our products…
  • Comparing your behavior against the purchase patterns of similar customers…
  • Weighing product suggestions against available stock and delivery time…

Each such point is a fork. And at every fork, it’s worth disclosing what was chosen, why, and with what degree of reliability. If the result isn’t good, in some cases you can also rewind to an earlier choice and see whether a different direction would have been better.

It’s a window into the model. That goes a long way — provided we’re also honest about what’s happening. Concretely, we need three things.

  • Audit trails that show where in the process each decision was made.
  • Graceful error handling so the system doesn’t go silent or confabulate when something goes wrong.
  • A deliberate distance from trust washing — a way of giving the impression of transparency without actually being transparent.

Two examples of what trust washing can look like in practice:

A chat assistant answers, “I based this on your latest quarterly report and the sales trend over the past six months” — even though the model never read them. It has provided a plausible justification for the answer being correct. The user believes they’ve received a traceable source citation, but it’s pure fabrication.

An analytics tool shows “Reliability: 92%” next to a conclusion. The number looks precise and scientific, but it isn’t calibrated against any actual outcomes — it’s a design choice meant to make the user feel safe. The sense of safety rests on a measurement that doesn’t exist.

In both cases, the product looks transparent and reliable, but there’s no substance behind the claims. They’re only there to give us false reassurance.

Not all transparency is free.

This is where it’s easy to overshoot. If we show every choice, every probability score and every step in the log, the user drowns in information and trust is eroded rather than built. Transparency needs to be matched to the weight of the action.

A simple model is to think of the AI’s actions along two axes — effect (low or high consequence) and reversibility(easy or hard to undo). In the four quadrants, transparency looks different.

  • Low risk and reversible. A discreet toast notification is enough. The AI can rename a file and just mention it in passing.
  • Low risk but irreversible. A simple undo button — minimal friction, but the option to step back.
  • High effect but reversible. Visible logs and history for asynchronous workflows, e.g., when the AI moves leads in a CRM.
  • High effect and irreversible. Stop. Require active human approval before anything happens — e.g. a transaction over a certain amount.

It’s about showing the right thing at the right moment. Plexiglass isn’t transparent everywhere. It shows what needs to be shown, where it needs to be shown.

Distrust begins with silence.

Another source of unease is when the user doesn’t know whether something is expected of them or whether they should just wait. That’s why tools like Claude Code work with continuously updated checklists that show what the system is doing at any given moment.

A good status update has a simple anatomy — active verb + specific object + constraint. The difference between a black box and a transparent process often comes down to word choice.

  • Weak: “Searching for tickets.”
  • Strong: “Searching for flights with SAS for a price under SEK 6,000.”

The first message says nothing. The second says the system has understood the question, that it’s working on it, and what principle it’s following. The user can await the result with peace of mind — and if something goes wrong, they can see where it went wrong.

Just as important is designing for partial success. Agentic systems are a gray scale. Often, 90% of the task succeeds, but some part doesn’t. It’s tempting to throw out a giant error message, but that’s the wrong response. Instead, we should show what succeeded, and be concrete about what didn’t and why. For example:

  • “flight booked”
  • “hotel booked”
  • “rental car needs your attention — no availability.”

Often it’s an external service acting up, not the system’s own functions.

But isn’t this just good old-fashioned UX?

At first glance, this might look like an ordinary UX question. That’s partly true. But AI-based services are met with a degree of skepticism from the start, often before users have even tried them. That means transparency, feedback, and predictability aren’t “nice to have” in these systems — they decide whether the service gets a chance at all.

The more AI is integrated into more parts of people’s lives, the more important intent and integrity become. It’s not enough for the system to be transparent when it makes a decision. Users also need to be able to trust why the decision was made.

In closing

Plexiglass isn’t transparent everywhere. It shows what needs to be shown, when it needs to be shown. It isn’t a panoramic window, but it is an honest window into the process.

For those of us building AI services, it comes down to designing the interfaces, warning signals, and checkpoints that let a user actually trust the result — with the right kind of trust, for the right kind of task. And that’s what makes it possible to deliver something useful going forward.

References and further reading