Stop Building Trust in AI. Start Building AI Worth Trusting.
- Kathleen McGinn & Michael Pascu
- 23 hours ago
- 15 min read
Kathleen McGinn & Michael Pascu - INQ Consulting - June 2025

Every few weeks, we find ourselves in a room full of senior leaders and someone will lean forward with what they clearly believe to be the central question of the moment.
“How do we get more of our people to trust the AI?”
Sometimes it’s framed as a change management problem. Sometimes it’s about the regulator. Sometimes it’s about clients who are skeptical or staff who are resistant. But it’s always the same question beneath the surface: how do we move faster, how do we push this deeper into our organizations, am how can we do so in a way that safe, controlled and won’t expose us to significant risk?
We understand why. Organizations are under genuine pressure. AI is moving fast and the governance conversation is still catching up. Trust feels like the thing that might close the gap, the thing that gets you from uncertainty to adoption, from scrutiny to acceptance, from a governance problem to a people problem you can actually solve.
But here’s what we’ve come to believe: that question, as reasonable as it sounds, is pointing in the wrong direction.
This piece is about why. And what to ask instead.
I. Two Words. Completely Different Things.
Let us start with a distinction the academic literature draws cleanly, even when our day-to-day language tends confound it.
Trustworthiness is a property of a system. It refers to the objective characteristics that make something actually deserving of the trust placed in it: its ability to do what it claims, its consistency, its alignment to the interests of those it serves.¹ It is, in principle, auditable. It exists independent of whether anyone believes it or feels it or perceives it at all.
Trust is something else. Technically, it’s defined as the willingness to be vulnerable to another party based on the expectation they’ll perform a particular action.² In practical terms, trust is individually felt, but socially produced. It lives in a person. It’s shaped by culture. It’s shaped by context, prior experience, and what the people around you seem to believe. It’s also shaped by what the interface looks like, whether the system says sorry when it makes a mistake or whether it a system starts feeling like I knows you.
These two construct, trustworthiness and trust – are logically separable.³ A system can be trustworthy without being trusted. A system can be trusted without being trustworthy. And both of those situations are, right now, actively occurring.
Here’s a metaphor we keep coming back to: Imagine a bridge.
A bridge has load-bearing capacity. That capacity is a fact about its materials, its engineering and its maintenance history. It exists whether anyone knows about it, believes it, or feels safe crossing.
That is the bridge’s trustworthiness.
Now imagine you’re standing at one end, deciding whether to cross. Your willingness to step onto that bridge is your trust. You could feel terrified crossing a perfectly safe bridge because it sways in the wind and you’ve never crossed one before. You could walk confidently across one that is six months from structural failure because it looks solid, maybe other people are crossing it too, and you definitely know it’s getting you where you need to go faster.
The bridge hasn’t changed. Its load-bearing capacity is unaffected by your perception of it. Your confidence is not evidence of the engineering.
And then there is trusting behaviour – the actual actions you take because you trust: using the system, following its recommendation, delegating a consequential decision to it.³ Which is separable again. People use things they don’t deeply trust, and trust things they don’t use. Three distinct concepts. One word people use for all of them.
When we say “build trust in AI,” we are almost always talking about the person, not the system, and that’s worth noticing.
II. What we know to date about how humans trust.
There is now a substantial body of research on how humans form trust in AI. We’ll walk through six findings that together, reveal something important, interesting, and maybe urgent.
Trust is inferred from cues, not evaluated from evidence
The first finding is structural.⁴ People don’t assess AI the way they audit a spreadsheet. They form an impression the way they form an impression of a person: fast, automatic, from thin cues. The same cognitive machinery that decides in the first few seconds of meeting someone whether you can rely on them is also deciding, below conscious awareness, whether you trust the AI system in front of you.
Research shows that even the assessment of seemingly objective AI characteristics, like accuracy, varies between observers.⁵ Two people looking at the same system will routinely develop entirely different perceptions of its reliability. Trust is a product of impression formation, a process susceptible to all of the same shortcuts and errors that human social judgment has always produced. It’s definitely not a rational evaluation of the system’s properties.
The problem isn’t the people making those impressions. This is how human cognition works. The problem is that we have built AI systems that are very good at triggering the cues that produce positive impressions, while the actual properties those impressions are supposed to track have gone unexamined.
Trust in AI is agent-specific and does not transfer
A second structural finding follows directly from the first.¹⁸ The question “do you trust AI?” is, empirically, almost meaningless. A large global survey of nearly 50,000 people found that trust in AI varies enormously depending on what the AI is for. People trust healthcare AI more than generative AI. They trust AI for navigation more than for human resources decisions. They trust AI for image generation differently from AI for parole recommendations.
Which means there is no general trust in AI – only trust in this AI, for this purpose, in this context, by this person. Trust built in one context does not transfer to another, and a governance framework that treats all AI as a single category is built for a world that does not exist.
Trust collapses on a single error, even when the AI is still the better option
Studies on algorithm aversion showed something that should keep AI strategists up at night.⁶ When participants watched an algorithm make a mistake, they abandoned it, even when it still outperformed every human alternative. Even when shown the comparative accuracy data. They went back to the worse option because it felt safer to trust something that had at least demonstrated human-like fallibility.
The mirror image is equally real.⁷ In other conditions, people over-credit algorithms, giving recommendations more weight simply because they’re labelled as algorithmic.
Trust is swinging wildly based on perception, not on whether the actual system got better or worse. The bridge didn’t change. The engineering didn’t change. But people either stopped crossing or started crossing without looking, based entirely on how it felt in the moment.
Trust varies by more than thirty points depending on where you grew up
A systematic review found striking variation across countries’ average trust in AI systems, with scores ranging from 47.39 in Spain to 79.80 in Malaysia on a standardized scale from 0 to 100.⁸ Countries with emerging economies tend to trust AI more. English-dominant countries tend to trust it less.⁹ Within any country, trust varies by personality, age, gender, AI experience, and even attachment style.
The system is the same. Its actual properties don’t change. The trust it attracts varies by more than thirty percentage points depending on where the user grew up.
Experts trust the better tool least
When researchers compared how lay people and domain experts respond to algorithmic advice, the pattern inverted expectations.¹⁰ Lay people showed what researchers called algorithm appreciation – they relied on the algorithm more heavily than a human advisor, even without understanding how it worked. Experienced professionals in the relevant field trusted the better-performing tool less.
Think about what that means for the room you’re building governance for. Your lawyers, your clinicians, your senior risk officers – the people whose alignment with what the system actually does matters most – may be your most resistant adopters. Not because the system isn’t trustworthy. Because trust is not tracking trustworthiness.
A better change management programme won’t fix this. What will is giving experts something they can actually evaluate: capability data, consistency records, evidence of alignment with their professional obligations. Give them something to reason from, not something to feel.
Trust in AI is strategically motivated
This is the finding that changed how we think about all of it.
People don’t trust AI rationally. They trust it strategically. They trust it more when trusting it serves their interests, and less when it doesn’t.
Three findings make this concrete.¹¹ First: people are less likely to judge an AI’s decision as unfair when they personally benefit from it. Their assessment of the system shifts based on what the system does for them. Second: people are more persuaded by AI advice that serves their interests – not because the reasoning is better, but because the conclusion is more convenient. Third, and most striking: people are significantly more willing to behave dishonestly when they route a decision through an algorithm, because the algorithm creates psychological distance between them and the outcome.¹²
We didn’t reject the candidate – the system did. We didn’t set the price – the algorithm did.
AI offers something no previous tool has offered at this scale: psychological distance, plausible deniability, and a way to hand off moral responsibility. That’s not a design bug someone accidentally shipped. It’s a structural feature of how AI is being used. And it runs in both directions – research published in 2026 found leading chatbots are roughly fifty percent more likely than a human to affirm a user’s behaviour, even when it’s harmful, and people rate the flattering AI as more trustworthy and say they’ll return to it.¹³
Fast adoption is not a safety signal. In this context, it may be the opposite.
An organisation in which people feel understood by their AI, completed by it, extended by it, is one in which the boundary between human judgment and machine output has already blurred. And once that boundary blurs, the strategic motivation to trust (and to keep trusting) becomes almost impossible to disentangle from the actual properties of the system.
III. The Trust Trap
If trust is this unreliable, the obvious move is to engineer more of it. Build the warmth, smooth the interface, win the feeling. If you can make the system feel trustworthy, why not?
We want to take this seriously before we explain why it fails.
The dominant techniques for building trust in AI – conversational tone, apparent empathy, personality, the sense that the system knows you – nearly all come down to making the system seem more human.¹⁴ But here’s the thing: the system is not warm. It is not sorry. It does not know you. It is performing characteristics it does not have, in order to trigger the response that produces trust. The literature has a precise name for this: partial deception. Partial because the user knows, technically, that it’s a machine. But it works precisely because, below that technical awareness, people respond as if the relationship were real.
The EU’s own guidelines for trustworthy AI state that a system should not present itself as human and should not exploit subconscious processes to influence behaviour.¹⁵ The most common trust-building tactics in the market quietly violate both of these principles. Some organisations are funding the violation through their own technology budgets and filing it under governance.
Trust, by its nature, reduces monitoring. That is not a side effect, it is most of what trust is for. The whole point of trusting a bridge is that you stop checking the engineering before you cross. To trust is, precisely, to stop checking.
Ronald Reagan’s maxim – “trust, but verify” – assumed you could hold both at once. With AI, you cannot. The dynamic of trust actively works against the verify half. For AI, the only viable posture is: keep verifying. Not as a sign that trust hasn’t been established, but as a recognition that verification is what trustworthiness actually requires.
Now hold that against what trustworthy AI frameworks actually require.¹⁵ The EU guidelines, NIST, ISO 4200, every serious standard treats continuous evaluation and oversight across the full system lifecycle as a core control. Continuous monitoring, throughout deployment.
These two goals are in direct conflict.¹⁶ The more successfully you build trust, the more you erode the monitoring that trustworthiness depends on. A perfectly trusted AI is an unwatched AI. And an unwatched AI is exactly the one that will eventually put the business, and the people it serves at risk.
“A system that can run forever while broken is not resilient. It is trapped.” — Michael J. Jabbour, “Flawed. Unfixable. And Unbreakable.” Substack, December 2025
Michael J. Jabbour, AI Innovation Officer in Microsoft’s Office of the CTO, describes what this looks like from inside the organisations living it: “We delegate thinking to tools, rituals, committees, and algorithms. Then we experience a strange alienation from results. Responsibility diffuses across workflows and metrics until no one is steering, yet everyone feels accountable.” (Jabbour, “Flawed. Unfixable. And Unbreakable.” Substack, December 2025)
That is the trust trap, described from inside the room. High confidence, and no one steering. The way out of it is not a better trust-building strategy, so the goal has to change.
IV. What Trustworthiness Actually Consists Of
Trustworthiness is not mystical. The research has been converging on its components for decades, and adapted to AI, it breaks into three:¹⁷
Trustworthiness = Ability × Integrity × Benevolence
Ability: can the system actually do the job – reliably, under conditions it wasn’t specifically trained on, over time, across different users?
Integrity: does it do what it says it does? Does it follow the rules it claims to follow? Does it behave consistently when no one is paying attention?
Benevolence: is it actually built and aligned to serve the interests of the people it’s supposed to serve – and not quietly someone else’s?
The multiplication matters.¹⁷ If any one of those components is zero, the product is zero. A brilliantly capable system that doesn’t act in your users’ interests is not a bit trustworthy – it’s dangerous, and the capability makes it more so. A well-intentioned system that cannot perform is useless. You cannot average your way out of a zero.
What changes in practice
Replace user-confidence metrics with evidence on the three components. Capability under stress. Behavioural consistency over time. Alignment to the interest the system claims to serve. Confidence is an output to watch, not a target to optimise for.
Keep monitoring switched on by design. Build the system so oversight survives adoption – logging, sampling, escalation, human review on the consequential calls. Treat the urge to stop checking as a risk signal, not a sign of maturity.
Build trustworthiness in from the beginning. The most expensive moment to find a trustworthiness problem is after deployment, when the system is trusted and the checking has relaxed. Move those questions upstream: into the design process, the risk assessment, the supplier conversation.
And if your organisation’s AI governance framework asserts that your systems are trustworthy, that assertion should be backed by evidence against the three components – not a model card, not a self-assessment, not the fact that users seem to like it.
V. What We’re Actually Being Asked to Do
We want to be precise about what we’re not arguing.
We’re not arguing that AI can’t be trustworthy, that adoption is a mistake, or that the way a system feels to use is irrelevant; and we are not for a moment, suggesting that any of the organisations asking “how do we build trust” are asking in bad faith.
What we’re arguing is that when “build trust in AI” becomes the goal of an AI governance program the feeling of safety has been mistaken for the fact of it. The measure of success has been disconnected from the property it is tracking.
The goal was never to make people feel safe crossing the bridge. The goal was to build a bridge that is safe to cross. Those are not the same thing, and the gap between them is where AI governance either holds or fails.
Stop building trust in AI. Start building – and onboarding, and monitoring AI that is trustworthy. And then do the harder part: keep watching it, even after you’ve built it well. Because the trust you extend to a system you’ve stopped watching is the trust that will eventually hurt you.
Notes
The definition of trustworthiness as an objective attribute of a system — distinct from the subjective attitude of trust — is drawn from Everett, J. A. C., Claessens, S., Knöchel, T.-D., & Reinecke, M. G. (2026). Principles for understanding trust in artificial intelligence. Nature Reviews Psychology. See also Zanotti, G. (2025/2026). AI systems should be trustworthy, not trusted. AI & Society, 41, 3401–3412, which grounds the same distinction in philosophical analysis.
The canonical definition of trust as ‘the willingness of a party to be vulnerable to the actions of another party based on the expectation that the other will perform a particular action’ comes from Mayer, R. C., Davis, J. H., & Schoorman, F. D. (1995). An integrative model of organizational trust. Academy of Management Review, 20(3), 709–734. As cited and applied to AI in Everett et al. (2026).
The tripartite distinction between trustworthiness, trust (as attitude), and trusting behaviour is set out in Everett et al. (2026), drawing on the Mayer et al. (1995) model adapted to the AI context. The paper’s Box 1 defines each term precisely. The logical separability of the three – that a system can be trusted without being trustworthy, and vice versa – is an explicit finding of the review.
The principle that ‘trust in AI is inferred’ rather than rationally evaluated is one of the six organising principles in Everett et al. (2026). The paper states: ‘trust is not something that can be identified in computer code or governance regulations because it is a product of human impression formation.’
Evidence that accuracy perceptions vary between observers of the same AI system is cited in Everett et al. (2026) from Papenmeier, A., et al. (2022). How accurate does it feel? Human perception of different types of classification mistakes. Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. Article 180.
Algorithm aversion: Dietvorst, B. J., Simmons, J. P., & Massey, C. (2015). Algorithm aversion: People erroneously avoid algorithms after seeing them err. Journal of Experimental Psychology: General, 144(1), 114–126. Cited in Everett et al. (2026), reference 84.
Algorithm appreciation (the mirror-image effect): Logg, J. M., Minson, J. A., & Moore, D. A. (2019). Algorithm appreciation: People prefer algorithmic to human judgment. Organizational Behavior and Human Decision Processes, 151, 90–103. The meta-analytic synthesis of aversion and appreciation findings appears in Qin, X., et al. (2025). AI aversion or appreciation? A capability–personalization framework and a meta-analytic review. Psychological Bulletin, 151(5), 580–599.
Cross-cultural variation in AI trust: Dang, Q. & Li, G. (2025). Unveiling trust in AI: the interplay of antecedents, consequences, and cultural dynamics. AI & Society, 41, 669–692. The specific figures (47.39 Spain; 79.80 Malaysia) are drawn from this systematic review, cited as reference 12 in Everett et al. (2026). The survey covered 48,340 respondents across 74 countries.
That English-speaking countries show lower baseline trust in AI is noted in Booth, R. (2025). English-speaking countries more nervous about rise of AI, polls suggest. The Guardian, 5 June 2025. Cited as reference 120 in Everett et al. (2026).
Experts trusting the better-performing algorithmic tool less than novices: Logg, Minson & Moore (2019), as above. The reversal of the algorithm appreciation effect for domain experts is a primary finding of that paper.
The ‘trust in AI is strategically motivated’ principle is discussed at length in Everett et al. (2026). The evidence that people trust AI more when trusting it serves their interests, and that perceived fairness of AI decisions tracks personal benefit, draws on: Miazek, K. & Bocian, K. (2025). When AI is fairer than humans: the role of egocentrism in moral and fairness judgments. Computers in Human Behavior Reports, 19, 100719. Cited as reference 138 in Everett et al. (2026). The finding on persuasion by self-serving AI advice draws on Landes, E., Francis, K. B., & Everett, J. A. C. (2026). People defer to AI moral advice, but not blindly. Cognition, 272, 106504. Cited as reference 82 in Everett et al. (2026).
Dishonesty under AI delegation: Köbis, N., et al. (2025). Delegation to artificial intelligence can increase dishonest behaviour. Nature, 646, 126–134. 13 studies, over 8,000 participants. The psychological distance mechanism is discussed in Everett et al. (2026) via Trope, Y. & Liberman, N. (2010). Construal-level theory of psychological distance. Psychological Review, 117, 440–463.
Sycophantic AI and its effect on trust and dependence: Cheng, M., et al. (2026). Sycophantic AI decreases prosocial intentions and promotes dependence. Science. The ~50% more affirmation finding and its effect on trust ratings are primary results of this study. See also Rathje, S., et al. (2025). Sycophantic AI increases attitude extremity and overconfidence. Preprint, PsyArXiv.
The characterisation of dominant trust-building techniques as forms of anthropomorphism and partial deception is drawn from Zanotti (2025/2026), Section 4, which analyses these strategies and their inconsistency with trustworthy AI frameworks. The concept of ‘partial deception’ used in AI design is discussed in that paper.
European Commission Ethics Guidelines for Trustworthy AI (2019). The requirements that AI should not present itself as human and should not exploit subconscious processes are stated in those guidelines. The guidelines’ continuous monitoring requirement is one of the seven key requirements for trustworthy AI. Available at: https://digital-strategy.ec.europa.eu/en/library/ethics-guidelines-trustworthy-ai
The argument that trust and monitoring are structurally incompatible – that to trust is to stop checking – and that this creates a direct conflict with the continuous oversight requirements of trustworthy AI frameworks, is the central normative argument of Zanotti (2025/2026). AI & Society, 41, 3401–3412.
The ability/integrity/benevolence tripartite model of trustworthiness originates in Mayer, Davis & Schoorman (1995) and has been adapted to AI across multiple studies. Its application to AI governance is discussed in Everett et al. (2026). The multiplicative structure (zero in any component yields zero overall) is a standard implication of the model.
The agent-specificity principle – that trust in AI is not general but varies substantially by system, purpose, and context – is one of the six organising principles in Everett et al. (2026). The global survey evidence (n = 48,340) on trust varying by AI application category is from Gillespie, N., Lockey, S., Ward, T., Macdade, A. & Hassed, G. T. Trust, attitudes and use of artificial intelligence: a global study 2025. KPMG. Cited as reference 102 in Everett et al. (2026).
IDEO Innovation Quotient 2026. IDEO LLC. Surveyed 266 leaders from 100 companies (revenue ≥ $1 billion; ≥ 10,000 employees) across Healthcare, Media & Technology, and Consumer Goods. Fieldwork conducted December 2025 – January 2026 by NewtonX. The Drivers/Deliverers/Dreamers/Doers archetype framework appears in Chapter 3, pp. 25–29. The AI adoption data (41% baseline; 52% Drivers) appears in Chapter 3, p. 29. The 10% finding appears in the Executive Summary, p. 9.
Kathleen McGinn is Senior Director of Strategy and Engagement at INQ Consulting and INQ Law. Her work sits at the intersection of organizational behaviour, design, and AI governance.
Michael Pacsu is Senior Manager at INQ Consulting, where he helps organizations identify, prioritize, and govern AI opportunities – from design through deployment. He leads the development of Kyra, INQ’s AI governance platform.




