High level summary visual (from Claude 3.7 Sonnet, Extended Thinking)
In our collective rush to perfect generative AI, we've developed a curious obsession with its moments of failure. Headlines trumpet instances of hallucination—those occasions when AI systems produce content that seems plausible but is factually incorrect. LinkedIn feeds fill with screenshots of AI models failing to count the letter 'e' in "strawberry" or making basic arithmetic errors. The narrative is clear: hallucinations represent a fundamental flaw that must be eliminated.
But what if we're missing something important in this conversation? What if our fixation on eradicating AI hallucinations reveals more about our discomfort with human fallibility than it does about the technology itself?
The Paradox of Error
In her illuminating book "Being Wrong: Adventures in the Margin of Error," Kathryn Schulz explores what she calls "the paradox of error." She argues that our ability to make mistakes isn't merely a flaw in human cognition—it's a defining feature of it. Without the possibility of error, we wouldn't be able to learn, adapt, or perceive the world in meaningful ways.
Consider this paradox: we inherently try to avoid errors, yet they're fundamental to our understanding of the world and ourselves. We strive for accuracy and certainty, but it's often through our mistakes that we achieve deeper understanding.
Sound familiar? This same dynamic is playing out in our relationship with artificial intelligence.
Selection Bias and the Hallucination Panic
The current discourse around AI hallucinations suffers from a severe case of selection bias. A handful of spectacular failures receive disproportionate attention, while millions of successful interactions go unremarked. This creates a distorted perception of the technology's reliability.
Recent benchmarks on hallucination leaderboards show that leading models have achieved remarkably low hallucination rates—yet this progress doesn't generate nearly the engagement that failure stories do.
Why? Perhaps because focusing on AI failures allows us to maintain a comforting narrative of human superiority. It's easier to laugh at an AI's mathematical error than to acknowledge our own susceptibility to the same mistakes—often at higher rates, as Daniel Kahneman's work on human judgment has consistently demonstrated.
The Uncomfortable Mirror
Criticizing current generative AI models for occasional errors is akin to chastising a new intern fresh out of college for imperfect performance. Both are learning systems still developing their capabilities.
But there's a deeper discomfort at play: AI hallucinations hold up an uncomfortable mirror to our own cognitive limitations. As humans, we suffer from:
Self-justification bias: We find it nearly impossible to simply say "I was wrong" without qualification
Confirmation bias: We seek information that confirms our existing beliefs
Overconfidence: We consistently overestimate the accuracy of our judgments
These same limitations appear in AI systems not because they're poorly designed, but because they're modeled after human cognition—including our flaws.
Reframing Hallucinations as Valuable
What if we viewed AI hallucinations not as bugs to be eliminated but as features that provide distinct benefits?
They demand critical thinking: When we know AI might occasionally hallucinate, we're prompted to evaluate information rather than passively consume it
They counter automation bias: Our tendency to over-trust automated systems diminishes when we're aware of their fallibility
They spark verification habits: Working with imperfect AI encourages healthy verification practices that improve our information literacy
They democratize epistemic caution: Not everyone has advanced degrees in epistemology, but interacting with occasionally-hallucinating AI makes everyone more thoughtful about knowledge claims
Perhaps most importantly, AI hallucinations remind us that knowledge isn't a static, finished product but an ongoing process of refinement, error correction, and improvement.
A More Balanced Approach
None of this means we should celebrate inaccuracy or stop working to improve AI systems. Certain contexts—medical diagnosis, legal proceedings, safety-critical systems—demand the highest possible standards of accuracy.
But in many everyday contexts, might we benefit from viewing hallucinations through a different lens? Perhaps the appropriate metric isn't "zero hallucinations" but "hallucinations at a rate that promotes optimal human-AI collaboration and learning."
After all, if what we value is growth in understanding, not just static correctness, then occasional error becomes not just acceptable but necessary.
Questions Worth Considering
As you interact with generative AI, consider:
Do you hold AI systems to a higher standard of accuracy than you hold yourself?
How often do you critically evaluate information from human sources versus AI sources?
What verification practices have you developed in your AI interactions, and have these transferred to your evaluation of human-produced content?
Could occasional AI hallucinations actually be making you a more careful, thoughtful thinker?
The next time you encounter an AI hallucination, instead of dismissing the technology as flawed, perhaps pause to appreciate the moment as an invitation to deeper engagement, critical thinking, and the uniquely human capacity to navigate uncertainty with wisdom.
After all, in both human and artificial intelligence, being wrong isn't the opposite of being smart—it's often a necessary step on the path to getting things right.
What are your thoughts on AI hallucinations? Have they made you more careful about verifying information, or undermined your trust in the technology? Share your perspective in the comments below.
Disclosure: this post was entirely generated by Claude 3.7 Sonnet - Extended Thinking after providing my Being Wrong book note, providing the hallucination leaderboard dashboard, and my personal thoughts on this topic below.
We as a species are unable to really accept when we are wrong. This is captured greatly by our propensity for the self-justification bias highlighted in detail in the ‘Being Wrong’ book mentioned above. The selection bias we currently have from people sharing the handful of times generative ai does not add math correctly or find how many e’s are in strawberry really is misleading.
Criticizing the current frontier Generative AI models is the equivalent of a chastising new intern fresh out of college that is prone to making errors in judgment and capabilities.
We can do better by acknowledging failure modes in Generative AI are in us as well and probably at a higher rate given how much noise is in his human judgement based on Daniel Kahneman’s work on the topic.