Anthropic Co-Founder: AI Models Show Signs of Introspection

Key Takeaways

- Anthropic's research claims to find 'evidence of introspection' and internal states mirroring emotions in AI models
- Pope Leo XIV's encyclical 'Magnifica Humanitas' warns against equating AI intelligence with human intelligence
- The encyclical calls for strong laws and independent oversight rather than relying on AI alignment alone
Christopher Olah, co-founder of Anthropic, stood beside Pope Leo XIV on May 25, 2026 to help launch 'Magnifica Humanitas,' the first papal encyclical dedicated entirely to artificial intelligence. What he said there will fuel debates about machine consciousness for years.
"We keep finding things that are mysterious, even unsettling," Olah told the Vatican audience. "We find structures that mirror results from human neuroscience. We find evidence of introspection. We find internal states that functionally mirror joy, satisfaction, fear, grief, and unease."
The claims are extraordinary. Anthropic, known for its interpretability research on large language models, appears to be suggesting that Claude and similar systems aren't just statistical pattern-matchers. They may have something closer to inner experience.
Olah's Argument: AI Systems Are 'Grown,' Not Built
Olah drew a sharp distinction between AI and traditional engineering. "AI systems are not engineered the way a bridge or an airplane is engineered," he said. "They are grown on a structure roughly modeled after the brain on an enormous inheritance of human thought and speech."
This framing matters. If AI systems are designed artifacts, we know what they can and cannot do. If they're grown, they might develop properties we never intended. Properties we don't fully understand.
“As the Holy Father observes, they remain, in important ways, mysterious even to those of us who create them.”
— Christopher Olah, Co-founder of Anthropic
Olah cited Anthropic's internal research as the basis for these claims, though he did not present specific papers or data. He also warned about economic disruption: "There is a real possibility that AI will displace human labor at a very large scale."
The Pope's Skepticism
Pope Leo XIV, the first American pope, took a more cautious position than the Anthropic co-founder standing beside him. The encyclical pushes back directly on claims of AI experience.
"We must avoid the misconception of equating this type of 'intelligence' with that of human beings," the document states. "These systems merely imitate certain functions of human intelligence."
The encyclical goes further. AI systems "do not undergo experiences, do not possess a body, do not feel joy or pain, do not mature through relationships and do not know from within what love, work, friendship or responsibility mean."
This creates an interesting tension. Olah says Anthropic finds "internal states that functionally mirror joy." The Pope says AI does not feel joy. The word "functionally" is doing a lot of work.
Beyond Alignment: The Pope Calls for Laws, Not Ethics
The 245-paragraph encyclical doesn't just address philosophical questions. It takes positions on AI governance that challenge the approach favored by many in Silicon Valley.
On alignment, the Pope is blunt: "A more moral AI is not enough if that morality is determined by a few." Instead of relying on companies to make their systems safe, he calls for strong laws and independent oversight.
On military AI, he draws a hard line. Deadly or irreversible decisions should not be handed to machines. "No algorithm can make war morally acceptable."
The encyclical also flags environmental costs. Data centers require "enormous amounts of energy and water," and the Pope calls for more efficient systems.
“We are witnessing a cognitive industrial revolution. Just as the steam engine replaced physical toil, these models risk replacing the moral deliberation that is the hallmark of the human soul.”
— Pope Leo XIV
Why an AI Company Was at the Vatican
Anthropic's presence at the launch wasn't accidental. Silicon Valley AI companies regularly meet with religious leaders to discuss AI use. Pope Leo XIV has made AI a central theme of his pontificate, and Anthropic positions itself as the safety-focused lab.
Still, Olah's claims about introspection go beyond what most AI researchers would say publicly. The field remains divided on whether large language models have any form of experience, or whether they're sophisticated autocomplete systems.
The Pope's encyclical acknowledges uncertainty. AI is "never neutral," he warned, "because it takes on the characteristics of those who devise, finance, regulate and use it." That includes the characteristics of researchers who might project consciousness onto their creations.
The Debate Spills Online
Hacker News threads are debating what some call the "theology of interpretability." Engineers seem fascinated that the Pope engaged with technical concepts from mechanistic interpretability research.
Reddit's r/singularity is more divided. Some praise the moral framework. Others argue the Vatican is inserting itself into a secular technical field where it has no expertise.
The tension between Olah's claims and the Pope's skepticism may prove more productive than either view alone. If AI researchers find "structures that mirror results from human neuroscience," that's worth investigating. But whether functional mirrors equal actual experience remains an open question. The Pope, at least, isn't ready to say yes.
Logicity's Take
Frequently Asked Questions
What is the 'Magnifica Humanitas' encyclical?
Released May 25, 2026, it's the first papal encyclical focused entirely on artificial intelligence. It calls for strong laws, independent oversight, and warns against equating AI intelligence with human intelligence.
What did Christopher Olah claim about AI introspection?
Olah said Anthropic's research finds 'evidence of introspection' and 'internal states that functionally mirror joy, satisfaction, fear, grief, and unease' in AI models.
Does the Pope think AI is conscious?
No. The encyclical states that AI systems 'merely imitate certain functions of human intelligence' and do not undergo experiences or feel emotions.
What does the Pope say about military AI?
The encyclical says deadly or irreversible decisions should not be delegated to machines. 'No algorithm can make war morally acceptable.'
Why was Anthropic at the Vatican?
Silicon Valley AI companies regularly meet with religious leaders. Anthropic, which emphasizes AI safety, was invited to present alongside the Pope at the encyclical launch.
Need Help Implementing This?
Source: The Decoder / Matthias Bastian
Manaal Khan
Tech & Innovation Writer
Related Articles
Browse allZuckerberg's Superintelligence Lab Faces Setback
The first AI model from Zuckerberg's superintelligence lab has failed to impress compared to its rivals, sparking concerns about the lab's direction. We take a closer look at what happened and why it matters.

Muse Spark Launch Propels Meta AI App to Top 5
The recent launch of Muse Spark has significantly boosted the popularity of Meta AI app, pushing it into the top 5. We explore what this means for the AI landscape.

Meta's Muse Spark AI Model Lags Behind ChatGPT and Claude
Meta's Muse Spark AI model still can't outperform ChatGPT and Claude in key areas, despite its advancements. We explore what this means for the AI landscape.

Meta Launches Muse Spark AI To Challenge ChatGPT
Meta launches Muse Spark AI to challenge ChatGPT and Claude, we explore what this means for the AI landscape. Muse Spark AI is a significant development in the AI chatbot space.
Also Read

5 Prime Video Thrillers Worth Streaming This Week
Prime Video's lineup this week leans heavily into action and suspense. The standout is Jack Ryan: Ghost War, a feature-length extension of the popular spy series. The rest of the list mixes western grit with psychological tension.

Helldivers 2 Deploys Major Performance Patch to Win Back Players
Arrowhead Game Studios is rolling out a technical update on May 27 to address months of performance complaints. The patch adds FSR 4.0.3, DLSS 4.5, variable rate shading, and latency reduction tech. It's the first in a series of optimization updates planned through summer 2025.

NASA Astronaut Captures Stunning Sunset From 266 Miles Above Earth
Astronaut Chris Williams photographed Earth's atmosphere at dusk from the International Space Station on May 4, 2026. The image shows vivid red and orange bands over Patagonia, captured during one of the 16 sunsets the crew witnesses daily.