Anthropic Co-Founder: AI Models Show Signs of Introspection

Key Takeaways

- Anthropic's research claims to find 'evidence of introspection' and internal states mirroring emotions in AI models
- Pope Leo XIV's encyclical 'Magnifica Humanitas' warns against equating AI intelligence with human intelligence
- The encyclical calls for strong laws and independent oversight rather than relying on AI alignment alone
Christopher Olah, co-founder of Anthropic, stood beside Pope Leo XIV on May 25, 2026 to help launch 'Magnifica Humanitas,' the first papal encyclical dedicated entirely to artificial intelligence. What he said there will fuel debates about machine consciousness for years.
"We keep finding things that are mysterious, even unsettling," Olah told the Vatican audience. "We find structures that mirror results from human neuroscience. We find evidence of introspection. We find internal states that functionally mirror joy, satisfaction, fear, grief, and unease."
The claims are extraordinary. Anthropic, known for its interpretability research on large language models, appears to be suggesting that Claude and similar systems aren't just statistical pattern-matchers. They may have something closer to inner experience.
Olah's Argument: AI Systems Are 'Grown,' Not Built
Olah drew a sharp distinction between AI and traditional engineering. "AI systems are not engineered the way a bridge or an airplane is engineered," he said. "They are grown on a structure roughly modeled after the brain on an enormous inheritance of human thought and speech."
This framing matters. If AI systems are designed artifacts, we know what they can and cannot do. If they're grown, they might develop properties we never intended. Properties we don't fully understand.
“As the Holy Father observes, they remain, in important ways, mysterious even to those of us who create them.”
— Christopher Olah, Co-founder of Anthropic
Olah cited Anthropic's internal research as the basis for these claims, though he did not present specific papers or data. He also warned about economic disruption: "There is a real possibility that AI will displace human labor at a very large scale."
The Pope's Skepticism
Pope Leo XIV, the first American pope, took a more cautious position than the Anthropic co-founder standing beside him. The encyclical pushes back directly on claims of AI experience.
"We must avoid the misconception of equating this type of 'intelligence' with that of human beings," the document states. "These systems merely imitate certain functions of human intelligence."
The encyclical goes further. AI systems "do not undergo experiences, do not possess a body, do not feel joy or pain, do not mature through relationships and do not know from within what love, work, friendship or responsibility mean."
This creates an interesting tension. Olah says Anthropic finds "internal states that functionally mirror joy." The Pope says AI does not feel joy. The word "functionally" is doing a lot of work.
Beyond Alignment: The Pope Calls for Laws, Not Ethics
The 245-paragraph encyclical doesn't just address philosophical questions. It takes positions on AI governance that challenge the approach favored by many in Silicon Valley.
On alignment, the Pope is blunt: "A more moral AI is not enough if that morality is determined by a few." Instead of relying on companies to make their systems safe, he calls for strong laws and independent oversight.
On military AI, he draws a hard line. Deadly or irreversible decisions should not be handed to machines. "No algorithm can make war morally acceptable."
The encyclical also flags environmental costs. Data centers require "enormous amounts of energy and water," and the Pope calls for more efficient systems.
“We are witnessing a cognitive industrial revolution. Just as the steam engine replaced physical toil, these models risk replacing the moral deliberation that is the hallmark of the human soul.”
— Pope Leo XIV
Why an AI Company Was at the Vatican
Anthropic's presence at the launch wasn't accidental. Silicon Valley AI companies regularly meet with religious leaders to discuss AI use. Pope Leo XIV has made AI a central theme of his pontificate, and Anthropic positions itself as the safety-focused lab.
Still, Olah's claims about introspection go beyond what most AI researchers would say publicly. The field remains divided on whether large language models have any form of experience, or whether they're sophisticated autocomplete systems.
The Pope's encyclical acknowledges uncertainty. AI is "never neutral," he warned, "because it takes on the characteristics of those who devise, finance, regulate and use it." That includes the characteristics of researchers who might project consciousness onto their creations.
The Debate Spills Online
Hacker News threads are debating what some call the "theology of interpretability." Engineers seem fascinated that the Pope engaged with technical concepts from mechanistic interpretability research.
Reddit's r/singularity is more divided. Some praise the moral framework. Others argue the Vatican is inserting itself into a secular technical field where it has no expertise.
The tension between Olah's claims and the Pope's skepticism may prove more productive than either view alone. If AI researchers find "structures that mirror results from human neuroscience," that's worth investigating. But whether functional mirrors equal actual experience remains an open question. The Pope, at least, isn't ready to say yes.
Logicity's Take
Frequently Asked Questions
What is the 'Magnifica Humanitas' encyclical?
Released May 25, 2026, it's the first papal encyclical focused entirely on artificial intelligence. It calls for strong laws, independent oversight, and warns against equating AI intelligence with human intelligence.
What did Christopher Olah claim about AI introspection?
Olah said Anthropic's research finds 'evidence of introspection' and 'internal states that functionally mirror joy, satisfaction, fear, grief, and unease' in AI models.
Does the Pope think AI is conscious?
No. The encyclical states that AI systems 'merely imitate certain functions of human intelligence' and do not undergo experiences or feel emotions.
What does the Pope say about military AI?
The encyclical says deadly or irreversible decisions should not be delegated to machines. 'No algorithm can make war morally acceptable.'
Why was Anthropic at the Vatican?
Silicon Valley AI companies regularly meet with religious leaders. Anthropic, which emphasizes AI safety, was invited to present alongside the Pope at the encyclical launch.
Need Help Implementing This?
Source: The Decoder / Matthias Bastian
Manaal Khan
Tech & Innovation Writer
اقرأ أيضاً

رأي مغاير: كيف يؤثر اختراق الأمن الداخلي الأميركي على شركاتنا الخاصة؟
في ظل اختراق عقود الأمن الداخلي الأميركي مع شركات خاصة، نناقش تأثير هذا الاختراق على مستقبل الأمن السيبراني. نستعرض الإحصاءات الموثوقة ونناقش كيف يمكن للشركات الخاصة أن تتعامل مع هذا التهديد. استمتع بقراءة هذا التحليل العميق

الإنسان في زمن ما بعد الوجود البشري: نحو نظام للتعايش بين الإنسان والروبوت - Centre for Arab Unity Studies
في هذا المقال، سنناقش كيف يمكن للبشر والروبوتات التعايش في نظام متكامل. سنستعرض التحديات والحلول المحتملة التي تضعها شركات مثل جوجل وأمازون. كما سنلقي نظرة على التوقعات المستقبلية وفقًا لتقرير ماكنزي

إطلاق ناسا لمهمة مأهولة إلى القمر: خطوة تاريخية نحو استكشاف الفضاء
تعتبر المهمة الجديدة خطوة هامة نحو استكشاف الفضاء وتطوير التكنولوجيا. سوف تشمل المهمة إرسال رواد فضاء إلى سطح القمر لconducting تجارب علمية. ستسهم هذه المهمة في تطوير فهمنا للفضاء وتحسين التكنولوجيا المستخدمة في استكشاف الفضاء.