All posts
Ai In Business

Claude's ethics lead maps the rules for AI that acts alone

Huma Shazia20 June 2026 at 4:12 am5 min read
Claude's ethics lead maps the rules for AI that acts alone

Key Takeaways

Claude's ethics lead maps the rules for AI that acts alone
Source: Fast Company
  • As AI moves from chatbots to agents, ethical decision points multiply exponentially
  • Anthropic uses a written constitution to guide Claude's values, which may shrink as the model improves
  • Askell recommends treating Claude no more reliably than a human personal assistant

Amanda Askell, who leads character and ethics work at Anthropic, is reshaping how Claude handles moral questions as the AI shifts from answering queries to taking autonomous actions. The difference matters: asking an AI whether it's ethical to invest in a defense contractor is one thing. Handing it your portfolio and walking away is another.

"As models are more autonomous and take actions over longer horizons, suddenly they have a lot more decision points that you have to map out and make work well in advance," Askell told Fast Company. She sits at the center of Anthropic's effort to give Claude what the company calls an ethical compass, a job that expands as the system's capabilities grow.

What changes when AI stops asking and starts doing?

A chatbot conversation is bounded. You ask, it answers, you evaluate the response. An agent operates differently. It browses the web, executes code, sends emails, manages files. Each action creates branches. Each branch requires judgment calls the user never sees.

Askell frames the challenge by distinguishing between discussion and delegation. Today, Claude might debate the ethics of investing in controversial industries. Tomorrow, a user might deputize Claude to manage investments entirely. The AI would then navigate those ethical dynamics without checking in at every step.

Part of her solution is training Claude to understand a user's values rather than impose the model's own. Think of it less like a rule book and more like a friend who knows your preferences. If you're comfortable with certain trade-offs, Claude should respect that. If you're not, it shouldn't push you.

How Anthropic's constitution guides Claude

Anthropic communicates values to Claude through a written document it calls a constitution. The constitution outlines principles like safety and helpfulness and provides guidance for resolving conflicts between them. If a user asks for something helpful but potentially harmful, Claude has a framework for weighing the trade-off.

Askell says this document evolves. As AI becomes more capable, the constitution could expand to cover new scenarios. Or it could shrink. A more sophisticated Claude might internalize ethical reasoning well enough that explicit rules become redundant. The goal is genuine understanding, not mechanical compliance.

My standard right now is, don't treat Claude as more reliable than a human personal assistant.

— Amanda Askell, Anthropic

That quote deserves attention. The person responsible for Claude's character treats it as no more trustworthy than a junior hire. This isn't false modesty. It's calibration. Even the best assistant makes mistakes, misunderstands context, and needs oversight.

How the agentic era changes Askell's own work

Askell uses Claude constantly in her own research. She employs the model to red team her ideas and surface edge cases she might miss. This is a feedback loop: the person shaping Claude's ethics relies on Claude to stress-test that shaping.

This practice matters because it demonstrates how agentic AI changes workflows even for those building it. The tool informs the process of improving the tool. Anthropic's researchers aren't just theorizing about AI assistance. They depend on it daily.

Why this matters for businesses deploying AI agents

Companies racing to deploy AI agents face a version of Askell's problem at scale. When your AI handles customer service, schedules meetings, and processes refunds, it makes decisions that affect your brand. A chatbot that gives a wrong answer is embarrassing. An agent that takes a wrong action is a liability.

Anthropic's approach suggests guardrails need to adapt. Static rules won't cover every situation an agent encounters. The model needs to internalize principles well enough to handle novel cases. That's a harder problem than traditional software QA.

Askell's emphasis on responsiveness over imposed ethics also has commercial implications. Businesses want AI that serves their customers' preferences, not AI that lectures them. But "responsive" is not the same as "compliant." Drawing that line, decision by decision, is where the hard work lives.

The bigger question Askell is circling

Underlying all of this is a question Anthropic hasn't fully answered: How do you verify that an AI has genuinely internalized good values versus learned to say what evaluators want to hear? Askell's work on character and red teaming is part of the answer. So is Constitutional AI, the training method Anthropic pioneered. But as agents grow more capable, the gap between observed behavior and internal reasoning widens.

For now, Askell's practical advice holds: calibrate your trust. Claude is capable, but not infallible. Use it like a sharp assistant who still needs supervision. That humility, coming from someone inside the project, is worth noting.

ℹ️

Logicity's Take

Askell's framing reveals how Anthropic thinks about competitive differentiation. OpenAI and Google race on raw capability. Anthropic bets that trustworthiness becomes the bottleneck. If they're right, character work like Askell's is a moat. If they're wrong, it's overhead. The market will decide, but the reasoning is coherent.

Frequently Asked Questions

What does 'agentic AI' mean?

Agentic AI refers to systems that take autonomous actions over extended periods, as opposed to chatbots that respond to individual prompts. Agents can browse the web, execute code, and complete multi-step tasks without constant human input.

How does Anthropic's constitution work?

Anthropic's constitution is a written document that outlines principles like safety and helpfulness. Claude references these principles when making decisions, especially when conflicts arise between competing values.

Should businesses trust AI agents to act autonomously?

Askell recommends treating Claude as no more reliable than a human personal assistant. Oversight remains necessary, particularly for consequential decisions involving finances, legal matters, or customer relationships.

How does Claude learn ethical behavior?

Claude is trained to understand user values and apply constitutional principles, rather than follow rigid rules. The goal is genuine ethical reasoning, not mechanical compliance with a checklist.

What risks do agentic AI systems create?

When AI acts autonomously, errors compound without human checkpoints. An agent managing investments, customer service, or scheduling can cause real harm if its judgment fails, unlike a chatbot whose mistakes stay in text.

Also Read
France commits €13B to tech sovereignty via Tibi Phase 3

Government investment in AI development and sovereignty ties directly to the regulatory and competitive landscape shaping companies like Anthropic.

ℹ️

Need Help Implementing This?

Logicity helps technical teams evaluate AI agent deployment strategies, from risk assessment to governance frameworks. Contact our team to discuss how agentic AI fits your operations.

Source: Fast Company / Rebecca Heilweil

H

Huma Shazia

Senior AI & Tech Writer