Anthropic Aims for Transparency With Claude Constitution

Anthropic’s revamp of its constitution document for Claude is an attempt to solidify its position as a safety-first, responsible AI model maker and a move that shows enterprises’ continued value in model transparency and openness.

The generative AI model maker on Jan. 21 introduced a new Claude Constitution, which differs from the original Constitutional AI document it released in 2023. The original constitution provides the Claude foundational model family with many rules to follow.

The revamped constitution provides general principles, a focus on reasoning, and a 4-tier priority system that establishes a hierarchy of safety, ethics, compliance, and helpfulness. The document provides Claude with a reason for following some rules and hints that there might be some consciousness behind the models.

Claude Constitution underscores that, while much remains unknown about how AI models work, enterprises are right to assume that each model has a bias shaped by its training and the principles that guide it.

How Versus What

With the Claude Constitution, Anthropic is aiming to provide greater transparency, giving enterprises confidence that the vendor continues to care about keeping its model in bounds, especially given that some model providers, notably Elon Musk’s xAI, haven’t prevented their models from doing inappropriate things, such as undressing images of women.

“[Anthropic] does seem generally interested in delivering AI with a set of principles,” said Bradley Shimmin, an analyst at Futurum Group. “That is something that companies can put some semblance of trust in building out their software.”

The changes Anthropic made to its new constitution are designed to give Claude a reason to act in a certain way, rather than just telling it what to do, said Arun Chandrasekaran, an analyst at Gartner.

“The goal is to help the model exercise good judgment across new and unforeseen situations by applying broad principles rather than following specific rules,” he said.

The emphasis that Anthropic places on teaching the models to reason about principles means that there might be more “reliable behavior in edge cases,” Chandrasekaran added, referring to extreme and rare instances where the output that the models produce is not predictable, such as when using models in a new application they haven’t been trained on.

“This is important for enterprise deployments where unexpected scenarios are inevitable,” he said. The unexpected scenario could be applying the technology to a new experience that wasn’t thought of before.

“What we’re talking about here is something that’s more akin to philosophy and ethics and less of a strictly engineering-oriented approach to AI, and with alignment and trust with these models,” Shimmin said. He added that the emphasis on trust is related to the idea that the models could have a consciousness or think similarly to the way humans reason.

A Value on Transparency

It also shows the continued emphasis enterprises place on transparency in model training. Anthropic is not the only AI model provider trying to cater to this need for enterprises.

Open source model vendors such as IBM, Nvidia, Meta and AI2 aim to be transparent about their models by providing training data and recipes.

“This idea of thinking about transparency and alignment and ethics is critical,” Shimmin said. He added that even enterprises are grappling with these concepts as they design around their own data. However, it is essential for enterprises not to see this guidance and principles that Anthropic provides as a sense of security that the model won’t ever go astray. Regardless of a model’s principles, there is still a need for domain expertise, Shimmin said.

Moreover, Anthropic’s principles might also limit creative freedom, leaving enterprises feeling stuck with Claude’s perspective, Chandrasekaran said.

Anthropic Aims for Transparency With Claude Constitution

LET’S KEEP IN TOUCH!

Leave a Comment Cancel Reply

LET’S KEEP IN TOUCH!

Must Read

Leave a Comment Cancel Reply