Anthropic Releases Claude AI: A Leap in Constitutional AI
Ian Leissner
March 8, 2024
On March 4th, 2024, Anthropic announced the release of its newest and most powerful LLM yet, Claude 3. This new model accepts multimodal inputs such as text, image, entire documents, and more.
Through investments totaling 4 billion dollars from Amazon and the success of claude AI previous models, Claude 2 and Claude 2.1, Anthropic has already proven it can compete directly with OpenAI’s ChatGPT-4 and Google Gemini 1.0 models.
Claude 3 pushes the barriers of what they have previously built. With a context window of 200k tokens, Claude 3 could read a book and summarize it, or write an entire screenplay twice as fast as before.
Model Family
Anthropic plans to offer three different models to consumers: Haiku, Sonnet, and Opus. All of these new models will be available to both developers and consumers through Anthropics APIs or the Claude Pro subscription at various price points.
Opus is the most powerful of the three models being made available. Anthropic claims that Opus outperforms GPT-4 and Gemini Ultra in undergraduate knowledge, graduate reasoning, grade school math, and knowledge Q&A. The second most powerful model is Sonnet, which offers a lower cost option than Opus. Haiku has yet to be fully released to the public but Anthropic claims it will be available soon.
The various models are designed for different audiences and have been tuned for various products and services that Anthropic anticipates Claude 3 will be used for. Anthropic suggests that Opus be used for task automation and R&D while the smaller Sonnet could be used for product recommendation and data processing. Once Haiku is released, we could see it being used for content moderation.
Source: Geek Gadgets
What Sets Claude 3 Apart From Its Predecessors?
Compared to previous generations of LLMs, Claude 3 has a much larger context window of 200k tokens. Anthropic has announced on its website that for select users this limit can be extended to 1 million tokens.
A larger context window will allow anyone to build new tools and applications that create more value for customers. Let’s look at the case of R&D and how they can benefit from a larger context window. Claude 3 is said to be able to read a 10k token paper from arXiv with charts and graphs in under 3 seconds. This is significantly faster than Claude 2 and other previous generations of LLMs, giving researchers the ability to keep up with new publications as soon as they are released.
What sets Claude 3 apart from its predecessors is its multimodal capabilities. Being able to process various forms of unstructured data simultaneously allows Claude 3 to analyze a series of technical graphs, understand different trends, and create more detailed reports.
In a recent interview with CNBC, Anthropic’s co-founder Daniela Amodei claimed that Claude 3 had a better understanding of risk than its predecessors. With a more nuanced understanding of risk, Claude 3 could be poised to give better responses when asked about sensitive topics. With Claude 2, sensitive topics would sometimes seem to trigger the models content moderation settings, leading to refusals.
Source: Anthropic
What Can We Build With Claude 3?
Claude 3 is much more versatile and powerful than its predecessors. This will allow companies across various industries to realize the value of Claude 3 and build new LLM-based applications.
Automating Customer Interactions: Anyone can deploy Claude 3 to automate customer support services, handling inquiries, resolving issues, and providing personalized assistance round-the-clock.
Cost-saving Tasks: Claude 3 could help insurance companies to streamline their operations by automatically processing claims, reducing overhead costs by analyzing personal data, and improving the productivity of their employees.
Creative Content Generation: Marketers can use Claude 3 in the ideation and copywriting process to help augment their content. This can be done to save time or to create more engaging and personalized content for readers.
Content Moderation: Social media platforms could deploy Claude 3 to analyze content and filter out inappropriate posts in real time.
How Does Claude 3 Address Ethical Concerns?
Anthropic claims that Claude 3 is built using their Constitutional AI policies. This includes using supervised and reinforced learning to help train each model that they release. The goal of Anthropic’s Constitutional AI policy is to train their models to respond to harmful or otherwise nefarious questions by explaining why it will not respond.
Additionally, when compared to previous models, Claude 3 demonstrates a lower bias score. This bias score is calculated using the Bias Benchmark for Question Answering test, which is a standard benchmark used to evaluate LLMs.
Source: Claude’s Constitution
What Do Skeptics Say?
Anthropic has claimed impressive performance on several types of tests that indicate that the Claude 3 Opus model outperforms all competitors in text-based tasks. These results have not been verified by any outside source so we have to take these results at face value. In my opinion, the best way to verify these results and show us the capabilities of Claude 3 would be to have an unbiased third party rerun the evaluation.
Gemini Pro was trained to accept up to 10 million tokens. This makes Claude 3’s context window seem small in comparison and begs the question of whether Anthropic will be able to keep up with competition in the future.
There is also the argument that given the closed-source nature of Claude 3, it is more difficult for developers to build new customized products and services because they do not have access to the original weights. Thus, close-sourced models are likely to underperform compared to open-source models in specific use cases.
What Next?
Out of the big three, it is difficult to say who has the upper hand. Google’s larger context window will allow it to attract more customers in the future while OpenAI’s brand recognition will allow GPT-4 to maintain its market share. One thing is for sure, with its new Opus, Sonnet, and Haiku models being released, Anthropic is going to continue to be competitive with OpenAI and Google.
Therefore it begs the question, do you switch to Claude 3 if you are already using GPT-4 or Gemini?
After having used Claude 3 and comparing it to GPT-4 and Gemini, its answers feel more natural. On top of this, Anthropics’ approach to development is more sustainable than their competition because of their Constitutional AI guidelines. Given that all three products have similar price offerings, switching is going to be a matter of long term strategic partnerships and whether the benefits gained outweigh the cost of changing service providers.