Claude 2.0 makes impressive debut, but can it overtake ChatGPT?

Which AI language model is better

MySpace was an immediate hit when it was launched in August 2003, attracting millions of users within months.

However, Facebook's launch six months later started a slow shift in the social media landscape. MySpace remained the preferred platform for user engagement and global reach until 2009 when Facebook firmly established itself as the dominant social networking platform.

Today, Facebook boasts just under 3 billion users, while MySpace has just under an estimated 7 million users.

Again, the gold medal doesn’t go to the world-changing idea; it goes to the one who slightly improved that idea.

And my immediate impression of Claude.ai is that it might be this generation’s “Facebook” to ChatGPT’s “MySpace.”

Fine, it’s a small sample size, but …

Too early to make big statements like that? Sure. Only U.S.- and U.K.-based writers can access Claude, although that’s expected to change within the coming months.

So our interactions with it have been limited.

And the way we use it at JakeGPT – to help us generate SEO content that still adheres to Google guidelines – doesn't take full advantage of Claude's capabilities. Factor in the small sample size, too; we just started using the tool within the past few weeks.

That said, it’s impossible to ignore the unbelievable results thus far.

Code Interpreter Hits the Market

The latest significant language model entrant, Claude, was made public around the first of the year. Don't worry; you're forgiven if you missed the news; after all, hardly a day goes by without seeing a new contender gunning for ChatGPT's throne.

But all have generally fallen short of ChatGPT, especially after 4.0 came out. And the hits keep on comin’ for OpenAI, as the new code interpreter has been called a game changer among programmers. ChatGPT Plus users can run code, use uploaded files like spreadsheets and docs, analyze data, and create charts.

What makes this such a big deal? Essentially, code interpreters democratize access to sophisticated data analysis. Now, businesses that lack sufficient data analyst support can perform complex queries about custom datasets in just minutes – and the outputs are high-quality and accurate.

It’s also user-friendly; users can give it simple, vague prompts like “analyze this data for me" and receive useful, valuable outputs. It can also suggest ways to analyze or visualize data that users might not have thought of themselves.

This is a significant development for digital marketers. Any dataset – about churn, growth, audience, etc. – can be uploaded to Code Interpreter for quick, quality analysis. The potential uses are massive, and the insights gained can significantly outweigh the cost of a ChatGPT Plus subscription.

Anthropic unleashes Claude 2.0

However, the announcement that went under the radar was the release of Anthropic’s updated Claude 2.0, released around July 11. Anthropic was founded about two years ago by former senior members of OpenAI, the creators of ChagGPT.

To be clear, both tools are amazing. That said, it’s hard to ignore Claude 2.0’s ability to take tons of information all at once. With an incredible context window of up to 100,000 tokens (about 75,000 words), Claude 2 can handle tasks that span hundreds of pages of technical documentation or even an entire book in one go. This makes it a powerful tool for complex tasks that require a deep understanding of large volumes of text. Moreover, Claude 2 has significantly improved coding skills.

So we tested it out on about 50 simple blog post assignments, and the results were stunning. Like, Buster Douglas versus Mike Tyson stunning.

ChatGPT experiences what is called a "draft."
New Fast & Furious movie: ChatGPT Drift

To be fair, we were motivated to try something new – the JakeGPT writers and prompt engineers have all noticed a shift in quality over the past 30 days with ChatGPT 3.5 and 4.0.

Simply put, the output wasn’t very good.

At first, we figured it might be time to tweak the prompts, but the longer the problem persisted, the more we realized that it wasn't an issue on our end.

Sure enough, a recent Stanford University study confirmed our beliefs. Their research showed that ChatGPT's performance on specific tasks had fluctuated significantly over the past several months. For example, in March, GPT-4 correctly identified that the number 17077 is a prime number 97.6 percent of the time. By June? That accuracy dropped to a mere 2.4 percent.

These fluctuations are known as "drift." This happens when changes in one part of the model have unpredictable effects on other parts, leading to inconsistent performance over time. It's like a ship drifting off course due to changes in the wind or currents.

In the case of ChatGPT, this drift led to a noticeable decline in the quality of its outputs, which played a crucial factor in our decision to explore new options.

And now it feels like Claude may have just Wally Pipped-ChatGPT.

Claude 48, ChatGPT 0

So here's what we did: We wrote 50 blog posts using ChatGPT (3.5 and 4.0) and 50 using Claude 2.0. All posts were given the same prompt to force the language bots to emulate human writing. Topics varied from real estate to drug rehabs to chemical products. When the posts needed additional information, we used a plugin for ChatGPT 4.0 and fed URLs for background info; for Claude 2.0, we simply copied and pasted the info into the chat window.

I couldn’t believe my eyes when I looked at the results.

My first impression was that the posts written by Claude 2.0 were just better; they sounded more natural and less formulaic. They also scored better in Grammarly – about 20 percent received scores of 95 or higher; all 50 received at least a 90 grade.

The highest ChatGPT score on the same 50 blog posts? 89.

Then we ran the posts through several AI detectors: GPT4Detector.ai, CopyLeaks’ AI Detector, Content At Scale’s detector, ZeroGPT and Sapling’s AI Detector. We also tested the copy on the toughest grader of them all, Originality.ai, but we did those tests separately. (Originiality.ai is a great tool, but it’s also the most likely to flag human-generated content as at least partially written by AI.).

We also ran two tests in the seven AI detectors: before and after Grammarly changes.

Once again, Claude mopped the floor with ChatGPT. Claude’s raw, untouched copy passed all six AI detectors with flying colors in 48 out of 50 blog posts – then went 50-for-50 when the Grammarly changes were made.

ChatGPT, on the other hand, went 0-for-50 in both tests.

Originality.ai was the final litmus test; a whopping 32 untouched Claude blog posts received a 70 or higher score in that AI detector, which is close to 65 percent. The results were only slightly better once the Grammarly changes were made.

As for ChatGPT? As expected, the copy it generated received all "0s" with Originality.ai, as in, it deemed all those blog posts to be written entirely by AI.

It’s Still Very Early in the Race

To be fair, Claude still has a long way to go before it turns ChatGPT into MySpace or Wally Pipp.

Start with ChatGPT 5.0, the latest iteration of OpenAI's language model that is already generating considerable buzz. While its release date is yet to be announced, it's expected to debut sometime in 2023. Here's what you need to know:

  • Enhanced Language Generation: ChatGPT 5, trained on a vast amount of data, is anticipated to generate even more human-like text than its predecessors.
  • Improved Accuracy: A more diverse range of training data is expected to boost the model's accuracy in generating contextually appropriate responses.
  • New Features: ChatGPT 5 is set to introduce new features and capabilities, further enhancing its utility.

That said, we will continue testing and using Claude 2.0, especially as ChatGPT continues working out the kinks with the latest "drift." Claude, still in beta testing and free (for now), is expected to release a paid version that is much less expensive than ChatGPT, although those details remain hazy.

But, again, it’s early in the process – too early to declare Claude as the superior product or clear winner.

That said, as AI influencer @MattVidPro said in one of his recent videos, “This is incredible stuff. I am so blown away. Claude V2 should not fly under the radar. This is a serious advancement in AI large language models, and OpenAI clearly has some competition because it did not take Anthropic that long to catch right up here.”

 

#ai #chatbot #GPT4 #promptengineer #promptengineering #aiprompts #generativeai #llm #blogging #blogger #contentmarketing

-------------------

JakeGPT is the owner and operator of JakeGPT1973.com, an AI-powered digital marketing company based in Carrollton, Texas. Email him at jakegpt@jakegpt1973.com. Click here if you’d like to learn more about JakeGPT’s DIY SEO services or its newly expanded optimized blogging services.


Comments

Popular posts from this blog

Conceived in Liberty, Birthed in Imperfection: The Importance of Inconsistent AI Detectors

Attention lawyers, dentists, Realtors and other businesses: Google may force you to “E.E.A.T.” your words if you’re using ChatGPT