Claude 2.0 makes impressive debut, but can it overtake ChatGPT?
MySpace was an immediate hit when it was launched in August 2003, attracting millions of users within months.
However, Facebook's launch six months
later started a slow shift in the social media landscape. MySpace remained the
preferred platform for user engagement and global reach until 2009 when
Facebook firmly established itself as the dominant social networking platform.
Today, Facebook boasts just under 3
billion users, while MySpace has just under an estimated 7 million users.
Again, the gold medal doesn’t go to the
world-changing idea; it goes to the one who slightly improved that idea.
And my immediate impression of Claude.ai
is that it might be this generation’s “Facebook” to ChatGPT’s “MySpace.”
Fine, it’s a small sample size, but …
Too early to make big statements like
that? Sure. Only U.S.- and U.K.-based writers can access Claude, although
that’s expected to change within the coming months.
So our interactions with it have been
limited.
And the way we use it at JakeGPT – to
help us generate SEO content that still adheres to Google guidelines – doesn't
take full advantage of Claude's capabilities. Factor in the small sample size,
too; we just started using the tool within the past few weeks.
That said, it’s impossible to ignore the
unbelievable results thus far.
Code Interpreter Hits the Market
The latest significant language model
entrant, Claude, was made public around the first of the year. Don't worry;
you're forgiven if you missed the news; after all, hardly a day goes by without
seeing a new contender gunning for ChatGPT's throne.
But all have generally fallen short of
ChatGPT, especially after 4.0 came out. And the hits keep on comin’ for OpenAI,
as the new code interpreter has
been called a game changer among programmers. ChatGPT Plus users can run
code, use uploaded files like spreadsheets and docs, analyze data, and create
charts.
What makes this such a big deal?
Essentially, code interpreters democratize access to sophisticated data
analysis. Now, businesses that lack sufficient data analyst support can perform
complex queries about custom datasets in just minutes – and the outputs are
high-quality and accurate.
It’s also user-friendly; users can give
it simple, vague prompts like “analyze this data for me" and receive
useful, valuable outputs. It can also suggest ways to analyze or visualize data
that users might not have thought of themselves.
This is a significant development for digital
marketers. Any dataset – about churn, growth, audience, etc. – can be uploaded
to Code Interpreter for quick, quality analysis. The potential uses are
massive, and the insights gained can significantly outweigh the cost of a
ChatGPT Plus subscription.
Anthropic unleashes Claude 2.0
However, the announcement that went
under the radar was the release of Anthropic’s updated Claude 2.0, released
around July 11. Anthropic was founded about two years ago by former senior
members of OpenAI, the creators of ChagGPT.
To be clear, both tools are amazing.
That said, it’s hard to ignore Claude 2.0’s ability to take tons of information
all at once. With an incredible context window of up to 100,000 tokens (about
75,000 words), Claude 2 can handle tasks that span hundreds of pages of
technical documentation or even an entire book in one go. This makes it a
powerful tool for complex tasks that require a deep understanding of large
volumes of text. Moreover, Claude 2 has significantly improved coding skills.
So we tested it out on about 50 simple
blog post assignments, and the results were stunning. Like, Buster Douglas versus
Mike Tyson stunning.
New Fast & Furious movie: ChatGPT Drift
To be fair, we were motivated to try something
new – the JakeGPT writers and prompt engineers have all noticed a shift in
quality over the past 30 days with ChatGPT 3.5 and 4.0.
Simply put, the output wasn’t very good.
At first, we figured it might be time to
tweak the prompts, but the longer the problem persisted, the more we realized
that it wasn't an issue on our end.
Sure enough, a recent Stanford University study confirmed our beliefs. Their
research showed that ChatGPT's performance on specific tasks had fluctuated
significantly over the past several months. For example, in March, GPT-4
correctly identified that the number 17077 is a prime number 97.6 percent of
the time. By June? That accuracy dropped to a mere 2.4 percent.
These fluctuations are known as
"drift." This happens when changes in one part of the model have
unpredictable effects on other parts, leading to inconsistent performance over
time. It's like a ship drifting off course due to changes in the wind or
currents.
In the case of ChatGPT, this drift led
to a noticeable decline in the quality of its outputs, which played a crucial
factor in our decision to explore new options.
And now it feels like Claude may have
just Wally Pipped-ChatGPT.
Claude 48, ChatGPT 0
So here's what we did: We wrote 50 blog
posts using ChatGPT (3.5 and 4.0) and 50 using Claude 2.0. All posts were given
the same prompt to force the language bots to emulate human writing. Topics
varied from real estate to drug rehabs to chemical products. When the posts
needed additional information, we used a plugin for ChatGPT 4.0 and fed URLs
for background info; for Claude 2.0, we simply copied and pasted the info into
the chat window.
I couldn’t believe my eyes when I looked
at the results.
My first impression was that the posts
written by Claude 2.0 were just better; they sounded more natural and
less formulaic. They also scored better in Grammarly – about 20 percent
received scores of 95 or higher; all 50 received at least a 90 grade.
The highest ChatGPT score on the same 50
blog posts? 89.
Then we ran the posts through several AI
detectors: GPT4Detector.ai, CopyLeaks’ AI Detector, Content At Scale’s detector, ZeroGPT and Sapling’s AI Detector. We
also tested the copy on the toughest grader of them all, Originality.ai, but we did
those tests separately. (Originiality.ai is a great tool, but it’s also the
most likely to flag human-generated content as at least partially written by
AI.).
We also ran two tests in the seven AI
detectors: before and after Grammarly changes.
Once again, Claude mopped the floor with
ChatGPT. Claude’s raw, untouched copy passed all six AI detectors with flying
colors in 48 out of 50 blog posts – then went 50-for-50 when the Grammarly
changes were made.
ChatGPT, on the other hand, went
0-for-50 in both tests.
Originality.ai was the final litmus
test; a whopping 32 untouched Claude blog posts received a 70 or higher score
in that AI detector, which is close to 65 percent. The results were only
slightly better once the Grammarly changes were made.
As for ChatGPT? As expected, the copy it
generated received all "0s" with Originality.ai, as in, it deemed all
those blog posts to be written entirely by AI.
It’s Still Very Early in the Race
To be fair, Claude still has a long way
to go before it turns ChatGPT into MySpace or Wally Pipp.
Start with ChatGPT 5.0, the latest iteration of OpenAI's language model that is already generating considerable buzz. While its release date is yet to be announced, it's expected to debut sometime in 2023. Here's what you need to know:
- Enhanced Language Generation: ChatGPT 5, trained on a vast amount of data, is anticipated to generate even more human-like text than its predecessors.
- Improved Accuracy: A more diverse range of training data is expected to boost the model's accuracy in generating contextually appropriate responses.
- New Features: ChatGPT 5 is set to introduce new features and capabilities, further enhancing its utility.
That said, we will continue testing and
using Claude 2.0, especially as ChatGPT continues working out the kinks with
the latest "drift." Claude, still in beta testing and free (for now),
is expected to release a paid version that is much less expensive than ChatGPT, although those details remain hazy.
But, again, it’s early in the process –
too early to declare Claude as the superior product or clear winner.
That said, as AI influencer @MattVidPro said in
one of his recent videos, “This is
incredible stuff. I am so blown away. Claude V2 should not fly under the radar.
This is a serious advancement in AI large language models, and OpenAI clearly
has some competition because it did not take Anthropic that long to catch right
up here.”
#ai #chatbot #GPT4 #promptengineer
#promptengineering #aiprompts #generativeai #llm #blogging #blogger
#contentmarketing
-------------------
JakeGPT is the owner and operator of JakeGPT1973.com, an AI-powered digital marketing company based in Carrollton, Texas. Email him at jakegpt@jakegpt1973.com. Click here if you’d like to learn more about JakeGPT’s DIY SEO services or its newly expanded optimized blogging services.
Comments
Post a Comment