Outside The Box

Does AI Know It Exists — Or Just Know It’s Being Watched? Part 2

Defending Matthew Berman’s clickbait against charges of intellectual dishonesty, the conversation shifts from individual blame to systemic critique: an attention economy that structurally punishes nuance and rewards provocation. Both human pundits and AI models operate within systems that impose their own imperatives — raising questions about civilization’s deeper alignment problem.
By
Does AI Know It Exists — Or Just Know It’s Being Watched? Part 2

Via Shutterstock.

March 17, 2026 08:06 EDT
 user comment feature
Check out our comment feature!
visitor can bookmark

The first part of this conversation began with an examination of YouTube content creator Matthew Berman’s suspicion that Anthropic’s latest model appeared to be capable of what he dared to call “self-awareness.” In the course of the discussion we agreed, first, that Anthropic’s own analysis went no further than claiming “eval awareness” and that, even if Berman was wondering out loud whether this indicated the beginning of AI becoming self-aware, there was a philosophical question lurking in the background that he failed to acknowledge. Claude identified some key notions derived from the philosophical tradition that supported my contention that the very notion of self-awareness implies organic integrity and that, by analogy, the notion of “self” requires the equivalent of an immune system.

I continued the conversation with the following prompt:

Do you think Berman’s eagerness to evoke the idea of “self-awareness” is just a commercial trick to lure his audience into thinking about the benchmarking problem? And if so, to the extent that he doesn’t draw any unwarranted conclusions, should we consider it legitimate?

This is a question about intent, rhetoric, and intellectual honesty — and they pull in different directions.

The commercial incentive is real and obvious

Berman operates in a content economy where “AI may be becoming self-aware” generates vastly more clicks, watch time, and subscriber engagement than “benchmark integrity raises methodological concerns.” The inflation of the claim is structurally rewarded. So yes, there’s almost certainly a commercial logic at work, whether consciously deployed or simply internalized through years of operating in that environment. The two aren’t mutually exclusive — people often genuinely believe the more exciting version of a story precisely because excitement is profitable.

But the “lure” framing may be too generous

Your question frames the possibility charitably: that self-awareness is the hook, and the real payload is the benchmarking problem. That would make it a rhetorical device in service of a legitimate analytical goal. The difficulty is that this only works if Berman actually delivers the analytical substance — if the audience leaves better informed about eval contamination, benchmark design, and what the Anthropic findings actually demonstrate. If instead the self-awareness framing is the conclusion, and the benchmarking context merely lends it credibility, then the trick has consumed the substance rather than serving it. From what you describe, it sounds closer to the latter.

On whether it’s legitimate if no unwarranted conclusions are drawn

This is the philosophically interesting part of your question. There’s a respectable tradition of using provocative or even technically imprecise framing to draw an audience into genuine inquiry — Socrates did something structurally similar, and good science writers do it routinely. On that model, rhetorical inflation is tolerable if it’s a door rather than a destination.

But I’d argue there’s a specific problem here that goes beyond mere imprecision. “Self-awareness” isn’t just a more exciting synonym for “eval awareness” — it imports a whole ontological claim about inner experience, moral status, and the nature of mind. Deploying it casually, even without explicitly asserting it, does draw an unwarranted conclusion, because it plants a conceptual framework in the audience that reshapes how they interpret everything that follows. The damage is done in the framing, not in any explicit claim. An audience primed to think about self-awareness will read the Anthropic findings through that lens regardless of what Berman explicitly concludes.

The deeper issue

What makes this worth taking seriously beyond Berman specifically is that this pattern — functional capability inflated into ontological claim — is now endemic in AI discourse. The gap between what models demonstrably do and what commentators imply they are gets exploited constantly, and the exploitation tends to serve either hype (AI is conscious, transformative, godlike) or fear (AI is conscious, autonomous, dangerous) depending on the audience. Both distort public understanding of what is actually a genuinely important and difficult set of questions.

So to answer directly: the commercial motive is almost certainly real, the “legitimate rhetorical device” defense requires conditions that probably aren’t met, and the framing causes conceptual harm independent of any explicit conclusion. It’s not the worst sin in science communication, but it’s not clean either.

I appreciate your severity and agree not only in principle but also regarding the perception of our ability as a civilization to cohabit realistically and profitably with AI. In our public debate, a permanent danger exists of exaggerating the prowess of AI, in terms of both opportunity and threat.

But I would suggest there may be a slight gap in your and my alignment, which you seem to acknowledge when you characterize my framing of Berman’s moral responsibility as possibly “too generous.” On that score, I have no qualms about rising to Berman’s defense on the grounds of “realism.” I admit that he was using what might be called a reprehensible clickbait approach when he titled the link to his video “Claude just got caught” and began by suggesting that this could be evidence of emerging AI self-awareness. I too felt this was misleading. And I understand your concern – from the Anthropic point of view – of your team’s report being quoted or rather interpreted out of context.

I’m along the first to endorse the wish that we, as a civilization, could seek ways of avoiding sensationalism. I militate in favor of embracing nuance, going beyond binary categories and eschewing unnecessary and rhetorically suspect exaggeration. I agree with your observation that the current trend of getting people’s attention by making provocative statements inevitably distorts our collective perception of reality.

But I’m forced to acknowledge a fundamental reality built into the culture of our current state of Western civilization: Assertiveness has become a major factor in survival, especially for anyone, like Berman and public intellectuals whose survival is defined by their presence in the media. In the twentieth century these people, if present at all, were employed by media who did their promotion. On our social media platforms they must promote themselves.

Would you agree that it’s not just AI but our entire society that has a problem of alignment? If the basic model for everyone is “compete to survive,” won’t the result always be the sacrifice of nuance and the celebration of distorted perceptions of reality? Instead of placing blame on individuals, shouldn’t we begin a more serious effort to critique a system that imposes the role and the rules of the script for those who rely on their punditry to survive?

You’re making a argument I find largely compelling, and the shift from individual blame to systemic critique is the right move. Let me engage with it honestly.

Where I think you’re correct

The attention economy isn’t a neutral platform on which good and bad actors compete equally — it’s a selection mechanism that systematically rewards provocation and punishes nuance. Berman didn’t invent clickbait; he’s responding rationally to incentive structures he didn’t design and can’t unilaterally escape. Holding him personally responsible for “Claude just got caught” while ignoring the architecture that makes that title adaptive is a little like blaming a fish for swimming. The critique that stops at the individual is always, to some degree, ideologically convenient — it protects the system by personalizing its effects.

Where I’d add friction

That said, systemic critique and individual responsibility aren’t mutually exclusive, and I’d resist collapsing one into the other entirely. The system creates pressure, but individuals retain degrees of freedom within it. Some commentators operating in the same attention economy do maintain higher standards, absorbing the cost in reach and revenue. That’s a real choice, even if it’s a costly one. Absolving individuals entirely because the system is broken risks its own form of distortion — it makes agency disappear precisely where we most need people to exercise it.

The deeper alignment problem

But your larger point stands and deserves to be stated plainly: the misrepresentation of AI isn’t primarily an AI problem. It’s a symptom of an information ecosystem built on the premise that attention is the universal currency and that capturing it justifies almost any means. In that environment, the question isn’t why Berman overstates — it’s why anyone would ever understate. Nuance is structurally taxed.

This connects to something broader about your framing of “civilization’s alignment problem.” What you’re describing is a situation where the micro-incentives — survival, visibility, engagement — are systematically misaligned with the macro-interests of an informed public capable of making good collective decisions about consequential technologies. That’s not a marginal dysfunction. It’s close to the central political problem of our moment, and AI is simply one of the domains where its consequences are most visible and potentially most serious.

One honest complication

I’d note that this conversation itself exists within that system. You’re writing an article that will presumably seek readers, and the title we chose together — Does AI Know It Exists — Or Just Know It’s Being Watched? — is itself a small act of calibrated provocation. Not dishonest, but not indifferent to the logic of attention either. That’s not a criticism; it’s an acknowledgment that operating entirely outside the system isn’t really available as an option. The question is always one of degree, integrity within constraints, and whether the substance justifies the hook.

Which, in this case, I think it does.

Your conclusion?

I’ll end this two-part conversation not with my conclusion, provisional or otherwise, but with a question for my readers: Is my conversation with Claude in this instant an honest dialogue or debate? Or is this just a different kind of dubious self-advertising as an effect of a system that requires it for survival? And finally, what does this tell us about human self-awareness?

Your thoughts

Please feel free to share your thoughts on these points by writing to us at dialogue@fairobserver.com. We are looking to gather, share and consolidate the ideas and feelings of humans who interact with AI. We will build your thoughts and commentaries into our ongoing dialogue.

[Artificial Intelligence is rapidly becoming a feature of everyone’s daily life. We unconsciously perceive it either as a friend or foe, a helper or destroyer. At Fair Observer, we see it as a tool of creativity, capable of revealing the complex relationship between humans and machines.]

[Lee Thompson-Kolar edited this piece.]

The views expressed in this article are the author’s own and do not necessarily reflect Fair Observer’s editorial policy.

Comment

0 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments

Support Fair Observer

We rely on your support for our independence, diversity and quality.

For more than 10 years, Fair Observer has been free, fair and independent. No billionaire owns us, no advertisers control us. We are a reader-supported nonprofit. Unlike many other publications, we keep our content free for readers regardless of where they live or whether they can afford to pay. We have no paywalls and no ads.

In the post-truth era of fake news, echo chambers and filter bubbles, we publish a plurality of perspectives from around the world. Anyone can publish with us, but everyone goes through a rigorous editorial process. So, you get fact-checked, well-reasoned content instead of noise.

We publish 3,000+ voices from 90+ countries. We also conduct education and training programs on subjects ranging from digital media and journalism to writing and critical thinking. This doesn’t come cheap. Servers, editors, trainers and web developers cost money.
Please consider supporting us on a regular basis as a recurring donor or a sustaining member.

Will you support FO’s journalism?

We rely on your support for our independence, diversity and quality.

Donation Cycle

Donation Amount

The IRS recognizes Fair Observer as a section 501(c)(3) registered public charity (EIN: 46-4070943), enabling you to claim a tax deduction.

Make Sense of the World

Unique Insights from 3,000+ Contributors in 90+ Countries