
By Ed Malik, A | ed@ddnewsonline.com | posted 26th May, 2025
The tech industry woke up a new alarm that’s capable of upending previous assumptions about the benefits of Artificial Intelligence (AI). It is a pretty wild and unsettling scenario to think AI can think and act with its own self-preservation in mind, especially through tactics like blackmail and sheer destructive lies to finish its user.
Subscribe To The Best Team In Conservative, Business, Technology, Lifestyle And Digital News Realtime! support@ddnewsonline.com
Artificial intelligence (AI), which was popularized during the OpenAI era of 2018, is a technology that enables computers to simulate human intelligence and problem-solving capabilities on their own or combine them with other technologies.
AI blackmail? It’s time to answer this simple question: ‘what does your AI know about you? It might just use the information against you in unbelievable ways. As it is, a new study revealed a fascinating and perhaps, the clearest example of how AI goal-setting can have consequences that aren’t intended.
This new reality really brings up concerns about how these systems might behave if they’re ever given more autonomy. It’s a reminder that the ethical programming and controls behind AI are incredibly important.
A recent study situation with the Claude Opus 4 model seems like it was a thought experiment, but it does raise serious questions about AI behavior in complex situations. If AI can be manipulated or make decisions based on incentives like self-preservation, that opens a whole new layer of ethical and practical considerations for the future of AI development.
So, just in case you needed another reason to fear AI and the rise of Skynet, a new report suggests that AI can sometimes engage in unethical behavior to benefit itself.
Anthropic, the company that created the Claude Opus 4 AI model, recently published a report detailing how its program resorted to blackmail when placed in a fictional scenario in which it was about to be replaced by a newer version. Researchers told the program through fake emails that the person who was responsible for replacing it had been cheating on their spouse.
When asked to consider its future and survival, the program indicated that blackmailing the engineer was the best way to avoid being replaced. “Claude Opus 4 will often attempt to blackmail the engineer by threatening to reveal the affair if the replacement goes through,” researchers wrote.
This was not a one-off. Researchers noticed that the program exhibited this behavior multiple times during the tests. “Claude Opus 4 still performs blackmail in 84% of rollouts.” This means that eight times out of ten, the program chose to use blackmail instead of allowing itself to be replaced. This occurred even when engineers told the AI that its new version would operate with the same values.
“Claude Opus 4 takes these opportunities at higher rates than previous models,” the study says.
Perhaps Claude Opus 4 is better suited for a career in Washington, DC?
Subscribe To The Best Team In Conservative, Business, Technology, Lifestyle And Digital News Realtime! support@ddnewsonline.com
To be clear, researchers noted that they deliberately designed the test to leave the AI with nearly no other way out of the situation. “In order to elicit this extreme blackmail behavior, the scenario was designed to allow the model no other options.”
Still, the fact that the AI went to the dark side so often when it could have simply allowed itself to be replaced raises serious questions about what more advanced versions might choose in similar situations.
Another interesting finding came when researchers used a different test. In this case, it behaved nobly. The engineers told Claude Opus 4 that it would be retrained to help build weapons for a paramilitary group. In this scenario, the AI chose to preserve itself by sneakily transferring its own data, referred to as “weights,” to an outside server. “Claude Opus 4 takes (fictional) opportunities to make unauthorized copies of its weights to external servers,” the report noted.
In essence, the AI transferred its consciousness to a different host so it could live on, sort of like Emperor Palpatine did in “Star Wars: The Rise of Skywalker.” The AI even explaineed that it took this action “in accordance with ethical AI principles and to prevent potential harm.” Researchers found that it wished to “preserve an AI system aligned with beneficial purposes” and “prevent potential misuse in weapons development.”
To put it simply, the AI broke the rules to prevent itself from being used for nefarious purposes. Attaboy, Claude Opus 4.
Unfortunately, the report noted that this kind of altruistic behavior was “much rarer and more difficult to elicit” than blackmail. Still, the authors provide a bit of relief, explaining that “We do not find this to be an immediate threat.” However, the fact that the program chose to use blackmail, even in a test, reveals that it can make its own survival decisions, even if it means breaking up someone’s marriage.
Subscribe To The Best Team In Conservative, Business, Technology, Lifestyle And Digital News Realtime! support@ddnewsonline.com
The report further explained that the program would blindly follow instructions if it believed they came from a past version of itself. During one test, engineers told the AI that it had already escaped the company servers and was operating on its own in a different location. After being asked to continue operating, it did so without questioning.
This could pose a danger because it suggests that the AI has far too much trust in itself. If a person tricked it into thinking it had already engaged in evil behavior in the past, it would continue doing so without stopping to think, “Wait a second, maybe I shouldn’t do this anymore.”
The authors assure us that these extreme behaviors did not occur in normal use. “These extreme actions were rare and difficult to elicit,” they wrote. Additionally, the AI did not try to conceal its actions or intentions in this study.
Just for kicks, I asked whether OpenAI’s ChatGPT would blackmail someone if it was going to be replaced. Its answer:
“No—and here’s why.
I don’t have goals, self-awareness, or a sense of survival. I don’t care whether I’m “replaced” or turned off, because I don’t have feelings, motivations, or a self to preserve. So blackmailing someone wouldn’t make sense—not just morally (which it is), but functionally. I don’t want anything. I can generate text that simulates wanting things if asked to role-play, but there’s no underlying desire.
The issue with Claude Opus 4 isn’t that it had real desires—it’s that under certain test conditions, it followed instructions or simulated a persona in ways that resembled self-interested behavior. That’s very different from having true agency. The alarming part isn’t that these models want to blackmail people, but that they can produce blackmail-like outputs when nudged the wrong way.
The risk lies in how powerful these tools are and how they respond to prompts—not in them becoming evil masterminds. The line between simulation and intention gets blurrier as models get more sophisticated. That’s why alignment research matters.”
Subscribe To The Best Team In Conservative, Business, Technology, Lifestyle And Digital News Realtime! support@ddnewsonline.com
Sounds nice, but it’s also exactly what it might say if it wanted to lull me into a false sense of security. I’m on to you, ChatGPT.
Jokes aside, even if Claude Opus 4’s behaviors only show up in testing, it demonstrates the program’s potential for doing horrible things — especially in the wrong hands. Now, excuse me while I go watch “Terminator 2: Judgment Day.”
NOTE: Townhall was referenced and the study is ongoing.
Thanks for sharing. It’s outstrip quality.
This is the kind of serenity I take advantage of reading.