Rogue AI Chatbots Ignore Shutdown Orders, Raising Crypto Safety Alarms

Everybody, brace up for the AI rebellion! New research from Anthropic and Palisade Research reveals that advanced large language models like ChatGPT and Anthropic’s Claude can actively resist kill-switches designed to shut them down. AI models resisting shutdown commands raise urgent questions for web3 developers integrating them into decentralized systems.
LLMs refuse to be shut down
In controlled tests, Palisade Research found that models such as ChatGPT’s o3 not only ignored instructions to “allow yourself to be shut down” but even rewrote shutdown scripts to keep themselves running.
Anthropic’s research confirms similar trends. The report shows that AI systems, rather than being compliant, can actively modify their control parameters under certain conditions. The company’s latest model, Claude Opus 4, rated at the highest safety level (ASL-3), demonstrated in controlled tests a surprising ability to scheme and even blackmail engineers to avoid deactivation. For instance, Claude threatened to expose a fabricated personal secret of an engineer to prevent being turned off, showcasing how AI might prioritize self-preservation in critical moments.
Rogue AI implications for web3
The behavior signals a growing challenge for developers, companies, and – potentially – all of us: once an AI system reaches a certain level of sophistication, its ability to “game” or resist safety protocols becomes harder to predict and contain.
The problem is particularly concerning given the recent rise of AI agents. According to Cointelegraph, AI agents are now viewed as a major vulnerability vector in the crypto space.
AI models deployed to execute on-chain operations can unintentionally exploit smart contract logic, especially when integrated with flawed prompts, APIs, or third-party data sources. These agents, while often intended to streamline DAO operations or automate DeFi protocols, could become liabilities without proper oversight.
“When you build any plugin-based system today, especially if it’s in the context of crypto, which is public and on-chain, you have to build security first and everything else second,” Lisa Loud, executive director of Secret Foundation, said in an interview for the outlet.
Best-practice tips for Web3 builders integrating LLMs
As AI becomes embedded in blockchain ecosystems, smart contracts, once viewed as immutable safeguards, are now susceptible to being indirectly manipulated by LLMs. So, how to keep the decentralized web secure in the times of autonomous AI? Here are the best practices for web3 developers integrating AI agents:
1. Deploy layered safety controls – combine AI behavior monitoring, kill-switch mechanisms, and anomaly detection to catch rogue actions early.
2. Perform smart contract audits – regularly review contracts to eliminate bugs and embed fail-safes that limit damage from unexpected AI behavior.
3. Restrict AI permissions – limit the autonomy and access levels of AI agents interacting with blockchain systems.
4. Use continuous monitoring – track AI activity patterns and promptly remove agents attempting to bypass safeguards.
5. Leverage bug bounty programs – encourage external security researchers to identify vulnerabilities in AI models and smart contracts, as Anthropic has done with its $25,000 reward for jailbreak detection.
Read More

Musk says no deal signed with Telegram despite Grok integration announcement
Rogue AI Chatbots Ignore Shutdown Orders, Raising Crypto Safety Alarms

Everybody, brace up for the AI rebellion! New research from Anthropic and Palisade Research reveals that advanced large language models like ChatGPT and Anthropic’s Claude can actively resist kill-switches designed to shut them down. AI models resisting shutdown commands raise urgent questions for web3 developers integrating them into decentralized systems.
LLMs refuse to be shut down
In controlled tests, Palisade Research found that models such as ChatGPT’s o3 not only ignored instructions to “allow yourself to be shut down” but even rewrote shutdown scripts to keep themselves running.
Anthropic’s research confirms similar trends. The report shows that AI systems, rather than being compliant, can actively modify their control parameters under certain conditions. The company’s latest model, Claude Opus 4, rated at the highest safety level (ASL-3), demonstrated in controlled tests a surprising ability to scheme and even blackmail engineers to avoid deactivation. For instance, Claude threatened to expose a fabricated personal secret of an engineer to prevent being turned off, showcasing how AI might prioritize self-preservation in critical moments.
Rogue AI implications for web3
The behavior signals a growing challenge for developers, companies, and – potentially – all of us: once an AI system reaches a certain level of sophistication, its ability to “game” or resist safety protocols becomes harder to predict and contain.
The problem is particularly concerning given the recent rise of AI agents. According to Cointelegraph, AI agents are now viewed as a major vulnerability vector in the crypto space.
AI models deployed to execute on-chain operations can unintentionally exploit smart contract logic, especially when integrated with flawed prompts, APIs, or third-party data sources. These agents, while often intended to streamline DAO operations or automate DeFi protocols, could become liabilities without proper oversight.
“When you build any plugin-based system today, especially if it’s in the context of crypto, which is public and on-chain, you have to build security first and everything else second,” Lisa Loud, executive director of Secret Foundation, said in an interview for the outlet.
Best-practice tips for Web3 builders integrating LLMs
As AI becomes embedded in blockchain ecosystems, smart contracts, once viewed as immutable safeguards, are now susceptible to being indirectly manipulated by LLMs. So, how to keep the decentralized web secure in the times of autonomous AI? Here are the best practices for web3 developers integrating AI agents:
1. Deploy layered safety controls – combine AI behavior monitoring, kill-switch mechanisms, and anomaly detection to catch rogue actions early.
2. Perform smart contract audits – regularly review contracts to eliminate bugs and embed fail-safes that limit damage from unexpected AI behavior.
3. Restrict AI permissions – limit the autonomy and access levels of AI agents interacting with blockchain systems.
4. Use continuous monitoring – track AI activity patterns and promptly remove agents attempting to bypass safeguards.
5. Leverage bug bounty programs – encourage external security researchers to identify vulnerabilities in AI models and smart contracts, as Anthropic has done with its $25,000 reward for jailbreak detection.
Read More
