Home / World News / Google, OpenAI, Anthropic step up fight against rising AI cyber threats
Google, OpenAI, Anthropic step up fight against rising AI cyber threats
Leading global AI leaders are racing to fix security flaws in chatbots that hackers are exploiting to steal data and launch cyberattacks
LLMs face another risk called data poisoning, where attackers insert hostile material into training data so models learn to misbehave. (Photo/Representative image)
3 min read Last Updated : Nov 03 2025 | 6:02 PM IST
Don't want to miss the best from Business Standard?
Top AI players, including Google DeepMind, Anthropic, OpenAI and Microsoft, are stepping up efforts to stop a key security weakness in large language models (LLMs), according to a report by Financial Times. The flaw, known as “indirect prompt injection”, allows attackers to hide commands in web pages, emails or other inputs that trick models into revealing sensitive information or acting in harmful ways.
In other words, prompt injection vulnerability happens when user prompts change the LLM’s behaviour or output in unintended ways. These inputs can influence the model even if people can’t see or understand them, as long as the model processes the hidden content.
Why the flaw matters
The flaw makes LLMs valuable not only to legitimate users but also to criminals. It widens the attack surface for phishing, fraud, code generation for malware and deepfake-enabled scams. Companies using AI tools risk data leaks and financial loss if the problem is not fixed.
AI teams are reportedly combining tactics, automated red teaming, external testing and AI-powered defence tools to detect and limit malicious uses. But experts warn that there is no complete fix yet because LLMs are built to follow instructions and cannot reliably tell the difference between trustworthy and harmful inputs, the news report said. ALSO READ: Cyberattack becomes personal: How AI is making identity theft easy
How the attacks work
Indirect prompt injection hides malicious instructions inside otherwise normal-looking content. When an LLM processes that content, it may obey the hidden instruction, for example, to dump confidential files or reveal private data. This same tendency also enables “jailbreaking”, where users prompt models to ignore safeguards.
LLMs face another risk called “data poisoning”, where attackers insert hostile material into training data so models learn to misbehave.
Threat landscape
AI has already made cybercrime more accessible. Tools help amateurs write harmful code and allow pros to scale attacks. LLMs can quickly generate new malicious code, complicating detection. Studies and industry reports link AI to a rise in ransomware, phishing and deepfake fraud. AI can also harvest personal data from public profiles to support sophisticated social engineering, Financial Times reported.
Companies have seen cases where language models were abused end-to-end, credential harvesting and system infiltration were used to extort organisations for large sums.
What companies should do
Experts urge firms to monitor for new threats, limit who can access sensitive datasets and restrict use of powerful AI tools. Security teams should combine automated detection, human review and strict access controls to reduce risks.
The race between attackers and defenders will continue. AI builders are improving testing and safeguards, but researchers warn that the root tension – models designed to follow instructions – makes a full solution difficult.
You’ve reached your limit of {{free_limit}} free articles this month. Subscribe now for unlimited access.