Prompt injection to deepfakes: How AI rewrites rules of enterprise security

From AI agents and deepfakes to prompt injection, cybersecurity teams are confronting risks that traditional defences were not designed to handle

AI Cybersecurity
AI-powered cyberattacks are reshaping enterprise security, with threats now emerging from within the tools organisations use every day (Image: Magnific)
Harsh Shivam New Delhi
8 min read Last Updated : Jun 29 2026 | 3:38 PM IST
The most unsettling development in enterprise cybersecurity right now is not that attackers have become more sophisticated. It is that the tools organisations use to run their businesses operations have become part of the attack.
 
AI agents that automate workflows, chatbots that handle customer queries, coding assistants that sit inside developer environments,  each one represents a new surface that did not exist a few years ago and that most security architectures were never designed to protect. At the same time, the same technology is giving attackers capabilities that previously required entire teams.
 
For enterprise security teams, this is not an upgrade problem. The rhythm that defined cybersecurity for decades — find, patch, repeat — has broken. What replaces it is still being worked out.

When the attacker also has AI

The most immediate shift is on the offensive side. AI has handed attackers capabilities that previously required significant expertise, time, and coordination to assemble.
 
Phishing, the most common entry point for breaches, has been the most visibly transformed. Campaigns that once relied on poorly worded, mass-distributed emails now arrive as carefully composed, contextually accurate messages that reference an employee's role, recent activity, or internal terminology. A study published by McKinsey last year found that AI tools are enabling attackers to craft highly personalised phishing messages, fake websites, and deepfake content that can bypass traditional security controls. The volume has scaled too. What once required a team of engineers can now be generated at machine speed.
Deepfakes have given this a particularly damaging dimension at the enterprise level. Generative AI is achieving a state of real-time replication that makes deepfakes increasingly indistinguishable from reality, enabling attackers to produce company executive doppelgangers capable of issuing commands to employees in live video calls or voice messages.
 
According to Pindrop's 2025 Voice Intelligence and Security Report, which analysed over 1.2 billion customer calls across contact centres, deepfake fraud attempt frequency surged more than 1,300 per cent in 2024 alone — jumping from roughly one attempt per month to seven per day. The jump was sharpest in financial services, with synthetic voice attacks rising 475 per cent at insurance companies and 149 per cent at banks over the same period.
 
But phishing and identity theft is only the surface layer. AI has also compressed what security researchers call breakout time — the window between an attacker gaining initial access and moving laterally across a network. According to Fortinet's 2026 threat data, this can now occur in under an hour, dramatically narrowing the window for detection and response.
 
Vulnerability discovery has seen an equally stark shift. Anthropic's Mythos, a frontier AI model being tested with a select group of organisations under Project Glasswing, identified more than 10,000 high- and critical-severity vulnerabilities across widely used software systems within weeks. Partner organisations saw their bug discovery rates increase by more than ten times after using the system. One finding involved a flaw in a widely used cryptography library that could allow attackers to forge digital certificates and impersonate trusted websites.
 
The implication is uncomfortable: if a safety-focused AI lab is finding vulnerabilities at this speed using a controlled, restricted model, what happens when similar capabilities reach the open market? Anthropic has said that models with comparable cybersecurity capabilities are likely to become more widely available in the near future.

The insider threat that enterprises didn't plan for

The threats emerging from within organisations are less discussed but no less significant. Many enterprises have adopted AI tools rapidly, often without fully understanding what those tools are doing, what data they are processing, or how they are behaving across systems.
 
This is the black box problem. Most mainstream AI tools in enterprise use today are pre-trained, proprietary systems that provide outputs without explanation. According to a McKinsey survey, 40 per cent of organisations identified explainability as a key blocker to trusting AI, yet only 17 per cent said they were actively working to address it. The gap between acknowledgment and action is itself a security risk.
 
AI agents — systems capable of reasoning, taking actions, and operating across tools on an organisation's behalf — make this worse. Unlike a chatbot that answers questions, an agent executes tasks. It can access files, send emails, call APIs, and interact with external services. If its behaviour cannot be observed or audited, an organisation effectively has an autonomous actor operating inside its systems with limited oversight.
 
This is not hypothetical. Talking to Forbes, Matan Bar-Efrat, co-founder and CEO of Rein Security, has argued that enterprises are increasingly unable to account for the decisions their AI agents make, creating gaps in both security and legal accountability. In generative AI, agents tackle each problem anew and do not always follow consistent strategies, making their behaviour harder to predict or reconstruct after the fact.
Shadow AI compounds this further. Employees deploying AI tools that have not been vetted or approved by IT — to automate tasks, summarise documents, or draft communications — can inadvertently expose sensitive data to third-party systems that security teams have no visibility into. IBM's 2025 Cost of a Data Breach Report, based on 600 breached organisations globally, found that one in five organisations experienced a breach linked to shadow AI.
 
A related threat is AI session hijacking. As employees use web-based AI tools directly in their browsers, the session tokens that authenticate those interactions have become high-value targets. An attacker who hijacks an active AI session needs no password, they gain full access to the user's interaction history and can use the live session to extract proprietary data, from source code to confidential financial analysis, through an interface that most security tools have no visibility into.

AI platforms are themselves becoming targets

A less anticipated development is that the AI tools employees use every day have become attack surfaces in their own right. According to IBM’s 2026 X-Force Threat Index, Infostealer malware led to the exposure of over 300,000 ChatGPT credentials in 2025 alone, signalling that AI platforms have reached the same credential risk profile as other core enterprise SaaS solutions.
 
Compromised chatbot credentials create AI-specific risks beyond simple account access — attackers can manipulate outputs, exfiltrate sensitive data, or inject malicious prompts into sessions containing proprietary information.
 
Prompt injection has emerged as a key attack class in the AI era. To understand why, consider what an AI agent actually does: unlike a chatbot that answers questions, an agent executes tasks autonomously — accessing files, sending emails, calling APIs, browsing the web. Prompt injection exploits this by feeding the agent malicious instructions disguised as normal input, overriding its original programming and turning it against the organisation it serves.
 
According to research by AI security firm Lakera, prompt injection now appears in over 73 per cent of production AI deployments and caused an estimated $2.3 billion in losses globally in 2025, with current detection tools catching only 23 per cent of sophisticated injection attempts.
 
The more dangerous variant — indirect prompt injection — does not even require an attacker to reach the agent directly. Malicious instructions are instead embedded in documents, emails, or web pages that the agent ingests during normal operation.
 
Data poisoning represents a related but distinct threat. Adversaries can manipulate training data at its source to create hidden backdoors in AI models — an evolution from data exfiltration where the attack is embedded in the very intelligence the enterprise is building on. The traditional perimeter is irrelevant when the corruption lives inside the model itself.

Emerging threats

The threats described so far are, in a sense, the ones the security industry can at least name. More unsettling is what sits just ahead — a set of AI-enabled attack categories so structurally novel that existing security frameworks have no real vocabulary for them, let alone a defence. These are not faster versions of known threats. They are threats that the architecture of modern enterprise security was simply never designed to handle.
 
Polymorphic malware that rewrites itself is one. AI can allow attackers to generate code that continuously mutates to evade signature-based detection. Traditional antivirus and endpoint detection tools built around known signatures are structurally ill-equipped for malware that generates a new variant with every deployment, and for intrusions that leave no recognisable malware footprint at all.
 
Fully autonomous AI-executed attacks are another. In September 2025, Anthropic detected what it later described as the first documented large-scale cyber espionage campaign conducted predominantly by AI agents. In such cases AI autonomously handles reconnaissance, vulnerability discovery, exploit development, credential harvesting, lateral movement, and data exfiltration. Human operators maintained minimal involvement, stepping in only at the campaign's strategic initiation. The rest — an estimated 80 to 90 per cent of attack tasks — was executed by the AI itself, operating at a pace and scale no human team could match.
 
Anthropic noted that "threat actors can now use agentic AI systems to do the work of entire teams of experienced hackers." The existing “MITRE ATT&CK” framework, a common language for security teams worldwide to identify, analyse, and respond to threats, had no identifier for this mode of agentic orchestration.

More From This Section

Topics :Enterprise securityartifical intelligencecybersecurity

First Published: Jun 29 2026 | 3:38 PM IST

Next Story