When the Australian government hired Deloitte last year to review one of its welfare compliance systems, it expected the usual thoroughness that comes with a Big Four consultancy. What it got instead was a 237-page report riddled with references to sources and experts that do not exist. This was followed by an admission by the company that parts of its reports had been written using artificial intelligence.
The fallout has turned into one of Australia’s first public refunds for undisclosed AI use in government work, triggering a debate over oversight and the credibility of AI-assisted consulting.
How the story began
In December 2024, Australia’s Department of Employment and Workplace Relations (DEWR) commissioned Deloitte to conduct an 'independent assurance review' of its Targeted Compliance Framework, an automated system that penalised jobseekers who missed welfare obligations. The contract was worth around 439,000 Australian dollars (US$290,000).
Deloitte’s report was published on the department’s website in July 2025. At first, it drew little public attention. But within weeks, a Sydney University researcher, Dr Chris Rudge, noticed something unusual -- the report was quoting academics and legal experts who didn’t exist.
He alerted media outlets and government officials that the report contained “fabricated references” and even an invented quote attributed to a federal court judge. Several of the cited studies supposedly came from the University of Sydney and Lund University in Sweden but no such papers could be found.
Also Read
Deloitte admits AI use
After reviewing the report, Deloitte confirmed that some of the footnotes and references were incorrect. In late September this year, the firm issued a corrected version of the document and acknowledged that it had used Azure OpenAI GPT-4o, a generative AI system, in producing parts of the report.
The updated version, now carrying a note dated September 26, explicitly discloses that AI tools were used during preparation and the fabricated quotes and references have been removed.
Deloitte also agreed to refund the final installment of its consultancy fee, a portion of the original AU$ 439,000 payment, after discussions with the department.
Why this incident matters
For Australia’s government, the episode is a warning about oversight and disclosure. The department stressed that Deloitte’s findings and recommendations were still valid and that the core analysis of the welfare system was unaffected. But the incident forced the firm to publicly acknowledge that generative AI had played a role in producing a paid government report.
Moreover, the timing is awkward for Deloitte. Only weeks earlier, the company had announced a deal with AI firm Anthropic to give its nearly 500,000 employees access to the Claude chatbot, part of a wider move among global consultancies to speed up work using AI tools.
What is AI hallucination?
AI hallucination has been a recurring problem since the advent of artificial intelligence itself. It refers to instances where an artificial intelligence system produces false information. These systems generate text by predicting patterns in language and not by checking facts against real-world data.
For example, when an AI chatbot is asked for references, quotes or statistics, it may “fill in” missing details using what it thinks are ‘plausible’ sources, which in reality are fictional sources, names or studies. Therefore, technically it is not lying, it simply doesn’t know what is real and what is not.
Such hallucinations often appear in long-form text, research summaries or citations, which may look authentic at first glance as they are complete with proper formatting and academic tone. But they can reference non-existent reports, misquote real experts or invent judgments that never occurred, exactly the kind of errors that appeared in Deloitte’s report.
What are some notable cases of AI hallucination?
Around the world, several high-profile episodes have shown how AI hallucinations can slip into professional or public-facing work:
> The New York lawyers case: In 2023, two US lawyers were fined for using ChatGPT to draft a court submission in a personal injury lawsuit. The AI generated fictitious case citations that did not exist in any legal database. The judge sanctioned both lawyers, calling the incident a “lesson in technological competence.”
> CNET’s AI-written articles: The same year, the American technology news site CNET published over 70 articles of financial explainers under the byline 'CNET Money Staff' generated using AI. When readers noticed factual and mathematical errors, the outlet had to issue corrections and temporarily pause its AI experiment.
> Medical AI hallucinations: Recent studies showed major AI chatbots like ChatGPT and Bing often generated inaccurate or completely fabricated medical information and references, with critical risks to patient safety.
> Google Bard’s factual slip: In one of its first promotional demos, Google’s chatbot Bard incorrectly claimed that the James Webb Space Telescope had captured the first image of an exoplanet, a mistake that was spotted by astronomers within hours.

)