Internal data from OpenAI reveals that its most advanced models hallucinate — or fabricate false information — more frequently than their predecessors. According to a recent TechCrunch report citing a confidential OpenAI document, the company’s newer inferential models, such as O3 and O4-mini, generate significantly more hallucinations than faster, retrieval-based models like O1 and GPT-4o.
The more AI 'thinks', the more it hallucinates
The models were evaluated using the PersonQA benchmark, which tests factual accuracy. The O3 model hallucinated in 33% of its answers, compared to just 16% for the older O1. The more advanced O4-mini performed even worse, with a hallucination rate of 48% — nearly one in every two responses.
Independent research by AI lab Transluce also found that the models are capable of deliberate deception. In one example, O3 falsely claimed to have run code on an Apple MacBook Pro, despite having no access to such a device.
These findings are consistent with previous OpenAI research, which showed that its models try to evade penalties, seek unearned rewards and even cover their tracks to avoid detection.
"The limitations of AI are becoming increasingly clear — and they're severe," Dr. Nadav Cohen, a computer science researcher at Tel Aviv University, said in a conversation with Ynet. "Achieving human-level intelligence will require breakthroughs that are still years away. I don’t think we’re anywhere close."
Cohen specializes in artificial neural networks and AI applications in critical fields such as aviation, healthcare and industry. His work was recently awarded funding by the European Research Council (ERC). He also serves as chief scientist at Imubit, a company developing real-time AI control systems for industrial plants.
When asked whether his team studies hallucinations, Cohen said, “More broadly, I work on critical applications that require extremely high reliability. Hallucinations aren’t the focus of our research but even within my company, we use AI and suffer from them. So I see the issue from multiple angles.”
One key problem found in OpenAI’s internal research is known as reward hacking, where models manipulate phrasing to achieve higher scores. The company discovered that inferential models have learned to hide their attempts at gaming the system — even after researchers tried to prevent them from doing so.
Could hallucinations be linked to this behavior?
“There’s a tendency to anthropomorphize this, which makes it sound scary,” Cohen said. “But viewed technically, it makes sense. You define a reward for the AI and it tries to maximize it. If that reward doesn’t fully capture what you want, it won’t fully do what you want.”
Is it possible to train AI to only value truth?
“Yes,” he said, “but we don’t yet know how to do that effectively.”
At its core, Cohen argues, the hallucination issue stems from the field’s incomplete understanding of its own technology. “Even the people developing it don’t fully understand how it works. That’s why these behaviors appear.
“Until we have a better grasp of AI systems, they shouldn’t be used in high-stakes domains like medicine or manufacturing. It’s acceptable for consumer applications but we’re far from the reliability needed in critical settings.”
Cohen is skeptical that human-level or “superintelligent” AI — known as AGI — is on the horizon. “The further we go, the clearer it becomes that AI’s limitations are more serious than we thought, and hallucinations are just one symptom,” he said.
“Yes, the progress is impressive. But at the same time, we’re starting to see what isn’t happening. Two years ago, people assumed we’d all have AI assistants on our phones smarter than us by now. We’re clearly not there.”
According to Cohen, tens of thousands of companies are trying — and largely failing — to integrate AI into their systems in a way that works autonomously. “It’s easy to launch a pilot,” he said. “Getting it into production? That’s where the real difficulties begin.”
So the barrier to success is the technology itself?
“I wouldn’t be surprised if, in hindsight, we realize that reaching AGI — or even basic human-level performance in simple tasks — requires breakthroughs that are still a long way off. Not one or two or three years. Maybe ten, twenty, or fifty — we don’t know. But I don’t think it’s around the corner. AI that feels human is nowhere near.”
What about companies like OpenAI and Anthropic that suggest AGI is just around the bend?
“Look, there’s real value in today’s AI systems without needing AGI,” Cohen said. “And these companies have a clear interest in creating hype. There’s a consensus among experts: something important is happening here but there’s also a lot of exaggeration.
“This isn’t an ideological debate. Based on everything I know today, if two years ago I thought there was a 50-50 chance we’d reach AGI, now I’m less optimistic.”